OOSC 2: 28.5 SYNCHRONIZATION ISSUES

This site contains older material on Eiffel. For the main Eiffel page, see http://www.eiffel.com.

28.5 SYNCHRONIZATION ISSUES

We have our basic mechanism for starting concurrent executions (separate creation) and for requesting operations from these executions (the usual feature call mechanism). Any concurrent computation, object-oriented or not, must also provide ways to synchronize concurrent executions, that is to say to define timing dependencies between them.

If you are familiar with concurrency issues, you may have been surprised by the announcement that a single language mechanism, separate declarations, is enough to add full concurrency support to our sequential object-oriented framework. Surely we need specific synchronization mechanisms too? Actually no. The basic O-O constructs suffice to cover a wide range of synchronization needs, provided we adapt the definition of their semantics when they are applied to separate elements. It is a testament to the power of the object-oriented method that it adapts so simply and gracefully to concurrent computation.

Synchronization vs. communication

To understand how we should support synchronization in object-oriented concurrency, it is useful to begin with a review of non-O-O solutions. Processes (the concurrent units in most of these solutions) need mechanisms of two kinds:

Synchronization mechanisms enforce timing constraints. A typical constraint might state that a certain operation of a process, such as accessing a database item, may only occur after a certain operation of another process, such as initializing the item.
Communication mechanisms allow processes to exchange information, which in the object-oriented case will be in the form of objects or object structures.

A simple classification of approaches to concurrency rests on the observation that some of them focus on the synchronization mechanism and then use ordinary non-concurrent techniques such as argument passing for communication, whereas others treat communication as the fundamental issue and deduce synchronization from it. We may talk about synchronization-based and communication-based mechanisms.

Synchronization-based mechanisms

The best known and most elementary synchronization-based mechanism is the semaphore, which can be viewed as a locking tool for controlling shared resources. A semaphore is an object on which two operations are available: reserve and free (traditionally called P and V, but more suggestive names are preferable). At any time the semaphore is either reserved by a certain client or free. If it is free and a client executes reserve, the semaphore becomes reserved by that client. If the client that has reserved the semaphore executes free, the semaphore becomes free. If the semaphore is reserved by a client and another executes reserve, the new client will wait until the semaphore is free again. The following table summarizes this specification:

-----------------------------------------------------
STATE           Free       Reserved by    Reserved by
OPERATION                       me       someone else
reserve    Becomes                       I wait.
           reserved by me
free                       Becomes free
-----------------------------------------------------

Events represented by shaded entries are not supposed to occur; they can be treated either as errors or as having no effect.

The policy for deciding which client gets through when two or more are waiting for a semaphore that gets freed may be part of the semaphore's specification, or may be left unspecified. (Usually clients expect a fairness property guaranteeing that if everyone gaining access to the semaphore ultimately frees it no one will wait forever.)

This description covers binary semaphores. The integer variant lets at most n clients through at any given time, for some n, rather than at most one.

Semaphores are widely considered to be too low-level for building large, reliable concurrent systems. But they can be used for small examples, and provide a starting point for discussing more advanced techniques.

A more abstract approach is based on critical regions. A critical region is a sequence of instructions that may be executed by at most one client at a time. To ensure exclusive access to a certain object a you may write something like

hold a then ... Operations involving fields of a ...end

where the critical region is delimited by then ... end. Only one client can execute the critical region at any given time; others executing a hold will wait. Most applications need a more general variant, the conditional critical region, in which execution of the critical region is subject to a certain boolean condition. Consider for example a buffer shared by a producer, which can only write into the buffer if it is not full, and a consumer, which can only read from it if it is not full; they may use the two respective schemes

hold buffer when not buffer.full then "Write into buffer, making it not empty" end
hold buffer when not buffer.empty then "Read from buffer, making it not full" end

Such interplay between input and output conditions cries for introducing assertions and giving them a role in synchronization, an idea to be exploited later in this chapter.

Another well-known synchronization-based mechanism, combining the notion of critical region with the modular structure of some modern programming languages, is the monitor. A monitor is a program module, not unlike the packages of Modula or Ada. The basic synchronization mechanism is simple: mutual exclusion at the routine level. At most one client may execute a routine of the monitor at any given time.

Also interesting is the notion of path expression. A path expression specifies the possible sequencing of a set of processes. For example the expression

init ; (reader* | writer)+ ; finish

prescribes the following behavior: first an init process; then a state in which at any time either one writer process or any number of reader processes may be active; then a finish process. The asterisk * means any number of concurrent instances; the semicolon ; indicates sequencing; | means "either-or"; + means any number of successive repetitions. An argument often cited in favor of path expressions is that they specify synchronization constraints outside of the processes, avoiding interference with the description of their algorithmic tasks.

Communication-based mechanisms

Starting with Hoare's "Communicating Sequential Processes" (CSP) in the late seventies, most non-O-O concurrency work has focused on communication-based approaches.

The rationale is easy to understand. If you have solved the synchronization problem, you must still find a way to make concurrent units communicate. But if you devise a good communication mechanism you might very well have solved synchronization too: because two units cannot communicate unless the sender is ready to send and the receiver ready to receive, communication implies synchronization; pure synchronization may be viewed as the extreme case of communicating an empty message. If your communication mechanism is general enough, it will provide all the synchronization you need.

CSP is based on this "I communicate, therefore I synchronize" view. The starting point is a generalization of a fundamental concept of computing, input and output: a process receives information v from a certain "channel" c through the construct c ? v; it sends information to a channel through the construct c ! v. Channel input and output are only two among the possible examples of events.

For more flexibility CSP introduces the notion of non-deterministic wait, represented by the symbol , enabling a process to wait on several possible events and execute the action associated with the first that occurs. Assume for example a system enabling a bank's customers to make inquiries and transfers on their accounts, and the bank manager to check what is going on:

  (customer_input ? balance_enquiry ->
    (customer_input ? password ->
      (password_valid -> customer_output ! balance)
          
  (password_invalid -> (customer_output ! rejection)))
      
  customer_input ? transfer_request -> ...
      
  manager ? control_operation ->  ...)

In the initial state the system stands ready to accept one of three possible input events, two on the customer_input channel and one on the manager channel. The first event that occurs will trigger the behavior described on the right of the corresponding arrow; that behavior is specified in the same way. Once the event's processing is complete, the system returns to its initial state, listening to possible input events.

CSP was a major influence on the concurrency mechanism of Ada, whose "tasks" are processes able to wait on several possible "entries" through an "accept" instruction. The Occam language, a direct implementation of CSP, is the primary programming tool for the transputer, a family of microprocessors designed specifically by Inmos (now SGS-Thomson) for the construction of highly concurrent architectures.

Synchronization for concurrent O-O computation

Many of the ideas just reviewed will help us find the right approach to concurrency in an object-oriented context. In the final form of the solution you will recognize concepts coming from CSP as well as monitors and conditional critical regions.

The CSP emphasis on communication seems right for us, since the central mechanism of object technology, feature call with arguments, is indeed a communication mechanism. But there is another, O-O-specific reason for preferring a communication-based solution: a synchronization-based mechanism can conflict with inheritance.

This conflict is most obvious if we consider path expressions. The idea of using path expressions has attracted many researchers on O-O concurrency as a way to specify the actual processing, as given by the features of a class, separately from the synchronization constraints, given by path expressions. The purely computational aspects of the software, which may have existed prior to the introduction of concurrency, will thus remain untainted by concurrency concerns. So for example if a class BUFFER has the features remove (remove the oldest element of the buffer) and put (add an element), we may express the synchronization through constraints such as

empty: {put}
partial: {put, remove}
full: {remove}

using a path-expression-like notation which lists three states and, for each of them, the permitted operations. (Notation and example from [Matsuoka 1993], which introduced the
term "inheritance anomaly". For more details on the example see exercise 28.3, page 1004.) But then assume you want a descendant NEW_BUFFER to provide an extra feature remove_two which removes two buffer items at a time, assuming a buffer size of at least three. Then you need an almost completely new set of states:

empty: {put}
partial_one: {put, remove}
partial_two_or_more: {put, remove, remove_two}
full: {remove, remove_two}

and if the routines specify what states they produce in each possible case, they must all be redefined from BUFFER to NEW_BUFFER, defeating the purpose of inheritance.

This problem, and similar ones identified by several researchers, have been dubbed the inheritance anomaly, and have led some concurrent O-O language designers to view inheritance with suspicion. The first versions of the POOL parallel object-oriented language, for example, excluded inheritance (see the bibliographical notes).

Concerns about the "inheritance anomaly" have sparked an abundant literature proposing solutions, which generally try to decrease the amount of redefinition by looking for more modular ways of specifying the synchronization constraints, so that descendants can describe the changes more incrementally, instead of having to redefine everything.

On closer examination, however, the problem does not appear to be inheritance, or even an inherent conflict between inheritance and concurrency, but instead the idea of specifying synchronization constraints separately from the routines themselves. (The formalisms discussed actually do not quite meet this goal anyway, since the routines must specify their exit states.)

To the reader of this book, familiar with the principles of Design by Contract, the technique using explicit states and a list of the features applicable in each state will look rather low-level. The specifications of BUFFER and NEW_BUFFER obscure fundamental properties that we have learned to characterize through preconditions: put should state require not full; similarly, remove_two should state require count >= 2, and so on. This more compact and more abstract specification is easier to explain, easier to adapt (changing a routine's precondition does not affect any other routine!), and easier to relate to the views of outsiders such as customers. Assessed against this technique, a state-based specification seems tedious, restrictive, and error-prone.

The "inheritance anomaly" only occurs because such specifications tend to be rigid and fragile: change anything, and the whole specification crumbles.

At the beginning of this chapter we saw another apparent inheritance-concurrency clash; but the culprit turned out to be the notion of active object. In both cases inheritance is at odds not with concurrency but with a particular approach to concurrency (active objects, state-based specifications); rather than dismissing or limiting inheritance --- cutting the arm whose finger itches --- the solution is to look for better concurrency mechanisms.

One of the practical consequences of this discussion is that we should try to rely, for synchronization in concurrent computation, on what we already have in the object-oriented model, in particular assertions. Preconditions will indeed play a central role for synchronization, although their semantics will need to be adapted from the sequential case.