Multithreading, distribution and Internet programming can be easy
Five philosophers are seated around a table. Each thinks, then eats, then thinks again and so on; thinking does not require any utensil, but eating requires grabbing two forks, and there are only five forks, creating a contention problem.
All right, you have seen this problem more often than you really care (although it might have been in its more recent, diversity-correct variant, involving rice and chopsticks). What you probably have not seen, however, is the simplest of all solutions: in class PHILOSOPHER, use the call
eat (left, right)
where eat does not need to do anything special:
eat (fork1, fork2: FORK) is do -- Nothing here (or perhaps a delay instruction). end
That's it! More precisely that's it provided we have declared class FORK as separate class instead of just class. Calling a routine such as eat with "separate" arguments means that the caller will wait for all the corresponding object to be available. No semaphores, no Ada "accept" instructions, no Java "synchronized method" declarations; just say what separate objects you need, and the mechanism will grab them for you.
There is also no need to state what concurrent architecture you are using. Concurrent programming comes in many flavors. What gets many people excited these days is multithreading, and indeed the philosophers and forks in this example can be assigned to threads on a multithreaded operating system. But why limit ourselves to multithreading? Threads share many properties with processes; we can use exactly the same model to describe concurrent execution of processes running on the same machine. Then again why limit ourselves to one machine? The processes can be on different computers on a local network. Why a local network? A process can be running in Santa Barbara and one in Tokyo, communicating through the Internet. Or they can be on a company's network, one of the process managing a client workstation and the other a database server (say).
Indeed the model sketched in this article will cover all these forms of concurrency, from multithreading to client-server computing to distributed processing to Internet programming. You will design your application as a network of concurrent objects, and decide separately how to map the virtual concurrency to the concurrent resources physically available.
The rest of this article describes how simple it is to extend the object-oriented paradigm, as illustrated by Eiffel, to cover all the principal variants of concurrent computation. We hope you will be struck by how gracefully object technology lends itself to the programming of concurrent applications.
Executing an operation on an object requires a "processor". Not necessarily a physical processor: it can be a computer, an operating system process, or a thread in a multithreaded OS. Indeed the word "thread" might be better than "processor", but since it has now been preempted by the more specific sense used in multithreading I'll stick to "processor". Just remember that a processor is not necessarily a physical computer.
One of the nice things about object technology is that the basic model of computation is so simple. In a true object-oriented language all that we ever execute is
where supplier represents an object, operation a feature (also called method) of the corresponding class, and there may be some arguments to the operation. Such a feature call is in turn executed as part of an operation performed on some client object.
There is very, very little to change when we move from sequential to concurrent programming. Simply, the processors (in the above sense) that handle the client and supplier objects can now be different! This implies that we go from a synchronous model, forcing the client to wait for operation to complete before it can proceed, to an asynchronous model where it does not need to wait.
But of course the difference in semantics must be reflected by a difference in the software text. Hence the only syntactical extension that we need to go from sequential Eiffel to concurrent Eiffel: to specify that a supplier will be handled by a different processor we will write its type declaration as
supplier: separate SOME_CLASS
or, as seen in the FORK example, declare SOME_CLASS itself as separate. This does not specify what processor supplier will use, as this would usually be overspecifying: the software text should not know about actual processors, which in most cases need only be selected at run time. But one thing we need to say at compile time: whether the processor is the same as that of client, or different, because the semantics is fundamentally different.
In plain (sequential) Eiffel you create an object through the instruction
create x.creation_procedure (...)
which allocates space for a new object, attaches x to that object, initializes its fields to default values, and may overrides some of these values by applying the given creation procedure (constructor) if provided. Now if x has been declared as separate, this creation instruction will also assign to the object a new processor. So to program in a multithreaded environment we do not need any special operations to create threads, as in Java for example: using the above creation instruction we simply create separate objects, which will go off on their own, using new threads.
To program concurrent applications requires synchronization mechanisms. Often these are very complicated; Java, for example, uses synchronized methods, wait, notify, yield, thread suspension, thread interruption, volatile fields, runnable objects, thread groups, and so on.
Things can be made much more simple. We have already seen the basic synchronization mechanism: a call of the form some_routine (supplier), where supplier is separate, ensures exclusive access to supplier; its execution will wait until obj is available. Accessing a separate object without a guarantee of mutual exclusion would be foolish anyway, as the interleaving of operations executed by different clients could leave the object in an inconsistent state. This is a major source of bugs that are particularly difficult to catch (because they depend on the clients' relative speeds and hence cannot easily be reproduced). Although the basic book on Java (Arnold and Gosling, The Java Programming Language, Addison-Wesley) warns against this risk through the example of two bank tellers concurrently updating the same bank account, Java leaves the risk completely open, since you are not required to declare your methods as "synchronized".
Once you have grabbed an object through the call some_routine (obj), you can apply arbitrary operations to the object, using the form seen above:
which will start off the operation in parallel, using the processor of supplier, making the client object free to continue with its own agenda.
Sometimes you do not want to grab an object unconditionally, but only when a certain condition becomes true. Again if you look at concurrent programming languages you will see a variety of complex mechanisms to achieve this. But there is no need for a new mechanism in Eiffel, whose assertions come in handy. Consider a routine of the form
oldest (q: BUFFER [SOME_TYPE]): SOME_TYPE is -- The oldest element not yet removed require not q.empty do Result := ... Implementation of access to oldest element ... end
In sequential programming, the precondition require not q.empty would be a correctness condition, expressing that you can only execute the "oldest element" request on a queue if the queue is non-empty. But if BUFFER is declared as separate the precondition becomes a wait condition: a client calling oldest will wait until q is available and non-empty.
We need one more synchronization mechanism, to resynchronize a thread (or more generally a processor) with a computation started by another. As we have seen, the execution of
does not block the object executing it -- the client -- if the supplier is separate. The client can proceed with its own instructions. But of course it may later need to make sure that the operation is complete, for example because it needs some of the objects that the operation creates or updates. In Concurrent Eiffel such resynchronization is handled automatically; there is no need for any explicit instruction. The rule, known as wait by necessity, is that the client will wait when it logically needs to; this happens when it executes a query of the form
value := supplier.some_query (..)
where some_query is a value-returning function or attribute. There is no need to wait for a non-value-returning procedure call, of the form supplier.other_command (...); but for some_query, which returns a value, the computation cannot proceed until we have that value. Wait by necessity provides a powerful implicit synchronization mechanism, not requiring any special work on the programmer's part.
A final mechanism takes care of the relatively rare cases in which we need to interrupt a client's hold on a separate supplier object, so as to service a request from a VIP client. This is a delicate situation, because we cannot just stop the current client, execute the request of the VIP, and then resume the original client: we would, in many circumstances, end up with an inconsistent object.
What we can do, however, is cause an exception in the original holder; the holder will be notified of the exception, and will have to take corrective actions since it was not able to complete its call. It may of course retry the operation later.
Simple library facilities support this mechanism. To accept being interrupted by a VIP client, call yield (the default is not to yield). If you are a VIP and want to request immediate service for the next call, use demand. Note that if the object of your desire is already taken, and its holder has not executed yield, you will be the one that gets the exception. A VIP request less extreme than demand is insist, which will interrupt the holder if possible, but otherwise simply go through the normal wait.
The concurrency model, as noted, maintains a strict separation between the conceptual architecture, which assumes the availability of processors as needed, and the physical architecture, which must decide how processors are implemented -- threads, processes, computers... -- and allocate processors to physical resources, for example by assigning a thread to a newly created separate object.
In most cases the software text should not have to know about this allocation; it simply describes the concurrent computation. It would be a pity to have to rewrite our programs just because we switch from one physical architecture to another. The actual processor-resource allocation is described outside of the program, through a Concurrency Configuration File (CCF) which is read only at run time. The CCF uses a straightforward set of conventions; a typical extract is
creation local_nodes: system "peanuts" (2): "c:\system1\appl.exe" "almonds" (4): "/home/users/syst1" end end external Oracle_handler: "pecans" port 9000 end
listing two servers for local creation of separate objects (peanuts for the first two, almonds for the next four, and again cyclically, each time using the given executable), and a remote database server. In this case processors are implemented as OS processes; similar conventions are available for processors that will be implemented as threads.
Library mechanisms are available to re-read the CCF at run time, and to provide the equivalent facilities from within Eiffel, so that an application that does want to control the physical allocation of processors can do so.
Concurrent programming has long had the reputation, largely justified, of being tricky and error-prone. But it does not have to be. By using the object-oriented paradigm for all that it has to offer, we can tackle concurrent applications -- whether they use threads, the Internet or a client-server model -- just as effectively as sequential ones. We hope this article has shown how attractive the result can be.
Early versions of the Concurrent Eiffel mechanism, also known as SCOOP (Simple Concurrent Object-Oriented Programming), have been presented at various TOOLS conferences since 1990. An article in Communications of the ACM (September 1993) described a model close to the current one.
The most complete description is in the 86-page chapter on concurrency of the second edition of the book Object-Oriented Software Construction (Prentice Hall, March 1997), which can be ordered from this site. See also the other on-line technology papers at http://www.eiffel.com.