JavaEnterprise JavaUnderstanding Java Multithreading and Read-Write Locks, Part 1

Understanding Java Multithreading and Read-Write Locks, Part 1

There’s a lot of misunderstanding about the concepts related to multithreading in the developer community. I must say that some of this misunderstanding is harmless because Java hides a lot of complexity and nuances behind some very simple and convenient language constructs. However, it is imperative for anyone into serious thread programming to know all that goes on under the hood.

In this article, we try to demystify some of the key concepts and to shed light on the lesser-known ones. This article uses an example, a working implementation of a read-write lock. Finally, we touch upon some concepts that are not used in the example but are important in completing the picture. This article, however, assumes that the reader is familiar with what threads are and how to work with Thread class, synchronized keyword, wait() and notify() method, etc.

Ground-Clearing Concepts

Before we delve deep into details, let us first quickly run through some of the facts on which much of the forthcoming discussion is based.

Main and working memory

In the JVM, there is a main memory, which is shared between all threads, and each thread has its own private working memory, no thread can access other threads working memory. The main and working memory may not be confused with the stack and heap area in JVM. There is a close association, yet they are not same and the language specification does not speak of any such correspondence. More specifically, the heap area may be ‘likened’ to the main memory, as it is here that the shared data in objects reside, and the operand stack of the thread is not the same as the working memory of the thread. How and where the working memory of the thread is located is dependent on the virtual machine implementation. It could be the physical memory, it could be the CPU cache. For the purpose of understanding, we shall just refer to it as a thread’s working memory, which is accessible only to the thread that owns it.

Concurrency and parallelism

Two threads performing two tasks are concurrent when they appear to be running at the same time, what happens actually is that the tasks are composed of many byte-code instructions and the two threads take their turns in performing some of these instructions and thus share the CPU time between them. How and when one of the threads is scheduled (or de-scheduled) for execution is dictated by the scheduler, which depends on the operating system. The parallel threads are the ones that run at the same time, and for that they require multiple processors. Because Java threads map to OS-level threads, the scheduling of the threads is left entirely to the scheduler.

Thread states

A thread could be in one of many states and these are ‘Ready’, ‘Running’, ‘Sleeping’, ‘Blocked’, and ‘Waiting’. The ‘Ready’ state means that the thread is now just waiting for the time of the CPU and whenever the scheduler will permit the thread to run, it will go into ‘Running’ state. ‘Running’ state is the actual execution state when the instructions in the run() method ( or method called from run()) are being executed. A thread goes into sleeping state when Thread.sleep() method is called. A thread goes into ‘Blocked’ state when it is blocked for some resource (typically IO). And finally, the thread goes into ‘Waiting’ state when the Object.wait() method is called on the thread stack. We will talk about locks and resources further ahead, for now it suffices to know that there is a lock associated with an Object and a thread may possess a lock (of any object). If a thread possesses a lock then by being in ‘Sleeping’ or ‘Blocked’ states, the lock is not relinquished; however, the lock is relinquished when the thread goes into ‘Waiting’ state. We will explain this in more detail.

Locks and sharing of resources

Each object has a lock associated with it. Critical sections of the code that access a common resource are guarded by monitors (which essentially are synchronized methods or blocks). To enter a monitor a thread has to acquire the lock corresponding to the monitor. Whenever there is a possibility of a resource (any class variable) being accessed by more than one thread then we need to guard the access to that resource by the help of monitors. One and only one thread can acquire lock of an object and thus enter the critical section at any point of time.

We will now build an example of a read-write lock on our understanding so far.

Read-Write Lock

We have seen that critical sections are guarded by monitors and only the thread which has a lock associated with that monitor can access the resources in that critical section. In simple terms, consider a synchronized method or a synchronized block of code. Any thread before executing that code takes the lock over the object, the object in case of the synchronized method is the instance on which the synchronized method is called; in case of a synchronized block, it could any other object whose reference is given there. For example, synchronized (anyObject) { }.

A read access to the guarded resource is one in which the thread is just using the values of the resource to do something and not changing the values themselves, and a write access is one in which the values are being changed. If there are multiple threads some of which are readers and some are writers then we have to take locks every time because even if we are just reading there could be a chance of a writer modifying the resource at the same time and thus reader could get an inconsistent view. But if most of the threads are just reading it would be unwise to serialize the read access to the resource. See the example listing of RWLock.java to see an example of a simple read-write lock. We will explain the program in order that the concepts behind the program are clearly understood, but first the usage. Any class which wishes to guard one or more of its resources (typically objects but could well be primitives) against concurrent modification but permit concurrent read may create an instance of RWLock. Before reading the threads must call getReadLock() method and after reading they must call releaseReadLock() method.

synchronized public void getReadLock() {
      while(writingInProgress | (writersWaiting >=maxWritersWaiting) ) {
        try {
         wait();
        }
        catch (InterruptedException ie) { }
      } 
      readers ++ ;
  }

Here, the maxWritersWaiting is a variable which is defined at initialization, and you could provide a number at construction. Just how many writing threads you wish to have waiting depends upon your application needs. Note that this method is synchronized, and the question that comes to mind is what have we gained if we have to take a lock anyway? There are three important aspects to this.

  1. Since we are changing the class variable ‘readers’ and accessing variables like writersWaiting we have to guard against other threads changing these values.
  2. Next and perhaps the more important concept is the “Rule about interaction of Lock and Variables”. Any application thread that wishes to read a resource requires the most latest and consistent copy of the data. As explained earlier the thread would copy the values from the main memory to its working memory. But whenthese values are copied are dependent upon implementation. However, the synchronized keyword acts as a ‘memory barrier’ that is when the thread steps in a synchronized block it flushes its local copy, so for all subsequent uses of any variable after crossing the memory barrier the thread copies afresh the values from main memory to the local private memory. Let us stress any variable further. The above synchronized method does not have any reference to the guarded resource, still any copy of the resource in the thread’s local memory is flushed clean, any read after getReadLock() method is guaranteed to fetch the latest values of any variable used by the thread. So synchronized keyword does not only mean mutex (mutual exclusion) it also means “synchronization” of the local memory and main memory.
  3. Finally, about the serial access to getReadLock() itself. It’s true that only one thread can get into getReadLock() at anytime; but if the conditions of line-2 are met, then the thread would be out in a flash and any other reader can then get the read lock. This scheme is suited for applications that spend considerable time reading the resources and performing some calculations based on them. Any number of reader threads can get the read lock and thus access the resource for reading at the same time.

Note that if the writing is in progress or the number of writers waiting is more than stipulated the reader thread will go into waiting state after giving up the lock (on the RWLock object).

In Part 2, we’ll address the releaseReadLock() method and the writer lock and unlock, run our example program, and clarify some common misunderstanding with regard to multithreading.

About the Author

Nasir Khan is a Sun Certified Java programmer, with a B.E. in electrical engineering and a masters degree in systems. His areas of interest include Java applications, EJB, RMI, CORBA, JDBC and JFC. He is presently working with BayPackets Inc. in high-technology areas of telecom software.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories