Concurrency

Item78: Synchronize access to shared mutable data

The synchronized keyword ensures that only a single thread can execute a method or block at one time. Not only does synchronization prevent threads from observing an object in an inconsistent state, but it ensures that each thread entering a synchronized method or block sees the effects of all previous modifications that were guarded by the same lock.

The language specification guarantees that reading or writing a variable is atomic unless the variable is of type long or double.

Synchronization is required for reliable communication between threads as well as for mutual exclusion.

Because read and write of boolean variable is atomic, it is recommended to stop one thread from another by having the first thread poll a boolean field that is initially false but can be set to true by the second thread to indicate that the first thread is to stop itself. Though it can be tricky:

public class StopThread {
  private static boolean stopRequested;
  
  public static void main(String[] args)
    throws InterruptedException {
      Thread backgroundThread = new Thread(() -> {
        int i = 0;
        while (!stopRequested)
        	i++;
  		});
      backgroundThread.start();
    
      TimeUnit.SECONDS.sleep(1);
      stopRequested = true;
  }
}

This program is expected to terminate after about 1 second, but it will not. The problem is that in the absence of synchronization, there is no guarantee as to when, if ever, the background thread will see the change in the value of stopRequested made by the main thread. The JVM could simply transform this code:

while (!stopRequested)
	i++;

into this:

if (!stopRequested)
  while (true)
  	i++;

This optimization is known as hoisting. To avoid this, we can use two synchronized methods: read and write the value of the boolean variable. Synchronization is not guaranteed to work unless both read and write operations are synchronized. Also, for these atomic operations, we could use violatile modifier to perform no mutual exclusion. But for some non-atomic operations, violatile does not guarantee consistency with multi-threads. Operations happened between two operations belonging to one operator would lead to this kind of inconsistency, like ++;

The best way to avoid the problems discussed in this item is not to share mutable data, and we should confine mutable data to a single thread.

Item79: Avoid excessive synchronization

To avoid liveness and safety failures, never cede control to the client within a synchronized method or block. In other words, inside a synchronized region, do not invoke a method that is designed to be overridden, or one provided by a client in the form of a function object. The disastrous results include ConcurrentModificationException and deadlocks.

The libraries provide a concurrent collection (Item 81) known as CopyOnWriteArrayList that is tailor-made for moving the alien method invocations out of the synchronized block. An alien method invoked outside of a synchronized region is known as an open call. Besides preventing failures, open calls can greatly increase concurrency.

As a rule, you should do as little work as possible inside synchronized regions. Obtain the lock, examine the shared data, transform it as necessary, and drop the lock. The performance of synchronization is another factor required concerning, which is preferred to be avoided, though the cost has been optimized long ago.

There are two options to write a mutable class:

  • omit all synchronization and allow the client to synchronize externally if concurrent use is desired
  • synchronize internally, making the class thread-safe

We should choose the latter option only if we can achieve significantly higher concurrency with internal synchronization than we could by having the client lock the entire object externally. If we do synchronize your class internally, we can use various techniques to achieve high concurrency, such as lock splitting, lock striping, and nonblocking concurrency control.

If a method modifies a static field and there is any possibility that the method will be called from multiple threads, we must synchronize access to the field internally.

Item80: Prefer executors, tasks, and streams to threads

java.util.concurrent contains an Executor Framework, which is a flexible interface-based task execution facility. This is now the way to create a work queue:

ExecutorService exec = Executors.newSingleThreadExecutor();

and submit a runnable for execution:

exec.execute(runnable);

and tell the executor to terminate gracefully:

exec.shutdown();

We can also call a different static factory to create a thread pool to have more than one thread to process requests from the queue. The java.util.concurrent.Executors class contains static factories that provide most of the executors we will ever need.

For a small program, or a lightly loaded server, Executors.newCachedThreadPool is generally a good choice. But in a heavily loaded production server, you are much better off using Executors.newFixedThreadPool, which gives you a pool with a fixed number of threads, or using the ThreadPoolExecutor class directly, for maximum control.

Not only should we refrain from writing your own work queues, but we should generally refrain from working directly with threads.

Item81: Prefer concurrency utilities to wait and notify

There is far less reason to use wait and notify. Given the difficulty of using wait and notify correctly, you should use the higher-level concurrency utilities instead. The higher-level utilities in java.util.concurrent fall into three categories: the Executor Framework, which was covered briefly in Item 80; concurrent collections; and synchronizers.

The concurrent collections are high-performance concurrent implementations of standard collection interfaces such as List, Queue, and Map. These implementations manage their own synchronization internally, so that it is impossible to exclude concurrent activity from a concurrent collection; locking it will only slow the program.

Concurrent collection interfaces were outfitted with state-dependent modify operations, which combine several primitives into a single atomic operation.

Some of the collection interfaces were extended with blocking operations, which wait (or block) until they can be successfully performed.

Synchronizers are objects that enable threads to wait for one another, allowing them to coordinate their activities. The most commonly used synchronizers are CountDownLatch and Semaphore. Less commonly used are CyclicBarrier and Exchanger. The most powerful synchronizer is Phaser.

Sometimes we have to use wait and notify, and there is the standard idiom for using the wait method:

synchronized (obj) {
  while (<condition does not hold>)
  	obj.wait(); // (Releases lock, and reacquires on wakeup)
  ... // Perform action appropriate to condition
}

Always use the wait loop idiom to invoke the wait method; never invoke it outside of a loop. Testing the condition before waiting and skipping the wait if the condition already holds are necessary to ensure liveness. And testing the condition after waiting and waiting again if the condition does not hold are necessary to ensure safety.

A related issue is whether to use notify or notifyAll to wake waiting threads. Using notifyAll would guarantee that we could wake the threads that need to be awakened. As an optimization, we may choose to invoke notify instead of notifyAll if all threads that could be in the wait-set are waiting for the same condition and only one thread at a time can benefit from the condition becoming true.

But as placing the wait invocation in a loop protects against accidental or malicious notifications on a publicly accessible object, using notifyAll in place of notify protects against accidental or malicious waits by an unrelated thread.

Item82: Document thread safety

The presence of the synchronized modifier in a method declaration is an implementation detail, not a part of its API. It does not reliably indicate that a method is thread-safe.

To enable safe concurrent use, a class must clearly document what level of thread safety it supports. The following list summarizes levels of thread safety:

  • Immutable—Instances of this class appear constant. No external synchronization is necessary.
  • Unconditionally thread-safe—Instances of this class are mutable, but the class has sufficient internal synchronization that its instances can be used concurrently without the need for any external synchronization.
  • Conditionally thread-safe—Like unconditionally thread-safe, except that some methods require external synchronization for safe concurrent use.
  • Not thread-safe—Instances of this class are mutable. To use them concurrently, clients must surround each method invocation (or invocation sequence) with external synchronization of the clients' choosing.
  • Thread-hostile—This class is unsafe for concurrent use even if every method invocation is surrounded by external synchronization. Thread hostility usually results from modifying static data without synchronization.

Documenting a conditionally thread-safe class requires care. We must indicate which invocation sequences require external synchronization, and which lock (or in rare cases, locks) must be acquired to execute these sequences.

The description of a class's thread safety generally belongs in the class's doc comment, but methods with special thread safety properties should describe these properties in their own documentation comments.

Lock fields should always be declared final. This prevents you from inadvertently changing its contents, and minimize the mutability of the lock field.

The private lock object idiom can be used only on unconditionally thread-safe classes.

Item83: Use lazy initialization judiciously

Lazy initialization is the act of delaying the initialization of a field until its value is needed. This technique is applicable to both static and instance fields.

Lazy initialization is a double-edged sword. It decreases the cost of initializing a class or creating an instance, at the expense of increasing the cost of accessing the lazily initialized field. That said, lazy initialization has its uses. If a field is accessed only on a fraction of the instances of a class and it is costly to initialize the field, then lazy initialization may be worthwhile.

In the presence of multiple threads, lazy initialization is tricky. If two or more threads share a lazily initialized field, it is critical that some form of synchronization be employed, or severe bugs can result.

Under most circumstances, normal initialization is preferable to lazy initialization. If we use lazy initialization to break an initialization circularity, use a synchronized accessor:

private FieldType field;

private synchronized FieldType getField() {
  if (field == null)
  	field = computeFieldValue();
  return field;
}

If we need to use lazy initialization for performance on a static field, use the lazy initialization holder class idiom.

// Lazy initialization holder class idiom for static fields
private static class FieldHolder {
	static final FieldType field = computeFieldValue();
}

private static FieldType getField() { return FieldHolder.field; }

The beauty of this idiom is that the getField method is not synchronized and performs only a field access, so lazy initialization adds practically nothing to the cost of access. A typical VM will synchronize field access only to initialize the class. Once the class is initialized, the VM patches the code so that subsequent access to the field does not involve any testing or synchronization.

If you need to use lazy initialization for performance on an instance field, use the double-check idiom, which avoids the cost of locking when accessing the field after initialization.

// Double-check idiom for lazy initialization of instance fields
private volatile FieldType field;
private FieldType getField() {
  FieldType result = field;
  if (result == null) { // First check (no locking)
    synchronized(this) {
      if (field == null) // Second check (with locking)
      	field = result = computeFieldValue();
    }
  }
  return result;
}

So check twice, the first time we need to check whether the filed has been initialized. If not, we need to lock it and check the field again. This is to avoid the field has been initialized between the first check and the intended initialization.

As for static fields, the lazy initialization holder class idiom is a better choice.

If the field can tolerate repeated initialization, we could abandon the second check and this is called singlecheck idiom. We can even omit the volatile modifier, if we can tolerate that each thread recalculates the value of a field. This variant is known as the racy single-check idiom. It speeds up field access on some architectures, at the expense of additional initializations (up to one per thread that accesses the field).

Item84: Don't depend on the thread scheduler

Any program that relies on the thread scheduler for correctness or performance is likely to be nonportable. Because the policy of different operating systems vary.

The best way to write a robust, responsive, portable program is to ensure that the average number of runnable threads is not significantly greater than the number of processors.

The main technique for keeping the number of runnable threads low is to have each thread do some useful work, and then wait for more. Threads should not run if they aren't doing useful work, like what thread pools do.

Threads should not busy-wait, repeatedly checking a shared object waiting for its state to change. This makes the program vulnerable to the vagaries of the thread scheduler, and increases the load on the processor.

When faced with a program that barely works because some threads aren't getting enough CPU time relative to others, resist the temptation to "fix" the program by putting in calls to Thread.yield. Thread.yield has no testable semantics. A better course of action is to restructure the application to reduce the number of concurrently runnable threads.

A related technique, to which similar caveats apply, is adjusting thread priorities. Thread priorities are among the least portable features of Java. So never use it.