Thursday, August 28, 2008

Concurrency – Past and Present

Java Concurrency in PracticeAt QCON London 2008, Brian Geotz gave a talk on concurrency. Brian starts by mentioning that concurrency is hard. It is unnatural, error prone and untestable, hard for most programmers to use. He then explains the basics concepts of concurrency. A thread is a sequential process with its own program counter and call stack. Thread share VM wide resources such as memory, file handles, and security credentials. The pros of threads is fine grained data sharing between threads. However, that is also its disadvantage. Threads execute concurrently unless you use explicit locking to ensure that threads take turns accessing critical resources (synchronize key word in java).

If mutable data is shared between threads, all access requires synchronization. Failing to follow this rule results in data race which results in reading the wrong value or writing the wrong value. This generally occurs whenever there is a if something then do something or ready-modify-write. In the 1st case, another thread might change the value between the time we do the check and the time we call the method. In the second case, 1 statement is actually 3 statements like x++. Again here another thread might read the same value and then increment it by just 1 when it should have gotten incremented by 2.

Adding synchronized on methods in the class that modify data is a start (account with debit/credit), but then we have to be aware of the locking mechanism when composing operations in other classes (accountManager with transfer).

Programmers have to specify which operations are required to be atomic. But when doing so, this might lead to deacklock. Example, transferMoney(me, you, 100) and transferMoney(you, me, 50). One way to deal with this, is to induce a lock ordering. Example, always acquire lock 1st on the smaller value.

If from

synchronize(from) {

synchronize(to) {

}

}


Next Brian explains how this is getting more and more complicated. The main reason is because there is a fundamental tension between concurrency and OO design. OO encourage encapsulation and hiding of implementation details. OO also encourages composition, but composing thread safe objects requires knowing how they implement locking in order to participate in their locking protocols and know how to avoid deadlocks. However, the language is hiding the implementation details.

Next Brian offers some solutions to this problem. One solution is software transactional memory (STM). Explicit locks are replaced by transactional boundaries that define atomic blocks. The VM then provides locking nesting semantics and choose a locking strategy. This is under research and promising but not yet available.

Threads and locks are just one model for concurrency. Lock based concurrency rules hold locks when accessing shared mutable state and hold locks for the duration of atomic operations. Alternatives are don’t mutate state or don’t share state. Functional languages like Haskell or JOCaml have no mutable state. In java, we should get in the habit of making everything final unless you have a clear need to make it mutable. In Erlang everything is an Actor (like a lightweight thread). It has a designated behavior for when a message is received. No shared state. Scala has a similar actor library.

Brian concludes that in java, we can try to restore predictability by limiting concurrent interactions to well defined points, limiting sharing, limiting immutability. Concurrency is hard, so minimize the amount of code that has to deal with concurrency (isolate concurrency in concurrent components such as blocking queues and isolate code that accesses shared state in frameworks), use immutable objects wherever you can. Sometimes it is cheaper to share a non thread safe object by copying than to make it thread safe.


This presentation is available on InfoQ at http://www.infoq.com/presentations/goetz-concurrency-past-present