Clojure Concurrency by Rich Hickey

Clojure is a dynamic language like Python and Ruby but as performant as Java.
It is useful in any context you use Java.

Fundamentals of Clojure:

  • functional language (immutable persistent data structures: valuable to concurrency in multithreaded context)
  • no mutable local variables (no for looping, with mutating local counter)
  • a Lisp dialect (code is data, a reader, language defined in terms of intepretation of data structure, not a stream of characters)
  • not Common Lisp or Scheme (no backwards compatibility with these)
  • hosted and embraces the JVM
  • direct support for concurrency (including data structures), primatives for concurrency

Features of Clojure:

  • supports dynamic development like Lisp. Not intepreted, always compiled on-the-fly into JVM by code and loaded
  • supports meta data as a first-class entity. Ability to associate meta data with a related data without impacting equality semantics
  • functions are first-class data type, closures
  • has a functional looping construct (does not have proper tail-call optimisation: Clojure uses JVM calling which does not support this)
  • has extensive destructuring binding system (not just for lists)
  • has Common Lisp style macros
  • has multi-methods which a way to do polymorphism without any inheritance
  • has concurrency support
  • has Java interoperability

Mutable objects very much the wrong way to do most things. Mutable objects an okay way to do some things. Encapsulation does not fix this.
Mess comes from network of objects that can change.

Vast majority of programs don't need mutability. Easier to understand without it. Mutable objects a disaster if also add concurrency.

Concurrency:

  • things happening at the same time. JVM is capable of simultaneous execution: multiple threads running on separate CPUs at the same time.
  • even if don't have actual concurrency, if use threads have simulated concurrency or interleaved execution. Even on single CPU, as soon as have
    multithreaded program have problems of concurrency (would like to avoid inconsistent or data).
  • Clojure not parallel language: emphasis on coordination aspects

What do now in mutable object-orientated language:

  • lock it, synchronise. Only one thread has access at a time, all others block
  • problem one is have to choose to lock it: manual
  • problem two is relying on convention: have to decide on locking approach which not provided for by language
  • can cause a bottle-neck on multi-CPU machine

Direct references to things that change: stuck with locking and hope for best.

Persistent data structures:

  • persistent data structure in function programming means the data structure is immutable, and to produce a new structure copy the existing
    structure with modification. Old data structure remains accessible.
  • big 0 performance guarantees for whatever structure must be maintained for persistent collection to be persistent. Maintains performance
    guarantees of its operations across changes
  • persistent achieved by sharing structure
  • thread safe

Clojure references:

  • have references to things which a type of label which mutable in a controlled way.
  • reference is stable, but what it refers to can change. Therefore reference is indirect, a cell that holds a pointer to a thing. Cell is atomic
    thing that can mutate. Same fell as object-orientated, but reference is indirect.
  • reference types the only thing in Clojure that can change/mutate.
  • have concurrency semantics which means they're automatic and enforced

3 types of mutable references in Clojure:

  • Vars: Isolate changes within threads
  • Refs: Share synchronous coordinated changes between threads
  • Agents: Share asynchronous independent changes between threads

Vars used to hold global variables and functions. In addition, ability to be bound to variables inside a thread. Once bound, independent of other bindings.

Making two changes in two places at once, need a transactional system. Refs provide this in Clojure. Change more than one things in co-ordinated way and visible
by more than one thread. Refs are the "worldview".

Agents allow to change things asynchronously and independently yet still be accessible to multiple threads. Agents are the "workers".

Refs and Transactions:
Similar to a database transaction. A transaction system meant for use inside programs (often called software transactional memory system or STM).
Refs can only be changed within a transaction.

All changes are Atomic and Isolated:

  • Every change to set of Refs made within a transaction occurs or none do (Atomic)
  • No transaction sees the effects of any other transaction while it is running (Isolated)

Transactions are speculative and automatically retried. No side-effects are allows (for example print statements), only allowed to manipulate refs.
Transactions are "nestable". If another transaction "dosync" within parent transaction, child is absorbed by parent.

Agents:
Agents manage independent state. State changes through actions, which are ordinary functions (state=>new-state).
Actions are dispatched using send or send-off, which return immediately. Actions are queued up. Actions occur asynchronously on thread- pool threads.
Only one action per agent happens at a time.

Used for a looser form of concurrency: updating caches, incrementors, independent workers. High degree of independence.

Under hood, no more than one action happening per agent at a time. All activites serialised by the system.

Unlike systems like in Erlang, state is always available.