Distributed locking

Distributed locks are primitives that help you control access to the resource that is shared between different processes. In modern cloud-based applications processes are usually distributed between multiple physical machines.

Redis provides a great opportunities for implementing distributed locking. It’s fast (in cloud you can archive sub-millisecond response time), and it provides atomic execution guarantees for LUA scripts.

Redisson locks

The most popular Java Redis client library that implements distributed locking algorithms is Redisson (lock docs).

Mainly Redisson lock implementations rely on Redis’s pubsub mechanism. Basically, their algorithm can be described as following:

  1. Try aquire a lock by executing LUA script
  2. If lock is aquired — return
  3. Else subscribe to the changes to lock object using pubsub mechanism
  4. When the thread gets notified that lock is released, return to step 1 if acquisition process didn’t timeout

Redis pubsub implementation

This approach has one major disadvantage — current Redis pubsub implementation in cluster mode is inefficient. According to docs (link) all pubsub messages are distributed to all cluster nodes. It means that if you have a cluster of 20 shards, and each shard consists of master and slave, message is distributed to 40 different places. If you aquire and release many locks in a short period of time, for example, 1.000.000 locks per minute, inefficient message distribution can saturate your node’s network. One of symptoms of this problem is unexpected raise of cpu to 100%.

Naive pubsub implementation also makes your solution’s Redis not scalable horizontally. If you add more nodes to the cluster, each node will get less hash slots and thus less objects. But still each node will receive all pubsub messages and distribute it’s own messages to all other nodes.

If we have m messages published in a time period, and a cluster has n nodes, then each node will:

  1. Publish m/n messages, and distribute them to all other nodes. Messages sent during a time period — m(n-1)/n = m — m/n. This number grows when n grows, so each node’s output traffic will grow if we add new nodes to the cluster
Sending messages to other nodes
Sending messages to other nodes
Message output from a node
  1. Reseive message from other nodes. Their amout is total messages minus messages published by this node, or, m — m/n. So, the input traffic also grows when we add more nodes to the cluster

Spinlock

RedissonSpinLock is a reentrant implementation of RLock interface. It eliminates the RedissonLock’s major disadvantage. It doesn’t rely on pubsub mechanism. It’s algorithm can be described as following:

  1. Try aquire the lock
  2. If lock is aquired — return
  3. Else wait for some time
  4. Return to step 1 if lock aquisition didn’t timeout

Pros:

  1. As mentioned earlier, it doesn’t depend on pubsub mechanism, so the solution that uses spinlock is scalable horizontally
  2. In large clusters (20+ nodes) it can make each node’s network input and output 90% lower

Cons:

  1. Thread that tries to aquire the lock sleeps for full backoff period despite the fact that lock could already be released. Aquisition time during contention is expected to be higher
  2. Under contention thread frequently wakes up, makes failing aquire attempt and falls asleep again. The amount of Redis requests will be higher.

Backoff

Backoff controlls the amount of time, that lock waits between two tryAquire calls. It’s required to prevent massive spam of failing requests to the Redis.

There are two backoffs provided by default:

  1. ExponentialBackOff — back off period is exponentially increased every time up to maxDelay. To prevent massive amount of simultanious requests from different threads some random value is added to the backoff period.
  2. ConstantBackOff — returns the same back off period each time. In production environment should be used carefully.

Other back off algorithms can be easily implemented by extending appropriate classes.

RedissonSpinLock is a new distributed locking algorithm that eliminates some problems of other RLock implementations and can be used in huge Redis clusters. Maybe it doesn’t fit your needs, but you can still test it.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store