
Understanding Jitter Backoff: A Beginner's Guide
Traditional exponential backoff can lead to a "thundering herd" problem, jitter backoff can help us to address this issue.
Retry Necessity
Network failures are inevitable. Client applications must implement retry mechanisms to handle these failures. there are many reasons that you might encounter network failure.
It's better to implement retries with Exponential Backoff and Jitter
What is Exponential Backoff ?
In exponential backoff, we increase the delay between retry attempts exponentially.
let delay = 5000
let timerId = setTimeout(function request() {
const requestSuccess = Math.random() > 0.7 // simulate 30% success rate
if (!requestSuccess) {
delay *= 2 // increase delay exponentially
timerId = setTimeout(request, delay) // retry request
}
else {
clearTimeout(timerId) // stop retries on success
}
}, delay)
but, this can lead to thundering herd problem.
Thundering herd problem
In a traditional backoff strategy, when a client encounters a failure (like a server unavailability), it typically implements a retry mechanism. A naive approach might be to have all clients retry at the same time after a fixed interval, that makes Thundering herd
problem.
Simple scenario:
Imagine you have 100 clients trying to access a service that temporarily goes down. With a traditional fixed backoff:
- All 100 clients detect the service failure and wait exactly 5 seconds
- After 5 seconds, ALL 100 clients simultaneously attempt to reconnect and it can cause further service instability and trigger another round of failure❗
On the other hand, Jitter can add some randomness to attempts.
What is Jitter?
Jitter adds random factor to our attempt intervals and it Prevents Synchronization(Stops multiple requests from retrying simultaneously) and Improves System Stability(Prevents recurring peak load patterns)
Let's add Jitter to our code example:
let delay = 5000
let timerId = setTimeout(function request() {
const requestSuccess = Math.random() > 0.7 // simulate 30% success rate
if (!requestSuccess) {
const jitter = Math.random() * 1000; // add random factor
delay = delay * 2 + jitter;
timerId = setTimeout(request, delay) // retry request
}
else {
clearTimeout(timerId) // stop retries on success
}
}, delay)
Conclusion:
Exponential backoff combined with jitter is a crucial technique for building resilient and performant distributed systems. It allows client applications to handle failures gracefully without overwhelming the system, leading to a smoother and more reliable user experience.
5 Questions and Answers about this topic
A scenario where multiple clients simultaneously retry connecting to a service after a failure, potentially overwhelming the recovering system.
A retry strategy where clients progressively increase wait times between reconnection attempts to reduce system load and avoid immediate re-failures.
Jitter adds randomness to retry intervals, spreading out reconnection attempts and preventing simultaneous re-connections from multiple clients.
A retry pattern where wait times double with each attempt (e.g., 1s, 2s, 4s, 8s) before adding jitter to further randomize intervals.
Randomization prevents synchronized retry attempts, reduces network congestion, and gives systems a better chance to recover without additional sudden load spikes.