If your producer pushes 1000 messages/sec and your consumer only processes 100/sec, something has to give. That "something" is backpressure — a signal flowing UPSTREAM from the slow consumer to the fast producer. Get this wrong and you OOM the server or drop messages silently.
The three strategies
BLOCK — the producer stops writing until the consumer catches up. Safest, but couples performance. DROP — discard messages when the buffer fills (head-drop, tail-drop, or priority-aware). BUFFER — accept everything and grow the queue. Eventually OOMs unless bounded.
HTTP/2 flow control (built-in)
Each HTTP/2 stream has a flow-control window. When the consumer doesn't send WINDOW_UPDATE frames, the producer's TCP send buffer fills and writes block. This is BLOCK strategy by default — works correctly if your application reads from the stream regularly.
Application-layer DROP
from collections import deque
from threading import Lock
class DropOldestQueue:
def __init__(self, maxlen=1000):
self.q = deque(maxlen=maxlen) # auto-drops on overflow
self.lock = Lock()
def put(self, msg):
with self.lock:
self.q.append(msg) # silently drops oldest
def get(self):
with self.lock:
return self.q.popleft() if self.q else NoneReactive streams (RxJava / Project Reactor)
Reactive frameworks formalize backpressure with request(N) calls — the subscriber tells the publisher exactly how many items it can take. This gives clean DROP/BUFFER semantics and is the most flexible pattern when consumers vary in speed.
Monitoring is mandatory
Track queue_depth as a gauge, messages_dropped_total as a counter, and consumer_lag_seconds as a histogram. If you can't see backpressure happening, you can't tune it.