A new log replication disaggregation survey post is out!
The Kafka Replication Protocol:
🔹Separation of control plane from data plane.
🔹Role separation with minimal coupling.
🔹Kafka’s alignment with Paxos roles.
https://t.co/noV1kRzaAU
Queues for Kafka is the hottest new feature being discussed right now!
"KIP-932: Queues for Kafka" was announced just 7 days ago.
But what is it?
First - let’s define a queue.
A traditional queue system is one where either:
🔹- many consumers read from the same queue (pub-sub)
🔹- one specific consumer reads from one specific producer (point to point)
The messages are typically stored until they’re consumed once - the queues have a maximum depth.
Kafka has never supported traditional queuing like this.
One of its strengths has precisely been the decoupling between producer and consumer.
A bad consumer has a close-to-zero effect on a producer. (unless it causes Kafka to read from disk and exhaust IOs)
One pain point with this approach is that consumer groups are coupled with the number of partitions in a topic.
If you have a topic with 10 partitions, you cannot scale beyond 10 consumers.
So, people usually over-partition.
But that's very unintuitive for a uniform workload. 🙅♂️
If all your messages are independent work items with no logical grouping, a single queue consumed by a pool of applications is the intuitive solution.
So, KIP-932 proposes a solution with the following benefits:
✅ - the ability for many consumers to read from the same partition
✅ - individual records acknowledgments
✅ - still keep producers and consumers decoupled
✅ - no maximum queue depth
✅ - messages are still retained - so you have the ability to replay
And the following limitations:
🔴 - Ordering is NOT guaranteed. Out-of-order delivery is possible within a partition.
🔴 - No exactly once - it is at least once.
🔴 - Maximum processing delta. A consumer cannot read more than N messages ahead of the slowest one.
How does it do it?
Shared Consumer Groups. ✨
Each broker will be a shared group coordinator for the data partitions it is leading and manage the sharing of reads.
It will keep a sliding window of start <-> end offset for each pair of partition and group.
The records available for consumption will only be those within that offset range, essentially adding a maximum lag between the slowest and fastest consumer. 🛑✋
Consumers from the same shared group can read from the same partition by exclusively reserving a few records (offset range) in the partition via a time-limited acquisition lock.
A consumer can then ack/release/reject the message(s):
🥇 ack - acknowledges successful processing and moves the shared group’s offset progress
🥈 release - unsuccessful processing - retry. Release the record for another delivery.
🥉 reject - unsuccessful processing - abort. Blacklists the record, making it unavailable for another delivery.
☠️ To avoid poison messages - a delivery count is kept per message. When it goes over a maximum retry limit, the message is rejected.
That’s it! Quite the out of the box proposal. 💡
In one sentence, this is a usability feature for un-ordered consumption by an arbitrary number of consumers.
Note that nothing here is set in stone. This proposal is still pending discussion.
I was the first person to reply to the discussion on the mailing list (humblebrag).
The proposal may look quite different by the time it gets in.
What do you think about it?