Kafka's new queue semantics: when Share Groups help and when to stay on consumer groups
KIP-932 removed the partition ceiling on consumer parallelism. Here is how to decide which workloads benefit and which should not switch.
When a traffic spike hit and consumer lag climbed, the fix looked obvious: add more consumers. The topic had 12 partitions. The consumer group already had 12 active consumers. Adding a 13th did nothing.
This is Kafka's partition ceiling, and it has been there since the beginning. In a traditional consumer group, each partition belongs to exactly one consumer within the group. Maximum active consumers equals maximum partitions. To scale past that, you either re-partition the topic or accept the limit.
Kafka 4.0 introduced Kafka Share Groups, finalised in KIP-932 and shipping at general availability in Kafka 4.2. They allow more consumers than partitions by replacing offset-based acknowledgment with per-record acquisition locks. The mechanical change is significant. Whether to use them, and for which workloads, is more nuanced than most coverage suggests.
The problem consumer groups quietly created
The partition ceiling produced two coping patterns that most large Kafka deployments share.
The first: over-provisioning partitions at topic creation. Teams set 50 or 100 partitions for future scale when current load needed 12. More partitions mean more broker metadata, slower rebalancing, and more coordinator traffic. A common rule of thumb was partition count equals expected peak consumers multiplied by three. That is guesswork dressed up as capacity planning.
The second: consumer group fragmentation. Multiple consumer groups consume the same topic, with the application sharding work between them. This achieves queue-like behaviour by layering application coordination on top of Kafka's model, and it requires careful deduplication logic to avoid double-processing during failover.
Beyond the ceiling, rebalancing itself was operationally costly. Every consumer join or leave triggered a group rebalance. During a rebalance, the group processes no messages. Incremental cooperative rebalancing, introduced in Kafka 2.4, reduced the size of each pause but not the frequency. A rolling deployment of a 20-consumer service could trigger 20 successive rebalances, each lasting hundreds of milliseconds to several seconds depending on partition count and coordinator load.
What Kafka Share Groups are
The change is at the partition level. Instead of assigning entire partitions to consumers, the broker tracks per-record acquisition locks within each share-partition. When a consumer calls poll(), the broker grants a batch of records with acquisition locks. Those records are unavailable to other consumers in the same share group until they are acknowledged or the lock expires.
Acknowledgment has three states. ACCEPT marks a record as successfully processed. RELEASE returns the record to the broker's available pool for another consumer to process. REJECT discards the record permanently, equivalent to routing it to a dead-letter store. The broker handles redelivery: if a lock expires without acknowledgment, the record re-enters the pool automatically.
Adding a consumer to a share group does not trigger a rebalance. There is no partition reassignment. The new consumer starts pulling from the broker's acquisition pool immediately. This is the mechanism that enables elastic scaling: add and remove consumers freely without a coordination pause.
On Confluent Cloud, Share Groups reached general availability in early 2026. On self-managed clusters, they require Kafka 4.2 or later and the broker configuration group.share.enable=true. The API class is KafkaShareConsumer, separate from the existing KafkaConsumer. Existing consumer group consumers on the same cluster are unaffected.
The ordering trade-off
Share Groups do not preserve per-key ordering. This is not a limitation that will be closed in a later release. It is the direct consequence of the mechanism that makes elastic scaling possible.
In a consumer group, partition assignment gives you a hard invariant: all records for a given key arrive at the same consumer in write order. Event sourcing, CDC pipelines, and sequential state machine processing all rely on this guarantee.
Lock-free record distribution and per-partition ordered delivery are structurally incompatible. Per-consumer partition ownership gives you ordering; removing partition ownership to enable elastic scaling removes the guarantee along with it. No configuration option restores both.
When Share Groups help
Work queues with variable processing time
The canonical case: a topic of jobs where some take 100ms and some take 10 seconds. With consumer groups, the slow job blocks its partition. The partition can only advance when the slow consumer finishes. Other consumers cannot help. With Share Groups, the acquisition lock timeout allows the broker to redistribute a stalled record to a free consumer, eliminating head-of-line blocking across variable-duration tasks.
Notification and delivery pipelines
Push notifications, webhook delivery, outbound email — workloads where processing order within a user's events is irrelevant to correctness, but volume is bursty. Share Groups allow you to scale consumers up during a spike and back down without a rebalancing event.
Background data enrichment
Enriching records against external APIs, including fraud scoring, CRM lookups, and geocoding, where each record is independent. The bottleneck is usually external API latency and rate limits, not local compute. Share Groups allow more concurrent requests without requiring a higher partition count to support them.
Workloads running on a separate job queue
If you are running RabbitMQ or SQS purely because Kafka could not provide queue semantics, Share Groups remove that reason. Whether consolidating is worth migrating away from an established job queue broker is a separate decision. The option now exists.
| Dimension | Consumer Groups | Share Groups |
|---|---|---|
| Max consumers per partition | 1 (hard ceiling) | Unlimited |
| Consumer scaling | Rebalance required | Add/remove; no rebalance |
| Ordering guarantee | Per-partition, strict | Within poll batch only |
| Acknowledgment unit | Offset commit (group) | Per-record: ACCEPT / RELEASE / REJECT |
| Exactly-once support | Yes, via Transactions API | Not available in Kafka 4.2 |
| Dead-letter handling | Application layer | Built into REJECT acknowledgment |
| Operational maturity | 10+ years; extensive tooling | GA as of 2026; tooling maturing |
| Best for | Ordered streams, CDC, event sourcing | Job queues, notifications, enrichment |
When to stay on consumer groups
Ordered streams and event-sourced workloads
If your application assumes all records for a given key arrive in order, stay on consumer groups. This covers CDC pipelines, event-sourced aggregates, user-activity ledgers, and sequential state machines. Share Groups will break this assumption in ways that produce incorrect application state rather than visible errors. The failure mode is silent data inconsistency.
Exactly-once processing
The Kafka Transactions API, which enables exactly-once semantics through atomic producer-consumer commits with isolation.level=read_committed, is not compatible with Share Groups as of Kafka 4.2. Exactly-once pipelines must remain on consumer groups until transactional support is added to the share group model. That work is on the roadmap but has not shipped.
High-throughput streams with stable load
If your consumer group is already one-to-one with partitions and operates near its throughput ceiling without scaling friction, consumer groups carry less per-record overhead. The acquisition lock tracking in Share Groups is efficient, but at very high sustained message rates the broker-side tracking adds overhead. Consumer groups also have a decade of operational tooling that is still maturing for Share Groups: kafka-consumer-groups.sh, Grafana dashboards, lag alerting.
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "kafka:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "image-processor-group");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
try (KafkaShareConsumer<String, String> consumer = new KafkaShareConsumer<>(props)) {
consumer.subscribe(Collections.singleton("image-jobs"));
while (running) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(500));
for (ConsumerRecord<String, String> record : records) {
try {
processImage(record.value());
consumer.acknowledge(record, AcknowledgeType.ACCEPT);
} catch (RetryableException e) {
consumer.acknowledge(record, AcknowledgeType.RELEASE); // requeue
} catch (PoisonPillException e) {
consumer.acknowledge(record, AcknowledgeType.REJECT); // dead-letter
}
}
consumer.commitSync();
}
}The partition count question
For new topics using Share Groups, partition count should reflect throughput and ordering requirements, not anticipated consumer parallelism. The old heuristic of setting a high partition count to leave room for future consumer scaling no longer applies.
A practical revision: set partition count to match your write throughput ceiling. If a single partition can sustain 50MB/s and your target is 200MB/s, four partitions is the right number. Consumer parallelism beyond that count is free with Share Groups.
For existing over-provisioned topics: you cannot reduce partition counts on a live Kafka topic without recreating it. If you have 100 partitions on a topic that needed 8, you gain nothing immediately. The recalibration applies to new topics going forward.
Version readiness
Share Groups in Kafka 4.0 were Early Access. Kafka 4.1 moved them to Preview. Kafka 4.2 is the targeted general availability release for self-managed clusters. Confluent Cloud made them generally available in early 2026 with Java client support; non-Java client support is on the near-term roadmap.
To enable Share Groups on a self-managed cluster, set group.share.enable=true in your broker configuration and upgrade to Kafka 4.2 or later. Existing consumer group consumers on the same cluster continue working without change. For Kafka Streams and Kafka Connect users: neither framework currently supports Share Groups. Both continue to use the consumer group protocol.
One thing actually changed
Share Groups solve one specific problem: the hard ceiling on consumer parallelism set by partition count. For workloads that do not require per-key ordering, they remove the need to over-provision partitions and eliminate rebalancing during scaling events.
For ordered streams, exactly-once pipelines, and workloads where the consumer group model has worked without friction, consumer groups remain the simpler and better-tooled choice. Kafka now supports two distinct consumption models. Picking based on what your workload actually needs is the only real decision.
Frequently asked questions
Related reading
Every Postgres isolation level, and the specific bug it lets through
Three isolation levels, three distinct failure modes. Most Postgres deployments run at Read Committed without knowing it. Here is what each level permits and what upgrading actually costs.
Rate limiting in production: why the algorithm you chose is probably wrong for your workload
Most rate limiting failures aren't implementation errors. They come from picking an algorithm whose properties don't match the actual traffic shape. Here's a workload-first framework for making the right choice.
Idempotency keys: the layer you're protecting isn't the one that bites you
An Idempotency-Key header handles one of five layers where duplicates cause harm. Database writes, queue consumers, external API calls, and saga compensation each have failure modes the HTTP key doesn't cover.