When talking about Apache Kafka, two very important concepts are offsets and consumer groups. These two work together to make sure messages are read correctly, even when there are many consumers.
What is an Offset?
- Each message inside a partition has a unique number called an offset.
- Offsets start from 0 and increase by 1 for each new message.
- Think of an offset as a bookmark in a log file.
Example:
- Partition
orders-0
has messages:- Offset 0: Order A
- Offset 1: Order B
- Offset 2: Order C
If a consumer has read up to offset 1, the next message will be offset 2.
Why Do We Need Offsets?
- Offsets let consumers remember where they stopped.
- If a consumer crashes, it can continue reading from the last saved offset.
- This avoids reading the same messages again and again (unless we want to replay them).
What is a Consumer Group?
A consumer group is a set of consumers working together.
- Each consumer in the group gets part of the work.
- Kafka makes sure that one partition is read by only one consumer in the group.
- If a new consumer joins, Kafka rebalances the work.
Example:
- Topic
orders
has 3 partitions:orders-0
,orders-1
,orders-2
. - Consumer Group billing has 3 consumers:
- Consumer 1 reads
orders-0
- Consumer 2 reads
orders-1
- Consumer 3 reads
orders-2
- Consumer 1 reads
If Consumer 2 crashes, Kafka will give orders-1
to another consumer in the group.
Offset Management
Offsets are stored in a special Kafka topic called __consumer_offsets
.
There are two main strategies:
- Automatic Commit (Auto-Commit)
- Kafka automatically saves the offset at regular intervals.
- Easy to use but can cause message loss or duplicates if crashes happen.
- Manual Commit
- Application decides when to commit offsets.
- Safer because you commit after processing is done.
- Gives more control, but you need extra code.
Diagram: Offsets and Consumer Groups
flowchart LR subgraph Topic["Topic: orders"] P0["Partition 0: Offsets 0..100"] P1["Partition 1: Offsets 0..100"] P2["Partition 2: Offsets 0..100"] end subgraph Group["Consumer Group: billing"] C1[Consumer 1<br/>Offset = 45] C2[Consumer 2<br/>Offset = 30] C3[Consumer 3<br/>Offset = 60] end P0 --> C1 P1 --> C2 P2 --> C3
Pages: 1 2
Category: Kafka