Skip to content

Widhian Bramantya

coding is an art form

Menu
  • About Me
Menu
kafka

Offset Management and Consumer Groups in Kafka

Posted on September 22, 2025September 22, 2025 by admin

When talking about Apache Kafka, two very important concepts are offsets and consumer groups. These two work together to make sure messages are read correctly, even when there are many consumers.

What is an Offset?

  • Each message inside a partition has a unique number called an offset.
  • Offsets start from 0 and increase by 1 for each new message.
  • Think of an offset as a bookmark in a log file.

Example:

  • Partition orders-0 has messages:
    • Offset 0: Order A
    • Offset 1: Order B
    • Offset 2: Order C

If a consumer has read up to offset 1, the next message will be offset 2.

Why Do We Need Offsets?

  • Offsets let consumers remember where they stopped.
  • If a consumer crashes, it can continue reading from the last saved offset.
  • This avoids reading the same messages again and again (unless we want to replay them).

What is a Consumer Group?

A consumer group is a set of consumers working together.

  • Each consumer in the group gets part of the work.
  • Kafka makes sure that one partition is read by only one consumer in the group.
  • If a new consumer joins, Kafka rebalances the work.

Example:

  • Topic orders has 3 partitions: orders-0, orders-1, orders-2.
  • Consumer Group billing has 3 consumers:
    • Consumer 1 reads orders-0
    • Consumer 2 reads orders-1
    • Consumer 3 reads orders-2

If Consumer 2 crashes, Kafka will give orders-1 to another consumer in the group.

Offset Management

Offsets are stored in a special Kafka topic called __consumer_offsets.
There are two main strategies:

  1. Automatic Commit (Auto-Commit)
    • Kafka automatically saves the offset at regular intervals.
    • Easy to use but can cause message loss or duplicates if crashes happen.
  2. Manual Commit
    • Application decides when to commit offsets.
    • Safer because you commit after processing is done.
    • Gives more control, but you need extra code.
See also  Debezium Architecture – How It Works and Core Components

Diagram: Offsets and Consumer Groups

flowchart LR
    subgraph Topic["Topic: orders"]
        P0["Partition 0: Offsets 0..100"]
        P1["Partition 1: Offsets 0..100"]
        P2["Partition 2: Offsets 0..100"]
    end

    subgraph Group["Consumer Group: billing"]
        C1[Consumer 1<br/>Offset = 45]
        C2[Consumer 2<br/>Offset = 30]
        C3[Consumer 3<br/>Offset = 60]
    end

    P0 --> C1
    P1 --> C2
    P2 --> C3
Pages: 1 2
Category: Kafka

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Linkedin

Widhian Bramantya

Recent Posts

  • Log Management at Scale: Integrating Elasticsearch with Beats, Logstash, and Kibana
  • Index Lifecycle Management (ILM) in Elasticsearch: Automatic Data Control Made Simple
  • Blue-Green Deployment in Elasticsearch: Safe Reindexing and Zero-Downtime Upgrades
  • Maintaining Super Large Datasets in Elasticsearch
  • Elasticsearch Best Practices for Beginners
  • Implementing the Outbox Pattern with Debezium
  • Production-Grade Debezium Connector with Kafka (Postgres Outbox Example – E-Commerce Orders)
  • Connecting Debezium with Kafka for Real-Time Streaming
  • Debezium Architecture – How It Works and Core Components
  • What is Debezium? – An Introduction to Change Data Capture
  • Offset Management and Consumer Groups in Kafka
  • Partitions, Replication, and Fault Tolerance in Kafka
  • Delivery Semantics in Kafka: At Most Once, At Least Once, Exactly Once
  • Producers and Consumers: How Data Flows in Kafka
  • Kafka Architecture Explained: Brokers, Topics, Partitions, and Offsets
  • Getting Started with Apache Kafka: Core Concepts and Use Cases
  • Security Best Practices for RabbitMQ in Production
  • Understanding RabbitMQ Virtual Hosts (vhosts) and Their Uses
  • RabbitMQ Performance Tuning: Optimizing Throughput and Latency
  • High Availability in RabbitMQ: Clustering and Mirrored Queues Explained

Recent Comments

  1. Playing with VPC AWS (Part 2) – Widhian's Blog on Playing with VPC AWS (Part 1): VPC, Subnet, Internet Gateway, Route Table, NAT, and Security Group
  2. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 1): Introduction
  3. Basic Concept of ElasticSearch (Part 2): Architectural Perspective – Widhian's Blog on Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh
  4. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective
  5. Basic Concept of ElasticSearch (Part 1): Introduction – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective

Archives

  • October 2025
  • September 2025
  • August 2025
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • March 2021
  • January 2021

Categories

  • Debezium
  • Devops
  • ElasticSearch
  • Golang
  • Kafka
  • Lua
  • NATS
  • Programming
  • RabbitMQ
  • Redis
  • VPC
© 2025 Widhian Bramantya | Powered by Minimalist Blog WordPress Theme