Skip to content

Widhian Bramantya

coding is an art form

Menu
  • About Me
Menu
kafka

Producers and Consumers: How Data Flows in Kafka

Posted on September 14, 2025September 14, 2025 by admin

Batch Consuming in Kafka

Consumers often read messages in batches (a group of messages) instead of fetching them one at a time. This makes consumption faster and more efficient.

Normal Batch Flow

  1. Consumer polls messages: Fetches a batch (e.g., 10 messages) from a partition.
  2. Process messages one by one inside the batch.
  3. Commit offset: After the batch is done, the consumer commits the last processed offset.
  4. Next time the consumer starts, it resumes after the committed offset.

Example: Batch of 5 Messages

  • Consumer polls offsets [100, 101, 102, 103, 104].
  • Processes all successfully.
  • Commits offset = 104.
  • Next poll starts at offset 105.

Negative Case: Error in the Middle

What if something goes wrong while processing the batch?

Example

  • Batch = [100, 101, 102, 103, 104]
  • Consumer processes 100, 101 successfully
  • Error happens at 102 → the app crashes

Effect Depends on Offset Commit Strategy:

  1. Auto Commit (default)
    • Kafka may have already committed offset 104 (end of batch).
    • On restart, the consumer starts at 105.
    • Messages 102, 103, 104 are skipped → data loss for this consumer.
  2. Manual Commit After Batch
    • Offset is only committed when the whole batch succeeds.
    • On restart, consumer starts again from 100.
    • Messages 100, 101 are reprocessed → possible duplicates.
  3. Manual Commit Per Message
    • Consumer commits after each message.
    • On crash at 102, offsets 100, 101 are safe, and restart continues from 102.
    • No loss, minimal duplicates, but slightly slower.

Diagram: Batch Consuming with Error

sequenceDiagram
    participant C as Consumer
    participant B as Broker (Partition)

    C->>B: Poll batch [100-104]
    B-->>C: Deliver messages

    C->>C: Process 100 ✅
    C->>C: Process 101 ✅
    C->>C: Process 102 ❌ (Error)

    alt Auto Commit
        Note over C,B: Offset 104 already committed → messages 102-104 skipped
    else Manual Commit (after batch)
        Note over C,B: Batch not committed → reprocess 100-104 (duplicates possible)
    else Commit Per Message
        Note over C,B: Offset 101 committed → restart from 102
    end
See also  Getting Started with Apache Kafka: Core Concepts and Use Cases
Pages: 1 2 3
Category: NATS

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Linkedin

Widhian Bramantya

Recent Posts

  • Log Management at Scale: Integrating Elasticsearch with Beats, Logstash, and Kibana
  • Index Lifecycle Management (ILM) in Elasticsearch: Automatic Data Control Made Simple
  • Blue-Green Deployment in Elasticsearch: Safe Reindexing and Zero-Downtime Upgrades
  • Maintaining Super Large Datasets in Elasticsearch
  • Elasticsearch Best Practices for Beginners
  • Implementing the Outbox Pattern with Debezium
  • Production-Grade Debezium Connector with Kafka (Postgres Outbox Example – E-Commerce Orders)
  • Connecting Debezium with Kafka for Real-Time Streaming
  • Debezium Architecture – How It Works and Core Components
  • What is Debezium? – An Introduction to Change Data Capture
  • Offset Management and Consumer Groups in Kafka
  • Partitions, Replication, and Fault Tolerance in Kafka
  • Delivery Semantics in Kafka: At Most Once, At Least Once, Exactly Once
  • Producers and Consumers: How Data Flows in Kafka
  • Kafka Architecture Explained: Brokers, Topics, Partitions, and Offsets
  • Getting Started with Apache Kafka: Core Concepts and Use Cases
  • Security Best Practices for RabbitMQ in Production
  • Understanding RabbitMQ Virtual Hosts (vhosts) and Their Uses
  • RabbitMQ Performance Tuning: Optimizing Throughput and Latency
  • High Availability in RabbitMQ: Clustering and Mirrored Queues Explained

Recent Comments

  1. Playing with VPC AWS (Part 2) – Widhian's Blog on Playing with VPC AWS (Part 1): VPC, Subnet, Internet Gateway, Route Table, NAT, and Security Group
  2. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 1): Introduction
  3. Basic Concept of ElasticSearch (Part 2): Architectural Perspective – Widhian's Blog on Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh
  4. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective
  5. Basic Concept of ElasticSearch (Part 1): Introduction – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective

Archives

  • October 2025
  • September 2025
  • August 2025
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • March 2021
  • January 2021

Categories

  • Debezium
  • Devops
  • ElasticSearch
  • Golang
  • Kafka
  • Lua
  • NATS
  • Programming
  • RabbitMQ
  • Redis
  • VPC
© 2025 Widhian Bramantya | Powered by Minimalist Blog WordPress Theme