Skip to content

Widhian Bramantya

coding is an art form

Menu
  • About Me
Menu
kafka

Producers and Consumers: How Data Flows in Kafka

Posted on September 14, 2025September 14, 2025 by admin

Negative Cases in Producer–Consumer Flow

Even though Kafka is designed to be reliable, things can still go wrong. Here are some common failure scenarios and how Kafka handles them.

1. Producer Failure (Message Not Sent)

  • Case: The producer tries to send a message, but the network goes down before the broker confirms.
  • Result: The producer doesn’t get an acknowledgment (ack).
  • Handling:
    • If retries are enabled, the producer will try again.
    • This can sometimes cause duplicate messages, but Kafka guarantees that the duplicates stay in the same partition.

This is why idempotent producers are recommended in production.

2. Broker Failure (Leader Partition Down)

  • Case: The broker that is the leader for a partition crashes.
  • Result: Producers and consumers cannot write or read from that partition temporarily.
  • Handling:
    • Kafka will elect a new leader from the replicas.
    • Producers/consumers automatically reconnect to the new leader.
    • Some delay (latency spike) may happen during re-election.

3. Consumer Failure (Consumer Crashes)

  • Case: A consumer in a group stops working.
  • Result: Messages assigned to its partition are not processed.
  • Handling:
    • Kafka rebalances the group.
    • Another consumer in the group takes over the partition.
    • Processing continues, though some lag may appear.

4. Consumer Lag (Too Slow to Keep Up)

  • Case: Producer sends data faster than the consumer can process.
  • Result: Lag grows (the gap between the latest offset and the consumer’s offset).
  • Handling:
    • If lag becomes larger than the topic’s retention, old data is deleted before the consumer reads it → data loss for that consumer.
    • The solution is to scale out consumers (add more in the group) or increase retention.
See also  Scalability and Reliability in NATS

Diagram: Normal vs Negative Flow

flowchart TD
    subgraph Producers
        P1[Producer]
    end

    subgraph KafkaCluster["Kafka Cluster"]
        Part[Partition Leader]
        Replica[Partition Replica]
    end

    subgraph Consumers
        C1[Consumer A]
        C2[Consumer B]
    end

    %% Normal case
    P1 -- send --> Part
    Part -- replicate --> Replica
    Part -- deliver --> C1

    %% Negative cases
    P1 -. network fail .-> Part
    Part -. leader down .-> Replica
    C1 -. crash .-> C2
    C1 -. too slow (lag) .-> Part

Conclusion

In Kafka, the flow of data is simple but powerful:

  • Producers put data into topics.
  • Brokers store data safely and replicate it.
  • Consumers read data from partitions and track their progress with offsets.

This design makes Kafka an efficient and scalable backbone for real-time applications.

Pages: 1 2 3
Category: NATS

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Linkedin

Widhian Bramantya

Recent Posts

  • Log Management at Scale: Integrating Elasticsearch with Beats, Logstash, and Kibana
  • Index Lifecycle Management (ILM) in Elasticsearch: Automatic Data Control Made Simple
  • Blue-Green Deployment in Elasticsearch: Safe Reindexing and Zero-Downtime Upgrades
  • Maintaining Super Large Datasets in Elasticsearch
  • Elasticsearch Best Practices for Beginners
  • Implementing the Outbox Pattern with Debezium
  • Production-Grade Debezium Connector with Kafka (Postgres Outbox Example – E-Commerce Orders)
  • Connecting Debezium with Kafka for Real-Time Streaming
  • Debezium Architecture – How It Works and Core Components
  • What is Debezium? – An Introduction to Change Data Capture
  • Offset Management and Consumer Groups in Kafka
  • Partitions, Replication, and Fault Tolerance in Kafka
  • Delivery Semantics in Kafka: At Most Once, At Least Once, Exactly Once
  • Producers and Consumers: How Data Flows in Kafka
  • Kafka Architecture Explained: Brokers, Topics, Partitions, and Offsets
  • Getting Started with Apache Kafka: Core Concepts and Use Cases
  • Security Best Practices for RabbitMQ in Production
  • Understanding RabbitMQ Virtual Hosts (vhosts) and Their Uses
  • RabbitMQ Performance Tuning: Optimizing Throughput and Latency
  • High Availability in RabbitMQ: Clustering and Mirrored Queues Explained

Recent Comments

  1. Playing with VPC AWS (Part 2) – Widhian's Blog on Playing with VPC AWS (Part 1): VPC, Subnet, Internet Gateway, Route Table, NAT, and Security Group
  2. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 1): Introduction
  3. Basic Concept of ElasticSearch (Part 2): Architectural Perspective – Widhian's Blog on Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh
  4. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective
  5. Basic Concept of ElasticSearch (Part 1): Introduction – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective

Archives

  • October 2025
  • September 2025
  • August 2025
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • March 2021
  • January 2021

Categories

  • Debezium
  • Devops
  • ElasticSearch
  • Golang
  • Kafka
  • Lua
  • NATS
  • Programming
  • RabbitMQ
  • Redis
  • VPC
© 2025 Widhian Bramantya | Powered by Minimalist Blog WordPress Theme