Negative Cases in Producer–Consumer Flow
Even though Kafka is designed to be reliable, things can still go wrong. Here are some common failure scenarios and how Kafka handles them.
1. Producer Failure (Message Not Sent)
- Case: The producer tries to send a message, but the network goes down before the broker confirms.
- Result: The producer doesn’t get an acknowledgment (ack).
- Handling:
- If retries are enabled, the producer will try again.
- This can sometimes cause duplicate messages, but Kafka guarantees that the duplicates stay in the same partition.
This is why idempotent producers are recommended in production.
2. Broker Failure (Leader Partition Down)
- Case: The broker that is the leader for a partition crashes.
- Result: Producers and consumers cannot write or read from that partition temporarily.
- Handling:
- Kafka will elect a new leader from the replicas.
- Producers/consumers automatically reconnect to the new leader.
- Some delay (latency spike) may happen during re-election.
3. Consumer Failure (Consumer Crashes)
- Case: A consumer in a group stops working.
- Result: Messages assigned to its partition are not processed.
- Handling:
- Kafka rebalances the group.
- Another consumer in the group takes over the partition.
- Processing continues, though some lag may appear.
4. Consumer Lag (Too Slow to Keep Up)
- Case: Producer sends data faster than the consumer can process.
- Result: Lag grows (the gap between the latest offset and the consumer’s offset).
- Handling:
- If lag becomes larger than the topic’s retention, old data is deleted before the consumer reads it → data loss for that consumer.
- The solution is to scale out consumers (add more in the group) or increase retention.
Diagram: Normal vs Negative Flow
flowchart TD subgraph Producers P1[Producer] end subgraph KafkaCluster["Kafka Cluster"] Part[Partition Leader] Replica[Partition Replica] end subgraph Consumers C1[Consumer A] C2[Consumer B] end %% Normal case P1 -- send --> Part Part -- replicate --> Replica Part -- deliver --> C1 %% Negative cases P1 -. network fail .-> Part Part -. leader down .-> Replica C1 -. crash .-> C2 C1 -. too slow (lag) .-> Part
Conclusion
In Kafka, the flow of data is simple but powerful:
- Producers put data into topics.
- Brokers store data safely and replicate it.
- Consumers read data from partitions and track their progress with offsets.
This design makes Kafka an efficient and scalable backbone for real-time applications.
Category: NATS