Apache Kafka is built to handle big data and stay reliable even when some servers fail. The secret lies in three concepts: partitions, replication, and fault tolerance.
Partitions
A partition is a slice of a topic.
- Every topic can be split into many partitions.
- Messages inside a partition are stored in order with an offset (like line numbers).
- Different partitions can live on different brokers (servers).
Example:
- Topic
orders
with 3 partitions:orders-0
: [offset 0, 1, 2…]orders-1
: [offset 0, 1, 2…]orders-2
: [offset 0, 1, 2…]
Why use partitions?
- Scalability → Many consumers can read in parallel.
- Load distribution → Messages are spread across brokers, so no single machine is overloaded.
- Performance → Kafka can handle millions of events per second by splitting the data.
Replication
Replication means making copies of data.
- Each partition has a leader and replicas.
- The leader handles reads and writes.
- Replicas are just backups, waiting in case the leader fails.
Example:
- Partition
orders-0
has leader on Broker 1, replicas on Broker 2 and 3. - If Broker 1 goes down: Broker 2 (replica) becomes the new leader.
Why replicate?
- Durability → Data is safe even if a server dies.
- High availability → Another broker can take over quickly.
- Consistency → Producers and consumers always talk to the current leader.
Pages: 1 2
Category: Kafka