Introduction
Modern systems need real-time data. For example:
- An e-commerce site wants to see new orders instantly.
- A bank must detect suspicious transactions immediately.
- A logistics company must update delivery tracking without delay.
Batch jobs are too slow for these cases. The solution is to build a real-time streaming pipeline.
- Debezium captures every change in a database.
- Kafka delivers these changes to other systems in real time.
Together, they form a strong foundation for event-driven systems.
Why Kafka + Debezium?
Using Debezium with Kafka brings many benefits:
- Real-time: changes appear in seconds.
- Scalable: Kafka can handle millions of events per second.
- Reliable: Kafka stores messages safely and can replay them.
- Multiple consumers: analytics, search, microservices, and caches can all read the same data.
- Ordering by key: with partitions, Kafka ensures order for related events.
How the Flow Works
flowchart LR direction LR subgraph DB[Database] A[(PostgreSQL WAL)] end subgraph DBZ[Debezium Connector] B[Postgres Connector] end subgraph KAFKA[Apache Kafka] T1[(orders topic)] T2[(customers topic)] end subgraph CONSUMERS[Consumers] C1[Analytics Dashboard] C2[Fraud Detection Service] C3[Shipping Service] end A --> B --> T1 A --> B --> T2 T1 --> C1 T1 --> C2 T1 --> C3
Step by step:
- Database writes changes into the transaction log (e.g., WAL in Postgres).
- Debezium reads the log and captures events.
- Debezium sends these events into Kafka topics.
- Consumers subscribe to topics and act on the events.
Example Use Case: E-Commerce Orders
Imagine an e-commerce system where new orders must be processed quickly.
- All events for the same customer_id should go to the same partition.
- This ensures correct order for fraud detection, shipping, and loyalty services.
Example Debezium Connector Config
{ "name": "ecommerce-orders-connector", "config": { "connector.class": "io.debezium.connector.postgresql.PostgresConnector", "database.hostname": "ecommerce-db.example.com", "database.port": "5432", "database.user": "orders_user", "database.password": "******", "database.dbname": "ecommerce_production", "database.server.name": "ecommerce", "table.include.list": "public.orders(.*)", "message.key.columns": "public.orders(.*):customer_id", "transforms": "unwrap", "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState", "tasks.max": "3", "topic.creation.default.partitions": "12", "topic.creation.enable": "true", "slot.name": "cdc_orders", "plugin.name": "pgoutput", "exactly.once.support": "required" } }
👉 Important part:
"message.key.columns": "customer_id"
→ ensures events are partitioned by customer."exactly.once.support": "required"
→ prevents duplicates.
Example Event in Kafka
When a new order is created, Debezium produces an event like this:
{ "customer_id": 2001, "order_id": "ORD-88421", "total": 499.99, "status": "NEW", "created_at": "2025-09-27T12:45:00Z" }
- Partition key:
customer_id
. - Guarantees all events for customer 2001 are in the same partition.
- Consumers can process in the right order.
Pages: 1 2
Category: Debezium