Modern systems generate millions of logs every day, from API servers, databases, applications, and containers. Managing, searching, and visualizing all of these logs in real-time is not easy.
This is where the ELK Stack, Elasticsearch, Logstash, and Kibana, comes in.
When combined with Beats, it becomes one of the most powerful and flexible log management systems in the world.
In this article, we’ll explore how ELK works together, how data flows through the pipeline, and how to build a scalable log management architecture.
What Is ELK Stack?
ELK is short for:
- Elasticsearch: stores and indexes log data
- Logstash: processes and transforms logs
- Kibana: visualizes data and dashboards
Later, Beats was added to make it easier to collect data from servers and containers.
Together, the full stack is now called the Elastic Stack (ELK + Beats).
Architecture Overview
Let’s understand how logs move through the system.
flowchart TD A[Applications / Servers] --> B[Beats] B --> C[Logstash] C --> D[Elasticsearch] D --> E[Kibana]
Each part has a specific role:
Component | Purpose |
---|---|
Beats | Lightweight agent that sends logs from servers |
Logstash | Central processor to filter, parse, and enrich logs |
Elasticsearch | Search and analytics engine storing structured logs |
Kibana | Dashboard for visualization and alerting |
Beats: Lightweight Data Shippers
Beats are small agents installed on servers to collect and send data.
They are written in Go and have minimal resource usage.
Common Types of Beats
Beat | Purpose |
---|---|
Filebeat | Read and ship log files |
Metricbeat | Collect CPU, memory, disk metrics |
Packetbeat | Capture network traffic |
Heartbeat | Monitor uptime and ping endpoints |
Example: Filebeat → Logstash → Elasticsearch
Configuration (simple Filebeat example):
filebeat.inputs: - type: log paths: - /var/log/nginx/*.log output.logstash: hosts: ["logstash:5044"]
Each Beat batches, compresses, and retries automatically — ensuring logs are not lost.
Logstash: The Central Pipeline
Logstash is the heart of the Elastic Stack. It receives logs, processes them, and forwards them to Elasticsearch.
It uses a pipeline model with three stages:
flowchart LR A[Input] --> B[Filter] --> C[Output]
Example Configuration
input { beats { port => 5044 } } filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } } output { elasticsearch { hosts => ["http://elasticsearch:9200"] index => "nginx-logs-%{+YYYY.MM.dd}" } }
- Grok filters parse raw text logs into structured JSON.
- Date filter ensures timestamps are consistent.
- Index naming by date helps in retention and lifecycle management.