Skip to content

Widhian Bramantya

coding is an art form

Menu
  • About Me
Menu
elasticsearch

Index Lifecycle Management (ILM) in Elasticsearch: Automatic Data Control Made Simple

Posted on October 5, 2025October 5, 2025 by admin

When your Elasticsearch grows very large, managing all indices by hand becomes impossible.
Old data takes space, slows down queries, and increases cost. Index Lifecycle Management (ILM) helps you automate this, deciding when to roll over, move, merge, freeze, or delete indices automatically.

This article explains ILM in simple English, including the frozen phase and how it can even store data in S3 for searchable archiving.

What Is ILM?

ILM (Index Lifecycle Management) is an automatic system that controls the lifecycle of an index.
It defines phases that represent the age of your data — from hot to frozen or deleted.

With ILM you can:

  • Create new indices when the old one grows too large (rollover)
  • Move older data to cheaper storage nodes
  • Merge and compress old indices
  • Keep old data searchable on S3 (frozen)
  • Delete data automatically after it’s no longer needed

ILM = automated data aging system for Elasticsearch.

Why ILM Is Important

Without ILM:

  • Indices grow forever
  • Searches get slow
  • Cluster uses too much memory and disk
  • You must delete old data manually

With ILM:
– Better performance
– Lower cost
– Automatic cleanup
– Easier scaling

ILM Phases Explained

ILM divides an index’s lifetime into five phases:

PhaseDescriptionTypical ActionsLicense
HotActive data (frequent writes & queries)Rollover, set replicasBasic
WarmRead-only data, still queriedMove to warm nodes, shrink, forcemergeBasic
ColdRarely queried dataMove to cold tier, compressBasic
FrozenArchived data, searchable via S3 or snapshotsSearchable snapshot, minimal costPlatinum / Enterprise
DeleteData no longer neededDelete indexBasic

Hot Phase

This is where Elasticsearch actively writes and searches new data.
The index is read-write, and ILM will roll over when it grows large.

See also  Basic Concept of ElasticSearch (Part 2): Architectural Perspective

Example triggers:

  • "max_size": "30gb"
  • "max_age": "7d"

Warm Phase

Data becomes read-only but still useful.
You can move it to warm nodes (slower, cheaper machines).

Actions in this phase:

  • allocate: move index to warm node
  • forcemerge: merge segments for faster reads
  • shrink: reduce shard count
  • set_priority: lower query priority

Cold Phase

Data is old and rarely accessed.
You can move it to cold nodes with slower disks or cheaper storage.
It’s still searchable but slower.

Actions:

  • allocate to cold node
  • freeze (optional): keeps index searchable with very low memory usage
  • Reduce replicas

Frozen Phase

This is the final searchable archive phase.
The data is moved out of the cluster to an object storage (like Amazon S3, Azure Blob, or Google Cloud Storage) and accessed via a searchable snapshot.

  • Data no longer takes local disk space.
  • Queries are slower because they read directly from S3.
  • Ideal for compliance or historical archives, data that must remain searchable but rarely used.

Example actions:

"frozen": {
  "min_age": "180d",
  "actions": {
    "searchable_snapshot": {
      "snapshot_repository": "s3_repository"
    }
  }
}

Note:

  • This feature is part of Elastic Stack Platinum or Enterprise license.
  • It requires configuring a snapshot repository connected to S3 (or similar).
  • Not available in the Basic (free) license.

Delete Phase

Data has no value anymore.
ILM can automatically delete it after a given time to save space.

Example:

"delete": {
  "min_age": "365d",
  "actions": { "delete": {} }
}
Pages: 1 2
Category: ElasticSearch

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Linkedin

Widhian Bramantya

Recent Posts

  • Log Management at Scale: Integrating Elasticsearch with Beats, Logstash, and Kibana
  • Index Lifecycle Management (ILM) in Elasticsearch: Automatic Data Control Made Simple
  • Blue-Green Deployment in Elasticsearch: Safe Reindexing and Zero-Downtime Upgrades
  • Maintaining Super Large Datasets in Elasticsearch
  • Elasticsearch Best Practices for Beginners
  • Implementing the Outbox Pattern with Debezium
  • Production-Grade Debezium Connector with Kafka (Postgres Outbox Example – E-Commerce Orders)
  • Connecting Debezium with Kafka for Real-Time Streaming
  • Debezium Architecture – How It Works and Core Components
  • What is Debezium? – An Introduction to Change Data Capture
  • Offset Management and Consumer Groups in Kafka
  • Partitions, Replication, and Fault Tolerance in Kafka
  • Delivery Semantics in Kafka: At Most Once, At Least Once, Exactly Once
  • Producers and Consumers: How Data Flows in Kafka
  • Kafka Architecture Explained: Brokers, Topics, Partitions, and Offsets
  • Getting Started with Apache Kafka: Core Concepts and Use Cases
  • Security Best Practices for RabbitMQ in Production
  • Understanding RabbitMQ Virtual Hosts (vhosts) and Their Uses
  • RabbitMQ Performance Tuning: Optimizing Throughput and Latency
  • High Availability in RabbitMQ: Clustering and Mirrored Queues Explained

Recent Comments

  1. Playing with VPC AWS (Part 2) – Widhian's Blog on Playing with VPC AWS (Part 1): VPC, Subnet, Internet Gateway, Route Table, NAT, and Security Group
  2. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 1): Introduction
  3. Basic Concept of ElasticSearch (Part 2): Architectural Perspective – Widhian's Blog on Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh
  4. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective
  5. Basic Concept of ElasticSearch (Part 1): Introduction – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective

Archives

  • October 2025
  • September 2025
  • August 2025
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • March 2021
  • January 2021

Categories

  • Debezium
  • Devops
  • ElasticSearch
  • Golang
  • Kafka
  • Lua
  • NATS
  • Programming
  • RabbitMQ
  • Redis
  • VPC
© 2025 Widhian Bramantya | Powered by Minimalist Blog WordPress Theme