Understanding “Hot Nodes”

A hot node is a node that carries more load than others —
for example, too many shards or very active shards.

Causes of Hot Nodes:

Uneven shard distribution (some nodes have bigger shards)
Skewed traffic (queries always hit certain indices)
Oversized shards (single shard too heavy)
Missing ILM or rollover policy

How to Prevent Hot Nodes:

Check shard allocation: GET _cat/allocation?v
Balance shards across nodes — Elasticsearch does this automatically, but you can adjust weights.
Use ILM to roll over indices before they grow too big.
Use more hot nodes if needed (scale horizontally).
Avoid routing all writes to a single shard (use routing key wisely).

A balanced cluster keeps CPU, heap, and disk usage similar across all nodes.

Shrink Oversized Indices

If you already have a very large index with too many shards, you can shrink it.

Make the index read-only: PUT my_index/_settings { "settings": { "index.blocks.write": true } }
Shrink to fewer shards: POST my_index/_shrink/my_index_small { "settings": { "index.number_of_shards": 1 } }

Shrinking reduces memory and coordination overhead.

Force Merge Old Indices

When an index is no longer updated (like last month’s logs),
you can reduce segment count to improve read speed:

POST logs-2025-09/_forcemerge?max_num_segments=1

Use this only for read-only indices, because it is expensive for active data.

Monitor Shard and Node Health

Regular monitoring keeps problems small.

GET _cat/indices?v
GET _cat/shards?v
GET _cluster/health

Green: All primary and replica shards are active
Yellow: Replica shards missing
Red: Primary shard missing (critical)

Use Kibana Monitoring or ElasticHQ to see hot nodes and shard distribution visually.

Use Snapshots for Backup

Back up large datasets with snapshot and restore.
Snapshots are incremental, so they save only changed data.

PUT _snapshot/my_backup
{
  "type": "fs",
  "settings": { "location": "/mnt/es_backup" }
}

PUT _snapshot/my_backup/snapshot_2025_10
{ "indices": "my_index*" }

Always store snapshots outside the cluster to protect against node failure.

Avoid Too Many Indices

Thousands of tiny indices can hurt performance just like oversized ones.
Each index has metadata that consumes memory.

Combine similar data into a single index with a category or source field.
Use index templates to ensure consistent settings:

PUT _index_template/logs_template
{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": { "number_of_shards": 3 },
    "mappings": { "properties": { "timestamp": { "type": "date" } } }
  }
}

Schedule Regular Maintenance

To keep the cluster clean and healthy:

Delete or roll over old indices
Rebalance shards periodically
Check heap usage and disk thresholds
Refresh nodes or restart slowly (one by one)
Test ILM policies and snapshot restore regularly

Example to delete old data:

DELETE logs-2024*

Summary of Trade-Offs and Balance

Decision	Benefit	Risk
More shards	Parallel queries, faster indexing	Memory overhead, coordination cost
Fewer shards	Less overhead, simpler state	Longer recovery, possible “hot shard”
Too many small indices	Easy to isolate data	Metadata overload
Too large index	Simple management	Slow queries and snapshots

The goal is balance, not too many, not too few.

Conclusion

Maintaining super large Elasticsearch datasets is about balance and planning.
Use ILM to automate data flow, monitor shard sizes, distribute load evenly, and always keep an eye on node health. Avoid both shard explosion and giant shards, find your sweet spot.

“Good clusters are like good teams, evenly balanced, not too hot, not too cold.”

If you follow these habits, your Elasticsearch cluster will stay fast, stable, and scalable, no matter how big your data grows.

Pages: 1 2

Category: ElasticSearch

Maintaining Super Large Datasets in Elasticsearch