Skip to content

Widhian Bramantya

coding is an art form

Menu
  • About Me
Menu
elasticsearch

Elasticsearch Best Practices for Beginners

Posted on October 5, 2025October 5, 2025 by admin

Use Analyzers Properly

Analyzers decide how Elasticsearch breaks text into words.
The default is standard, but you can use others like english, whitespace, or custom.

Example:

"analyzer": "english"

helps with stemming (e.g., running → run).

If you store multilingual data, define separate fields:

"name_en": { "type": "text", "analyzer": "english" },
"name_id": { "type": "text", "analyzer": "indonesian" }

Delete Old Data Automatically with ILM

If you store logs or time-based data, don’t keep everything forever.
Use Index Lifecycle Management (ILM) to:

  • Rollover to a new index after N GB or days
  • Move old data to “cold” nodes
  • Delete very old data automatically

Example Policy:

PUT _ilm/policy/logs_policy
{
  "policy": {
    "phases": {
      "hot": { "actions": { "rollover": { "max_age": "7d" } } },
      "delete": { "min_age": "30d", "actions": { "delete": {} } }
    }
  }
}

This keeps your storage clean and your cluster fast.

Monitor Your Cluster Health

Always know if your cluster is healthy.
You can check it with:

GET _cluster/health

Statuses:

  • Green: everything is fine
  • Yellow: some replicas missing
  • Red: primary shards missing (serious problem)

Use Kibana Monitoring or ElasticHQ plugin to watch performance.

Use Aliases for Safer Updates

Instead of changing index names in your app, use index aliases.
This makes versioning easier.

Example:

POST /_aliases
{
  "actions": [
    { "add": { "index": "users_v2", "alias": "users" } },
    { "remove": { "index": "users_v1", "alias": "users" } }
  ]
}

Your app always searches “users”, even if you change versions behind the scenes.

Cache Smartly and Tune Queries

  • Use filter for cacheable queries.
  • Use sort on keyword or numeric fields (not on analyzed text).
  • Use _source filtering to return only needed fields:
"_source": ["id", "name", "price"]

This saves bandwidth and memory.

See also  Basic Concept of ElasticSearch (Part 2): Architectural Perspective

Secure Your Cluster

Never expose Elasticsearch directly to the internet.
Use:

  • HTTP authentication or API keys
  • Firewalls or VPC access only
  • SSL/TLS for encryption
  • Regular backups with snapshots

Conclusion

Elasticsearch can handle millions of records, but only if you use it the right way.
By planning your mappings, managing shards wisely, and using filters and ILM, you will have a stable, fast, and safe system.

Start small. Test queries. Watch performance.
These small habits will make you a professional Elasticsearch user faster than you think.

Pages: 1 2
Category: ElasticSearch

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Linkedin

Widhian Bramantya

Recent Posts

  • Log Management at Scale: Integrating Elasticsearch with Beats, Logstash, and Kibana
  • Index Lifecycle Management (ILM) in Elasticsearch: Automatic Data Control Made Simple
  • Blue-Green Deployment in Elasticsearch: Safe Reindexing and Zero-Downtime Upgrades
  • Maintaining Super Large Datasets in Elasticsearch
  • Elasticsearch Best Practices for Beginners
  • Implementing the Outbox Pattern with Debezium
  • Production-Grade Debezium Connector with Kafka (Postgres Outbox Example – E-Commerce Orders)
  • Connecting Debezium with Kafka for Real-Time Streaming
  • Debezium Architecture – How It Works and Core Components
  • What is Debezium? – An Introduction to Change Data Capture
  • Offset Management and Consumer Groups in Kafka
  • Partitions, Replication, and Fault Tolerance in Kafka
  • Delivery Semantics in Kafka: At Most Once, At Least Once, Exactly Once
  • Producers and Consumers: How Data Flows in Kafka
  • Kafka Architecture Explained: Brokers, Topics, Partitions, and Offsets
  • Getting Started with Apache Kafka: Core Concepts and Use Cases
  • Security Best Practices for RabbitMQ in Production
  • Understanding RabbitMQ Virtual Hosts (vhosts) and Their Uses
  • RabbitMQ Performance Tuning: Optimizing Throughput and Latency
  • High Availability in RabbitMQ: Clustering and Mirrored Queues Explained

Recent Comments

  1. Playing with VPC AWS (Part 2) – Widhian's Blog on Playing with VPC AWS (Part 1): VPC, Subnet, Internet Gateway, Route Table, NAT, and Security Group
  2. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 1): Introduction
  3. Basic Concept of ElasticSearch (Part 2): Architectural Perspective – Widhian's Blog on Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh
  4. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective
  5. Basic Concept of ElasticSearch (Part 1): Introduction – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective

Archives

  • October 2025
  • September 2025
  • August 2025
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • March 2021
  • January 2021

Categories

  • Debezium
  • Devops
  • ElasticSearch
  • Golang
  • Kafka
  • Lua
  • NATS
  • Programming
  • RabbitMQ
  • Redis
  • VPC
© 2025 Widhian Bramantya | Powered by Minimalist Blog WordPress Theme