Skip to content

Widhian Bramantya

coding is an art form

Menu
  • About Me
Menu
elasticsearch

Blue-Green Deployment in Elasticsearch: Safe Reindexing and Zero-Downtime Upgrades

Posted on October 5, 2025October 5, 2025 by admin

Blue-Green Upgrade to a New Cluster

When upgrading Elasticsearch itself (for example 7.x → 8.x), it’s safer to use two separate clusters.

Step 1 Create the New (Green) Cluster

Install the new version on new nodes.
Do not mix old and new nodes in the same cluster if the major version is different.

Step 2 Register a Snapshot Repository

On the blue cluster:

PUT _snapshot/backup_repo
{
  "type": "fs",
  "settings": { "location": "/mnt/backup" }
}

Step 3 Take a Snapshot

PUT _snapshot/backup_repo/snapshot_oct2025
{
  "indices": "my_index*",
  "ignore_unavailable": true,
  "include_global_state": false
}

Step 4 Restore to the New Cluster

On the green cluster:

POST _snapshot/backup_repo/snapshot_oct2025/_restore
{
  "indices": "my_index*",
  "rename_pattern": "my_index_v1",
  "rename_replacement": "my_index_v2"
}

Now the green cluster has all data in the upgraded format.

Step 5 Validate and Switch

Point your load balancer or application connection to the new cluster endpoint.
If problems appear, simply point back to the old cluster (blue).

Continuous Sync with Cross-Cluster Replication (CCR)

If your dataset is very large or you cannot afford any delay between blue and green clusters,
Cross-Cluster Replication is the best approach.

CCR keeps a follower index in the new (green) cluster that automatically copies all operations from the leader index in the old (blue) cluster, including new documents, updates, and deletions, in near real-time.

How CCR Works

  • Leader Cluster (Blue): keeps the original data
  • Follower Cluster (Green): continuously replicates data
  • You can later promote the follower cluster to production

CCR uses Elasticsearch’s internal replication engine to send new operations through the remote cluster connection.

Requirements for Cross-Cluster Replication

RequirementDescription
Elasticsearch Platinum or Enterprise licenseCCR is a commercial feature (not in basic license)
Both clusters on compatible versionse.g., 8.5 → 8.10 (same major version)
Remote connection configuredGreen must know where to find Blue
Security enabledTLS/SSL + user roles for replication
Index in Blue must have soft deletes enabledDefault in ES 7+

Check soft deletes setting:

GET my_index_v1/_settings

If missing:

PUT my_index_v1/_settings
{ "index.soft_deletes.enabled": true }

Setting Up CCR

Step 1 Register the Remote Cluster (on Green)

PUT _cluster/settings
{
  "persistent": {
    "cluster.remote.blue_cluster.seeds": [ "blue-node1:9300" ]
  }
}

Step 2 Create a Follower Index

PUT my_index_v2/_ccr/follow
{
  "remote_cluster": "blue_cluster",
  "leader_index": "my_index_v1"
}

Now, my_index_v2 will continuously follow all changes from my_index_v1.


Step 3 Monitor Replication

GET my_index_v2/_ccr/stats

Check:

  • operations_read → how many operations replicated
  • time_since_last_read_millis → lag time
  • follower_checkpoint vs leader_checkpoint

Step 4 Promote the Follower (Switch to Green)

When ready to make Green the new production cluster:

  1. Pause replication: POST my_index_v2/_ccr/unfollow
  2. Switch your application endpoint from Blue to Green.
  3. Optionally, start replicating back (if needed for rollback).
See also  Elasticsearch Best Practices for Beginners

Now, Green is the active environment, Blue can be kept as a standby or backup.

CCR vs. Reindex/Snapshot

MethodSpeedReal-timeCross-clusterLicense
_reindexMediumNoOptionalFree
Snapshot & RestoreSlowNoYesFree
CCRFastYesYesPaid (Platinum/Enterprise)

CCR is ideal for:

  • Real-time migrations
  • Large, high-volume clusters
  • Zero-downtime cross-region setups
Pages: 1 2 3
Category: ElasticSearch

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Linkedin

Widhian Bramantya

Recent Posts

  • Log Management at Scale: Integrating Elasticsearch with Beats, Logstash, and Kibana
  • Index Lifecycle Management (ILM) in Elasticsearch: Automatic Data Control Made Simple
  • Blue-Green Deployment in Elasticsearch: Safe Reindexing and Zero-Downtime Upgrades
  • Maintaining Super Large Datasets in Elasticsearch
  • Elasticsearch Best Practices for Beginners
  • Implementing the Outbox Pattern with Debezium
  • Production-Grade Debezium Connector with Kafka (Postgres Outbox Example – E-Commerce Orders)
  • Connecting Debezium with Kafka for Real-Time Streaming
  • Debezium Architecture – How It Works and Core Components
  • What is Debezium? – An Introduction to Change Data Capture
  • Offset Management and Consumer Groups in Kafka
  • Partitions, Replication, and Fault Tolerance in Kafka
  • Delivery Semantics in Kafka: At Most Once, At Least Once, Exactly Once
  • Producers and Consumers: How Data Flows in Kafka
  • Kafka Architecture Explained: Brokers, Topics, Partitions, and Offsets
  • Getting Started with Apache Kafka: Core Concepts and Use Cases
  • Security Best Practices for RabbitMQ in Production
  • Understanding RabbitMQ Virtual Hosts (vhosts) and Their Uses
  • RabbitMQ Performance Tuning: Optimizing Throughput and Latency
  • High Availability in RabbitMQ: Clustering and Mirrored Queues Explained

Recent Comments

  1. Playing with VPC AWS (Part 2) – Widhian's Blog on Playing with VPC AWS (Part 1): VPC, Subnet, Internet Gateway, Route Table, NAT, and Security Group
  2. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 1): Introduction
  3. Basic Concept of ElasticSearch (Part 2): Architectural Perspective – Widhian's Blog on Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh
  4. Basic Concept of ElasticSearch (Part 3): Translog, Flush, and Refresh – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective
  5. Basic Concept of ElasticSearch (Part 1): Introduction – Widhian's Blog on Basic Concept of ElasticSearch (Part 2): Architectural Perspective

Archives

  • October 2025
  • September 2025
  • August 2025
  • November 2021
  • October 2021
  • August 2021
  • July 2021
  • June 2021
  • March 2021
  • January 2021

Categories

  • Debezium
  • Devops
  • ElasticSearch
  • Golang
  • Kafka
  • Lua
  • NATS
  • Programming
  • RabbitMQ
  • Redis
  • VPC
© 2025 Widhian Bramantya | Powered by Minimalist Blog WordPress Theme