LaVOZs

The World’s Largest Online Community for Developers

'; failover - Failing over with single Replication Group on ElastiCache Redis - LavOzs.Com

I'm testing out ElastiCache backed by Redis with the following specs:

  • Using Redis 2.8, with Multi-AZ
  • Single replication group
  • 1 master node in us-east-1b, 1 slave node in us-east-1c, 1 slave node in us-east-1d
  • The part of the application writing is directly using the endpoint for the master node (primary-node.use1.cache.amazonaws.com)
  • The part of the application doing only reads is pointing to a custom endpoint (readonly.redis.mydomain.com) configured in HAProxy, which then points to the two other read slave end points. (readslave1.use1.cache.amazonaws.com and readslave2.use1.cache.amazonaws.com)

Now lets say the primary node (master) fails in us-east-1b.

From what I understand, if the master instance fails, I won't have to change the url for the end point for writing to Redis (primary-node.use1.cache.amazonaws.com), although from there, I still have the following questions:

  • Do I have to change the endpoint names for the read only slaves?
  • How long until the missing slave is added into the pool?

If there's anything else I'm missing, I'd appreciate the advice/information.

Thanks!

If you are using ElastiCache, you should make use the "Primary EndpointThe" provided by AWS.

That endpoint actually is backed by Route53, if the primary (master) redis is down, since you enable MutliA-Z, it will auto fail over to one of the read replica (slave). In that case, you don't need to modify the endpoint of your redis.

I don't know why you have such design, seems you only want write to master, but always read from slave. For HA Proxy part, you should include TCP check for ALL 3 redis nodes, using their "Read Endpoint" In haproxy, you can check if the endpoint is SLAVE, if yes, your haproxy should redirect the traffic to that.

Notice that in the application layer, if your redis driver don't support auto reconnect, your script will fail to connect to the new master nodes.

In addition to "auto reconnect", since AWS is using Route53 DNS to do fail over, some lib will NOT do NS lookup again, which means the DNS is still pointing to the OLD ip which is the old master.

Using HAproxy can solve this problem.

Related
Connecting to AWS ElastiCache replication group read replicas
Redis client Lettuce Master/Slave configuration for AWS Elasticache
Nodejs connect to AWS ElasticCache replication group
AWS Elasticache Redis failover
Problem configuring django-redis with ElastiCache Redis (cluster mode enabled)
Replication with AWS ElastiCache (Memcached)