Rancher for Microservices : Upgrades and Rollback.

So far we’ve checked how easy it is to get up and running with Rancher. We also deployed a very simple HTTP service on our Rancher Cluster, attached an L7 Load balancer and successfully scaled up containers running this service.

In this post, I’ll use the same service with a slight modification which is version number now return as 2.0 in HTTP response. So far our service is on v1.0, let’s say we’ve worked very hard and released a new version with latest features. We want to release it to our users while ensuring there is no downtime during deployment. At this stage, our docker image of service 2.0 is pushed to docker repository (ravirdv/app:2.0).

In the world without container orchestration platforms, we’d have to write scripts to spawn up compute resource (EC2, VM etc) and then use something like Ansible/Chef/Farbic scripts to provision required services and dependencies. Once that is done, we’d push our package and hope there is no dependency/version mismatch and our service starts up. After new version of service is up, we’d slowly migrate our traffic to new version of service and clean up compute resources of old version or may be keep it running as standby. Now there might be variations of how this is done, but roughly this is how normally people did it in past.

Enough of history lesson 🙂, let’s get back to how we can achieve similar result with more predictability and fault tolerance using Rancher. Rancher likes to call collection of containers a “Workload”, in this case, our HTTP service containers are part of same workload. So when we ask Rancher to upgrade the Workload, it will spin up new set of containers with specified revision of docker image and instruct our L7 load balancer to gradually redirect traffic to new set of containers. Rancher does all of this for you, without breaking a sweat! awesome isn’t it? 🤩

Upgrading a workload is very simple. You just have to click on Edit, update image version and hit “Upgrade” button, Rancher will instantly fetch the docker image from registry and provision a set of containers using that image.

Demo time! let’s try to upgrade our service from v1.0 to v2.0 and see how our curl script handles it.

Looking at “App Version” value, we can see that it curl slowly starts receiving v2.0 as response.

It takes few more seconds to switch all containers to new version, below screen capture shows all responses are now from 2.0

As our service is now upgraded, “App Version” value is 2.0 in response, also container host are now different.

I was completely blown away by this feature, this “nuke and pave” approach give me much more confidence than modifying existing server configuration. However, things may go wrong. May be there is a last minute bug which impact a set of users or a top secret feature which shouldn’t have been part of this release or anything else really where you’d wish your infrastructure had an “undo” button.

And you guessed it! while not really an undo button, Rancher makes it very easy to rollback to previous version. Process for rollback is same as upgrade. Let’s say we found an issue in our cutting edge 2.0 release and now we’ve decided to rollback to 1.0 version ASAP 🔥. Let’s see how Rancher handles it:

As you can see from HTTP response, we can see service 2.0 containers are replaced with version 1.0 containers. Although, this will be much more messy if services are not stateless, I’ll write more about this when covering persistent storage topic. If you’re new to containerization and its ecosystem then I highly recommend to start with services which are stateless. Running your app on local machine with Docker-Compose is easiest way to start in my opinion.

I hope this series of posts give you enough ammunition to start playing with Rancher. In next series of posts, I’ll write about how you can use this platform to perform service discovery, handle configuration changes, store secrets, and most importantly debug and monitor services.

If you found this interesting or if there is something that I can improve then please let me know via comments.