Zero Downtime Releases using Kubernetes and Istio

The idea behind zero downtime release is to release a new version of the service, without affecting any users — i.e., users don't even know when a new version of the service is released. A practical example would be if you have a website running, how can you can you release a new version without taking the site down? For services, it means that you can make continuous requests to that service while new service is being released and the callers never get that dreaded 503 Service Unavailable response.

In this article, I'll explain two methods on how to do the zero downtime deployments — one using Kubernetes only and the second one using Istio service mesh.

Using Kubernetes (Rolling Updates)

As the first option for zero downtime deployments, we are just going to use “plain” Kubernetes, without any Istio involved. We are going to use a simple Node.Js web application that shows “Hello World!” message.

You can deploy the above YAML using kubectl apply -f helloworld.yaml. This creates a Kubernetes service called helloworld and a Kubernetes deployment with three replicas and with an image called learnistio/helloworld:1.0.0 — pretty much the simplest YAML you can come up with to run something inside Kubernetes.

To access the website, you can use the port-forward command from the Kubernetes CLI. This command opens a tunnel between the service running inside the cluster and your local machine:

kubectl port-forward service/helloworld 3000 3000

Open http://localhost:3000 in the browser of your choice, and you should get an ugly looking page like the one in the figure below:

Now, we want to update this website with a newer release. Assuming you built and created an image called learnistio/helloworld:2.0.0, we need to update the deployment with this new image. Not only that, we need to update the deployment in such a way that it doesn't impact anyone who might be trying to access the website.

As mentioned before, Kubernetes has a concept of doing a zero downtime deployment using a strategy called RollingUpdate (this gets set on the individual Kubernetes deployments) which is also the default setting for every deployment. The second strategy is called Recreate, and as the name suggests, it kills all existing pods and then creates the new ones.

The default rolling update strategy updates the pods in a rolling fashion. To control the process, you can use two settings: maxUnavailable and maxSurge. With the first setting, you can specify how many pods (expressed as a total number of pods or a percentage) can be unavailable during the update (the default value is 25%). So if you have four replicas running and you do the rolling upgrade, the old deployment will get scaled down to 75% while at the same time, new pods will be coming online. With the maxSurge option, you control the maximum number of pods that can get created and are over the desired amount of pods. If we take the previous example with four replicas and we set this value set to 50%, the rolling update process will scale up the new deployment in a way that the total number of old and new pods won't exceed the 150% of desired pods (i.e., six pods).

To update the existing deployment with the new image version, we will use the set command in the Kubernetes CLI. Let's run the command to perform the rolling update:

kubectl set image deployment helloworld web=learnistio/helloworld:2.0.0 — record

The above command sets the image for the container named web to learnistio/helloworld:2.0.0 and records the change in the deployments annotation (so you can roll the change back if needed).

Let's run another command that will show the progress of the rolling update:

$ kubectl rollout status deployment helloworld
Waiting for deployment “helloworld” rollout to finish: 2 out of 3 new replicas have been updated…
helloworld successfully rolled out

During all this time you could have accessed the website and you'd either get the v1 of the home page or the v2. Once the command completes though, you should only be getting responses from the v2 version of the home page.

Using Istio

You might wonder why in the heck would I use Istio to do rolling updates — the Kubernetes option is much simpler. That's true, and you probably shouldn't be using Istio if zero-downtime releases are the only thing you're going to use it for. You can achieve the same behavior with Istio; however, you have way more control over how and when the traffic gets routed to specific versions.

To use Istio for downtime releases, you need to keep these things in mind:

One deployment per version

Each deployment of the service needs to be versioned — you need a label called version: v1 (or something similar), as well as name the deployment, so it's clear which version it represents (e.g. helloworld-v1). Usually, you'd have at a minimum these two labels set on each deployment:

labels:
  app: helloworld
  version: v1

You could also include a bunch of other labels if it makes sense, but you should have a label that gets used for identifying your component and its version.

Generic Kubernetes service

Kubernetes service should be generic — no need for the version in the selector, only app/component name is enough

Keep destination rules up to date

Start with a destination rule that contains versions you are currently running and make sure you keep it in sync. There's no need to end up with a destination rule that has a bunch of unused or obsolete subsets.

Define a fallback version

If you are using matching and conditions, always define a “fallback” version of the service in your Virtual service. If you don't, any requests not matching the conditions will end up in digital heaven and won't be served.

With those guidelines in mind, here's a rough process on how to do a zero-downtime release using Istio. We are starting with Kubernetes deployment helloworld-v1, a destination rule with one subset (v1) and a virtual service that routes all traffic to the v1 subset. Here's the destination rule YAML:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: helloworld
spec:
  host: helloworld.default.svc.cluster.local
  subsets:
    - labels:
      version: v1
      name: v1

And here is the corresponding virtual service:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: helloworld
spec:
  hosts:
    - helloworld
  http:
    - route:
        - destination:
            host: helloworld
            port:
              number: 3000
            subset: v1
          weight: 100

With these resources deployed, all traffic gets happily routed to the v1 version. Let's start and roll out the v2:

Deploy the modified destination rule that adds the new subset:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: helloworld
spec:
  host: helloworld.default.svc.cluster.local
  subsets:
    - labels:
        version: v1
        name: v1
    - labels:
        version: v2
        name: v2

Deploy the updated virtual service with 100% of the traffic routed to the v1 subset.
Create/deploy the helloworld-v2 Kubernetes deployment.
Update the virtual service and re-deploy it to route x% of the traffic to the v1 version and y% of the traffic to the new, v2 subset. There are multiple ways you can do this — you can gradually route more and more traffic to v2 (e.g., in 10% increments for example), or you can do a straight 50/50 split between versions, or even route 100% of the traffic to the new v2 subset. Regardless of which way you go, you are going to get a zero downtime deployment and end up with 0/100% traffic split.
Remove the v1 subset from the virtual service and re-deploy the virtual service.
Remove the v1 subset from the destination rule and re-deploy it.
Delete the v1 Kubernetes deployment

If you got to this part, all traffic is now flowing to the v2 subset and you don't have any v1 artifacts running anymore.