Circuit breakers in Istio explained

Peter Jausovec

May 21, 2019

Circuit breaking is an important pattern that can help with service resiliency. This pattern is used to prevent additional failures by controlling and managing access to the failing services.

The easiest way to explain circuit breakers is with a simple example using a frontend called Hello web and backends called Greeter service. Let's say the greeter service starts failing, and instead of call it continuously, we could detect the failures and either stop or reduce the number of requests made to the service. If we added a database to this example, you could quickly imagine how calling the service could put more stress on different parts of the system and potentially make everything even worse. This scenario is where the circuit breaker comes into play. We define the conditions when we want the circuit breaker to trip (for example, if we get more than 10 failures within a 5 second time period), once circuit breaker trips, we won't be making calls to the underlying service anymore, instead we will just directly return the error from the circuit breaker. This way, we are preventing additional strain and damage to the system.

In Istio, circuit breakers get defined in the destination rule. Circuit breaker tracks the status of each host, and if any of those hosts starts to fail, it will eject it from the pool. Practically speaking, if we have five instances of our pod running, the circuit breaker will eject any of the instances that misbehave, so that the calls won't be made to those hosts anymore. Ejection is controlled by the outlier detection and it can be configured by setting the following properties:

  • number of consecutive errors
  • scan interval
  • base ejection time

In addition to the outliers, we can also set the connection pool properties - such as the number of connections and requests per connection made to the service.

Let's look at an example for the greeter service:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: greeter-service
spec:
  host: greeter-service.default.svc.cluster.local
    trafficPolicy:
      connectionPool:
        http:
          http2MaxRequests: 10
          maxRequestsPerConnection: 5
      outlierDetection:
        consecutiveErrors: 3
        interval: 10s
        baseEjectionTime: 10m
        maxEjectionPercent: 10
  subsets:
    - name: v1
      labels:
        version: v1
    - name: v2
      labels:
        version: v2

The above rule sets the connection pool size to a maximum of 10 concurrent HTTP requests that have no more than five requests per connection to the greeter service. With the outlier detection properties, we are scanning the hosts every 10 seconds (default value), and if any of the hosts fails three consecutive times (default value is 5) with the 5xx error, we will remove the 10% of the failing hosts for 10 minutes.

Let's deploy the destination rule that configures a simple circuit breaker:

cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: greeter-service
spec:
  host: greeter-service
  trafficPolicy:
    connectionPool:
      http:
        http1MaxPendingRequests: 1
        maxRequestsPerConnection: 1
    outlierDetection:
      consecutiveErrors: 1
      interval: 1s
      baseEjectionTime: 2m
      maxEjectionPercent: 100
EOF

To demonstrate the circuit breaking we will use the load-testing library called Fortio. With Fortio we can easily control the number of connection, concurrency, and delays of the outgoing HTTP calls. Let's deploy Fortio:

cat <<EOF | kubectl create -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fortio-deploy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fortio
  template:
    metadata:
      labels:
        app: fortio
    spec:
      containers:
      - name: fortio
        image: istio/fortio:latest_release
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
          name: http-fortio
        - containerPort: 8079
          name: grpc-ping
EOF

Next, we will deploy the greeter service:

cat <<EOF | kubectl create -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: greeter-service-v1
  labels:
    app: greeter-service
    version: v1
spec:
  replicas: 3
  selector:
    matchLabels:
      app: greeter-service
      version: v1
  template:
    metadata:
      labels:
        app: greeter-service
        version: v1
    spec:
      containers:
        - image: learnistio/greeter-service:1.0.0
          imagePullPolicy: Always
          name: svc
          ports:
            - containerPort: 3000
---
kind: Service
apiVersion: v1
metadata:
  name: greeter-service
  labels:
    app: greeter-service
spec:
  selector:
    app: greeter-service
  ports:
    - port: 3000
      name: http
EOF

Finally, let's make a simple call from the Fortio pod to the greeter service:

export FORTIO_POD=$(kubectl get pod | grep fortio | awk '{ print $1 }')
kubectl exec -it $FORTIO_POD  -c fortio /usr/local/bin/fortio -- load -curl  http://greeter-service:3000/hello


HTTP/1.1 200 OK
x-powered-by: Express
content-type: application/json; charset=utf-8
content-length: 43
etag: W/"2b-DdO+hdtaORahq7JZ8niOkjoR0XQ"
date: Fri, 04 Jan 2019 00:53:19 GMT
x-envoy-upstream-service-time: 7
server: envoy

{"message":"hello 👋 ","version":"1.0.0"}

With the above command, we are just making one call to the greeter service, and it all works, we get the response back. Let's try to trip the circuit breaker now. To make the circuit breaker trip faster, we will decrease the number of replicas in our greeter service deployment from 3 to 1.

kubectl scale deploy greeter-service-v1 --replicas=1

Now we can use Fortio and make 20 requests with 2 concurrent connections:

kubectl exec -it $FORTIO_POD -c fortio /usr/local/bin/fortio -- load -c 2 -qps 0 -n 20 -loglevel Warning http://greeter-service:3000/hello

In the output, you will notice the following lines:

Code 200 : 19 (95.0 %)
Code 503 : 1 (5.0 %)

This is telling us that all 19 requests succeeded and 5% of them failed. Let's increase the number of concurrent connections to 3:

kubectl exec -it $FORTIO_POD -c fortio /usr/local/bin/fortio -- load -c 3 -qps 0 -n 50 -loglevel Warning http://greeter-service:3000/hello

Now are getting more failures:

Code 200 : 41 (82.0 %)
Code 503 : 9 (18.0 %)

This is telling us that 82% of requests succeded, and the rest was caught by the circuit breaker. Another way to see the calls that we trapped by the circuit breaker is to query the Istio proxy stats:

$ kubectl exec -it $FORTIO_POD  -c istio-proxy  -- sh -c 'curl localhost:15000/stats' | grep greeter-service | grep pending
cluster.outbound|3000||greeter-service.default.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|3000||greeter-service.default.svc.cluster.local.upstream_rq_pending_failure_eject: 107
cluster.outbound|3000||greeter-service.default.svc.cluster.local.upstream_rq_pending_overflow: 9
cluster.outbound|3000||greeter-service.default.svc.cluster.local.upstream_rq_pending_total: 2193

The above stats are showing that 9 calls have been flagged for circuit breaking (which equals the number of failed requests we had with Fortio).

Spread the word

Did you find this article helpful? Share it with others!

Want to get notified new posts are published?