AKS with Istio Service Mesh

Securing traffic with Istio service mesh on AKS

Update: 2021年08月14日 Reduce commands, gist for 4+ commands, fixes

So now you have your distributed application, monolith, or microservices packaged as a container and deployed to Kubernetes. Congratulations!

But now you need to have security, such as encrypted traffic and network firewalls, and for all of these secured services, you need to have monitoring and proper load balancing of gRPC (HTTP/2) traffic, especially as Kubernetes fails in this department (ref).

And all of these things need to happen every time you roll out an new pod.

The Solution

This article will cover how to get started with this using Istio, coupled with the famous Envoy proxy, which is one of the most popular service mesh platforms on Kubernetes.

Goals and Not Goals

  1. Install AKS with Calico and install Istio with Istio addons
  2. Install Dgraph and some clients (Python script using pydgraph)
  3. Test outside traffic is blocked after installing the network policy.
  4. Test traffic works through the service mesh.
  5. Generate traffic (gRPC and HTTP) and observe in Kiali.

The not goals (reserved for later):

  • Restricting traffic within the mesh to authorized clients.
  • Automatic authorization (AuthN, JWT, etc) for mesh members
  • Managing external inbound or outbound traffic through Gateways.
  • Other traffic management features, like retries and circuit breaker.

Architecture

Istio Architecture: Control Plan vs. Data Plane

A service mesh can be logically organized into two primary layers:

a control plane layer that’s responsible for configuration and management, and a data plane layer that provides network functions valuable to distributed applications. (ref)

Articles in Series

  1. AKS with Azure Container Registry
  2. AKS with Calico network policies
  3. AKS with Linkerd service mesh
  4. AKS with Istio service mesh (this article)

Previous Article

Requirements

Required Tools

  • Azure CLI tool (az): command line tool that interacts with Azure API.
  • Kubernetes client tool (kubectl): command line tool that interacts with Kubernetes API
  • Helm (helm): command line tool for “templating and sharing Kubernetes manifests” (ref) that are bundled as Helm chart packages.
  • helm-diff plugin: allows you to see the changes made with helm or helmfile before applying the changes.
  • Helmfile (helmfile): command line tool that uses a “declarative specification for deploying Helm charts across many environments” (ref).
  • Istio CLI (istioctl): command line tool to configure and deploy the Istio environment.

Optional tools

  • POSIX shell (sh) such as GNU Bash (bash) or Zsh (zsh): these scripts in this guide were tested using either of these shells on macOS and Ubuntu Linux.
  • Docker (docker): command line tool to build, test, and push docker images.

Project setup

~/azure_istio
├── addons
│ ├── grafana.yaml
│ ├── jaeger.yaml
│ ├── kiali.yaml
│ ├── prometheus.yaml
│ ├── prometheus_vm.yaml
│ └── prometheus_vm_tls.yaml
├── env.sh
└── examples
├── dgraph
│ ├── helmfile.yaml
│ └── network_policy.yaml
└── pydgraph
├── Dockerfile
├── Makefile
├── helmfile.yaml
├── load_data.py
├── requirements.txt
├── sw.nquads.rdf
└── sw.schema

With either Bash or Zsh, you can create the file structure with the following commands:

Project environment variables

Copy this source script and save as env.sh:

Provision Azure resources

Azure Resources

Both AKS with Azure CNI and Calico network policies and ACR cloud resources can be provisioned with the following steps outlined in the script below.

Verify AKS and KUBCONFIG

source env.sh
kubectl get all --all-namespaces

The results should look something like the:

AKS with Azure CNI and Calico

NOTE: As of Aug 1, 2021, this will install Kubernetes v1.20.7 and Calico cluster version is v3.19.0. This will reflect recent changes and introduce two new namespaces: calico-system and tigera-operator.

Verify Azure CNI

You can print the IP addresses on the nodes and pods with the following:

This should show something like this:

Nodes:
------------
aks-nodepool1-56788426-vmss000000 10.240.0.4
aks-nodepool1-56788426-vmss000001 10.240.0.35
aks-nodepool1-56788426-vmss000002 10.240.0.66
Pods:
------------
calico-kube-controllers-7d7897d6b7-qlrh6 10.240.0.36
calico-node-fxg66 10.240.0.66
calico-node-j4hlq 10.240.0.35
calico-node-kwfjv 10.240.0.4
calico-typha-85c77f79bd-5ksvc 10.240.0.4
calico-typha-85c77f79bd-6cl7p 10.240.0.66
calico-typha-85c77f79bd-ppb8x 10.240.0.35
azure-ip-masq-agent-6np6q 10.240.0.66
azure-ip-masq-agent-dt2b7 10.240.0.4
azure-ip-masq-agent-pltj9 10.240.0.35
coredns-9d6c6c99b-5zl69 10.240.0.28
coredns-9d6c6c99b-jzs8w 10.240.0.85
coredns-autoscaler-599949fd86-qlwv4 10.240.0.75
kube-proxy-4tbs4 10.240.0.35
kube-proxy-9rxr9 10.240.0.66
kube-proxy-bjjq5 10.240.0.4
metrics-server-77c8679d7d-dnbbt 10.240.0.89
tunnelfront-589474564b-k8s88 10.240.0.67
tigera-operator-7b555dfbdd-ww8sn 10.240.0.4

The Istio service mesh

Kubernetes components

There are a few ways to install Istio, helm charts, operators, or with istioctl. For this article, we take the easy road, the istioctl command.

Istio Platform

source env.shistioctl install --set profile=demo -y
kubectl
get all --namespace istio-system

This should show something like the following below:

Deployment of Istio

Istio addons

NOTE: The first time kubectl apply is run, there will be errors as CRD was not yet installed. Run kubectl apply again, to run the remaining manifests that depend on the CRD.

After adding these components, you can see new resources with kubectl get all -n istio-system:

Deployment of Istio + Addons

The Dgraph service

Save the following as examples/dgraph/helmfile.yaml:

NOTE: The namespace dgraph will need to have the required label istio-injection: enabled to signal Istio to install Envoy proxy side cars.

Both the namespace with the needed label and Dgraph can be installed and verified with these commands:

source env.sh
helmfile --file examples/dgraph/helmfile.yaml apply
kubectl --namespace dgraph get all

After about two minutes, this should show something like the following:

Deployment of Dgraph

The pydgraph client

  1. Build pydgraph-client image and push to ACR
  2. Deploy pydgraph-client in pydgraph-allow namespace. Istio will inject an Envoy proxy into the pod.
  3. Deploy pydgraph-client in pydgraph-deny namespace.

Fetch the build and deploy scripts

In the previous blog, I documented steps to build and release a pygraph-client image, and then deploy a container using that image.

Fetch build and deploy scripts

Below is a script you can use to download the gists and populate the needed files run through these steps.

NOTE: These scripts and further details are covered in the previous article (see AKS with Azure Container Registry).

Build and Push

source env.shaz acr login --name ${AZ_ACR_NAME}
pushd examples/pydgraph && make build && make push && popd

Deploy to pydgraph-deny namespace

helmfile \
--namespace "pydgraph-deny" \
--file examples/pydgraph/helmfile.yaml \
apply

Afterward, you can check the results with kubectl get all -n pydgraph-deny:

Namespace: pydgraph-deny

Deploy to pydgraph-allow namespace

The final results in pydgraph-allow namespace the should look similar to the following:

Namespace: pydgraph-allow

This will add the Envoy proxy sidecar container:

Test 0 (Baseline): No Network Policy

No Network Policy

Log into pydgraph-deny

PYDGRAPH_DENY_POD=$(
kubectl get pods --namespace "pydgraph-deny" --output name
)
kubectl exec -ti --namespace "pydgraph-deny" \
${PYDGRAPH_DENY_POD}
-- bash

HTTP check (no network policy)

curl ${DGRAPH_ALPHA_SERVER}:8080/health | jq

The expected results should be health status of one of the Dgraph Alpha nodes:

/health (HTTP)

gRPC check (no network policy)

grpcurl -plaintext -proto api.proto \
${DGRAPH_ALPHA_SERVER}:9080 \
api.Dgraph/CheckVersion

The expected results will be the Dgraph server version.

api.Dgraph/CheckVersion (gRPC)

TEST 1: Apply a network policy

After adding the policy, the expected results will timeouts as communication from the pydgraph-client that is not in the service mesh, from the pydgraph-deny namespace, will be blocked.

Network Policy added to block traffic outside the mesh

Adding a network policy

Dgraph Network Policy for Istio (made with https://editor.cilium.io)

Copy the following and save as examples/dgraph/network_policy.yaml:

When ready, apply this with the following command:

kubectl --filename ./examples/dgraph/network_policy.yaml apply

Log into pydgraph-deny

PYDGRAPH_DENY_POD=$(
kubectl get pods --namespace "pydgraph-deny" --output name
)
kubectl exec -ti --namespace "pydgraph-deny" \
${PYDGRAPH_DENY_POD}
-- bash

HTTP check (network policy applied)

curl ${DGRAPH_ALPHA_SERVER}:8080/health

The expected results in this case, after a very long wait (about 5 minutes) will be something similar to this:

gRPC check (network policy apply)

grpcurl -plaintext -proto api.proto \
${DGRAPH_ALPHA_SERVER}:9080 \
api.Dgraph/CheckVersion

The expected results for gRPC in about 10 seconds will be:

Test 2: Test with Envoy proxy side car

Log into pydgraph-allow

PYDGRAPH_ALLOW_POD=$(
kubectl get pods --namespace "pydgraph-allow" --output name
)
kubectl exec -ti --namespace "pydgraph-allow" \
${PYDGRAPH_ALLOW_POD}
-- bash

HTTP check (namespace label applied)

curl ${DGRAPH_ALPHA_SERVER}:8080/health | jq

The expected results for this is that JSON data about the health from one of the Dgraph Alpha pods.

/health (HTTP)

gRPC check (namespace label applied)

grpcurl -plaintext -proto api.proto \
${DGRAPH_ALPHA_SERVER}:9080 \
api.Dgraph/CheckVersion

The expected results for this is that JSON detailing the Dgraph server version.

api.Dgraph/CheckVersion (gRPC)

Test 3: Listening to traffic steams

Kiali dashboard

istioctl dashboard kiali

One in the dashboard, click on graph and select the dgraph for the Namespace.

Generate Traffic

curl ${DGRAPH_ALPHA_SERVER}:8080/healthgrpcurl -plaintext -proto api.proto \
${DGRAPH_ALPHA_SERVER}
:9080 api.Dgraph/CheckVersion
python3 load_data.py --plaintext \
--alpha ${DGRAPH_ALPHA_SERVER}:9080 \
--files ./sw.nquads.rdf \
--schema ./sw.schema
curl "${DGRAPH_ALPHA_SERVER}:8080/query" --silent \
--request
POST \
--header "Content-Type: application/dql" \
--data
$'{ me(func: has(starring)) { name } }'

Observe the resulting traffic

In the graph you can see the following content:

  • Kubernetes services are represented by triangle △ icon and Pod containers as the square ◻ icon.
  • Both gRPC and HTTP incoming traffic connect to the demo-dgraph-alpha service and then to the alpha container, which is called latest, due to lack of a version label.
  • The Dgraph Alpha service then communicates to Dgraph zero service, also called latest, due to lack of a version label.

Cleanup

az aks delete \
--resource-group $AZ_RESOURCE_GROUP \
--name $AZ_CLUSTER_NAME

Resources

Blog Source Code

Service Mesh

gRPC Load Balancing

Istio vs. Calico: Combining Network Policies with Istio

Istio vs AKS: Installing Istio on AKS

Documentation

Articles

Example Application

Conclusion

There are a few things I would like to explore as the next step are around managing external traffic and further securing traffic within the mesh.

For traffic access or rather restricting traffic within the mesh network using AuthorizationPolicy, and exploring adding a later of authorization, so that service must authenticate to access the component.

External Traffic

For a public facing service, you would want to use a friendly DNS name like https://dgraph.example.com, as this is easier to remember than something like https://20.69.65.109. This can be automated with Kubernetes addons external-dns and cert-manager. Through these two addons you can automate DNS record updates and automate issuing X.509 certificate from a trusted certificate authority.

So how do can I integrate these addons with Istio?

You can integrate these addons using either the native Gateway and VirtualService or use an ingress resource.

For the ingress, you can select the ingress by setting an annotation of kubernetes.io/ingress.class: istio. I wrote an earlier article, AKS with Cert Manager, that demonstrates how to use ingress-nginx with both external-dns using Azure DNS and cert-manager using Let’s Encrypt. The process is identical with exception of the annotation to select istio instead of nginx.

For Gateway and VirtualService resources, external-dns has direct support to scan these sources directly. With cert-manager, you would configure a Certificate resource, and then reference the secret it creates from the Gateway resource.

Final Note

Linux NinjaPants Automation Engineering Mutant — exploring DevOps, o11y, k8s, progressive deployment (ci/cd), cloud native infra, infra as code

Love podcasts or audiobooks? Learn on the go with our new app.