AKS with gRPC and ingress-nginx

Using gRPC with ingress-nginx add-on with AKS

Joaquín Menchaca (智裕)
12 min readJul 4, 2021

--

Updated: 2021–08–30 multi-line code to gist as hard to copy from Medium

In previous articles in this series, I covered how to publish end points for applications that are deployed on Kubernetes, ultimately detailing how to use Ingress resources with automation support for TLS certificates and updating DNS records. This article pick that up, but shows how to support gRPC, a popular protocol for efficient web APIs.

In supporting web interfaces, traditionally with RESTful APIs, that is ultimately a CRUD (Create, Read, Update, Delete) interface using HTTP/1.1 protocol. Part of using such web interfaces requires converting data-structures to and from a JSON format, a process called serialization and deserialization. This process with text files in a JavaScript data structure (JSON) is not particularly efficient.

gRPC (gRPC Remote Procedure Calls) is an alternative that provides significant performance gains for web interfaces at least in two regards: more highly performant HTTP/2 and serialization using binary protobuf format.

This article demonstrates how to configure a ingress resource for gRPC with ingress-nginx. The following components will be created:

Articles in this Series

  1. AKS with external-dns: service with LoadBalancer type
  2. AKS with ingress-nginx: ingress (HTTP)
  3. AKS with cert-manager: ingress (HTTPS)
  4. AKS with GRPC and ingress-nginx: ingress (GRPC and HTTPS)

Previous Article

In the previous article showed how to automate creation of TLS certificates with cert-manager.

Requirements

These are some logistical and tool requirements for this article:

Registered domain name

Nginx will require use encryption with TLS certificates, so that it can route traffic between GRPC (HTTP/2) and HTTPS (HTTP/1.1). A TLS certificate that are issued by a trusted CA will require you to own a public domain name, which can be purchased from a provider for about $2 to $20 per year.

A fictional domain of example.com will be used as an example. Thus depending on the examples used, there would be, for example, hello.example.com, ratel.example.com, and alpha.example.com.

Required tools

These tools are required for this article:

  • Azure CLI tool (az): command line tool that interacts with Azure API
  • Kubernetes client tool (kubectl): command line tool that interacts with Kubernetes API
  • Helm (helm): command line tool for “templating and sharing Kubernetes manifests” that are bundled as Helm chart packages.
  • helm-diff plugin: allows you to see the changes made with helm or helmfile before applying the changes.
  • Helmfile (helmfile): command line tool that uses a “declarative specification for deploying Helm charts across many environments”.

Optional tools

I highly recommend these tools:

  • POSIX shell (sh) such as GNU Bash (bash) or Zsh (zsh): these scripts in this guide were tested using either of these shells on macOS and Ubuntu Linux.
  • curl (curl): tool to interact with web services from the command line.
  • grpcurl (grpcurl): a tool to interact with gRPC services from the command line.
  • jq (jq): a JSON processor tool that can transform and extract objects from JSON, as well as providing colorized JSON output greater readability.
  • Python (python) and pip (pip) to use Dgraph python script for gRPC interaction. Recommend using pyenv and pyenv-virtualenv to manage requirements.

Project setup

Below is a file structure that will be used for this article to keep things consistent and referenceable.

Project file structure

The following structure will be used:

~/azure_ingress_nginx_grpc/
├── env.sh
├── examples
│ ├── dgraph
│ │ ├── data
│ │ │ ├── getting_started.py
│ │ │ ├── sw.nquads.rdf
│ │ │ └── sw.schema
│ │ └── helmfile.yaml
│ └── yages
│ └── helmfile.yaml
├── helmfile.yaml
└── issuers.yaml

With either Bash or Zsh, you can create the file structure with the following commands:

These instructions from this point will assume that you are in the $HOME/azure_ingress_nginx_grpc directory, so when in doubt:

cd ~/azure_ingress_nginx_grpc

Project environment variables

Setup these environment variables below to keep things consistent amongst a variety of tools: helm, helmfile, kubectl, jq, az, curl, grpcurl.

If you are using a POSIX shell, you can save these into a script and source that script whenever needed. Copy this source script and save as env.sh:

After composing this script, source it so that variables can be used in the reset of the steps.

source env.sh

Azure components

Similar to previous articles, both AKS and Azure DNS zone cloud resources, and then authorize VMSS worker nodes to access Azure DNS, so that any pod running on AKS can make the required DNS updates.

For simplicity, you can use the following steps below to create the required resources. If you use your own scripts, such as Terraform, be sure to enable MSI or managed identities with the AKS cluster.

Cloud resources

You can create AKS and Azure DNS cloud resources with the following commands:

You will need to transfer domain management to Azure DNS for root domain like example.com, or if you are using sub-domain like dev.example.com, you’ll need to update DNS namespace records to point to Azure DNS name servers. This process is fully detailed as well as how to provision the equivalent with Terraform in Azure Linux VM with DNS article.

For a more robust script on provisioning Azure Kubernetes Service, see Azure Kubernetes Service: Provision an AKS Kubernetes Cluster with Azure CLI article.

Verify AKS and KUBCONFIG

Verify that the AKS cluster was created and that you have a KUBCONFIG that is authorized to access the cluster by running the following:

source env.sh # fetch KUBECONFIGkubectl get all --all-namespaces

The final results can should look something like this:

Provisioned AKS (kubenet network plugin)

NOTE: Regarding security around AKS access, this setup with KUBECONFIG grants full unfettered access to anyone that has the configuration. This may be fine for single users development environments, but in a shared environment, you will want to secure this further, such as using Azure Active Directory RBAC to secure access to AKS.

Authorizing access Azure DNS

Both external-dns and cert-manager will need access to the Azure DNS zone. This can be done through by associating a role that grants access to the Azure DNS zone to the Managed Identity installed on the VMSS node pool workers.

Once completed, this allows all pods running on the AKS cluster to update records on the Azure DNS zone.

NOTE: In regards to security, allowing any pod to change DNS records could be dangerous, especially if any of the records were used for production infrastructure. Currently in preview, Azure has a upcoming feature called Azure Active Directory Pod Identities that can secure access at the pod level instead of the node level. This can be used to allow access for only cert-manager and external-dns services.

You can grant access to Azure DNS zone using the following commands:

About Managed Identities on AKS

A Managed Identity is a wrapper around service principals to make management simpler. Essentially, they are mapped to a Azure resource, so that when the Azure resource no longer exists, the associated service principal will be removed.

A service principal is sent to Azure Active Directory, which if authorized based on the role assignment, will grant a ticket or token. This token is then used to access the Azure DNS service explicitly for the zone we specified, e.g. example.com.

Managed Identity authorized to access Azure DNS

Kubernetes components

The Kubernetes add-ons can be installed with the following script below.

Install the addons

Copy this script below and save as helmfile.yaml:

Once ready, simply run:

############ 
# fetch environment variables
# AZ_TENANT_ID, AZ_SUBSCRIPTION_ID, AZ_DNS_DOMAIN
############################################
source env.sh
############
# deploy k8s add-ons
############################################

helmfile
apply

Install clusterissuers

Copy the following and save as issuers.yaml:

There will need to be a few seconds before the cert-manager pods are ready and online. When ready, run this:

############ 
# fetch environment variables
# ACME_ISSUER_EMAIL, AZ_DNS_DOMAIN, AZ_SUBSCRIPTION_ID
############################################
source env.sh
############
# deploy cert-manager ACME issuers
############################################
helmfile --file issuers.yaml apply

Verify Addons and Cluster Issuer resources

Once this process is completed you can view the results with:

kubectl get all,clusterissuer --namespace kube-addons

This should something like the following:

Kube addon deploy (cert-manager, external-dns, ingress-nginx)

GRPC ingress example: Yages

This example uses a small service called Yet another gRPC echo server (YAGES).

Copy the following below and save it as examples/yages/helmfile.yaml:

Deploy YAGES

This has three embedded Kubernetes resources: service, ingress, and deployment for the YAGES application. You can deploy the following into the yages namespace using the letsencrypt-prod issuer with the following:

############ 
# fetch environment variables
# ACME_ISSUER, AZ_DNS_DOMAIN
############################################

source
env.sh
############
# deploy yages
############################################

helmfile
-f examples/yages/helmfile.yaml apply

Verify YAGES

After about 3 minutes for the certificate to be ready, you can run the following to verify the resources are deployed and ready:

kubectl get all,ing,certificate --namespace yages

The results should look something like this:

yages deployment

Testing gRPC with YAGES

Make sure that the certificate Ready state is true.

kubectl get certificate --namespace yages --watch

Once ready we can test the service with gRPC using the following commands:

NOTE: This assumes that ACME_ISSUER is set to letsencrypt-prod. If this is set to letsencrypt-staging for certificates issued by a private CA, use the grpcurl --insecure argument.

GRPC ingress example: Dgraph

Dgraph is a distributed graph database and has a helm chart that can be used to install Dgraph into a Kubernetes cluster. You can use either helmfile or helm methods to install Dgraph.

Dgraph and Ratel infrastructure

This example will deploy three endpoints using 2 × ingress resources, one for gRPC traffic and one for HTTPS traffic. Both use the same certificates, so in the background these will only be created once for both ingress resources. Ultimately all of these rules from all ingresses will be added to the same configuration (nginx.conf) in the ingress-controller pods.

This will allow the following access (swapping out example.com for the domain name you are using):

  • dgraph.example.com: backend graph database through GRPC (port 9080)
  • alpha.example.com: backend graph database through HTTPS (port 8080)
  • ratel.example.com: the graphical user interface client (React SPA application) (port 80)

Securing Dgraph (optional)

In a production scenario, public endpoints should be secured, especially for a backend database, but in order the keep things simple for the article, the endpoint will not be secured. Normally, this would be done with a network security group or even better, only have the endpoints exposed internally and then use a jump host or VPN to access the backend service.

Some level of security can be added on the Dgraph Alpha service itself by adding an allow list (also called a whitelist):

This value DG_ALLOW_LIST will be used later during deployment. If this is not set, it will default to allow the whole public Internet access to the database server.

Dgraph Deploy Code

Copy the file below and save as examples/dgraph/helmfile.yaml:

Deploy Dgraph Services and Ingresses

The will deploy the Dgraph helm chart for several resources including Dgraph Ratel UI deployment and service, Dgraph Alpha and Dgraph Zero statefulsets, pvc, service, headless service, a configmap, as well as some custom manifests to deploy two ingress resources: one for GRPC traffic and one for HTTP traffic.

When this can be deployed with:

############ 
# fetch environment variables
# ACME_ISSUER, AZ_DNS_DOMAIN
############################################

source
env.sh
############
# deploy yages
############################################

helmfile
-f examples/dgraph/helmfile.yaml apply

Verify deployed Dgraph resources

After a few minutes, you can verify the resources with the following command:

kubectl get all,ing,certificate --namespace dgraph

This should look something like the following:

Dgraph + ingress + certificate deployment

The certificate may take about a minute before it is ready. You can monitor it with:

kubectl get certificate --namespace dgraph --watch

Also the pods may take a about a minute to before they are in a ready state. This can be monitored with:

kubectl get pods --namespace dgraph --watch

Verify Basic gGRPC

For getting off the ground to determine if gRPC in h2 (or HTTP2 over TLS), the tool grpcurl can be used to test this:

The response should be something like:

{
"tag": "v21.03.0"
}

Upload Data and Schema

There are some scripts adapted from tutorials https://dgraph.io/docs/get-started/ that you can download.

First, and save the following at examples/dgraph/data/getting_started.py:

The above script is based on simple.py and and tls_example.py, and is limited to just working with only certificates issued from publicly trusted CAs. This can be easily adapted for insecure (no TLS) traffic, or with private certificates including mutual TLS where a client certificate is required. For further information, check out the code repository:

This above python script does have a few module requirements, which you can install with pip from your Python 3 environment:

pip install pydgraph
pip install certifi

Then next step is to download the schema and data files, and then run the script:

This will upload the schema and data file, so that we can run queries.

Connect to Ratel UI

After a few moments, you can check the results https://ratel.example.com (substituting example.com for your domain).

In the dialog for Dgraph Server Connection, configure the domain, e.g. https://alpha.example.com (substituting example.com for your domain)

Test Using the Ratel UI

In the Ratel UI, paste the following query and click run:

You should see something like this:

Cleanup the Project

You can cleanup resources that can incur costs with the following:

Remove the Azure Resources

This will remove the Azure resources:

############ 
# Delete the AKS cluster
############################################

az
aks delete \
--resource-group $AZ_RESOURCE_GROUP \
--name $AZ_CLUSTER_NAME
############
# Delete the Azure DNS Zone
############################################

az
network dns zone delete \
--resource-group $AZ_RESOURCE_GROUP \
--name $AZ_DNS_DOMAIN

Resources

Here are some articles and links I have come across on topics while making this article.

Blog Source Code

Azure Resources

Kubernetes Addons

Services used in ingress examples

Ingress-Nginx and GRPC

Protobuffers and gRPC

Python (gRPC and TLS)

In doing this guide, I needed to figure out how to communicate securely with GRPC and TLS in Python. Most documentation covered using only self-signed certficates or certificates issued by a non-trusted private CA. As we are using certificates from a publicly trusted CA through Let’s Encrypt through cert-manager, it was hard to find material on this. Below are some links that may be useful, should find yourself on the same journey.

Conclusion

This article was a challenge in two pieces, generally getting gRPC to work with existing HTTPS services that may still be needed, and testing gRPC itself, where the requirement go beyond a simple curl command. The underlying protocol is different, as it requires serialization through protobuf and HTTP/2 is a different, not stateless like HTTP/1.1, so load balancer have to load balance on HTTP/2 sessions, rather than transactions.

The python script (or any other language for that matter) was a little challenging, as most services that are gRPC will almost never be available publicly, at least without some form of authentication. Typically, for internal private communication, such as microservice-to-microservice, or microservice-to-graphdb, private certificates are perfectly fine.

So I think this concludes the series of these articles. I could maybe document how to further lock-down Azure or these services, but likely I can see trying out some other options for gRPC with some different ingress controllers, like Contour, Ambassador, Gloo, or Traefik, or jumping into service meshes like Linkerd or Itsio.

--

--

Joaquín Menchaca (智裕)

DevOps/SRE/PlatformEng — k8s, o11y, vault, terraform, ansible