AKS with gRPC and ingress-nginx
Using gRPC with ingress-nginx add-on with AKS
Updated: 2021–08–30 multi-line code to gist as hard to copy from Medium
In previous articles in this series, I covered how to publish end points for applications that are deployed on Kubernetes, ultimately detailing how to use Ingress resources with automation support for TLS certificates and updating DNS records. This article pick that up, but shows how to support gRPC, a popular protocol for efficient web APIs.
In supporting web interfaces, traditionally with RESTful APIs, that is ultimately a CRUD (Create, Read, Update, Delete) interface using HTTP/1.1 protocol. Part of using such web interfaces requires converting data-structures to and from a JSON format, a process called serialization and deserialization. This process with text files in a JavaScript data structure (JSON) is not particularly efficient.
gRPC (gRPC Remote Procedure Calls) is an alternative that provides significant performance gains for web interfaces at least in two regards: more highly performant HTTP/2 and serialization using binary protobuf format.
This article demonstrates how to configure a ingress resource for gRPC with ingress-nginx. The following components will be created:
- Azure resources: AKS, Azure DNS zone
- Kubernetes add-ons: cert-manager, external-dns, ingress-nginx
- Example applications: Yages (Yet Another gRPC Echo Server), Dgraph
Articles in this Series
- AKS with external-dns:
service
withLoadBalancer
type - AKS with ingress-nginx:
ingress
(HTTP) - AKS with cert-manager:
ingress
(HTTPS) - AKS with GRPC and ingress-nginx:
ingress
(GRPC and HTTPS)
Previous Article
In the previous article showed how to automate creation of TLS certificates with cert-manager.
Requirements
These are some logistical and tool requirements for this article:
Registered domain name
Nginx will require use encryption with TLS certificates, so that it can route traffic between GRPC (HTTP/2) and HTTPS (HTTP/1.1). A TLS certificate that are issued by a trusted CA will require you to own a public domain name, which can be purchased from a provider for about $2 to $20 per year.
A fictional domain of example.com
will be used as an example. Thus depending on the examples used, there would be, for example, hello.example.com
, ratel.example.com
, and alpha.example.com
.
Required tools
These tools are required for this article:
- Azure CLI tool (
az
): command line tool that interacts with Azure API - Kubernetes client tool (
kubectl
): command line tool that interacts with Kubernetes API - Helm (
helm
): command line tool for “templating and sharing Kubernetes manifests” that are bundled as Helm chart packages. - helm-diff plugin: allows you to see the changes made with
helm
orhelmfile
before applying the changes. - Helmfile (
helmfile
): command line tool that uses a “declarative specification for deploying Helm charts across many environments”.
Optional tools
I highly recommend these tools:
- POSIX shell (
sh
) such as GNU Bash (bash
) or Zsh (zsh
): these scripts in this guide were tested using either of these shells on macOS and Ubuntu Linux. - curl (
curl
): tool to interact with web services from the command line. - grpcurl (
grpcurl
): a tool to interact with gRPC services from the command line. - jq (
jq
): a JSON processor tool that can transform and extract objects from JSON, as well as providing colorized JSON output greater readability. - Python (
python
) and pip (pip
) to use Dgraph python script for gRPC interaction. Recommend using pyenv and pyenv-virtualenv to manage requirements.
Project setup
Below is a file structure that will be used for this article to keep things consistent and referenceable.
Project file structure
The following structure will be used:
~/azure_ingress_nginx_grpc/
├── env.sh
├── examples
│ ├── dgraph
│ │ ├── data
│ │ │ ├── getting_started.py
│ │ │ ├── sw.nquads.rdf
│ │ │ └── sw.schema
│ │ └── helmfile.yaml
│ └── yages
│ └── helmfile.yaml
├── helmfile.yaml
└── issuers.yaml
With either Bash or Zsh, you can create the file structure with the following commands:
These instructions from this point will assume that you are in the $HOME/azure_ingress_nginx_grpc
directory, so when in doubt:
cd ~/azure_ingress_nginx_grpc
Project environment variables
Setup these environment variables below to keep things consistent amongst a variety of tools: helm
, helmfile
, kubectl
, jq
, az
, curl
, grpcurl
.
If you are using a POSIX shell, you can save these into a script and source that script whenever needed. Copy this source script and save as env.sh
:
After composing this script, source it so that variables can be used in the reset of the steps.
source env.sh
Azure components
Similar to previous articles, both AKS and Azure DNS zone cloud resources, and then authorize VMSS worker nodes to access Azure DNS, so that any pod running on AKS can make the required DNS updates.
For simplicity, you can use the following steps below to create the required resources. If you use your own scripts, such as Terraform, be sure to enable MSI or managed identities with the AKS cluster.
Cloud resources
You can create AKS and Azure DNS cloud resources with the following commands:
You will need to transfer domain management to Azure DNS for root domain like example.com
, or if you are using sub-domain like dev.example.com
, you’ll need to update DNS namespace records to point to Azure DNS name servers. This process is fully detailed as well as how to provision the equivalent with Terraform in Azure Linux VM with DNS article.
For a more robust script on provisioning Azure Kubernetes Service, see Azure Kubernetes Service: Provision an AKS Kubernetes Cluster with Azure CLI article.
Verify AKS and KUBCONFIG
Verify that the AKS cluster was created and that you have a KUBCONFIG that is authorized to access the cluster by running the following:
source env.sh # fetch KUBECONFIGkubectl get all --all-namespaces
The final results can should look something like this:
NOTE: Regarding security around AKS access, this setup with KUBECONFIG
grants full unfettered access to anyone that has the configuration. This may be fine for single users development environments, but in a shared environment, you will want to secure this further, such as using Azure Active Directory RBAC to secure access to AKS.
Authorizing access Azure DNS
Both external-dns and cert-manager will need access to the Azure DNS zone. This can be done through by associating a role that grants access to the Azure DNS zone to the Managed Identity installed on the VMSS node pool workers.
Once completed, this allows all pods running on the AKS cluster to update records on the Azure DNS zone.
NOTE: In regards to security, allowing any pod to change DNS records could be dangerous, especially if any of the records were used for production infrastructure. Currently in preview, Azure has a upcoming feature called Azure Active Directory Pod Identities that can secure access at the pod level instead of the node level. This can be used to allow access for only cert-manager and external-dns services.
You can grant access to Azure DNS zone using the following commands:
About Managed Identities on AKS
A Managed Identity is a wrapper around service principals to make management simpler. Essentially, they are mapped to a Azure resource, so that when the Azure resource no longer exists, the associated service principal will be removed.
A service principal is sent to Azure Active Directory, which if authorized based on the role assignment, will grant a ticket or token. This token is then used to access the Azure DNS service explicitly for the zone we specified, e.g. example.com
.
Kubernetes components
The Kubernetes add-ons can be installed with the following script below.
Install the addons
Copy this script below and save as helmfile.yaml
:
Once ready, simply run:
############
# fetch environment variables
# AZ_TENANT_ID, AZ_SUBSCRIPTION_ID, AZ_DNS_DOMAIN
############################################
source env.sh############
# deploy k8s add-ons
############################################
helmfile apply
Install clusterissuers
Copy the following and save as issuers.yaml
:
There will need to be a few seconds before the cert-manager pods are ready and online. When ready, run this:
############
# fetch environment variables
# ACME_ISSUER_EMAIL, AZ_DNS_DOMAIN, AZ_SUBSCRIPTION_ID
############################################
source env.sh############
# deploy cert-manager ACME issuers
############################################
helmfile --file issuers.yaml apply
Verify Addons and Cluster Issuer resources
Once this process is completed you can view the results with:
kubectl get all,clusterissuer --namespace kube-addons
This should something like the following:
GRPC ingress example: Yages
This example uses a small service called Yet another gRPC echo server (YAGES).
Copy the following below and save it as examples/yages/helmfile.yaml
:
Deploy YAGES
This has three embedded Kubernetes resources: service
, ingress
, and deployment
for the YAGES application. You can deploy the following into the yages
namespace using the letsencrypt-prod
issuer with the following:
############
# fetch environment variables
# ACME_ISSUER, AZ_DNS_DOMAIN
############################################
source env.sh############
# deploy yages
############################################
helmfile -f examples/yages/helmfile.yaml apply
Verify YAGES
After about 3 minutes for the certificate to be ready, you can run the following to verify the resources are deployed and ready:
kubectl get all,ing,certificate --namespace yages
The results should look something like this:
Testing gRPC with YAGES
Make sure that the certificate Ready state is true.
kubectl get certificate --namespace yages --watch
Once ready we can test the service with gRPC using the following commands:
NOTE: This assumes that ACME_ISSUER
is set to letsencrypt-prod
. If this is set to letsencrypt-staging
for certificates issued by a private CA, use the grpcurl --insecure
argument.
GRPC ingress example: Dgraph
Dgraph is a distributed graph database and has a helm chart that can be used to install Dgraph into a Kubernetes cluster. You can use either helmfile
or helm
methods to install Dgraph.
This example will deploy three endpoints using 2 × ingress
resources, one for gRPC traffic and one for HTTPS traffic. Both use the same certificates, so in the background these will only be created once for both ingress resources. Ultimately all of these rules from all ingresses will be added to the same configuration (nginx.conf) in the ingress-controller pods.
This will allow the following access (swapping out example.com
for the domain name you are using):
dgraph.example.com
: backend graph database through GRPC (port9080
)alpha.example.com
: backend graph database through HTTPS (port8080
)ratel.example.com
: the graphical user interface client (React SPA application) (port80
)
Securing Dgraph (optional)
In a production scenario, public endpoints should be secured, especially for a backend database, but in order the keep things simple for the article, the endpoint will not be secured. Normally, this would be done with a network security group or even better, only have the endpoints exposed internally and then use a jump host or VPN to access the backend service.
Some level of security can be added on the Dgraph Alpha service itself by adding an allow list (also called a whitelist):
This value DG_ALLOW_LIST
will be used later during deployment. If this is not set, it will default to allow the whole public Internet access to the database server.
Dgraph Deploy Code
Copy the file below and save as examples/dgraph/helmfile.yaml
:
Deploy Dgraph Services and Ingresses
The will deploy the Dgraph helm chart for several resources including Dgraph Ratel UI deployment
and service
, Dgraph Alpha and Dgraph Zero statefulsets
, pvc
, service
, headless service
, a configmap
, as well as some custom manifests to deploy two ingress
resources: one for GRPC traffic and one for HTTP traffic.
When this can be deployed with:
############
# fetch environment variables
# ACME_ISSUER, AZ_DNS_DOMAIN
############################################
source env.sh############
# deploy yages
############################################
helmfile -f examples/dgraph/helmfile.yaml apply
Verify deployed Dgraph resources
After a few minutes, you can verify the resources with the following command:
kubectl get all,ing,certificate --namespace dgraph
This should look something like the following:
The certificate may take about a minute before it is ready. You can monitor it with:
kubectl get certificate --namespace dgraph --watch
Also the pods may take a about a minute to before they are in a ready state. This can be monitored with:
kubectl get pods --namespace dgraph --watch
Verify Basic gGRPC
For getting off the ground to determine if gRPC in h2 (or HTTP2 over TLS), the tool grpcurl
can be used to test this:
The response should be something like:
{
"tag": "v21.03.0"
}
Upload Data and Schema
There are some scripts adapted from tutorials https://dgraph.io/docs/get-started/ that you can download.
First, and save the following at examples/dgraph/data/getting_started.py
:
The above script is based on simple.py
and and tls_example.py
, and is limited to just working with only certificates issued from publicly trusted CAs. This can be easily adapted for insecure (no TLS) traffic, or with private certificates including mutual TLS where a client certificate is required. For further information, check out the code repository:
This above python script does have a few module requirements, which you can install with pip from your Python 3 environment:
pip install pydgraph
pip install certifi
Then next step is to download the schema and data files, and then run the script:
This will upload the schema and data file, so that we can run queries.
Connect to Ratel UI
After a few moments, you can check the results https://ratel.example.com
(substituting example.com
for your domain).
In the dialog for Dgraph Server Connection
, configure the domain, e.g. https://alpha.example.com
(substituting example.com
for your domain)
Test Using the Ratel UI
In the Ratel UI, paste the following query and click run:
You should see something like this:
Cleanup the Project
You can cleanup resources that can incur costs with the following:
Remove the Azure Resources
This will remove the Azure resources:
############
# Delete the AKS cluster
############################################
az aks delete \
--resource-group $AZ_RESOURCE_GROUP \
--name $AZ_CLUSTER_NAME############
# Delete the Azure DNS Zone
############################################
az network dns zone delete \
--resource-group $AZ_RESOURCE_GROUP \
--name $AZ_DNS_DOMAIN
Resources
Here are some articles and links I have come across on topics while making this article.
Blog Source Code
- Blog Source Code: https://github.com/darkn3rd/blog_tutorials/tree/master/kubernetes/aks/series_1_endpoint/part_4_ingress_nginx_grpc
- nginx.conf automatically generated by ingress-nginx: https://gist.github.com/darkn3rd/43af50fe6b6229e6a13e2dad7901e0e7
Azure Resources
- VMSS manages the worker nodes used with AKS: https://docs.microsoft.com/azure/virtual-machine-scale-sets/overview
- AKS: https://azure.microsoft.com/services/kubernetes-service/
- Control access to Kubernetes via AAD: https://docs.microsoft.com/en-us/azure/aks/azure-ad-rbac
- AAD Pod Identity for AKS: https://docs.microsoft.com/en-us/azure/aks/use-azure-ad-pod-identity
Kubernetes Addons
- Cloud native certificate management: https://cert-manager.io/
- ExternalDNS: https://github.com/kubernetes-sigs/external-dns
- Nginx Ingress Controller (OpenResty): https://kubernetes.github.io/ingress-nginx/
Services used in ingress examples
- Yet another gRPC echo server: https://mhausenblas.info/yages/
- YAGES source code: https://github.com/mhausenblas/yages
- Dgraph: https://dgraph.io/
- Dgraph source code: https://github.com/dgraph-io/dgraph
- Dgraph’s Python gRPC client: https://github.com/dgraph-io/pydgraph
Ingress-Nginx and GRPC
- ingress-nginx gRPC docs: https://kubernetes.github.io/ingress-nginx/examples/grpc/
- grpc-fortune-teller application referenced in the docs (Bazel build scripts no longer compile): https://github.com/kubernetes/ingress-nginx/tree/master/images/grpc-fortune-teller
- grpc manifest examples: https://github.com/kubernetes/ingress-nginx/tree/master/docs/examples/grpc
- ingress-nginx GRPC example for ArgoCD: https://argoproj.github.io/argo-cd/operator-manual/ingress/#kubernetesingress-nginx
Protobuffers and gRPC
- protobuf documentation: https://developers.google.com/protocol-buffers
- gRPC documentation: https://www.grpc.io/docs/
- JSON vs Protobuf: https://auth0.com/blog/beating-json-performance-with-protobuf/
- Flatbuffers (not in this article, but related to protobuf): https://google.github.io/flatbuffers/
- JSON vs. Protobuf vs. Flatbuffers: https://codeburst.io/json-vs-protocol-buffers-vs-flatbuffers-a4247f8bda6f
Python (gRPC and TLS)
In doing this guide, I needed to figure out how to communicate securely with GRPC and TLS in Python. Most documentation covered using only self-signed certficates or certificates issued by a non-trusted private CA. As we are using certificates from a publicly trusted CA through Let’s Encrypt through cert-manager, it was hard to find material on this. Below are some links that may be useful, should find yourself on the same journey.
- Descartes Labs’s Python gRPC client script (great coding style): https://docs.descarteslabs.com/_modules/descarteslabs/client/grpc/client.html
- Cāobīn Táng’s (唐超斌) article on working with SSL/TLS and HTTPS with Python: https://chaobin.github.io/2015/07/22/a-working-understanding-on-SSL-and-HTTPS-using-python/
- gRPC Python Reference: https://grpc.github.io/grpc/python/grpc.html
- Real Python’s Python gRPC microservices guide: https://realpython.com/python-microservices-grpc/
- Dgraph’s Python gRPC client: https://github.com/dgraph-io/pydgraph
- GRPC Basics Tutorial for: https://grpc.io/docs/languages/python/basics/
- GRPC Quick Start for Python: https://grpc.io/docs/languages/python/quickstart/
Conclusion
This article was a challenge in two pieces, generally getting gRPC to work with existing HTTPS services that may still be needed, and testing gRPC itself, where the requirement go beyond a simple curl
command. The underlying protocol is different, as it requires serialization through protobuf and HTTP/2 is a different, not stateless like HTTP/1.1, so load balancer have to load balance on HTTP/2 sessions, rather than transactions.
The python script (or any other language for that matter) was a little challenging, as most services that are gRPC will almost never be available publicly, at least without some form of authentication. Typically, for internal private communication, such as microservice-to-microservice, or microservice-to-graphdb, private certificates are perfectly fine.
So I think this concludes the series of these articles. I could maybe document how to further lock-down Azure or these services, but likely I can see trying out some other options for gRPC with some different ingress controllers, like Contour, Ambassador, Gloo, or Traefik, or jumping into service meshes like Linkerd or Itsio.