
Bootstrapping GCE Instances with Knife
Part I: Google Cloud Platform
This article is about knife tool’s role within the Chef workflow and how to get started with Google Cloud by creating a SSH deploy key that knife will use for its bootstrap process.
The Knife Tool
Knife is an orchestration tool that interacts with the Chef Server to update chef components, such as nodes, environments, roles, and cookbooks, and for bootstapping newly created nodes.
The bootstrap process will do the following:
- register a node configuration on the Chef Server
- install a chef agent on the remote system
- configure the remote system to access the Chef Server
- proceed to apply role(s) or cookbook recipes on the remote system
All of this magic happens because of SSH, and as long as you configure SSH and enabled a deploy key on the target system, you can then remotely configure them to their initial desired state.
This guide will show you how to do this using with GCP (Google Cloud Platfrom) and AWS (Amazon Web Services).
Google Cloud Platform
If you are new to GCP, you can get started with a free tier, and after download and install Google Cloud SDK. This will create a default project, Google’s organizational category for managing cloud resources.
At this point we’ll need to run through the following steps:
- Generate SSH key pair (our deploy key)
- Install SSH key into our project
- Create some systems
- Bootstrap some Systems using our key
Generate SSH Key Pair
This is the typical process to generate a key pair, which will create a private key of gce.key
and a public SSH key of gce.key.pub
.
KEYPATH="${HOME}/.ssh"
ssh-keygen -t rsa -b 4096 -f "${KEYPATH}/gce.key" -C ubuntu -q -N ""
Install SSH Public Key into our Project
We need to create a special keysfile format that gcloud
tool requires for this. process. The format of this keysfiles will look like this (with users foo
, bar
, and baz
):
foo:ssh-rsa <PUBLICKEY> foo
bar:ssh-rsa <PUBLICKEY> bar
baz:ssh-rsa <PUBLICKEY> baz
We can easily craft this file in shell with the following:
KEYPATH="${HOME}/.ssh"
GCE_KEYSFILE="${KEYPATH}/gce.keysfile"
printf '%s:%s\n' 'ubuntu' "$(cat ${KEYPATH}/gce.key.pub)" \
>> ${GCE_KEYSFILE}
Once we have our keysfile, we can upload the data to our default project, so that future systems will create the Ubuntu account with the installed public key:
GCE_KEYSFILE="${HOME}/.ssh/gce.keysfile"
gcloud compute project-info add-metadata \
--metadata-from-file ssh-keys=${GCE_KEYSFILE}
Create Some Systems
Now we can create some systems, such as a three node ElasticSearch cluster (or whatever you desire). In Google Cloud, you can find this under Compute Engine
area, and the VM Instances
sub-section. There’s a CREATE INSTANCE
button that gives you some graphical interface like this:

Select Ubuntu 14 image, and create systems, e.g. es-01
, es-02
, es-03
. The defaults are fine for this category. We can view the results with gcloud compute instances list
, and inspect the metadata about the system with gcloud compute instances describe <instance_name>
.
With this information, we can log into the system using it’s public external IP address and our generated private SSH key for the ubuntu user:
IP=$(gcloud compute instances list | grep es-01 | awk '{print $5}')
ssh -i ~/.ssh/gce.key ubuntu@${IP}
Bootstrap Some Systems Using the Private Key
Now that we can access our systems using our generated and installed key, we can easily bootstrap with a script (bash v4 required, e.g. brew install bash
for macOS users) like this:
This above snippet is a simplified example, and needs to be tailored to your environment. There are some assumptions and requirements for your chef repository:
- Environments configuration:
/path/to/chef_repo/environments/production.json
- configured credentials in
knife.rb
to your chef server. - role called
elasticsearch
with required chef cooksbooks that make this magic happen.
Also, these were not addressed for brevity:
- systems are not ordered, so if they need to be bootstrapped in a particular order, such as a shared dependency, then you’ll need to re-order the list of systems.
- if you need to override attributes, such as generating a list of internals IPs for configuring discovery in ElasticSearch, then you would need use
--json-attribute
with a custom crafted JSON string.
A note about the script itself: the script has ephemeral state embedded into the code logic, which is an anti-pattern in my view. Specifically, the script responds to a hard coded static run-list
based on the node name. A better solution, omitted for brevity, would be to:
- create a data structure extracted from a file (like TSV or other delimiter separated file) that has system name (or grep pattern) and the corresponding run list, which itself is a string with a comma separated list.
- lookup the run list and insert it into the knife bootstrap command.
I kept the script as is to illustrate the process involved, we can expand upon this in future articles.
Conclusion
There you have it, the process to bootstrap systems with knife through a SSH deploy key installed into your Google Project. In Part II, I will document how to do the same process with AWS.