
DevOps Concepts: Bake vs Fry 2
Part 2: Configuration Methods during the Cloud Age
In the last article we introduced the concept of system configuration using baking and frying, and we covered ways this was applied during the Iron Age.
We covered:
- Imaging Client Systems (baking)
- Provisioning Server Systems (frying)
- Change Configuration (frying)
- Application Deployments (frying)
- Service Discovery (frying)
Now we’ll show how this concept is applied in the Cloud Age.
Previous Article
The Cloud Age
With the arrival of the Cloud Age, the entire data center was virtualized, not just the virtual machine instances, but storage and network infrastructure as well.
These new virtual resources are accessible through a web interface from the IaaS platforms, like AWS (2006), Google Cloud (2008), and Azure (2010), or OpenStack. Suddenly now with a few mouse clicks or a small script, you can create the whole infrastructure of your organization.
As greater number of applications are released as web services in the cloud, whether service (SaaS) or platform as a service (PaaS), focus shifted toward maintaining fleets of instances and deploying web applications to those instances.
Provisioning Instances (Frying)
Configuring systems with virtualization on your own hardware or cloud virtualization are virtually the same. The one exception is that you can no longer use network booting for early stage provisioning. Instead, with a new virtual machine instances, you select a pre-baked image, supplied by the cloud provider, such as Debian, RedHat, or Ubuntu.
With your operating system image selected, you configure an SSH key (so that you can access the systems later), and optionally use a startup script solution, such as cloud-init, to configure the system at launch time. From here, you can provision the whole system at launch or hand off to a change configuration solution, like Puppet (2005), Chef (2009), or more recent arrivals Salt Stack (2011) and Ansible (2012).

The above illustration shows the approaches to configuring systems:
- Provisioning everything at launch of the system using cloud-init or another alternative.
- Mixed Approach where provisioning installs an agent (like Puppet agent or Chef client) that is later used to configure remaining components through convergence.
- Change Configuration is used configure the whole system after launch. The system will have to use push-based remote execution to either install an agent or configure the system, like Chef Knife or Ansible.
Making Images (Baking)
Instead of the selection of images from the cloud provider, you can decide to bake your own images with tools like:
- Netflix Animator (2013–2016) for AWS
- Hashicorp Packer (2013) for many providers
Building images with such automation is far more efficient than before, especially with Packer, where you can use previously popular tools like Debian Preseed and Kickstart for building the system, and then use Packer Provisioners to run change configuration tools like Chef, Ansible, Puppet, and Salt Stack to finish baking the image.
Despite efficiency in building images, there’s still challenges the Golden Image problem where curating a library of images of various combinations becomes a maintenance nightmare.
To Bake or Not to Bake?
Despite higher costs to build and maintain baked images, the costs for deployment and remediation (repairing systems) goes down dramatically when using images. The systems are disposed and replaced when they need to be upgraded or repaired, a pattern called immutable infrastructure.
Additionally with baked images, you can have a single image encapsulate the applications, dependencies, and configuration into a single deployable artifact that can be versioned.
In contrast with frying with incremental changes on a mutable system, you have multiple artifacts that need to be rigorously maintained, synchronized, and tested. Systems are upgraded and repaired incrementally, sometimes without automation. This leads to higher complexity for deploying systems.

The illustration shows:
- Fully Baked Image: single artifact that is dipsosed and replaced
- Frying in Stages: illustrates four artifacts: system image, provisioning scripts, change configuration scripts, and application deployment scripts, which are all used in conjunction to support the application.
Self Healing Managed Instances
Wouldn’t it be great if we don’t have to configure the systems each time and every time? Maybe we just set them up once, and afterward, they managed themselves? Or if they fail or crash, just replace themselves?
This is actually possible, for lack of a better term, cloud managed groups:
- ASG (Auto Scaling Groups) from AWS
- MIGs (Managed Instance Groups) from Google Cloud
- Virtual Machine Scale Sets from Microsoft Azure
You declare in a configuration template some attributes, such as how many identical systems in the fleet, health checks used to determine if the instance is functional, and rules for scaling up or down the number of instances depending on usage.

The above illustration is a generalized overview of the process. When you apply your configuration template, the cloud provider will launch a fleet of instances and park them all behind a load balancer.
The cloud provider will use what can be described as a reconciliation loop (to borrow a term from Kubernetes), where it continually monitors the health of your fleet of instances, leveraging from health check mechanism from the load balancer and used in conjunction with monitoring and alert system.
As apart of fleet management process, any failed instances will be removed from the load balancer set, and replaced with a new instances. As part of dynamic scaling process, new instances will be added or removed, as needed when the load on the existing instances increases or decreases.
When launching a new instances that is part of a fleet, they either need to be fully baked for their role, or need to be provisioned (fried) at launch time for their role (with optional hand-off to change configuration solution). The cloud provider will have some mechanism to provision the system at launch:
- AWS EC2 user data (shell script or cloud-init)
- GCE startup script
- Azure extension profile
Cloud Provisioning (Frying)
We discussed configuring systems through baking or frying, but what about configuring the cloud itself?
Change Configuration platforms are good at configuring system resources: files, packages, services, users, groups, etc. What about cloud resources: virtual machine instances, virtual networks, virtual storage, auto-scale groups?

Configuring cloud resources is also sometimes called cloud provisioning, and then combined with configuring operating systems, you can have end-to-end infrastructure as code.
In my experience, these are the three most popular platforms for cloud provisioning are:
- Hashicorp Terraform (2014) for AWS, Google Cloud, Azure, OpenStack, and many others
- Cloud Formation (2011) for only AWS
- Ansible Cloud Modules (2014): AWS, Google Cloud, Azure, OpenStack, others
Below is more comprehensive list of other solutions that is sorted in the order of their release:
- Pallet (2010): AWS, OpenStack
- Chef Provisioning (2013): AWS and others through fog gem
- Salt Cloud (2013): AWS and other smaller providers
- Troposphere (2013): AWS, OpenStack
- Azure Resource Manager (2014): Azure
- Cloud Deployment Manager (2018): Google Cloud
Service Discovery (Frying)
The previous article introduced service discovery for cluster membership or external services during the Iron Age. The application itself can query a discovery services to find other members in its set as well as external services, such as a database or message queue.
General Discovery Solutions (any cloud)
I am repeating the list from the last article. These can be implemented on any cloud platform.
- Apache Zookeeper (2010): key-value store, discovery must be built into application logic itself
- AirBnB Smart Stack (2012): service discovery, key-value store, health checking
- CoreOS Etcd (2013): service discovery, key-value store, events
- Hashicorp Serf (2013): node discovery, group membership, failure detection, events
- Hashicorp Consul (2014): service discovery, key-value store, health checking
AWS Discovery Solutions
These are two built for only AWS.
- Netflix Eureka (2012): service discovery tightly coupled with other Netflix services.
- Cloud Map (2018): service discovery that can register resources and broadcast events on availability of those registered resources.
Cloud Metadata (Frying)
As an alternative to using service discovery, you can use cloud metadata.
The Cloud Age adds a new dimension in that services will include cloud resources exposed through web interface. Each cloud provider provides features that support metadata that can be used as a form of service discovery.
Google Cloud Metadata
- Instance Tags: list of values that can be added to an instance
- Resource Labels: key-value pairs that serve as lightweight method to group resources together
- Instance Metadata: information about running instance such as IP address, machine type, zone, networks, and other information.
- Project Metadata: grants applications shared environment variables and configs that can be accessed securely.
Azure Metadata
- Azure Tags: key-value pairs associated to resources
- Azure Instance Metadata: information about running VM instances, network configuration, and other information.
AWS Metadata
- EC2 Tags (2010): key-value pairs associated to resources
- Systems Manager Parameter Store (2017): store key-value pairs
- Secrets Manager (2018): store and retrieve encrypted secrets
Disposability Patterns
With all these technologies and methods, there comes the question to how to go about managing systems, from single systems to fleets of systems?
Before we dive into this, I wanted to take a small moment to talk about two related metaphors: pets vs. cattle and snowflake servers vs. pheonix servers.
Pets vs. Cattle
In the pets vs cattle centers around instances that are managed group (cattle) or managed as single instances (pets). When managed as a group, the instances are replaceable, but when managed individually, they are irreplaceable.
A while back I wrote an earlier article that expanded on this topic:
Snowflake Server vs. Pheonix Server
In snowflake server vs pheonix server concept centers are around configuration drift and disposibility of instances.
Instances become snowflakes as their configuration will gradually diverge or drifts form the desired state. In order to remedy this, you need to continually monitor the system and incrementally configure it back to the desired state.
As an alternative, you can dispose of the snowflake server completely, and replace it with a newly provisioned instance. The instance rises from the ashes of disposed instance, hence the term pheonix server.
A pheonix server can be created using frying (provisioning or change configuration) or baking (imaging). The later is also called an immutable server.
Combining the Concepts
The illustration below shows the two concepts combined and illustrates whether baking or frying are applicable.

These are the categories of tools you might use to support such systems:

Types of Tools include the following:
- Bubble Gum and Scripts: scripting language often combined with undocumented commands, mouse clicks in web interface, etc.
- Provisioning (Early Stage): Kickstart, Debian preseed, Cloud-Init
- Change Configuration: CFEngine, Chef, Ansible, Puppet, Salt Stack
- Cloud Managed Groups: AWS ASG, Google Cloud MIG, Azure VMSS
- Imaging: Packer or custom solution
Conclusion
In the Cloud Age, we saw further virtualization of all resources besides instances of virtual servers, and new opportunities for automation, using infrastructure as code to not only configure components in the operating system, but for components in the cloud, such as networks and storage.
In the next article, I will discuss containers and orchestration and scheduling solutions for containers, similar to cloud managed groups, and show how baking and frying relate to these topics.
References
Immutable Infrastructure
- What Is Immutable Infrastructure? by Hazel Virdó on September 26, 2017.
Infrastructure as Code (IaC)
- What is Infrastructure as Code by Sam Guckenheimer on April 3rd, 2017.
- Infrastructure as Code: Why is it Important on August 20th, 2018
- Infrastructure as Code: A Reason to Smile by Jafari Stiakange on March 14th, 2016
Pheonix vs. Snowflake
- Snowflake Server by Martin Fowler on July 10, 2012
- Pheonix Server by Martin Fowler on July 10, 2012
- Immutable Server by Kief Morris on June 13, 2013
- Configuration Drift: Phoenix Server vs Snowflake Server Comic by Teddy Hose, Masami Kubo and Hazel Virdo on March 26, 2018
Google Cloud Topics
- Service Discovery and Configuration on Google Cloud Platform — Spoiler: It’s built in! by Sandeep Dinesh on Jan 7, 2016
AWS Topics
- New — Tag EC2 Instances & EBS Volumes on Creation by Jeff Barr on March 28, 2017
- The Right Way to Store Secrets using Parameter Store by Ananth Vaidyanathan on August 27, 2017.
- AWS Cloud Map: Easily create and maintain custom maps of your applications by Abby Fuller on November 28, 2018.