DevOps Concepts: Bake vs Fry 1
When provisioning a system, you have a choice: bake the configuration into a system before launching the system, or adding the configuration after launching the system, called frying. In reality though, you’ll likely use a mixture of the two, where parts to the system will be baked and ready to go, and other parts will be fried.
The concept applies to all ages from the Iron Age of computing, using raw hardware, to the Cloud Age using virtual systems and later with container technologies. However the implementation and popularity of one method to another varies between the ages.
This will be a three part series covering backing and frying:
- The Iron Age
- The Cloud Age (Virtualization)
- The Cloud Age (Containers)
The Iron Age
Before the rise of popular cloud platforms and virtualization, enterprises built their own data centers using hardware (iron). During this age, enterprises managed large consolidated centralized servers and fleets of client systems, sometimes called workstations (desktop or laptop).
Baking Client Systems
Managing fleets of client systems was expensive, so baking became popular method to manage client systems. An enterprise would bake a standard corporate image and through a process called imaging, replace the existing system with the corporate image.
The Golden Image Problem
For servers, this was not as popular, because imaging technologies at this time was slow, and there was the infamous Golden Image problem: numerous combinations of configurations leads to maintaining a large image library.
As an example of this problem, if operations were tasked with building systems that need to support any combination of 12 components, such as databases, web services, and application versions, and so on, operations would build 132 golden images to support each of the potential combinations.
Maintaining such a library became known as the Golden Image problem, and is one reason why baking at the time was not popular on servers.
Frying on Iron Servers
Though the golden image problem made baking undesirable for servers, there was still a great need to for automation. This arrived through provisioning systems like JumpStart for SunOS (1994), Anaconda on RedHat Enterprise Linux (1999) and FAI (Fully Automatic Installation) with Debian. Similar to imaging solutions, network booting process was also used for provisioning systems.
The provisioning solution would use would use automation scripts, such as a Kickstart script or a Debian preseed script, at the launch of a new system to to configure a system to a baseline with the desired components.
Eventually a system will have its configuration drift away from the base line or the desired state. Changing the system back to the desired state was very expensive and risky, but leaving a system out of alignment was also risky.
The solution for this arrived in the form of CFEngine (1993) for Unix systems and Linux. CFEngine would apply promise theory to converge a system to the desired state using an agent installed on each system. When the system diverges from the desired state, CFEngine will change the system to meet its promise, through a process of converging to the desired state.
With rising popularity of Linux data centers, and growing need for these solutions, we saw arrival of new change configuration platforms:
Rise of Virtualization
As virtualization became more mainstream, we saw a shift from single consolidated servers to numerous virtual systems on the same hardware box.
The common approach is to use a mixture of frying tools, from early stage provisioning with tools Kickstart and Cobbler to create a baseline (security, monitoring, agents, packages) at launch, then later after launch use change change configuration solution like Puppet or Chef to install frameworks and deploy applications.
The above illustration shows the approaches to frying the configuration on systems:
- Provisioning will configure at launch of the system
- Mixed Approach usually installs and change configuration agent launch, then afterward converge the configuration to the desired state.
- Change Configuration will converge the configuration to the to desired state after the system is launched.
The application, such as SaaS web services, is the one component that is often updated on a system frequently. Installing a new versions of the application is called deployment.
The web application can be deployed using either either pull or push method.
In the pull method, the build systems, now using continuous integration solutions like Jenkins, will build an artifact in the form of a software package, such as
jar. These artifacts are stored in an artifact storage repository. The change configuration solution, like Puppet or Chef will then install the approved software package.
In the push method, the software is directly installed on the web servers hosting the application. In here, change solutions like Puppet or Chef only configure the system, such as the web server, like Apache or Nginx. Then a tool like Capistrano (2006), Fabric (2008), Mina (2012), or MCollective (2011-2018) will push the application to these servers.
Push Deploy or Pull Deploy?
So… which one is better? Push Deploy or Pull Deploy? Well… that depends.
For large enterprise organizations that have require strict controls, especially in heavily regulated industries, they will likely use the pull method.
For the more lean startup that needs to regularly deploy software, or an organization has a need for complex orchestration, such as stopping and starting a message queue or applying a database schema before making the web application available, will want to use push method.
With growing need for push deployments, newer arrivals like Salt Stack (2011) and Ansible (2012) have become popular. These tools can use push method to deploy applications, facilitate orchestration, perform remediation to repair systems, and do convergence with change configuration.
Zero Configuration with Discovery
Wouldn’t it be great if you didn’t need to configure your applications? The applications could just discover the configuration, such as address of database server and then configure themselves? Well that is actually possible with service discovery.
Service discovery is a method to discover availability of services. For a cluster, like Hadoop, Apache Storm, or ElasticSearch, where applications can exist a single service distributed across several nodes, instead of many services on a single node, service discovery allows the cluster application to find all of its members as well as detect errors for any of its members.
One essential feature of service discovery is a distributed key-value store, where you can save configuration artifacts. These configuration artifacts can be fetch by the application to configure itself, or self-frying.
Some service discovery solutions provide a mechanism for health checks and event broadcasts that your application can use to do remediation, where it can repair itself or self-heal.
These are some of the solution in the umbrella of service discovery.
- Apache Zookeeper (2010): key-value store, discovery must be built into application logic itself
- AirBnB Smart Stack (2012): service discovery, key-value store, health checking
- CoreOS Etcd (2013): service discovery, key-value store, events
- Hashicorp Serf (2013): node discovery, group membership, failure detection, events
- Hashicorp Consul (2014): service discovery, key-value store, health checking
HashiCorp also as a few complementary tools that can work with Consul:
- Hashicorp Vault (2015): store secrets artifacts in a secure way (using Consul or another solution as a backend), as well as provide encryption services, certificate authority, auditing of secrets, etc.
- Consul-Template: query either Consul or Vault to apply templates and configure systems, frying.
There you have it, numerous applications of frying and baking for the Iron Age. Despite the allure of cloud virtualization IaaS platforms, like AWS (2006), Google Cloud (2008), and Azure (2010), running a data center on your own hardware is still quite popular in the industry.
Many enterprises use a mixture of both, called hybrid cloud, where they use their own hardware data centers in addition to using services from cloud providers. Thus, these solutions used in the Iron Age, are popular going forward into the Cloud Age.
Iron Age vs Cloud Age
- Infrastructure as Code, Managing Servers in the Cloud, “The Iron Age and The Cloud Age” side bar by Kief Morris, O’Reilly. June 2016.
- Infrastructure as Code: From the Iron Age to the Cloud Age, by Kief Morris, January 8, 2016.
Change Configuration Topics
- Configuration Drift by Kief Morris on December 6, 2011
- What is Configuration Drift? (security context) by Irfahn Khimji on Jan 22, 2018
Deployment and Orchestration
- Automated deployment systems: push vs. pull by Grig Gheorghiu on March 03, 2010
- Service Discovery in a Microservices Architecture by Chris Richardson on Oct 12, 2015.