Immutable Infrastructure and Security, Part 1

  • Leah Berges
  • April 24, 2024

Content

Immutable Infrastructure and Security, Part 1

Immutable infrastructure is a concept that, despite its many benefits, is often a lofty goal rather than an implemented practice. Many organizations have considered adopting it, but often it gets de-prioritized in the pursuit of faster time-to-market and shipping new features.

However, immutable infrastructure is a critical practice for safely deploying application infrastructure at scale. Minimizing configuration drift and unintended changes makes application infrastructure easier to scale and safer to maintain, and it also improves the overall security posture.

In this two-part series, we’ll discuss what immutable infrastructure is, why it’s more than just a safe way to deploy applications at scale, and why it’s an important part of any organization’s security strategy.

What Is Immutable Infrastructure?

The principle of Immutable Infrastructure (II) can be summarized as follows: Once software infrastructure is deployed, it is not changed until a new version, feature, or revision completely replaces it.

This seems simple enough, but what does it mean to say infrastructure is immutable? What does this look like in practice?

A common analogy for explaining II is “cattle vs. pets.” When we think of pets, we think of animals with unique names, behaviors, diets, and care regimens. Cattle, in contrast, are generally not given names, tend to act with a herd mentality, and are fed and cared for in a generalized and highly scaled manner.

Replace “animals” in the previous analogy with “servers,” and the parallels become obvious. Servers, nodes, or virtual machines that are treated as pets have unique names and behaviors, and the sequence of steps or actions that were taken to arrive at the current state or configuration could not be easily replicated on another node. Conversely, servers that are treated as cattle are typically named as part of a larger herd, with alphanumeric sequences as their only unique identifier. Their care and feeding, or in this case configuration, is a scalable and repeatable process that can be applied to new members of the herd to quickly bring them into the same state as other members.

In this paradigm, deployed network components such as servers are not patched, upgraded, or modified. If an update is needed, the existing component is destroyed and replaced by a new one.

The concept of immutable infrastructure also extends to software artifacts. Software artifacts are generally thought of as a by-product or output of software development. (In the context of this article, the term “software artifact” will specifically refer to a unit of code or application functionality that can be deployed as a stand-alone feature or application or as part of a larger system.)

Mutable software artifacts often result in difficult, error-prone deployment processes. In a hypothetical development environment, a developer may need to make small changes to their software to enable it to run correctly in their local development environment, in the testing environment, and finally in the staging environment. What happens if the production deployment fails? Was the original feature work at fault, or was it one of the many changes made to make the artifact functional in the various environments? Mutable artifacts result in excessive complexity and make debugging software and infrastructure issues much more difficult.

A more immutable approach, such as versioning software artifacts with immutable tags (such as a Git commit hash) or using containerization to encapsulate everything, results in a process that is much simpler and less error-prone.

What Are the Benefits of Immutable Infrastructure?

Focusing on making infrastructure immutable provides a variety of benefits to a software organization; it simplifies management overhead and complexity, enables better baseline security posture, and improves operational outcomes/scalability.

Reducing complexity

Exploring the software artifact example a bit further, immutable software artifacts help reduce complexity and management overhead in software development. Most modern software applications aren’t just a compiler or interpreter anymore; they often represent a complex web of software dependencies, various build artifacts, configuration files, and middleware.

Trying to manage all of these interconnected components quickly becomes a non-tenable affair at scale; how can development teams effectively track all of the various dependencies and unique configurations across thousands of nodes and microservices? With immutable software artifacts, all of the dependencies, configuration files, and tooling are encapsulated in something like a Docker container, which is uniquely versioned and tagged.

This process occurs once, at build time, and that artifact is never changed again, regardless of the deployment environment or context. If a developer makes a change to any aspect of the artifact, even a small dependency or configuration change, a new artifact with a new version tag is created and deployed over the previous one.

Improving security

Immutable infrastructure also helps software teams maintain a better overall security posture. Immutability can extend to the management of traditional remote access pathways like SSH, SFTP, and other types of connectivity.

In legacy, mutable systems, servers were configured and upgraded via manual remote access using SSH or similar tools. SSH access requires the management of the user accounts and allowed SSH keys on each server that permit access. Managing this type of access at scale often meant software and ops teams would need to also deploy and manage configuration management software infrastructure, adding additional complexity and attack surface to the stack. Every node that allows remote access is a potential target for hackers and increases management complexity.

In an immutable pattern, access and user accounts are defined during the infrastructure configuration and deployment phase—and never modified again without a new version of the configuration being deployed as a complete replacement. Organizations that frequently need to grant or revoke access to systems may want to explore alternative management patterns, rather than replacing their fleet every time a new user is added. Part 2 of this article will explore in much greater detail various implementation patterns that can be used for infrastructure access and management.

Improving operational outcomes

Improving operational outcomes in software deployments should be a key focus for any engineering organization. One of the key DORA metrics is “Change Failure Rate: the percentage of deployments causing a failure in production.” Low change failure rates are a sure indicator of a healthy software development and operations environment.

Consider a hypothetical software deployment in a large, distributed system. In the legacy, mutable pattern of infrastructure, small changes and tweaks were allowed to accumulate over time. This resulted in configuration drift; in other words, live systems no longer reflect their canonically accepted configuration state as defined by developers and operations staff. Changes or deployments to the system result in unexpected behaviors and failures that cannot be modeled adequately in testing due to the lack of homogenous environments. The deployment causes an outage in production, and engineers have to scramble to determine the root cause. Rolling back to a working state is nearly impossible because the defined working state is not reflective of reality.

With an immutable deployment, the current working state of infrastructure and application components is known and defined; everything is clearly tagged with commit hashes and version numbers. If a new software deployment or infrastructure change causes a production outage, it is easy to identify the at-fault change. Rolling back is simply a matter of restoring the previous known working state. Being able to minimize mean time to resolution (MTTR) is another key DORA metric and results in better outcomes across engineering and business teams.

Immutable Infrastructure in the Cloud

Modern distributed software infrastructure is often hosted in the cloud. Cloud providers like AWS, Azure, and Google Cloud Platform offer a plethora of managed and unmanaged services that startups and enterprise organizations alike can use to host their applications. However, the public-facing nature of cloud infrastructure brings its own unique security challenges that aren’t necessarily present in legacy, on-premise infrastructure. The larger attack surface means any win in reducing complexity is a win in security posture.

Immutable infrastructure can reduce complexity by limiting unscoped change, and ultimately configuration drift in the environment. This is a key security outcome for cloud environments. Reducing complexity leads to software architecture that is easier to manage, understand, and ultimately secure.

Here’s a good thought experiment for software infrastructure: could your engineering teams correctly enumerate all of the access vectors that an attacker could potentially exploit without looking at live, running workloads? Live systems require maintenance and security testing, to be sure. But with an immutable infrastructure configuration, organizations can fully understand the state of critical production systems.

Not all cloud services are created equal in this regard; it’s perfectly possible to build a complex, brittle, mutable software stack on top of traditional compute services like Amazon EC2. Further drawing on the AWS example, an organization could choose to build immutable containers, store them in Amazon ECR with build tags, and deploy them to Amazon ECS, providing a much more immutable solution. Part 2 of this series will go into more detail and depth on specific immutability patterns that can be implemented with cloud services.

Immutable Infrastructure Isn’t Just a Buzzword

Immutable Infrastructure isn’t just a “nice-to-have”; it should be considered a fundamental principle of building distributed, cloud-based application infrastructure. It enables engineering teams to manage large amounts of infrastructure securely and at scale. Manual change management quickly bogs down and leads to vulnerabilities in legacy, mutable infrastructure.

It also is related to other important concepts and practices in tech today, such as Infrastructure as Code, container security, and DevSecOps. In Part 2 of this series, we’ll dig deeper into immutable infrastructure and web security, including some examples and benefits, as well as specific implementation patterns for cloud-based infrastructure.

DDoS mitigation: Why time is critical
Warning: Dangerous DDoS attacks by ZZb00t targeting multiple new victims
X