Introduction

If you're just starting on your journey into either Developer Operations or Cloud Computing you're likely to run across the concept of Virtualization, which can be rather confusing.

Virtualization is a very different subject to navigate because there are so many cross-cutting concepts between the layers of abstraction that define a modern virtualization stack. The illustration I painted for this post is a great visual metaphor for how I felt when I first began to learn about virtualization.

In this article, I will explain the 2 most important types of virtualization: Virtual Machines and Containers.

Typically virtual machines are used in the context of infrastructure and DevOps practices, whereas containers are typically within the domain of application development. However, these two layers of virtualization are interrelated and it's important to understand their similarities and differences.

Hardware Virtualization

Virtual Machines (VMs) are hardware virtualizations that are configured with computing resources (CPU, RAM, etc). The key thing to know is that they are software-based computers. Their purpose is to emulate an entire computer, virtually. Multiple virtual machines can share the same physical hardware but are logically isolated from each other as if they are separate machines.

Hardware, such as a server or personal computer, is usually what hosts Virtual Machines. A virtual machine has all the functionality and features that you might commonly associate with a personal computer; with hardware virtualization, multiple Virtual Machines share hardware. In contrast, a standard out-of-the-box PC is just a singular piece of hardware with a single operating system.

The software that allows any number of virtual machines (guests) to run on a single piece of hardware is what's known as a hypervisor, or Virtual Machine Monitor (VMM).

There are 2 main types of hypervisors:

Hypervisor Types

Type 1 hypervisor, aka "bare-metal"

Bare-metal hypervisors are installed as software directly on the computing hardware and control any number of guest operating systems, which can have different kernels because the operating systems of the Virtual Machines are isolated and are not limited to specific hardware devices or drivers.

Bare-metal hypervisors are the most commonly used. They are also very secure since there is no host operating system that can be attacked. They are the most efficient, in terms of performance and latency.

Bare-metal hypervisors function in place of what would typically be an operating system, their primary purpose is to manage guest operating systems and allocate resources. Type 1 hypervisors are typically used for enterprise data centers and are also known as "single-tenant" environments.

Type 2 hypervisor, aka "hosted"

Hosted hypervisors are installed inside an operating system on the host machine. Hosted hypervisors have increased latency because communication between the hypervisor and the hardware passes through an intermediate operating system layer.

In contrast to bare-metal hypervisors, a hosted hypervisor is encapsulated in the host machine's operating system instead of being installed above the hardware layer. In this context, the hosted hypervisor runs just as any other application within the host would.

A common virtualization software, which can be installed and ran from your home computer, is VirtualBox.

VirtualBox is a Type 2 hypervisor that is owned by Oracle. When installed, your personal computer serves as the host operating system.

VirtualBox, for example, can be hosted on a Mac computer running the latest version of macOS. The virtual machines managed by VirtualBox can run any number of guest operating systems such as Windows, Linux distributions, or additional macOS machines.

Virtual machines are flexible in the sense that computing resources can be customized and storage/memory can be attached/detached as needed. Virtual machines are also portable and can be stored as snapshots and moved between hypervisors.

Containers

Containers are OS-level virtualizations. Essentially they're computing environments that run as isolated processes within a virtual machine. The big difference between a container and a virtual machine is that containers do not have exclusive computing resources allocated to them when they are created.

Hardware virtualization seeks to emulate an entire machine, whereas a container's purpose is to emulate just an operating system and any applications that run therein.

The most common use for containers is to deploy distributed applications using microservice architecture, which is a topic worthy of its own article. For now, it's enough to understand that containers are used to make the components of an application modular and isolated (distributed). So for example a database for an application may run in one container and the web-server for the same application may exist in a separate container. Microservice architecture is resilient because failures are isolated and individual components of a distributed application can be triaged/redeployed/scaled independently.

The most common containerization engine is Docker.

Docker containers can run any number of operating systems, but the basic requirement is that the operating system of the container must be compatible with the kernel of the virtual machine in which it's deployed.

For example, you can "spin up" a virtual machine running some version of Linux and deploy a container inside of that virtual machine as long as the container also some distribution of Linux, since all Linux distributions share the same kernel. The virtual machine is the "computer", which is created with allocated computing resources (RAM/Memory/etc). The container is a process that utilizes those resources and has its own operating system that runs on top of the virtual machine's.

Containerization Diagram

Typically containers are used for applications such as web servers and databases. The operating systems that containers use are usually lightweight in the sense that they often don't include a graphical user interface or the user experience associated with the sorts of operating systems that are installed on personal computers.

The benefit of containers is that they can be configured with very minimal code and; once they're built as images they can be deployed within seconds. Containers are generally used to deploy standalone applications or a set of tightly coupled services.

Pre-built, opensource, containers can be downloaded as "images" which are consumed by a containerization engine to deploy with in a host machine. Common images, such as databases or SaaS, save developers a great deal of time since they eliminate the need to install/configure the database software on the host operating system.

Containers are important for modern applications and in software development because they are portable and offer consistent/predictable environments between deployments.

Final Example

Let's walk through a very simple example of how containers and virtual machines relate to one another.

It's important to know that there are a lot of details that are omitted in this example. But for this article, we'll keep it simple.

Final Example Diagram 01

Define and Build Container

To deploy an application, as a container, we firstly define an operating system for it to use. If the virtual machine this container is meant to run on is a Linux machine then it can be any distribution of Linux, such as Debian-slim.

The container is built from a set of configurations and execution steps, either shell scripts defined as static files or CLI commands that can be executed at buildtime on your behalf. Ideally, the execution steps are described in such a way that no further human interaction is needed for the application(s) within the container to be interacted with. The applications within a container are available through an exposed port or are networked with middleware.

Let's assume that our container is built on top of a base image, that has additional pre-packaged software to streamline the process of deploying the application within it. In this example, software such as Git and Node.js are available to be interacted with through a command-line interface because our container is has been built with these already installed.

Containers are defined in a dockerfile, which is used to build an image. The build process will setup everything so that when the image is deployed it is a fully functional enviornment with all the desired software installed and running.

Deploy Containerized Application

Additionally, this application's code might be hosted on a version-control platform, such as Github. We may want to clone this repo at runtime so our application is always up-to-date when a new container is deployed. When our container is launched we can control it with an interactive terminal, or provide scripts that are automatically executed as soon as the container is launched- which will clone the latest version of our application into the container's local storage. After cloning our repo it continues with the next predefined commands which install the application's dependencies, starts the software, and then performs a series of tests to confirm that the application was successfully deployed in the container. As an added convenience our container has an exposed port so that our application can be interacted with using a REST API.

For Node.js we might even have a single entry-point command that runs all the aforementioned app deployment scripts related to the repository- such as yarn && yarn build && yarn start && yarn healthcheck.

However, in most cases we'll probably choose to build our application directly into our container image as bundled JavaScript, in which case there is no need to clone our repo and set up the application- which might be useful for creating a more secured environment or to reduce the start-up time.

As previously mentioned this example container is deployed inside a virtual machine, which also has other containers deployed on it. Perhaps we want these containers to interact with each other. Since they are separate processes they can be configured with middleware, such as a networking bridge. We might even create a container that serves as a hub, which manages communication between all the containers on a virtual machine. Most containerization software, like Docker, has solutions for these sorts of things.

Additionally, with container orchestration software such as Kubernetes, there's a whole range of possibilities for managing and interacting with groups of containers (clusters).

The Virtual Machine

All the containers on a virtual machine are isolated processes, which have their own operating systems and share a kernel and compute resources with their host virtual machine. These containers will dynamically utilize the resources available to the virtual machine as needed, just as any normal application running on a typical computer would. However, containers can also be configured in such a way as to limit their use of computing resources.

Final Example Diagram 02

The Hypervisor

The virtual machine that hosts our containers is managed by a hypervisor. For this example, lets assume our virtual machine exists alongside a number of other virtual machines which all run their own set of containers.

The hypervisor is this setup is "Type 1, bare metal" and represents a singular piece of hardware, without an operating system between the hardware and the hypervisor. It allocates compute resources to all of the virtual machines it manages and is effecient, low latency, and secure.

If it's the case that the containers within a virtual machine need access to more compute resources than are available, they can be easily redeployed entirely or as snapshots inside of a new virtual machine that has the appropriate allocation. Container migration is possible because the storage associated with an instance is typically detachable, which means data can persist despite its associated virtual machine being terminated. Container orchestration software can also perform snapshots of the containers themselves on a schedule or through a CLI script and store them in block storage. If your virtual machine instances are part of a cloud infrastructure there are a number of tools that make snapshots and migrations simple and easy to do.

Similarly, if the virtual machines in our example need access to more computing resources than are available on the underlying hardware they can also migrate to different hypervisors.

Conclusion

As mentioned at the beginning of this article, Virtualization is very complex and abstract. But the basic starting point is understanding the distinction and interrelationship between virtual machines and containers.

There's far more information about this topic than can be reasonably written about in a single, digestible, article. For example, there are ways to manage computing resources and utilization- such as load balancing and autoscaling. In the context of Infrustructure as a Service, there exists a whole range of elastic computing possibilities that aren't possible in traditional IT infrastructure. But that's a topic for another time!

If you enjoyed this topic and want to see more all you have to do is check back regularly. Stay tuned for more articles related to cloud computing, infrastructure, and DevOps.

Virtualization (VMs & Containers)