An introduction to containers

I've had some people ask me what containers are so here it goes. A good way to think of software containers is with the metaphor of shipping containers. Shipping containers have standardized sizes and allow the transportation of goods without the unloading and reloading of their cargo. This allows ports and shipping hubs around the world to build infrastructure that works with these standardized containers.

Containers are a way of encapsulating a program and its dependencies so that it can be deployed in a consistent way across multiple machines. Operating-system-level virtualization is the technique used to achieve this. This is opposed to hardware virtualization in the form of Hypervisors or Virtual Machine Monitors (VMMs). A brief comparison between OS-level virtualization and hardware virtualization follows.

Hardware virtualization allows for fully virtualized servers to run on the same physical hardware. This is the technique used on cloud platforms such as Amazon's EC2 service. Each virtual machine (VM) runs in a simulated environment on the host hardware. The environment is simulated because access to physical system resources (e.g. CPU, Memory, Disk Storage) is managed by the Hypervisor or VMM.

Hardware virtualization

VMs are used because they provide the ability to run many servers on one physical server. In addition, VMs provide isolation and can be moved between different host systems. However, VMs are very big in size and slow to start up. This is where OS-level virtualization comes in.

OS-level virtualization allows the kernel of an operating system to support multiple isolated user space instances. Each instance (or container) is isolated from other instances. This is akin to the chroot mechanism available on Unix systems where each instance has its own root directory for all its processes and child processes.

OS-level virtualization

The above diagram presents a graphical representation of OS-level virtualization. Unlike with hardware virtualization, there is only one OS. The OS virtualization layer provides an abstraction of the OS that the containers interact with.

This means there is little overhead in starting a container as there is no need to simulate an OS. There is one limitation of this though as all containers have to run the same kernel and OS. It is not possible to run a Windows container on a Linux host OS for example but it is possible to run a different Linux distribution to the host Linux OS.

Docker, a project that was first released in 2013, 'is an open platform for developing, shipping and running applications'[1]. It provides automation of OS-level virtualization on Linux using kernel namespaces and cgroups to isolate containers whilst running within a single Linux host. Each container has its own network stack, process space and instance of file system.

How Docker works

The diagram above shows how Docker works on a high level. Docker is an engine that runs on top of the Linux kernel's virtualization features. Before libcontainer, Docker used to access the kernel through libvirt (management tool for hardware virtualization), LXC (containment features for Linux kernel) and systemd (system and session manager for Linux).

Docker now uses libcontainer to access the Linux kernel. Libcontainer is a key part of Docker. It is an open-source project that companies like Microsoft, Google and Red Hat contribute to.

Docker is not the only container option for Linux. However, it has exploded onto the scene due to its support from cloud providers and also due to the tools that have been developed around the Docker ecosystem. Docker provides tools to get applications into Docker containers and also has a platform for distributing and sharing containers, Docker Hub.

Finally, a key feature of Docker that isn't present in many other container tools is layered filesystem images (AUFS). This allows container images to be built that rely on other container images thus reducing space usage and simplifying filesystem management. This allows many containers to share the same base images. This is very useful when multiple containers rely on the same base image.

Since containers use OS-level virtualization, they are lightweight and quick to start and stop; much faster than using hypervisor-based VMs. This is one reason as to why containers have grown in popularity recently, along with microservices.


  1. What is Docker? https://docs.docker.com/introduction/understanding-docker/ ↩︎