AWS

Introduction to Containerization – Vol I

For some time now, in nearly all conversations around IT infrastructure phrases like container or containerization come up. The topic is indeed a hot one, so I decided to collect all my notes related to it, those that have been busting around the drawers and various nooks and crannies, from the last two years. Of course, I won’t be able to discuss everything associated with containers but I think that those who are starting their adventure with Docker, as this text will revolve mainly around it, will certainly find something interesting here.

Let’s start with what the word Docker which is a word compound of two words Dock (port) and Worker (employee), joining them together gives us a short name for a dock worker. And there are two things hidden under the term itself. First of these is Docker Inc. a company which stands behind the development of a container platform under the same name… So, we have a company and we have a product with the same name; the product is not uniform and it consists of elements forming The Docker Platform, in short called Docker.

Therefore, Docker is responsible for the delivery of containers using the available platform. Elements composing the Docker Platform from Docker Inc. are, among others, the Docker Engine (something that allows us to run containers, Docker Swarm, a system of clustering and orchestration of engines and Docker Registry, a specific database containing container images and Docker Universal Control Plane (only available in the commercial version) which is our window allowing graphical management of the platform. There are several more mechanisms in the platform itself, but to start with, we just need a few. Generally, the Docker Platform comes in two variants – free, for which you can find support in the community, called Docker Community Edition (CE), and a commercial one – Docker Enterprise Edition (EE), where we receive support from Docker Inc. It’s a slightly richer version of the Docker Engine, equipped with mentioned Docker Universal Control Plane. The Docker Platform is a fragment of a huge ecosystem that grew around containerization:

The sheer number of tools can startle, but there is no need to panic, not all of them need to be used. Many perform very specific functions which are completely unnecessary at the beginning of your adventure with Docker, and, only toy with your eyes and give birth to a certain dose of uncertainty. There is nothing to worry about, and it’s worth to start with the simplest technological heap as possible. This way you can expand it only when necessary and you can add software that will implement only specific functions. In the meantime, who uses Docker? Many of you may be surprised but the list is massive: Netflix, General Electric, BBC, Lyft, Spotify, eBay, Yelp, Box, Expedia, New York Times, Business Insider, PayPal, Shopify and Uber. While this is just a clipping, it shows that it is not just a product intended for startups, and large organizations also implement it. At this point, it’s also worth mentioning that Docker can be run natively in the cloud and cloud providers such as AWS, Azure or Google have dedicated services just for that. So, if you already have a local Docker installation, you can easily migrate to the Cloud when the time is right, or your organization wishes to do so. No doors get slammed shut. You can start it on your laptop, in a server room on bare metal sheets, in virtual machines or in the cloud. It’s also important to note that Docker can run both on Linux and on Windows. However, you should carefully examine the differences between these Docker editions because they are, and sometimes can be, surprisingly big. Nonetheless, there’s a Docker for Windows and it works, so we can dismiss the rumor that containers are only for Linux users.

Docker, docker, docker… that’s not all! As I’ve already mentioned about the madly big ecosystem of applications around containerization, it should be well known that there are also other platforms, e.g. RKT or Rocket created by CoreOS. I think, many of you flinched right away, just knowing that there’s an alternative. On the one hand, it’s always a good thing – competition in the industry for us, technology consumers, is always a plus. As an old Polish proverb says, where two fight the third benefits. Yet this sudden feeling of distaste might have been caused by the fact that with two different suppliers there are, most likely, two different standards… Well, there aren’t! The world of containerization did not make the same mistake as its equals and predecessors in the IT sector. As soon as two standards appeared the suppliers sat down at the table and set up The Open Container Initiative, an organization which is responsible for overseeing one standard, so the developers can deliver the application in one format and the IT department can run it on the platform of their choice without the need for conversion. It is worth noting that this organization works under The Linux Foundation.

All for the apps! Let’s start with this because it’s basically a catalyst for change in the IT we’re witnessing right now. A long time ago, when the world was simple and money didn’t matter as much one physical server would run one OS and was used to operate one app. Future accountants decided that the cash must be balanced and began to search for a solution in which one could run more applications on one server. Virtualization was deemed the solution. And this was enough for a couple of years. Unfortunately, it turned out as with most solutions, that this one wasn’t a perfect. We need to remember that while it’s an all for the apps approach the VM model focuses heavily on the operating systems: one VM and one OS … And while sometimes this is needed, at other times the lighter OS we have the better. Using the virtualization analogy, you could assume that we install a hypervisor on a physical server and it doesn’t matter which one, it could be a VMware vSphere, Microsoft, Hyper-V, KVM, XEN or anything on these lines. We lubricate the physical layer with a virtual one and regardless of what it is, it will allow us to share the physical server resources encompass our virtual machines. Then, on each virtual machine we install a Windows or a Linux. The services are another layer that allow us to launch our apps… and it all looks somehiting like this:

You do not have to be a genius to realize that we have quite a lot of duplicated elements here, and the hypervisor seems to be a redundant piece of ballast, in certain scenarios at least. I’m not saying that the idea of virtualization is bad, please keep this in mind. Virtualization has given us a lot and no one is going to argue otherwise, however, when you look at the above drawing it’s tempting to try to optimize the hypervisor layer, VM and OS… In short, it’s best if they were not there but all the advantages of their existence remained. It’d be a solution on the lines of having a cake and eating a cake. This is where the idea of containers appears, serving as a kind of compromise in optimization. They do not eliminate everything, but limit the number of elements. So, what are the containers? Let’s put it this way, very generally speaking, these can be compared to a VMs of sorts, i.e. the environment for running applications in which the operating system is limited to the necessary minimum. 

If you remember jail or chroot, and I know we’re talking borderline Triassic/Jurassic eras for IT… but maybe someone is still there;) If you remember them, you should have a basic understanding of the container concept. So, wrapping up these geriatrics, let’s take the above diagram and modify it so that it shows the framework architecture of the containerization environment:

Straight from the start you can see the fundamental difference – lack of a hypervisor and the appearance of small containers. So, what happened? Well, the hypervisor layer isn’t required, and the operating system layer is unified for all applications. Let’s introduce a naming convention – the operating system, when speaking about containerization, is called Hardware Land and it runs in it a program called the Docker Engine. This in turn allows you to run Docker Image – images containing the applications. In our analogy to virtualization the Docker Engine is a specific launching service on Hardware Land that acts as a hypervisor of Docker Image i.e. virtual machines.