These two last days, I was participating to devoxx in Paris. One of the star topic of the event this year was Docker so I attempt different talk about this container technology I already has discovered in a previous JUG session. It was not an evidence that this technology was more than something interesting for DEV needing small, fast starting, small memory footprint, environment for testing. With my OPS hat, I did not catch the interest. This was before Devoxx and the different talk. The aspects I’ll detailed in this post have not been addressed during Devoxx, as much as I have seen, as mostly the DEV aspects have been addressed, I going to synthesize my opinion.
To start, let’s discuss a little bit about Docker technology. I do not want to detail it a lot as you will find many post on Internet. To explain it quickly Docker is a container solution. This is generally related to virtualization solution, but frankly speeking it it like a bicycle compared to a truck: both can help you to move, but technically speeking they are two different technologies. A container is a way to start a process on an existing Operating System limiting its view on the underlaying devices like filesystem, network….. It means the process you start will have a limited view on the file system, and in fact its own filesystem is coming from a subpart of the main FS. Network filtering rules will also hide the different container from each others. The objective is to have containers not visible from each other, using different technics, until you specify you want them to interract. Explained like that, containers are looking like virtual machine, but it is totally false as each of the containers are, in fact, sharing the same Linux kernel, the same physical devices (not passing through a virtual device), the same glibc… In fact, if you are looking the containers from the host, you will see containers as processes like any other, you will be able to access each container’s file system and so on. They are just processes running on the host. Viewed from the container, each can be considered as an independent machine like a VM. This is a really great thing, I’m going to explain why !
What is really great from this technology ? As a container is a process in the host, the memory allocation is made in a standard way, the container will only consume what it needs, not more. As starting a container means starting the process you want to execute in a specific container context it does not means booting a machine. It is really fast to not say it is instantaneous. Devices are acceded directly through the common kernel avoiding the use of virtual device and performance will be really near physical. As a consequence you can have many containers on a single system, many more than VM. Thanks to these advantages it is a really interresting solution for developpers.
DEV need a solution to deploy, many time, quickly, a clean environment to build and test the solution they are creating. They want to be sure that this environment is isolated from their own environment where some dependencies can exist. Then they need to trash all of this in a clean way. They are using virtualization for this purpose but managing vm on a laptop, even on a server, means cloning image template (10GB), boot a system (2min), deploy the soft, start it, test it, clean it … this vm will slow down the main system and consume a lot of memory. It is limiting the capacity to create prod-like architecture on a laptop: having 2 web-server nodes with a clustered database and a load balancer for example is heavy. Container is a solution for it: no large image is needed, many containers can be scripted and started in a short time. This is really a good solution and as a consequence, DEV will love it and are already talking a lot about it for good reasons.
Now, let’s see why OPS may also love it, potentially differently and for different reasons : the problem with container is that it is not solving the OPS problems the virtualization is addressing. If we are only talking about consolidation it looks perfect. But if we are talking about high availability, drp, capacity planning management, scallability…. containers are not adressing these points. In my point of view it would not be a good idea to address these area through containers. So that said, comparing container with virtualization is a wrong debate, they are in fact complementary solutions in an OPS point of view, addressing different problems. Let me explain :
Actually DEV are looking to create a vm per application and more largely speaking it is what IaaS solution are proposing to you … ok this is a shortcut as IaaS is not addressing that point.by the way, practically speaking in the cloud,with devops automation systems, it is a consequence : one application = one or more vm. In a OPS point of view this creates a really huge problem to manage : how to maintain a large number of VM ? how to optimize cost as, even if a vm is not expensive, you have about 10K vm. It is a really large budget. Before devops, the solution was easy, it was mutualization : many applications deployed on a server. This is an efficient way to manage the OPS constraints but it had a lot of impact on DEV. It has also been proven it is creating constraints due to dependencies and unexpected interactions between apps. This is exactly what Dockers is addressing. In fact, in my point of view, using docker running on a virtualized system solved most of the DEV and OPS expectations.
In a OPS point of view the VM has to be managed : the VM is the unit to be backup, monitored, alarmed, upgraded… You can move it over multiple physical servers… In fact OPS do not really have to consider the containers itself. The way a container has been packaged, including all its dependencies, is for OPS just an easier way to industrialize applications. It is also a good solution to move an application from a server to another as the application, through the container is portable by owning all its specificities and dependencies. Said differently, in my point of view, containers providing a solution to keep existing infra process in place and simplify industrialization & management.
In a DEV point of view, the VM is not anymore to be considered, only the container have to be managed if the standard VM does not contain the needed dependencies you just have to add them into the container and deliver it. The industrialization is easy and the final architecture can be tested on dev environment. Associated with solution like chef or puppet it is also possible to manage the container configuration from the DEV side in sync with OPS team.
All that said what in fact is really important is to consider that OPS process are depending on the hosting solution : the way you backup / monitor / upgrade depends on the hosting solution : Internal processes & tools will differ from a public cloud and also differ from a managed private cloud. This is impacting applications over norms on top of OPS scripts are built. The container solution, by creating logical association, solves it. It makes the container / applications portable over many OPS provider.
As a conclusion of this, I would say, YES, Docker is a revolution : it makes it the first real opportunity to create a DEVOPS organisation based on a DEV / OPS(s) existing organiszations where we can continue to have one or more dedicated OPS teams and a DEV team, working on the same processes in an efficient way. And this is a really good news because expecting DEV to becomes OPS in large compagny makes no sense and expecting different OPS to converge on the same industrialization standards is not more possible. So if you just have to retain one thing of this : in a multi-sourcing infrastructure context or to not be IaaS provider dependent, you must quickly look to Docker.