Braindump at https://docs.google.com/document/d/1LlPWnzTHH6spRrvOXOXa2P5GIqsBA-cS5Mw_V7RM8tI/edit# is commentable.
WOMS: It works on my system - let's ship it!
"If all you have is a container, everything looks like an encapsulation problem"
The current trend in system deployment at the moment seems to be containerization. Docker is one example of it, and Revisiting how we put together linux systems seems to be going into a similar direction.
What is containerization?
Typically you are looking at a file system with delta capabilties: BTRFS send/receive, AUFS2 and similar systems are able to do a variant of copy-on-write that is recording changes to an underlying base image, and create a layer on top of that. Apple's time machine is a very primitive way of doing a similar (but different) thing for purposes of backup instead of system execution.
Depending on the way these changes are being recorded, there are various ways in which the change recording is dependent on the underlying base image and how it can be recombined later with other images.
The main idea is to create layers of stacked file systems. The resulting image for execution can be composed of a base operating system, installed system components, a base configuration and additional customization. Optionally, the system has a capability to merge layers that are adjacent in the stack into a single composed layer ('merge the latest install with my additional changes' being the most enticing and problematic option here).
Containerization is sexy, because it allows us to run programs from inside the containers through the use of Linux Namespaces and Linux Cgroups. The namespaces provide the mechanism necessary to hide things from each other (separate system names, pid spaces, uid spaces, mount spaces, IPC spaces, and network name spaces). The cgroups provide the requires mechanism to limit ressource consumption of each container, and manage system ressources efficiently and in isolation.
Docker is a formalization of this as an execution model: Basically, each app is being run it ins own Linux environment, with its own libraries and other environmental dependencies provided as part of the image. A full container is being created to isolate the application, exposing only singular network ports to the outside in order to create a defined interfaces and access point for service delivery.
While the image is actually usually a full Linux image (it need not be, but the way docker containers are being made these days is pretty wasteful), starting an instance is not running a full Linux subsystem. Instead usually the actual application is being run in place of the system containers init process.
Due to the way Linux manages namespaces and cgroups, this provides very fast startup, very little overhead and zero dependencies on the host platform.
It looks like a really awesome concept. Docker even adds a central registry of images, with dependencies listed. You request a dockerized application, and it downloads itself from there, with dependent subimages, stacks everything together properly and then starts stuff.
Images are cached, and because everybody and their dog uses the Ubuntu 12.04 cloud image as a base, that's cached locally anyway and the download times are really tiny. Awesome!
The downside: Fake reproducibility
You create docker images using a Dockerfile. That's a small text file with build instructions for the image. These things are really simple, and can be understood with almost no explanation.
Here is how to create a firefox in a box, useable via VNC:
RUN apt-get update && apt-get install -y x11vnc xvfb firefox
RUN mkdir /.vnc
RUN x11vnc -storepasswd 1234 ~/.vnc/passwd
RUN bash -c 'echo "firefox" >> /.bashrc'
CMD ["x11vnc", "-forever", "-usepw", "-create"]
This dockerfile is a good dockerfile: It downloads an Ubuntu base image, installs a minimum amount of additional software, via documented installation commands, exposes a single port for access and defines a startup command.
That's how dockerfiles work: They describe how an image has been built from other images, by adding files, and running commands to prepare it.
Only: this is not what is typically happens.
Look at this Google search: ssh into Docker
What people actually do is munging together a bunch of images, from unknown sources, created in an undocumented way, ssh into that, and then go on a customization spree through the container.
Nobody downloaded the complete public registry of docker images, yet, to run some statistics on it. But I bet, if you do, and scan that, you'll find a lot of interesting things. Among things of interest, I expect:
.bash_history files, containing things that should have been part of the dockerfile instead.
- any amount of ssh and ssl private key material
- entire git repositories, or traces of them after deletion ("git clone myproject.git && cd myproject && make && make install && cd .. && rm -rf myproject")
- and a lot of other nasty surprises that are invisible if you look at dockerfiles
The difference between RPM spec files and a dockerfile
This is no accident. A dockerfile is fundamentally different from a spec file for a rpmbuild .
The spec file is build instructions but it is also a kind of documentation. It contains references to source file URLs, patches, and build dependencies. It then contains instructions on how to combine all that into an actually successful build, and then explains in detail what the deliverables are, which of them are the actual deliverables, config files or documentation, and finally spells out the full list of installation dependencies.
That is, it is a full description of how a certain result has been achieved, in order to make it reproducible by any interested third party. If that reads like the requirement list for the scientific method, that's because it is: A spec file contains only steps and ingredient source lists, no results, and the system has been built in a way to make manual intervention at any stage pretty hard, on purpose.
That's annoying, if you want just the results, hence things like checkinstall exist, and while they are sometimes handy, they are in no way a replacement for a proper spec file. You don't build distros based on that.
A dockerfile is, by contrast, in its minimal form, a binary patch to a binary blob downloaded from a questionable non-original source.
We have had that before, back 25 years ago.
We have had that before, back 25 years ago, and we abondoned it for a reason. It was called Smalltalk back then, but has been reinvented or reproduced on other platforms after that, several times:
Smalltalk was a wonderful development system: on error it dropped not a message or a stacktrace, but dropped you into the dev environment, with the cursor positioned at the error and all variables and the stack set to proper current values. That was really great, for a developer.
If something was broken or lacking, you could easily fix it. For example, if your system libraries did not support png format, because it wasn't invented yet, you could open the system bitmap image class,add a new subclass and patch it into the main class. Voila, everything in your system that used bitmap images now understands png format.
It makes "Hello, world!" a but unwieldy, though, if you wan to ship it. Actually, there never were any smalltalk applications. in order to ship, you froze the current state of the dev system and shipped that:
WOMS. Works on my system.
That's dev culture. It is very different from operations culture. That's because dev culture focuses on the creation of new features, and not on infrastructure, depdendencies, requirements and other nasty outside factors.
It creates results, but does not record the way we arrived at them - there are no instructions that document in a binding way how we arrived at that blob that actually delivers the intended result. There is also no binding between the executeable/deliverables and the sources, all sources and patches and build instructions necessary to reproduce.
That makes results hard to verify, makes dependencies invisible, and statistics close to impossible. Instead, dependencies are shippsed are part of the image.
Down the road, 5 years, we'll be in a world with a lot of cargo culting: "If you clone this image, it will be faster and more stable." "I have no idea why it works, but if you choose that other image, it won't." Great! We just got rid of that with rpm and puppet (or apt and Chef, or whatever makes your systems fly).
Anyway, Operations exist, and they have nasty little requirements: "What versions of which are you deploying and running in your systems?" "Can you guarantee the absence of the following code snippet in all versions of SSL in all of your boxes and subboxes?" "And, while you are at it, can you please replace all versions of SSL in all of your boxes and subboxes with this new, improved and more current versions, by tomorrow, because vulnerable?"
You could pessimize that even more, arguing that shared libraries make no sense in a dockerized environment at all. After all, if you look at the internals of, say a Sonos box, you'll see a Linux kernel booting as a shell for device drivers to access the hardware, and then, after setting up a network environment, starting one large static C++ blob that is the actual Sonos sound system software. In a dockerized environment it would make a lot of sense to structure your software in a similar way - if distributions and operating system environments no longer matter, you can tailor your environment completely to the needs of your application after all.
So where before an upgrade meant to replace a library and then restart all binaries using it, it now means waiting for a rebuild of something outside of our control that contains a static copy of it.
It might as well be Windows, using a Linux outer shell to host device drivers.