LinuxKit – the software-defined OS

If you were lucky enough to go to DockerCon 2017 (I wasn’t) you might have seen the announcement of Moby and LinuxKit, Docker’s new framework for assembling specialised container systems. Traditionally a bare metal or virtual machine that runs Docker has run a “full service” distribution like Debian, Ubuntu or RedHat Linux. Docker and containerized applications are then installed and run on that server. LinuxKit gives us a quick and easy way of building an OS for the host machine that’s customised to run containers and not a lot else.

In March 2013, Docker gave us an easy and lightweight way to run services on a single server in isolation both from a development and production point of view. Docker has since developed to include networking, storage and clustering capabilities. One interesting side-effect of the architecture was to decouple the host OS from the guest (container) OS. Developers need not write their applications for the type and flavour of Linux that’s installed on the host. In fact, as far as the developer is concerned the host distro is completely irrelevant.

LinuxKit takes the idea of decoupling the host and guest OS to the next level by applying the principles used to build a containerized application to the OS itself. We build a container by starting with a very small base and then adding only what we need on top. With LinuxKit we build an OS also by starting with a very small base and adding only what we need on top. Extra system services on the host, such as ntpd or dhcpd, are run inside containers on the host.

This method is in stark contrast to the modern general-purpose Linux OS that runs on servers today. A minimal install of Debian, Ubuntu or CentOS occupies between 100 and 200MB depending on what other services you have installed. After a certain point though it’s impossible to slim down a Linux install due to the interconnectedness of the base packages. In Debian and Ubuntu, for example, the size base install is blown out merely by having the packaging system and systemd present, both of which pull in other dependencies such as libstdc++ and Perl.

Let’s take the idea of applying containerization concepts to the host OS a step further. What if the host OS were to have a similar short lifecycle as that of a container? Compute nodes typically have a relatively long lifecycle as cluster members but thinking of the host OS as a throwaway might lead us to build infrastructure that can be developed, deployed and maintained with the same speed that we can achieve with applications.

LinuxKit is effectively a system for building software-defined operating systems. We have been able to build OS installs automatically for a long time now, but they are relatively large and clunky due to their general-purpose nature. In the computing world, there’s always a tension between generalization and specialization. Using LinuxKit to build host operating systems is tipping the scales towards the specialization side and away from the generalized nature that computing nodes have traditionally had.  I think it’s going to be very interesting to see where this direction takes us in the future.