A HTTP proxy is an essential component if you have a slow Internet link, or are simply doing a lot of builds that require downloading a lot of data. I like the Polipo caching HTTP proxy as it’s simple and single threaded. I was a bit sad to discover that Polipo is no longer maintained, as of August 2016. Hopefully it will remain useful for sometime yet or maybe even be picked up by a new maintainer. Continue reading “Container of the Week – clue/polipo”
Machine Learning is a very popular field at the moment and is something that’s in the news and geek culture a lot. Kaggle is a machine learning competition site where you can take part in a (usually sponsored) competition to apply your skills and solve a real-world problem.
Putting aside the controversial nature of spec work (read no!spec and this Wired article for some background) Kaggle have put together a pretty nice container image for getting started with machine learning. Continue reading “Container of the Week – kaggle/python”
While researching another article I discovered that it’s now possible to change the restart policy for a container without stopping, deleting and re-running it. Apparently this has also been possible for quite a long time, since Docker 1.11 was released in April 2016.
The docker update command allows you to change the configuration of a container in several ways. To change the restart policy use the ––restart command line option:
$ docker update --restart=always CONTAINER
This week we are looking at a container for Apache Spark. Spark is a cluster-computing framework for data processing, in particular MapReduce and more recently machine learning, graph analysis and streaming analytics. Clustered systems are sometimes difficult to run on a single machine, for example a laptop or desktop, as this use case is often not given a high priority by developers. Luckily, there is the gettyimages/spark image available for those who wish to quickly and easily explore the Spark environment.
What are base images?
Every application running inside a container is built upon a foundation. This foundation is called the base image and supports everything above it. A virtual machine requires an operating system that is used by the application running inside it, and in a similar way a containerized application requires a base image.
The choice of base image is very important and the decision needs to be made with due care and attention. A good choice of base image gives your application room to grow and change as your requirements grow and change. A bad choice of base image will restrict your application and can result in costly rewriting and refactoring.
A base image is often a minimized installation of a regular server operating system, like Debian, Ubuntu or CentOS. Unused or unnecessary components are removed or not installed. This leads to an image that has minimal support requirements, a small attack surface, and is easy to test and validate. It can then be used to create intermediate images to support particular software ecosystems, for example Java.
I have been doing a bit of work on analysing the Docker official library images using the bashbrew tool. If you are an experienced Go developer then perhaps it’s obvious how to get it working but I had some trouble.
Here is a quick introduction to getting the bashbrew tool working.
There’s literally not much to say about the scratch container as it’s completely empty! This container is usually only used when creating a base container from an external root filesystem in combination with the ADD command.
A Dockerfile that does this would look like:
FROM scratch ADD rootfs.tar /
The root filesystem can be created outside of Docker or downloaded from a third party. For example when creating a base container using Debian the root filesystem can be created with debootstrap. The ADD command takes the tarball and extracts it into the container.
This week we are going to look at a fairly popular container that is often used as a base for larger images – busybox. We’re also going to look at some of the upsides and downsides of busybox, a somewhat tempestuous project in the free software world.
There is a rather unhealthy obsession, in my opinion, in the Docker community about developing the smallest possible container size. Obviously you don’t want your container to contain hundreds of megabytes of useless junk, but perhaps we have passed the point of diminishing returns. It turns out that it is less expensive to have files in your base image that aren’t used than it is to have duplicated files in higher layers.
This post is part of a series where we examine a different container image each week. See previous Containers of the Week here. This week’s image is the official image for the Jenkins project, an open source application for building, deploying and automating software.
Running Jenkins inside a container is a simple task, but I’m going to give you a few tips to improve your Jenkins experience with Docker.