Container of the Week – clue/polipo

A HTTP proxy is an essential component if you have a slow Internet link, or are simply doing a lot of builds that require downloading a lot of data. I like the Polipo caching HTTP proxy as it’s simple and single threaded. I was a bit sad to discover that Polipo is no longer maintained, as of August 2016. Hopefully it will remain useful for sometime yet or maybe even be picked up by a new maintainer. Continue reading “Container of the Week – clue/polipo”

Container of the Week – kaggle/python

Machine Learning is a very popular field at the moment and is something that’s in the news and geek culture a lot.  Kaggle is a machine learning competition site where you can take part in a (usually sponsored) competition to apply your skills and solve a real-world problem.

Putting aside the controversial nature of spec work (read no!spec and this Wired article for some background) Kaggle have put together a pretty nice container image for getting started with machine learning. Continue reading “Container of the Week – kaggle/python”

Changing a Docker container’s restart policy

While researching another article I discovered that it’s now possible to change the restart policy for a container without stopping, deleting and re-running it. Apparently this has also been possible for quite a long time, since Docker 1.11 was released in April 2016.

The docker update command allows you to change the configuration of a container in several ways. To change the restart policy use the ––restart command line option:

$ docker update --restart=always CONTAINER

Container of the Week: gettyimages/spark

This week we are looking at a container for Apache Spark. Spark is a cluster-computing framework for data processing, in particular MapReduce and more recently machine learning, graph analysis and streaming analytics. Clustered systems are sometimes difficult to run on a single machine, for example a laptop or desktop, as this use case is often not given a high priority by developers. Luckily, there is the gettyimages/spark image available for those who wish to quickly and easily explore the Spark environment.

Continue reading “Container of the Week: gettyimages/spark”

Docker Base Images

What are base images?

Every application running inside a container is built upon a foundation. This foundation is called the base image and supports everything above it. A virtual machine requires an operating system that is used by the application running inside it, and in a similar way a containerized application requires a base image.

The choice of base image is very important and the decision needs to be made with due care and attention. A good choice of base image gives your application room to grow and change as your requirements grow and change. A bad choice of base image will restrict your application and can result in costly rewriting and refactoring.

A base image is often a minimized installation of a regular server operating system, like Debian, Ubuntu or CentOS. Unused or unnecessary components  are removed or not installed. This leads to an image that has minimal support requirements, a small attack surface, and is easy to test and validate. It can then be used to create intermediate images to support particular software ecosystems, for example Java.

Continue reading “Docker Base Images”

Kubernetes: Why does it matter?

I wrote an article for opensource.com about Kubernetes: Kubernetes: Why does it matter?

Developing and deploying cloud-native applications has become very popular—for very good reasons. There are clear advantages to a process that allows rapid deployment and continuous delivery of bug fixes and new features, but there’s a chicken-and-egg problem no one talks about: How do you get there from here? Building the infrastructure and developing processes to develop and maintain cloud-native applications—all from scratch—are non-trivial, time-intensive tasks.

Container of the Week – scratch

There’s literally not much to say about the scratch container as it’s completely empty! This container is usually only used when creating a base container from an external root filesystem in combination with the ADD command.

A Dockerfile that does this would look like:

FROM scratch
ADD rootfs.tar /

The root filesystem can be created outside of Docker or downloaded from a third party.  For example when creating a base container using Debian the root filesystem can be created with debootstrap. The ADD command takes the tarball and extracts it into the container.

Mass-Deleting Docker Images

I’m having a cleanup of my Docker images and there’s a bit of a mismatch between the output format of docker images and the input of docker rmi. I don’t however want to delete everything, only a selection of images.

Luckily there’s a –format argument to docker images which allows an output format to be specified.  Here’s the trick:

$ docker images --format "{{.Repository}}:{{.Tag}}" | \
    grep :foo \
    xargs docker rmi

This command deletes all images with the tag “foo”, something which is tricky using the standard output format.

The documentation for the docker images command has all the details.