Building Docker Images from a Container
By Jeff Nickoloff
This article was excerpted from the book Docker in Action
It is easy to get started building images if you are already familiar with using containers. A union file system (UFS) mount provides a container's file system so any changes that you make to the file system inside a container will be written as new layers that are owned by the container that created them.
Before you work with real software, this article will detail the typical workflow using a Hello World example.
Packaging Hello World
The basic workflow for building an image from a container includes three steps. First, you need to create a container from an existing image. You will choose the image based on what you want to be included with the new finished image and the tools you will need to make the changes.
The second step is to actually modify the file system of the container. These changes will be written to a new layer on the union file system for the container. The relationship between images, layers, and repositories will be revisited again in this chapter.
Finally, once the changes have been made the last step is to commit those changes. Once the changes are committed, you will be able to create new containers from the resulting image. Figure 1 illustrates this workflow.
Figure 1: Building an image from a container
With these steps in mind, work through the following commands to create a new image, named "hw_image."
# Modify a new container docker run --name hw_container ubuntu:latest touch /HelloWorld # Commit the changes you made in that container to a new image docker commit hw_container hw_image # Remove the changed container docker rm -vf hw_container # Test the new image docker run --rm hw_image ls -l /HelloWorld # Outputs: # -rw-r--r-- 1 root root 0 Apr 15 22:06 /HelloWorld
If that seems stunningly simple, you should know that it does become a bit more nuanced as the images you produce become more sophisticated. However, the basic steps will always be the same. Now that you've gotten an idea of the workflow, you should try to build a new image with real software. In this case, you'll be packaging a program called Git.
Preparing Packaging for Git
Git is a popular distributed version control tool. Whole books have been written about the topic. If you are unfamiliar with it I recommend that you spend some time learning how to use it. At the moment, however, you only need to know that it is a program that you are going to install onto an Ubuntu image.
To get started building your own image, the first thing you will need is a container created from an appropriate base image:
docker run -it --name image-dev ubuntu:latest /bin/bash
This will start a new container running the Bash shell. From this prompt you can issue commands to customize your container. Ubuntu ships with a Linux tool for software installation called "apt-get." This will come in handy for acquiring the software that you want to package in a Docker image. You should now have an interactive shell running with your container. Next you need to install Git in the container. You can do that by running the following command:
apt-get -y install git
This will tell APT to download and install Git and all of its dependencies on the container's file system. When it is finished, you can test the installation by running the "git" program:
git version # Output something like: # git version 1.9.1
Package tools like apt-get make installing and uninstalling software easier than if you have to do everything by hand. However, they provide no isolation to that software and dependency conflicts occur often. You can be sure that other software you install outside of this container will not impact the version of Git you have installed.
Now that Git has been installed on your Ubuntu container you can simply exit the container:
The container should be stopped but still present on your computer. Git has been installed in a new layer on top of the ubuntu:latest image. If you were to walk away from this example right now and return a few days later, how would you know exactly what changes were made? When you're packaging software it is often useful to review the list files that have been modified in a container, and Docker has a command for that.
Reviewing File System Changes
Docker has a command that shows you all of the file system changes that have been made inside of a container. These changes include added, changed, or deleted files and directories. To review the changes that you made when you used APT to install Git run the following command:
docker diff image-dev # Outputs a LONG list of file changes...
Lines that start with an "A" were files that were added. Those starting with a "C" were changed. Finally, those with a "D" were deleted. Installing Git with APT in this way made several changes. For that reason, it might be better to see this at work with a few specific examples:
# Add a new file to busybox docker run --name tweak-a busybox:latest touch /HelloWorld docker diff tweak-a # Output: # A /HelloWorld # Remove an existing file from busybox docker run --name tweak-d busybox:latest rm /bin/vi docker diff tweak-d # Output: # C /bin # D /bin/vi # Change an existing file in busybox docker run --name tweak-c busybox:latest touch /bin/vi docker diff tweak-c # Output: # C /bin # C /bin/busybox
Always remember to clean up your workspace:
docker rm -vf tweak-a docker rm -vf tweak-d docker rm -vf tweak-c
Now that you've seen the changes that you've made to the file system, you're ready to commit the changes to a new image. Just like most other things, this involves a single command that does several things.
Committing a New Image
You use the docker commit command to create an image from a modified container. It is a best practice to use the -a flag that signs the image with an author string. You should also always use the -m flag, which sets a commit message. Create and sign a new image that we'll name, "ubuntu-git" from the "image-dev" container where you installed Git:
docker commit -a "@dockerinaction" -m "Added git" image-dev ubuntu-git # Outputs a new unique image identifier like: # bbf1d5d430cdf541a72ad74dfa54f6faec41d2c1e4200778e9d4302035e5d143
Once you've committed the image, it should show up in the list of images installed on your computer:
docker images # Outputs: # REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE # ubuntu-git latest bbf1d5d430cd 5 seconds ago 226 MB
Make sure it works by testing git in a container created from that image:
docker run --rm ubuntu-git git version # Outputs: # git version 1.9.1
Now you've created a new image based on an Ubuntu image and installed Git. That is a great start, but what do you think will happen if you omit the command override? Try it to find out:
docker run --rm ubuntu-git
Nothing appears to happen when you run that command. That's because the command you started the original container with was committed with the new image. The command you used to start the container that the image was created by was "/bin/bash." When you create a container from this image using the default command, it will start a shell and immediately exit. That's not a terribly useful default command.
I doubt that any users of an image named ubuntu-git would expect that they would need to manually invoke git each time. It would be better to set an entrypoint on the image to "git." An entrypoint is the program that will be executed when the container starts. If the entrypoint is not set, then the default command will be executed directly. If the entrypoint is set, the default command and its arguments will be passed to the entrypoint as arguments.
To set the entrypoint, you'll need to create a new container with the --entrypoint flag set and create a new image from that container.
# Override the command for a new container docker run --name cmd-git --entrypoint git ubuntu-git # Outputs the standard git help and exits # Commit the new image to the same name docker commit -m "Set CMD git" -a "@dockerinaction" cmd-git ubuntu-git # Clean up the container docker rm -vf cmd-git # Test the entrypoint by passing the version command to git docker run --name cmd-git ubuntu-git version # Outputs the git version and exits
Now that the entrypoint has been set to "git," users no longer need to type the command at the end. This might seem like a marginal savings with this example, but many tools that people use are not as succinct. Setting the entrypoint is just one thing that you can do to make images easier for people to use and integrate into their projects.
Configurable Image Attributes
When you use docker commit, you commit a new layer to an image. The file system snapshot is not the only thing included with this commit. Each layer also includes metadata describing the execution context. Of the parameters that can be set when a container is created, all of the following will carry forward with an image created from the container:
- All environment variables
- Working directory
- The set of exposed ports
- All volume definitions
- Container entrypoint
- Command and arguments
If these values were not specifically set for the container, the values will be inherited from the original image. Let's examine a detailed example.
# Create a specialization of busybox with two environment variables set docker run --name rich-image-example \ -e ENV_EXAMPLE1=Rich \ -e ENV_EXAMPLE2=Example \ busybox:latest # Commit the image docker commit rich-image-example rie # e37d1db55196c828fd6be3b180c24e446633fec84a1c558b4cc85294ba5cd1d5 docker run --rm rie \ /bin/sh -c "echo \$ENV_EXAMPLE1 \$ENV_EXAMPLE2" # Outputs: Rich Example # Further specialize by setting a default entrypoint and command docker run --name rich-image-example-2 \ --entrypoint "/bin/sh" \ rie \ -c "echo \$ENV_EXAMPLE1 \$ENV_EXAMPLE2" # Outputs: Rich Example # Commit the image docker commit rich-image-example-2 rie # bc18509b86d20e721eea8eda95b879b7a906c6aef01831ea3f1f5a950d2fbc79 # Run the bare image docker run --rm rie # Outputs: Rich Example
This example builds two additional layers on top of Busybox. In neither case are files changed, but the behavior changes because the context metadata has been altered. These changes include two new environment variables in the first new layer. Those environment variables are clearly inherited by the second new layer, which sets the entrypoint and default command to display their values. The last command used the final image without specifying any alternative behavior, but it is clear that the previous defined behavior has been inherited.
Now that you understand how to modify an image, take the time to dive deeper into the mechanics of images and layers. Doing so will help you produce high quality images in real world situations.
Building Docker Images From a Container
By Jeff Nickoloff
This article wasexcerpted from the book Docker in Action.