Building the app, installing the dependencies and services, automating the deployment, and more — it all starts with the Dockerfile. Let’s review the syntax, from basic to elaborate, and some best practices when building your Docker images.
In this guide, we’ll write a Dockerfile instructing Docker to select a minimal Linux (base image) for the application we’ll deliver, and ship with it a set of tools of our choice and a certain configuration, effectively building our own Linux distribution which is just right for running our app.
Why Docker
With Docker you can “Build, ship, and run any app, anywhere”. That is, you can pack your application with all of the binaries and runtime libraries, back-end tools, OS tweaks, and even specific services your application needs for running — and make it readily available for instant delivery and automatic deployment.
The software containers technology that Docker implements is what makes this possible. And although I won’t cover here much of the detail behind it, you can read more about Docker, what software containers are, and how they work in Understanding Docker, Containers and Safer Software Delivery.
Installing Docker
Before starting, you’ll need to have Docker installed, whether it’s on your local machine or on a remote server.
Fortunately, the latest version of Docker (1.12 as of this writing) made the installation process really smooth, and you have easy-to-follow guides for Windows, MacOS and Linux.
The Dockerfile
In order to build an image in Docker, you first need to set the instructions for this build on a plain text file named Dockerfile
and a context (more on this later). This file has a syntax similar to that of Apache configuration files — one instruction per line with its respective arguments, and all instructions processed in sequence, one after another. Comments are preceded by the #
character and a whitespace. Finally, once you have a Dockerfile, the command docker build
will build the image, as we’ll see in more detail later.
Before we start writing the Dockerfile, we’ll set the working space. We’ll create a directory called my_image
in our home directory, use it as our working directory, and place the Dockerfile
in there:
mkdir ~/my_build
cd ~/my_build
touch Dockerfile
Now we’re ready to start building the image.
Selecting the base image
Most of the time when creating an image, you’ll use a starting point — that is, another image. This can be an official Ubuntu, MySQL, WordPress, or any other image available from the Docker Hub. You can also use an image you created yourself previously.
Note: You can create your own base image with your own core tools and directory structure, using Docker’s reserved, minimal image, called scratch
. It’s a process I won’t cover here, but you can refer to the Docker site’s guide on creating a base image.
For example, if you want to start off with a minimal Debian distribution, you’ll add the following content to the Dockerfile
:
# set the base image
FROM debian
FROM
must be the first instruction you use when writing a Dockerfile. Notice that you can also use a specific version of your base image, by appending :
and the version_name
at the end of the image name. For example:
# set the base image
FROM debian:sid
In the code above, we’re using the “sid” Debian (unstable distribution). This will be relevant also when you want a specific version of a Ruby or Python interpreter, MySQL version, or what have you, when you use an official base image for any of these tools. For now, we’ll stick to the default (stable) debian
image for this guide.
Specifying a maintainer and adding metadata
Optionally, you can specify who’s the MAINTAINER
, replacing Lucero del Alba
by your name or the person or team responsible for the build:
# author
MAINTAINER Lucero del Alba
It isn’t necessary, but we may also add some metadata using the LABEL
instruction, and this information will become available later when using the docker inspect
command to examine the image:
# extra metadata
LABEL version="1.0"
LABEL description="First image with Dockerfile."
For more on this feature, refer to Docker object labels.
Making your own distro
At this point, we’re going to select some tools and libraries to be included in our image, so that our container has everything it needs for what we intend it to do. At the end of this tutorial, we’ll be doing something that’s very close to actually building a Linux distribution.
Some containers, such as one running a PostgreSQL database, are meant to run in the background. But often we need a console to perform some operations on the container, so we’re likely to need some extra tools, because the base image will bundle just a minimal set of GNU tools.
Dealing with cache issues
It’s almost guaranteed that you’ll experience cache issues when trying to install additional packages on your image. This is because the base image comes with cached metadata, and the live repositories you’re pulling data from are often changing.
In Debian-based distributions, you can handle this by adding the following commands before installing new packages:
# update sources list
RUN apt-get clean
RUN apt-get update
Installing basic tools
Code editors, locales, tools such as git
or tmux
— this is the time to install everything you’re going to need later, so that they’re bundled in the image.
We’ll install one per line:
# install basic apps, one per line for better caching
RUN apt-get install -qy git
RUN apt-get install -qy locales
RUN apt-get install -qy nano
RUN apt-get install -qy tmux
RUN apt-get install -qy wget
We could install all of them in a single line, but if we later want to add or remove a package, we need to re-run the whole process. So the best practice here is to install one package per line so you can benefit from Docker’s caching.
Also, keep it tight. You don’t want to install tools “just in case”, as this may increase the build time and the image size.
Installing runtime libraries for your app
We’ll be shipping our app in this image as well. Do you need a specific version of PHP, Ruby or Python, together with certain modules? Now’s the time to deliver all of the programs and runtimes our app is going to need.
Be as specific as you like, as this container is intended to run only your app:
# install app runtimes and modules
RUN apt-get install -qy python3
RUN apt-get install -qy python3-psycopg2
RUN apt-get install -qy python3-pystache
RUN apt-get install -qy python3-yaml
For this example, we’ll install Python 3 with the packages Psycopg 2 (to connect to PostgreSQL databases), the Mustache for Python module, and the YAML module. (You’ll naturally install the specific dependencies you need when doing your own Dockerfile.)
Compiling and downloading packages
It’s also possible that your distribution won’t have a package for a certain module or program that you need. But you don’t need to manually install it in your running container! Instead, you can use the RUN
instruction (one per line) to batch the process of downloading, compiling and setting whichever library your application will need.
You can even write a script on a separate file, add this file to the build and run it, as we’ll see later in the “Shipping Your Own App” section.
Cleaning up
To keep your image tidy and as small as possible, it’s also a good idea to do a cleanup at the end of the installation sequence:
# cleanup
RUN apt-get -qy autoremove
Again, notice we’re using apt-get
because we chose Debian, but use the appropriate command for the distribution of your base image.
Shipping your own app
The whole point of building this environment is so that you can deliver your application smoothly and ready to run. To add files, directories, and even the content of remote URLs to the image, we’ll use the ADD
instruction.
However, before adding files, we need to put them in the appropriate context. To make things easier, we’ll just locate everything in the aforementioned my_build
directory, alongside the Dockerfile
itself.
Let’s say that, with the app and everything we want to put into the image, we have the following files in ~/my_build
(where app.py
and lib.py
are inside the sub-directory app/
):
.bashrc
.profile
app/app.py
app/lib.py
Dockerfile
We’ll add .bashrc
and .profile
scripts to the /root
directory in the container so that they execute whenever we launch a shell on the container, and we’ll copy the contents of app/
to the /app/
directory in the container.
We add the following instructions:
# add scripts to the container
ADD .bashrc /root/.bashrc
ADD .profile /root/.profile
# add the application to the container
ADD app /app
Setting your environment
Finally, we’ll set some environment variables that we’ll need at a system and application level.
Many of you will do just fine with the default Debian charset, but since we’re aiming at an international audience, let’s see how to have a UTF-8 terminal. We previously installed the locales
package, so all we have to do now is generate the charsets and set the appropriate Linux environment:
# locales to UTF-8
RUN locale-gen C.UTF-8 && /usr/sbin/update-locale LANG=C.UTF-8
ENV LC_ALL C.UTF-8
You may also need to set some environment variables for your application, for exchanging passwords and paths. The Dockerfile provides the ENV
instruction for doing precisely this:
# app environment
ENV PYTHONIOENCODING UTF-8
ENV PYTHONPATH /app/
Notice that you can also pass environment variables from the command line when launching the container, which may be convenient for sharing some sensitive information such as passwords.
The Complete Dockerfile
Naturally, you’ll have to adapt the Dockerfile to your needs, but hopefully you get the idea of the possibilities.
Here’s the full file:
# author
MAINTAINER Lucero del Alba
# extra metadata
LABEL version="1.0"
LABEL description="First image with Dockerfile."
# set the base image
FROM debian
# update sources list
RUN apt-get clean
RUN apt-get update
# install basic apps, one per line for better caching
RUN apt-get install -qy git
RUN apt-get install -qy locales
RUN apt-get install -qy nano
RUN apt-get install -qy tmux
RUN apt-get install -qy wget
# install app runtimes and modules
RUN apt-get install -qy python3
RUN apt-get install -qy python3-psycopg2
RUN apt-get install -qy python3-pystache
RUN apt-get install -qy python3-yaml
# cleanup
RUN apt-get -qy autoremove
# add scripts to the container
ADD .bashrc /root/.bashrc
ADD .profile /root/.profile
# add the application to the container
ADD app /app
# locales to UTF-8
RUN locale-gen C.UTF-8 && /usr/sbin/update-locale LANG=C.UTF-8
ENV LC_ALL C.UTF-8
# app environment
ENV PYTHONIOENCODING UTF-8
ENV PYTHONPATH /app/
Building the image
From inside the my_build
directory, we’ll use the docker build
command, passing the -t
flag to “tag” the new image with a name, which in this case will be my_image
. The .
indicates that the Dockerfile
is in the current directory, along with so-called “context” — that is, the rest of the files that may be in that location:
cd ~/my_build
docker build -t my_image .
That will generate a long output where every “step” is an instruction in our Dockerfile. This is a truncated output:
Sending build context to Docker daemon 5.12 kB
Step 1 : FROM debian
---> 7b0a06c805e8
Step 2 : MAINTAINER Lucero del Alba
---> Running in d37e46e5455d
---> 2d76561de558
Removing intermediate container d37e46e5455d
Step 3 : LABEL version "1.0"
---> Running in 904dde1b4cd7
---> a74b7a492aaa
Removing intermediate container 904dde1b4cd7
Step 4 : LABEL description "First image with Dockerfile."
---> Running in 9aaef0353256
---> 027d8c10e966
Removing intermediate container 9aaef0353256
Step 5 : RUN apt-get clean
---> Running in bc9ed85dda16
---> a7407036e74a
Removing intermediate container bc9ed85dda16
Step 6 : RUN apt-get update
---> Running in 265e757a7563
Get:1 http://security.debian.org jessie/updates InRelease [63.1 kB]
Ign http://deb.debian.org jessie InRelease
Get:2 http://deb.debian.org jessie-updates InRelease [145 kB]
Get:3 http://deb.debian.org jessie Release.gpg [2373 B]
Get:4 http://deb.debian.org jessie Release [148 kB]
Get:5 http://security.debian.org jessie/updates/main amd64 Packages [402 kB]
Get:6 http://deb.debian.org jessie-updates/main amd64 Packages [17.6 kB]
Get:7 http://deb.debian.org jessie/main amd64 Packages [9064 kB]
Fetched 9843 kB in 10s (944 kB/s)
Reading package lists...
---> 93fa0a42fcdc
Removing intermediate container 265e757a7563
Step 7 : RUN apt-get install -qy git
---> Running in c9b93cecd953
(...)
Listing images
We can list our images with the docker images
command:
docker images
This will output our newly created my_image
alongside other base images we have downloaded:
REPOSITORY TAG IMAGE ID CREATED SIZE
my_image latest e71dc183df2b 8 seconds ago 305.6 MB
debian latest 7b0a06c805e8 2 weeks ago 123 MB
debian sid c1857cb435d7 3 weeks ago 97.77 MB
… and there it is, our image is ready to ship and run!
Launching a container
Finally, to launch an interactive terminal of our newly created image, we’ll use the docker run
command:
docker run -ti my_image /bin/bash
What To Do Next
I haven’t covered all of the possibilities of the Dockerfile. Particularly, I haven’t reviewed how to EXPOSE
ports so that you can run services and even link containers between themselves; how to HEALTHCHECK
containers to verify they’re still working; or even how to specify a VOLUME
to store and recover data from the host machine … among other useful features.
We’ll get to cover those on future articles. For now, you may like to check out the following resources.
From the Docker website:
From SitePoint:
- Understanding Docker, Containers and Safer Software Delivery
- The Docker subchannel
- All of Docker-related articles
FAQs on How to Build an Image with the Dockerfile
What is the Importance of Using Dockerfile in Building Docker Images?
Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Using Dockerfile simplifies the process of building images in Docker. It allows you to automate the process, making it more efficient and less prone to human error. Dockerfile also provides a clear, version-controlled documentation of how your image is built, which makes it easier for other developers to understand your work and use or modify it.
How Can I Optimize the Build Process Using Dockerfile?
Dockerfile provides several ways to optimize the build process. One of the most effective ways is to use a multi-stage build. This allows you to use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.
What are the Best Practices for Writing Dockerfiles?
There are several best practices for writing Dockerfiles. First, you should avoid installing unnecessary packages to keep the image size small. Second, use multi-stage builds to optimize the build process. Third, each Dockerfile should represent a single application. If you have multiple applications, you should use multiple Dockerfiles. Lastly, you should use .dockerignore file to exclude files and directories that should not be included in the image.
How Can I Debug a Dockerfile?
Debugging a Dockerfile can be done by building the image and running it with a shell command. If the build fails, Docker will return an error message that can help you identify the problem. You can also use the RUN command to execute a command that will help you debug the Dockerfile.
Can I Use Environment Variables in Dockerfile?
Yes, you can use environment variables in Dockerfile. The ENV instruction sets the environment variable to the value. This value will be in the environment for all subsequent instructions in the build stage and can be replaced inline in many as well.
How Can I Copy Files from the Host to the Docker Image?
You can use the COPY instruction to copy new files from your host to the Docker image. The files are copied from the source on the host to the destination in the Docker image.
How Can I Expose Ports in a Docker Image?
You can use the EXPOSE instruction to inform Docker that the container listens on the specified network ports at runtime. However, this does not actually publish the port. To publish the port, you need to use the -p flag on docker run command.
How Can I Set the Working Directory in a Docker Image?
You can use the WORKDIR instruction to set the working directory for any RUN, CMD, ENTRYPOINT, COPY, and ADD instructions that follow it in the Dockerfile.
How Can I Run a Command in a Docker Image?
You can use the RUN instruction to run a command in a Docker image. This will execute any commands in a new layer on top of the current image and commit the results.
How Can I Specify a Default Command for a Docker Image?
You can use the CMD instruction to provide defaults for an executing container. These can include an executable, or they can omit the executable, in which case you must specify an ENTRYPOINT instruction.
Lucero is a programmer and entrepreneur with a feel for Python, data science and DevOps. Raised in Buenos Aires, Argentina, he's a musician who loves languages (those you use to talk to people) and dancing.