Gitlab-ci CICD pipelines can be run in docker containers.
Although you can start from a common docker image and use before_script
to
install utilities used in your gitlab stage, it is better to create your own docker image. Indeed, it will make your CICD:
-
clearer: no need to skip first hundred installation logs lines to get to actual build logs
-
faster: no need to spend time installing packages at every build
-
more resilient: no need to worry about failures of repository servers containing packages to be installed
In this post, I will show you how to easily create a docker image with all needed utilities and push it to dockerhub in order to use it in gitlab-ci. As an example, I will create a docker image with the following utilities:
-
cloud utilities: awscli and databricks-cli
-
other utilities : git, jq, curl, envsubst
First, I need a base docker image as a start point to my custom docker image.
Base docker image
As base docker image, I use python image based on alpine because it is a small docker image. If you’re not familiar with this docker image, it is a good idea to explore it to see what utilities are already in it. You can do so by retrieving the docker image, launching and connect to it with a shell:
docker pull python:3.9-alpine docker run -d -t --rm --name test python:3.9-alpine docker exec -i -t test sh
Let’s look at those commands. First, docker pull
retrieves docker image from dockerhub.
Next, docker run
launches container from a retrieved docker image. Options of docker run
command are:
-
-d
fordetach
, container will be run in background -
-t
to allocate a pseudo tty, here it prevents container to stop immediately after being setup -
--rm
to destroy container once we stop it. It is useful when you don’t expect to start this container again, hence for testing -
--name test
to name your container, as it is easier to use an human-readable name instead of a container hash identifier
Finally, docker exec
command connects you to your container. This command takes two arguments, the name/id
of the container, and the command to execute on your container. To execute a shell, use sh
as for alpine docker images,
bash is not installed. Options of docker exec
are:
-
-i
for interactive, to be able to interact in your terminal -
-t
to allocate a pseudo tty.
Those two options combined with sh
command creates an interactive shell
Once you finished, you can quit your container with exit
and then stop your container:
docker stop test
As option --rm
was set when we started the container, the container is destroyed and all commands you ran in this container
are forgotten.
I now have my base docker image, time to customize it.
Customize docker image
To transform my base docker image to my desired docker image, with all utilities needed for my CICD, I need to install them. To do so,
I can launch a container from base docker image, connect to it as explained above, install all I need and create a new docker image
from my running container using docker commit
. However, to create a lasting docker image, It is better to use a Dockerfile.
First, I create a file named Dockerfile
in an empty directory:
mkdir cicd-aws-databricks touch cicd-aws-databricks/Dockerfile
It is very important to put Dockerfile in a dedicated directory, as building docker image from Dockerfile requires to pass a directory containing a Dockerfile as argument. I put the following lines in my Dockerfile:
FROM python:3.9-alpine RUN apk add --no-cache --update git jq curl groff && \ pip3 install --no-cache-dir awscli envsubst databricks-cli
I set base docker image using FROM
instruction and then I execute commands using RUN
instruction. You can notice two things:
first, instead of having two RUN
instructions, one for apk add
and other for pip3 install
, I execute them in one
RUN
instruction. It is actually a
docker best practices in order to avoid too many layers.
Second, I set "no cache" options for both apk add
and pip3 install
commands. Indeed, as I want to create the smallest docker image,
I don’t keep useless cached packages.
Next, I can build my docker image using my Dockerfile with docker build
. I pass the directory containing my Dockerfile as
argument:
docker build --tag mydockerusername/cicd-aws-databricks:latest cicd-aws-databricks
I use --tag
option to setup my docker image’s name. As I will push this docker image on repository
mydockerusername/cicd-aws-databricks
with the version latest
, I tag this docker image with
mydockerusername/cicd-aws-databricks:latest
.
Now I have my docker image on my local machine, it is time to share it with the world.
Push docker image on dockerhub
I’ve already created an account on dockerhub. To push my docker image, I need to add a repository to
this account. To do so, I go to repositories and I click on "Create Repository". Then
I name my repository cicd-aws-databricks
. The complete name of my repository will be mydockerusername/cicd-aws-databricks
.
Then I push my docker image to my repository:
docker login --username=mydockerusername docker push mydockerusername/cicd-aws-databricks:latest
And it’s done, my docker image is uploaded on dockerhub: https://hub.docker.com/r/vincentdoba/cicd-aws-databricks.
Now I can use this docker image on gitlab-ci, by adding the following line at the top of my gitlab-ci.yml
:
image: mydockerusername/cicd-aws-databricks:latest