Skip to content

GitLab Runner Installation

This page is a tutorial to install GitLab Runner on a node of the Monolithe.

Docker Installation

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

# Install the latest version
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Tested on Ubuntu 24.04 LTS.

# uninstall system packages related to docker
sudo dnf remove docker docker-client docker-client-latest docker-common docker-latest docker-latest-logrotate docker-logrotate docker-selinux docker-engine-selinux docker-engine

# setup the repository
sudo dnf -y install dnf-plugins-core
sudo dnf-3 config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo

# install docker
sudo dnf install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# start docker
sudo systemctl start docker

Tested on Fedora 39 and 40.

Try if it works:

sudo docker run hello-world

GitLab Runner Installation

# Add the official GitLab repository: 
curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" | sudo bash

# Install the latest version of GitLab Runner, or skip to the next step to install a specific version
sudo apt install gitlab-runner

Warning

For now this is not working on Ubuntu 24.10 and it fails with the following error:

Installing /etc/apt/sources.list.d/runner_gitlab-runner.list...curl: (22) The requested URL returned error: 404

Download the right package:

# Replace ${arch} with any of the supported architectures, e.g. amd64, arm, arm64
# A full list of architectures can be found here https://s3.dualstack.us-east-1.amazonaws.com/gitlab-runner-downloads/latest/index.html
curl -LJO "https://s3.dualstack.us-east-1.amazonaws.com/gitlab-runner-downloads/latest/deb/gitlab-runner_${arch}.deb"

Install it:

# Replace ${arch} with any of the supported architectures, e.g. amd64, arm, arm64
sudo dpkg -i gitlab-runner_${arch}.deb

# Add the official GitLab repository: 
curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.rpm.sh" | sudo bash 

# Install the latest version of GitLab Runner, or skip to the next step to install a specific version
sudo dnf install gitlab-runner

Warning

For now this is not working on Asahi Linux (Fedora 39 and 40) and it fails with the following error:

Downloading repository file: https://packages.gitlab.com/install/repositories/runner/gitlab-runner/config_file.repo?os=fedora-asahi-remix&dist=39&source=script
curl: (22) The requested URL returned error: 404

Download the right package:

# Replace ${arch} with any of the supported architectures, e.g. amd64, arm, arm64
# A full list of architectures can be found here https://s3.dualstack.us-east-1.amazonaws.com/gitlab-runner-downloads/latest/index.html
curl -LJO "https://s3.dualstack.us-east-1.amazonaws.com/gitlab-runner-downloads/latest/rpm/gitlab-runner_${arch}.rpm"

Install it:

# Replace ${arch} with any of the supported architectures, e.g. amd64, arm, arm64
sudo rpm -i gitlab-runner_<arch>.rpm

GitLab Runner Configuration

First you need to create a new runner from gitlab.lip6.fr. To do this you need to go in a project and then CI/CD Settings -> click on the New project runner blue button.

Here are the tags related to the architectures:

arm64, fma, neon [, armhf] [, armie]

Info

armie tag is useful because only ARM architectures support the "Arm Instruction Emulator" (ArmIE). For instance, ArmIE is not working on the M1 Ultra.

avx512bw, avx512f, avx2, avx, sse4.2, sse4.1, ssse3, sse3, sse2, x86_64

Tags that are not architecture dependent:

docker, powerful, linux

In Runner description put linux-alsoc-hostname where hostname is the name of the machine.

Once you picked the right tags and set a description you can continue the registration procedure by clicking on the Create runner blue button.

Then you will have to paste a command that look like this on the node :

sudo gitlab-runner register --url https://gitlab.lip6.fr --token glrt-a-token

Danger

sudo is very important in the previous command, in other case the runner will be attached to the current $USER.

Enter the GitLab instance URL:

https://gitlab.lip6.fr

Enter a name for the runner (if it is a Linux machine from the ALSOC team, replace hostname with the hostname of the machine where is running the runner):

linux-alsoc-hostname

Enter an executor:

docker

Enter the default Docker image:

ubuntu:24.04

To check if it works:

sudo gitlab-runner run

You are good to go!

Prevent GitLab Runner CI and SLURM jobs from running at the same time on a node

Info

Note that, in this section, the proposed solution is more a work around than a perfect solution. It supposes that GitLab Runner has been installed manually on each node (which is not the common way to use compute nodes with SLURM) and that you have root privileges on each compute node. However, it is relevant in the Monolithe cluster where each node has a different hardware and software configuration.

A gitlab-nfs user has been created to submit SLURM jobs before to start the CI. It is a standard account but its password has been disabled (with sudo passwd -l gitlab-nfs on the front node). Public and private keys have been generated in the /nfs/users/gitlab-nfs/.ssh/ folder (id_rsa and id_rsa.pub files).

First, these keys needs to be copied on the node where GitLab runner is installed:

(node):  sudo mkdir /opt/gitlab-runner
(node):  sudo mkdir /opt/gitlab-runner/ssh_keys
(node):  sudo chmod 700 /opt/gitlab-runner/ssh_keys
(front): sudo chmod o+r /nfs/users/gitlab-nfs/.ssh/id_rsa
(node):  sudo cp /nfs/users/gitlab-nfs/.ssh/id_rsa.pub /opt/gitlab-runner/ssh_keys
(node):  sudo cp /nfs/users/gitlab-nfs/.ssh/id_rsa /opt/gitlab-runner/ssh_keys
(node):  sudo chmod o-r /opt/gitlab-runner/ssh_keys/id_rsa
(front): sudo chmod o-r /nfs/users/gitlab-nfs/.ssh/id_rsa

Then, edit the config.toml file:

(node): sudo vim /etc/gitlab-runner/config.toml

Keys and scripts need to be mounted as volumes for runner docker image instances. In the [runners.docker] section, edit or add a volumes entry as follow:

  volumes = ["/opt/gitlab-runner/ssh_keys:/opt/ssh_keys:ro", "/nfs/scripts/gitlab-runner:/opt/scripts:ro", "/cache"]

And, after the executor = "docker" line in the [[runners]] section, add the following lines:

  pre_build_script = '''
    bash /opt/scripts/pre_build_script.sh
  '''
  post_build_script = '''
    bash /opt/scripts/post_build_script.sh
  '''
  environment = ["SLURM_PARTITION=<the_slurm_partition_here>"]

Replace <the_slurm_partition_here> by the real SLURM partition of the current node.

VoilĂ , this is done, you're good to go!

This is not a perfect solution. GitLab runner jobs will start even if a SLURM job is currently running on the node. However, in this case, the GitLab runner job will loop (passive waiting) until the SLURM job ends.

Warning

pre_build_script.sh and post_build_script.sh require ssh to be installed on the docker image to work. If the image is Debian-like, then ssh is automatically installed through apt.

Danger

When a GitLab job is cancelled from the GitLab web interface, the post_build_script is NOT called. Then, the SLURM job corresponding to the GitLab job will stay active during CI_JOB_TIMEOUT seconds (generally one hour). Meanwhile, the node will be unavailable for regular SLURM jobs. This is not the case for new GitLab runner jobs because the pre_build_script.sh script cancels all its previous SLURM job (only on the corresponding partition) before to submit a new one.

Source Codes

Here is the contents of pre_build_script.sh and post_build_script.sh scripts located in /nfs/scripts/gitlab-runner:

pre_build_script.sh
#!/bin/bash
set -x

# install ssh client if not found and if OS is Debian-like
if ! [ -x "$(command -v ssh)" ]; then
  echo 'Warning: ssh client not found.' >&2
  if [ -x "$(command -v apt)" ]; then
    apt update
    apt install -y openssh-client
  fi
fi

# just print environment variables and set `SLURM_JOB_TIMEOUT_MIN` variable
echo "CI_JOB_TIMEOUT=${CI_JOB_TIMEOUT}" # used to determine the maximum time (in seconds) of the SLURM job
echo "CI_JOB_ID=${CI_JOB_ID}" # used to set the SLURM job name
SLURM_JOB_TIMEOUT_MIN=$((CI_JOB_TIMEOUT/60))
echo "SLURM_JOB_TIMEOUT_MIN=${SLURM_JOB_TIMEOUT_MIN}" # used to determine the maximum time (in minutes) of the SLURM job

# for the following lines, commands are executed on the front node (through ssh connection) and here are the steps
#   1. cancel other SLURM jobs from the same partition and user
#   2. submit the SLURM job (non-blocking)
#   3. if the SLURM job is not RUNNNING just after, print a message
#   4. loop while the previously submitted SLURM job is not RUNNNING (passive waiting)
ssh -o "IdentitiesOnly=yes" -o "StrictHostKeyChecking=accept-new" -i /opt/ssh_keys/id_rsa gitlab-nfs@front.mono.proj.lip6.fr /bin/bash << EOF
scancel -p ${SLURM_PARTITION} -u gitlab-nfs
sbatch -p ${SLURM_PARTITION} -J ${CI_JOB_ID}_job --exclusive --nodes=1 --time=${SLURM_JOB_TIMEOUT_MIN} --wrap="sleep ${CI_JOB_TIMEOUT}"
sleep 3
state=\$(squeue -n ${CI_JOB_ID}_job -p ${SLURM_PARTITION} -l | tail -n 1 | awk '{print \$5}')
if [[ "\$state" != "RUNNING" ]]; then echo "Waiting for SLURM job(s) on the same node to be complete..."; fi
while [[ "\$state" != "RUNNING" ]]; do sleep 30; state=\$(squeue -n ${CI_JOB_ID}_job -p ${SLURM_PARTITION} -l | tail -n 1 | awk '{print \$5}'); done
EOF

post_build_script.sh
#!/bin/bash
set -x

# install ssh client if not found and if OS is Debian-like
if ! [ -x "$(command -v ssh)" ]; then
  echo 'Warning: ssh client not found.' >&2
  if [ -x "$(command -v apt)" ]; then
    apt update
    apt install -y openssh-client
  fi
fi

# connect to the Monolithe frontend to cancel the CI job in SLURM
ssh -o "IdentitiesOnly=yes" -o "StrictHostKeyChecking=accept-new" -i /opt/ssh_keys/id_rsa gitlab-nfs@front.mono.proj.lip6.fr /bin/bash << EOF
scancel -p ${SLURM_PARTITION} -n ${CI_JOB_ID}_job
EOF

List of Installed Nodes

  • brubeck.soc.lip6.fr
  • front.mono.proj.lip6.fr
  • xu4.mono.proj.lip6.fr
  • rpi4.mono.proj.lip6.fr
  • m1u.mono.proj.lip6.fr (manual install)
  • opi5.mono.proj.lip6.fr
  • em780.mono.proj.lip6.fr
  • x7ti.mono.proj.lip6.fr (manual install)