Week 7 GitLab CI/CD Tutorial

In this week’s lab, you will learn about:

Deploying a Continuous Integration service in Gitlab
Annotating tests written in JUnit

Prerequisites

You have installed Java (preferably with IntelliJ as the IDE), and have some experience in programming in it.

Overview

This intro is inspired from MIT’s metaprogramming tutorial, under the CC BY 4.0 license [1]

Continuous integration, or CI, is an umbrella term for “event-triggered actions that run whenever a particular event happens (usually on the the server where the code is stored)”. Here, the action means some sort of script/sequence of programs are going to be invoked. There are many companies out there that provide various types of CI, often for free for open-source projects.

When talking about build processes on code changes, there are many considerations such as dependency management, setting up build systems locally and remotely, testing etc. Similarly, for larger builds, we could even need to have a hierarchy of pipelines in which the parent pipeline controls it’s children. You will also see that on working in large teams, this could become very complicated and prone to errors. For every PR/post-commit/commits with specific messages, you might want to upgrade to a new version of the documentation (for which the types of tests needed to be run are minimal), upload a docker image or a compiled version of your application on a remote machine, release a major version of the code to production, run your test suite, check for formatting errors, benchmarking, and many other things.

In fact, with the recent trend towards microservices, the changes inside the local development environment of the user are less likely to break, however they would likely break an application call which calls this microservice on the other side of the network call.

gitlab-eg

To solve this (especially in the context of teams) we have the concept of a cloud-based build system called continuous integration (CI).

Some of the big ones are Travis CI, Azure Pipelines, GitHub Actions, and Gitlab Pipelines. They all work in roughly the same way: you add a file to your repository that describes what should happen when various things happen to that repository. By far the most common one is a rule like “when someone pushes code, run the test suite” (which is what exercise 1 of this lab is about). When the event triggers, the CI provider spins up one/more virtual machines (or more), runs the commands in your “recipe”, and then usually notes down the results somewhere. You might set it up so that you are notified if the test suite stops passing, or so that a little badge appears on your repository as long as the tests pass. This is called test annotation.

As an example of a CI system, the class website is set up using Gitlab CI pipeline, which itself calls various scripts to automate build, test, run and deploying on the cloud. It uses the Jekyll blog software on every push to master and makes the built site available on a particular GitHub domain. This makes it trivial for us to update the website! We just make our changes locally, commit them with git, and then push. CI takes care of the rest. An example is given below:

gitlab-eg

Benefits and considerations#

The fundamental goal of CI is to automatically catch problematic changes as early as possible, resulting in a fast-feedback loop.

Some of the considerations are provided as a StackExchange` question

Aside: CI at SE@Google

Continuous integration: the contintous assembling and testing of our entire complex and rapidly evolving ecosystem.

ci-google

Based on the diagram above (the life of a code change including CI/CD), what considerations would Google take for efficient cycle of new releases of a project?

Activity 1

Fork and then clone the repository for the lab from your workspace. You can see that it is a solution to one of the problems in Codeforces, an online platform for competitive programming.

The source code has libraries for JUnit 4 for testing purposes, which we will enable to be run on every subsequent commit to master. The build system being used for the project is gradle, there are other ones such as maven but the build command would be similar. For the purposes of the lab, the project has been set up such that the corresponding annotated test pipeline should consist of two stages:

gradle-build - Compiles the source code into application bytecode. This is run via gradle assemble shell command.
gradle-test - Tests all the written test cases for the project . This is run via gradle test shell command.

The pipeline for Gitlab is written in the YAML file format - .gitlab-ci.yml at the root of the project (See this. For this exercise, the pipeline after completion should look like the following (Goto CI/CD -> PipeLines -> <Job Number> on the Gitlab repository website):

gitlab-pipeline

Now, when we push any of the changes to the remote repository, we expect these pipelines to be run by a computing entity (which is called a runner in Gitlab’s/Github’s lingo). Normally, one has to install and set up the runner by themselves in a specific machine that takes in jobs in a queue (see Install Gitlab Runner for more details), however in this scenario, we can assume that the runner has been set up by an administrator.

Notably, we are going to be using a shared runner, which can take in jobs from any one in the organization. There are shared runners for subjects such as comp2100, comp2300 already present so one can use these.

For the purposes of the lab, we are going to use the COMP2100 runner, which can be run by tagging it in the necessary steps of the pipeline

specific-pipeline-step:
    tags:
        - comp2100

Finally, we are also going to display the annotated results in the browser as follows: gitlab-pipeline

Check out the necessary documentation for instructions for setting up annotations.

For hint: see an example

Activity 2

As of now, the pipeline is run for every push to the main branch. However, in practice this would waste a lot of unnecessary computational resources, so we need the runners to only run specific stages under specific circumstances (such as running the whole build and test stage on pull requests, not running extensive tests on documentation changes and using something like proselint instead, etc). We can configure certain filters on stages and their conditions to be run in the same file.

Make the specific Gitlab Pipeline for gradle-test only during pull requests (this could be done for gradle-build as well, but this is just to show a differentiating example between the two)

You can look at the corresponding documentation for potential solutions.

Extension: Setting up your own Gitlab Runner

Instead of relying on ANU’s machine as an instance for running stages, we can use self-hosted runners on gitlab from providers such as AWS (via EC2), Azure (via it’s VM service), etc. In this case, we assume that the host is the same as your local machine, and you needs to configure it to run the pipelines.

Install Gitlab Runner on your local machine and run the CI tests from there. See the corresponding documentation on:
1. Installing
2. Registering
3. Configuring

References

[1] MIT Missing semester: https://missing.csail.mit.edu/2020/metaprogramming/

Search this site

Week 7 Tutorial