Published on 6 April 2023.
After drafting this article, I asked for feedback on James’ Discord. Emily wrote back and said that this sounded a lot like pre-tested integration that she had written about (here and here) earlier. She describes almost the exact same workflow as I imagine with this CI server, and there is also a Jenkins plugin to support that workflow. I encourage you the check out her writing as well.
I think I have figured out what a Continuous Integration (CI) server should do. It is very simple. Yet common tools used for CI, like Jenkins, make it hard or near impossible.
CI probably means different things to different people.
I’ve tried to find the root of the practice, and a lot of my thoughts here are based on James Shore’s descriptions in AOAD2.
So with that in mind, CI to me is about two things:
Integrate means to merge your changes into the main branch. This branch is commonly also referred to as master or trunk.
From what I’ve read, the consensus seems to be that you should integrate at least once a day. If you do it less frequently, you are not doing continuous integration.
Every time you integrate, you have to make sure that the main branch is still working. This is the second aspect of CI.
How can you do that?
The only way to do that, and still integrate often, is with an automatic test suite.
When you integrate your code, you want to run the test suite to make sure that everything still works.
The test suite should give you confidence that when it’s time to deploy to production, it will just work.
I’m using the term test suite here to include everything you need to gain that confidence, so it includes compiling, linting, static analysis, unit tests, deploy to test environment, smoke test… everything.
James Shore writes that Continuous Integration is an Attitude, Not a Tool and points out that you can do CI without a tool.
No tool can choose to integrate your changes often. You have to change your way of working so that you can integrate more often and also do so. This requires practice.
No tool can enforce that your main branch is always working. You have to have a mindset of working like that. This requires practice.
However, there are some things that a tool can help with. To make it easier to work in this way.
A CI server should merge changes to the main branch in a “safe” way.
Here is pseudo code for how a CI server should integrate changes from a branch in a Git repo:
def integrate(repo, branch):
with lock(repo):
"git clone {repo}")
sh("git merge origin/{branch}")
sh("<command to run test suite>")
sh("git push") sh(
The lock step ensures that only one integration can happen at a time. If you have two branches that want to integrate, one has to wait for the other to be integrated first.
The branch is then integrated by performing a git merge
.
To make sure the new main branch works, a test suite is then run. This test suite should be defined in the repo.
If the test suite passes, a git push
is performed to “publish” the new main branch.
This workflow ensures that every change that is merged into the main branch works. Where “works” is defined as passing the test suite.
That is the basic function that I think a CI server should perform. Let’s look at some directions where this design can be evolved to make a more full fledged CI server.
One thing that a dedicated CI server helps prevent is the problem that code works on one developer’s machine, but not on another’s. Perhaps it is due to a dependency missing on one developer’s machine.
With a CI server, the one true environment is the CI server’s environment.
Preferably, this should also be set up in the exact same way before every test run so that two test runs have the exact same clean environment.
Clean environments make test runs more predictable and helps make integrations safe.
Setting up a clean environment looks different in different contexts. One option would be to use Docker containers. In the Python world, virtual environments could be set up for each test run.
Any function that a CI server can perform to help set up a clean environment is useful.
Another advantage of a dedicated CI server is that you can make sure that your code works in an environment that you don’t have access to on your development machine.
You might write Python code that should work on both Windows and Linux, but your laptop only runs Windows.
A CI server should have functionality to run the test suite in different environments.
To take full advantage of the CI server, the “command to run the test suite” should be written in a “pipeline language” that the CI server understands.
Consider this pseudo example:
step('compile') {
sh('make')
}
parallel {
step('test unix') {
environment('unix') {
sh('./test')
}
}
step('test windows') {
environment('windows') {
sh('test.exe')
}
}
}
This script could not have been written as a Bash script for example, because then it could not have taken advantage of the CI server functionality to run commands in different environments.
When I asked for feedback on this article, I got some objections about a CI server being responsible for environments and a pipeline language.
One person wrote this and this:
… having a pipeline script that works only with the ci software seems like a huge lockin and risk
I feel that the moment I say I can’t do this locally and I need a pre-configured build server, I am violating the basic principles of development.
I partly agree with those objections.
It would be better if you could run your whole pipeline locally and have it set up all the clean environments for you. With virtualisation technology, this is becoming more and more possible.
If you manage to get this setup, then the CI server only functions as a single integration point that everyone has to go through.
I still think that a pipeline language would be useful for programming your pipeline. However, it could be used outside the CI server as well. That way you could also debug your pipeline locally without involving the CI server. If a pipeline step requires a specific environment that you can’t get locally, that step could be skipped when run locally.
Another aspect of continuous integration is communication.
For example, when you integrate, you want to tell your team members about the change so that they can pull your changes and test their code against it.
A CI server can help communicate. It can for example do the following:
The lock step in the basic workflow ensures that only one integration can happen at a time.
In some situations you might have a longer running test suite that you don’t want to block further integrations.
A CI server could support that something like this:
def integrate(repo, branch):
with lock(repo):
"git clone {repo}")
sh("git merge origin/{branch}")
sh("<command to run fast test suite>")
sh("git push")
sh("<command to run slow test suite>") sh(
Of course, when you do this, you risk breaking the main branch since all tests are not run before the change is integrated.
One scenario where this could be useful is if you have a slow running test suite today that you can’t make instantly faster. You can start using this pattern with the goal of making all your slow tests fast. As a rule of thumb, the fast test suite should not take more than 10 minutes. If it takes longer for an integration to complete, chances are that you start multitasking because you don’t want to wait for it.
Some tests might also be impossible to run in less than 10 minutes. In that case, this pattern is also good. But make sure that all basic functionality is tested in the fast test suite.
When it comes to tools commonly used for CI, I primarily have experience with Jenkins. And the two most common patterns in Jenkins, which a believe are not unique to Jenkins, prevent you from doing continuous integration. Let’s have a look.
This pattern runs a pipeline only after you have merged your changes to the main branch.
If the test suite fails, your main branch is broken, and everyone who pulls your changes will base their work on something broken.
If you are serious about continuous integration, you fix this problem immediately. Either by reverting the change or merging a fix. It might not be too big a problem.
If you are not serious about continuous integration, you might leave the main branch broken and hope that someone else fixes it.
With a CI server I describe in this article, it is simply impossible to merge something broken. (Given that your test suite will catch the broken things.)
This patterns runs a pipeline on every branch so that you know that your changes work before you merge them. And when you merge them, the pipeline is run again.
This is a slight improvement over the previous pattern, but it still has a flaw. Consider this scenario:
0---0
\
\---A
\
\---B
A
and B
are two branches that both have passing test suites, so they both go ahead and merge, resulting in this:
0---0-------A'---B'
\ / /
\---A /
\ /
\----B
A'
has already been tested on the branch, but B'
has never been tested. That is, the combination of A
’s and B
’s changes has never been tested, until they are both merged.
With a CI server I describe in this article, this problem is solved with the lock where multiple integrations have to wait for each other.
If you use the multiple test suites pattern, you still have this problem. At least for functionality only covered by the slow test suite. But then it’s a choice you make. You decide if the trade off is worth it for you or not.
I think that tools for CI should help you do CI well. Why don’t they?
I have two speculations.
First, if your team is committed to doing continuous integration, a broken main branch might not be too big a deal since everyone is committed to fixing it fast.
Second, back in the day of using SVN (which was my fist version control system), branching was expensive. The default way to share changes was to push directly to the main branch. Having a CI tool do the actual integration was probably technically more difficult. However, now with Git, that is no longer true.
Do you know why tools for CI don’t work like I describe in this article? Please let me know.
Emily responded the following to that question:
I think it’s hard to tell at this distance, but I suspect the people building the tools weren’t always the same people who really understood what CI is, and there was a communication gap. The tools that ended up becoming popular were perhaps the easiest to adopt and had the best marketing?
That sounds reasonable to me.
Another person responded with this:
i think most [build servers] can be configured that way [proper CI]. many users do not want to because they don’t understand the ci process. instead they regard the build server as some central platform on which development is done.
So people find value in build servers even though they are not designed explicitly for CI. That also makes sense.
So perhaps the reason why we don’t have better tools for CI is that people don’t understand the value of CI or don’t want to adopt it?
Pull requests are a common way of working, but they don’t play nicely together with CI.
First of all, when working with pull requests, you integrate your code by pressing a button that will perform the merge. With a CI tool like the one I describe in this article, the CI tool performs the merge. With the former, no tool can prevent broken code on the main branch. (The best they can do is test the branch, then test again after merge.)
Second of all, pull requests, at least blocking ones, add delay to the process of integrating code, making it difficult to integrate often.
Pull requests are often used to review changes before they are merged. In a CI server that I describe in this article, there is nothing preventing you from having a manual review step before the CI server is allowed to merge. However, a manual review step adds delays and makes it difficult to integrate often.
What is Rickard working on and thinking about right now?
Every month I write a newsletter about just that. You will get updates about my current projects and thoughts about programming, and also get a chance to hit reply and interact with me. Subscribe to it below.