Software Versioning

I want to take a moment to acknowledge a hole that we sometimes find ourselves in when maintaining software for the long term. Version management. This is not the same as version control which is what allows you to easily revert or compare changes, and makes collaboration easier. No, I’m talking about the difference between named versions, semantic versions, and rolling releases. Not to mention long term support and other interesting structures.

Named Versions

Most of us are familiar with named versions. Ubuntu and the rest of the *nix ecosystem quite like them. They’re easy to remember, and are good for use in conversation. We know the difference between bookworm and alpine (maybe). The name only changes when there are significant changes made, although they still manage to happen every year or so (more often for other things). I would actually go so far as to suggest that versions of the Windows operating system are also using named versioning. The names are just boring at the moment (eleven, ten, etc).

So, we know who does use named versions, maybe we should ask why they do it, and whether we should be doing the same?

Most of the time, you don’t need it. Names need to stick around long enough to be remembered, which given how bad we are at remembering things means years. Most of us are not willing to keep our software stable for that long, and most of us do not need to. That is important. That level of stability is required for things which really shouldn’t be noticeably changing frequently. Jumping from alpine to bookworm may have notable differences, and may cause some grief as systems no longer work the way you anticipate. We all know what upgrading the major version of MacOS or Windows does to people. They might even lose an entire day of productivity.

If you are building software that sticky, then sure, give it a name. Otherwise you are just signing yourself up to think up new names at an unsustainable rate, and we all know that “naming things” is one of the hardest problems in tech, let along software.

Semantic Versioning

Another incredibly familiar concept, one which most of us have implicitly internalised without necessarily acknowledging the meaning of SemVer. Interestingly, there are places where the use of SemVer is enforced, and correctly so, which should really guide where you think about using it.

Helm charts use SemVer to the point that if you forget to bump the patch version when you push a change you might spend a lot of time debugging why the chart wasn’t updated before you realise that you forgot to increment a number. This leads to a whole bunch of clever things like pre-commit hooks which automatically update the version, and tools like release-please which go a step further and include a change-log with each version change automatically.

SemVer provides a communicable contract without having to go through the rigmarole of defining it. There is a pre-agreed meaning to each part of the semantic version number, and many named versions actually use this rule under the hood. Given the structure x.y.z in a semantic version this can be read as major.minor.patch. The major version must change if there are breaking changes in the release. This is so that software which depends on it won’t automatically update and break in ways that leave half the internet scratching their heads. A great example is when python moved from 2.7 to 3.x; we had to re-write large amounts of python code to fit the new syntactic rules regarding braces. It is also why it took so many years to migrate, but that’s a different matter.

The minor version changes for most significant updates. Particularly those which may impact some consumers. So a security fix which stops you misusing the library n a cool way will annoy a small minority of users, and be an easy update for the large majority. Adding new features is often a minor version bump. Significant but not breaking changes, is usually what you’re looking at. This leads to minor versions which are really high when a library is new, and as they mature they don’t increase the minor version so often because the nature of the changes is less invasive, or the branching strategy is changed.

The patch version should change every time something in the internals changes. Minor bug fixes, small logging updates, utility changes, security updates. Small stuff that most developers don’t really need to worry about. For the most part we want to know there was a change, so that if somehow something breaks we can figure out where it happened, but we don’t expect anything to break.

So, I started by saying you should use semantic version based on those who enforce it. SemVer is a contract. You should use SemVer if there is someone with whom you need to maintain that contract. If you have an API then the API should be semantically versioned (if it is super stable you could name it even). If you curate a library or docker image, or application which people download, then you want semantic versioning. I know that GIMP-3 and GIMP-2 will feel somewhat different when I open them, because that is a major version change.

Rolling Releases

The bread and butter of CI/CD, rolling releases don’t really use version much at all. This is intentional. Continuous Integration and Continuous Deployment are not intended to be gated on anything other than the software working. You push a change to main, and the system picks it up CI (integration) runs some tests and validations, then passes it over to CD (deployment) which packages it up and makes it available for you to use. If you’re paranoid you have a couple of stages of this which you run through with different levels of testing, but no one really cares about the version which is in production, so long as they can link it back to the version of the code in version control.

Most SaaS and PaaS application code is using this structure. Not all of it, but much of it. There is no contractual obligation to notify anyone if the internals of the system change. If we remove a bottleneck and suddenly it gets faster, that’s a nice win, but the end user doesn’t really care all that much. If the frontend gets a massive overhaul you may want to have two versions deployed and which the user sees is gated on a cookie choice, but that doesn’t actually require versioning. That is usually called feature flagging. You can relate them but you do not need to.

So which should I use?

Bearing in mind that you can always change what you’re doing, usually by adding or removing a couple of automatic actions, make the choice that fits the way you release software the closest.

If you are releasing something which users really don’t want to update often, named releases with long term support are great.
If you are releasing a library, package, or API which comes with a contract to the consumer, and there are consumers other than you, you should be using Semantic Verisoning.
If you are deploying a SaaS or PaaS application, in a mono-repo, where you own the entire system end to end, then stick with rolling releases. You can use SemVer if you really want the tooling which does change-log management, but for the most part it is just going to slow you down.

There isn’t really a wrong choice in all of this. You should always have a way to roll a change back, or to roll new changes out. There are more and less annoying choices for the way in which you manage your code, and if you find you’re using the more annoying way, you can always try one of the others and possibly that will speed up your releases.