Skip to content
Snippets Groups Projects
Commit e69c9490 authored by Peter Senna Tschudin's avatar Peter Senna Tschudin Committed by Emanuele Aina
Browse files

T6883: Handle stakeholder feedback on artifact tracking and version freeze document


This patch fixes a few typos and minor issues and add a concrete example
of how to reproduce the build.

Signed-off-by: default avatarPeter Senna Tschudin <peter.senna@collabora.com>
parent 6a76e9af
No related branches found
No related tags found
1 merge request!54T6883: Handle stakeholder feedback on artifact tracking and version freeze document
Pipeline #145448 passed
......@@ -9,7 +9,7 @@ outputs = [ "html", "pdf-in",]
# Background
One of the main goals for Apertis is to provide teams the tools to support
their products for long the lifecycles needed in many industries, from civil
their products for long lifecycles needed in many industries, from civil
infrastructure to automotive.
This document discusses some of the challenges related to long-term support and
......@@ -17,7 +17,7 @@ how Apertis addresses them, with particular interest in reliably reproducing
builds over a long time span.
Apertis addresses that need by providing stable release channels as a platform
for products with a clear tradeoff between updateness and stability. Apertis
for products with a clear trade-off between updateness and stability. Apertis
encourages products to track these channels closely to deploy updates on a
regular basis to ensure important fixes reach devices in a timely manner.
......@@ -37,13 +37,13 @@ involve things like timestamps or items being listed differently in places
where order is not significant, cause builds to not be bit-by-bit identical
while the runtime behavior is not affected.
# Apertis artifacts and release channels
# Apertis artefacts and release channels
As described in the[release flow]( {{< ref "release-flow.md" >}} ) document, at any given time Apertis
has multiple active release channels to both provide a stable foundation for
product teams and also give them full visibility on the latest developments.
Each release channel has its own artifacts, the main ones being the
Each release channel has its own artefacts, the main one being the
[deployable images](https://apertis.org/images/) targeting the [reference
hardware platforms](https://www.apertis.org/reference_hardware/), which get
built by mixing:
......@@ -51,11 +51,11 @@ built by mixing:
* reproducible build environments
* build recipes
* packages
* external artifacts
* external artefacts
These inputs are also artifacts themselves in moderately complex ways:
* build enviroments are built by mixing dedicated recipes and packages
* packages are themselves built using dedicated reproducible build enviroments
These inputs are also artefacts themselves in moderately complex ways:
* build environments are built by mixing dedicated recipes and packages
* packages are themselves built using dedicated reproducible build environments
However, the core principle for maintaining multiple concurrent release
channels is that each channel should have its own set of inputs, so that
......@@ -105,13 +105,13 @@ external resources to keep the impact of the environment as low as possible.
For the most critical components, even the container images themselves are
created using Apertis resources, minimizing the reliance on any external
service and artifact.
service and artefacts.
For instance, the `apertis-v2020-image-builder` container image provides
the reproducible environment to run the pipelines building the reference
image artifacts for the v2020 release, and the
image artefacts for the v2020 release, and the
`apertis-v2020-package-source-builder` container image is used to convert the
source code stored on GitLab in a format suitable for building on OBS.
source code stored in GitLab in a format suitable for building on OBS.
Each version of each image is identified by a hash, and possibly by some tags.
As an example the `latest` tag points to the image which gets used by default
......@@ -123,10 +123,10 @@ By default the Docker registry where image are published keeps all the past
versions, so every build environment can be reproduced exactly.
Unfortunately this comes with a significant cost from a storage point of view,
so each team needs to evaluate the tradeoff that better fits their goals
so each team needs to evaluate the trade-off that better fits their goals
in the spectrum that goes from keeping all Docker images around for the whole
lifespan of the product to more aggressive pruning policies involving the
deletion of old images on the assumtion that changes in the build environment
deletion of old images on the assumption that changes in the build environment
have a limited effect on the build and using an image version which is close to
but not exactly the original one gives acceptable results.
......@@ -150,7 +150,7 @@ recipes are invoked and combined.
Relying on git for the definition of the build pipelines make preserving old
versions and tracking changes over time trivial.
Rebuilding the `v2020` artifacts locally is then a matter of checking out the
Rebuilding the `v2020` artefacts locally is then a matter of checking out the
recipes in the `apertis/v2020` branch and launching `debos` from a container
based on the `apertis-v2020-image-builder` container image.
......@@ -229,10 +229,10 @@ the full history of each archive.
More advanced use-cases can be addressed using the optional
[Aptly HTTP API](https://www.aptly.info/doc/api/).
## External artifacts
## External artefacts
While the packaging pipeline effectively forbids any reliance on external
artifacts, the other pipelines in some case include components not under the
artefacts, the other pipelines in some case include components not under the
previously mentioned systems to track per-release resources.
For instance, the recipes for the HMI-enabled images include a set of
......@@ -243,7 +243,7 @@ Another example is given by the `apertis-image-builder` recipe checking out
Debos directly from the master branch on GitHub.
In both cases, any change on the external resources impacts directly all the
release channels when building the affected artifacts.
release channels when building the affected artefacts.
A minimal solution for `multimedia-demo.tar.gz` would be to put a version in its
URL, so that recipes can be updated to download new versions without affecting
......@@ -254,20 +254,20 @@ In the Debos case it would be sufficient to encode in the recipe a specific
revision to be checked out. A more robust solution would be to use the packaged
version shipped in the Apertis repositories.
## Main artifacts and metadata
## Main artefacts and metadata
Ther purpose of the previuosly described software items is to generate a set of
artifacts, such as those described in [the v2019 release artifacts
document](release-v2019-artifacts.md). With the artifacts themselves a few metadata
The purpose of the previously described software items is to generate a set of
artefacts, such as those described in [the v2019 release artefacts
document](release-v2019-artifacts.md). With the artefacts themselves a few metadata
entries are generated to help tracking what has been used during the build.
In particular, the `pkglist` files capture the full list of packages installed
on each artifact along their version. The `filelist` files instead provide
basic information about the actual files in each artifact.
on each artefacts along their version. The `filelist` files instead provide
basic information about the actual files in each artefacts.
With the information contained in the `pkglist` files it is possible to find
the exact binary package version installed and from there find the
corresponding commit for the sources stored on GitLab by looking at the
corresponding commit for the sources stored in GitLab by looking at the
matching git tag.
Other files capture other pieces of information that can be useful to reproduce
......@@ -330,7 +330,7 @@ snapshots of the archive contents so that subsequent builds can point to the
snapshotted version and retrieve the exact package versions originally used.
To provide the needed server-side support, the archive manager need to be
switched to the `aptly` archive manager as it provide explicit support for
switched to the `aptly` archive manager as it provides explicit support for
snapshots. The build recipes then need to be updated to capture the current
snapshot version and to be able to optionally specify one when initiating
the build.
......@@ -351,9 +351,9 @@ For full reproducibility it is recommended to use the exact image originally
used, but to be able to do so the image hash needs to be stored in the
metadata for the build.
## Version control external artifacts
## Version control external artefacts
External artifacts like the sample multimedia files need to be versioned just
External artefacts like the sample multimedia files need to be versioned just
like all the other components. Using Git-LFS and git tags would give fine
control to the build recipe over what gets downloaded.
......@@ -361,7 +361,7 @@ control to the build recipe over what gets downloaded.
The package name and package version as captured in the `pkglist` files are
sufficient to identify the exact sources used to generate the packages
installed on each artifact, as they can be used to identify an exact commit.
installed on each artefacts, as they can be used to identify an exact commit.
However, the process can be further automated by providing explicit hyperlinks
to the tagged revision on GitLab.
......@@ -370,7 +370,7 @@ to the tagged revision on GitLab.
## Identify the recipe and build environment
1. Open the folder containing the build artifacts, for instance
1. Open the folder containing the build artefacts, for instance
`https://images.apertis.org/release/v2021dev1/v2021dev1.0/`
1. Find the `recipe-revision.txt` metadata,
for instance `https://images.apertis.org/release/v2021dev1/v2021dev1.0/meta/recipe-revision.txt`
......@@ -394,7 +394,7 @@ Once all the input metadata are known, the build can be reproduced.
on the newly created branch, specifying parameters for the exact Docker
image revision and the APT snapshot identifier
When the pipeline completes, the produced artifacts should closely match the
When the pipeline completes, the produced artefacts should closely match the
original ones, albeit not being bit-by-bit identical.
## Customizing the build
......@@ -412,3 +412,234 @@ For instance, to install a custom package:
experiments during development)
1. Commit the results and push the branch
1. Execute the pipeline as described in the previous section
# Example 1: OpenSSL security fix 2 years after release v1.0.0
Today a product team makes the official release of version 1.0.0 of their
software that is based on Apertis. Two years from now a critical security
vulnerability will be found and fixed in OpenSSL. How can the product team
issue a new release two years from now with the only change being the fix to
OpenSSL?
It is important for product teams to consider their future requirements at the
point they make a release. To ensure bug and security fixes can be deployed
with minimal impact on users a number of artefacts need to be preserved from
the initial release:
1. The image recipes
1. The Docker images used as build environment
1. The APT repositories
1. External artefacts
## Getting started with Apertis: one year before release 1.0.0
Good news! A product team has decided to use Apertis as platform for their
product. At this stage there are a few recommendations on how to get started
that will make it easier to use Apertis long term reproducibility features.
The product team needs control over their software releases, and is important
to decouple their releases from Apertis. One important objective is to give
the product team control over importing changes from Apertis, such as package
updates. We recommend using release channels for that.
A product team can have multiple release channels, each reflecting what is
deployed for an specific product. And because release channels are independent
and parallel deliveries, a single product may even have multiple release
channels, for instance a stable channel and a development one.
In turn each product release channel is based on an Apertis release channel. As
an hypothetical example the `automotive` product team may have an
`automotive/cluster-v1` release channel for delivering stable updates to their
`cluster` product, and an `automotive/cluster-v2` release channel for
development purposes, both based on the same `apertis/v2020` release channel.
Git repositories need to use a different branch for each release channel, and
each release channel has its own set of projects on OBS. However only the components
that the product team need to customize have to be branched or forked. To
maximize reuse, it is expected that the bulk of packages used by every product
team will come directly from the main Apertis release channels.
1. What: Create a dedicated release channel
1. Where: GitLab and OBS
1. How: Create release channel branches in each git repository that diverges
from the ones provided by Apertis; set up OBS projects matching those
release channels to build the packages
In this way the product team has complete control on the components used to
build their products:
* Source code for all packages is stored on GitLab with full development
history
* Compiled binary packages are tracked by the APT archive snapshotting system
for both the product-specific packages and the packages in the main Apertis
archive.
The previous step took care of the Apertis layer of the software stack, but
there is one important set of components missing: the product team software. We
suggest that product teams use one of Apertis recommended ways for shipping
software which consists of using .deb packages or Flatpaks. For this example we
are going to use .deb packages.
While there are multiple ways of handling product team specific software, for
this example we are going to recommend the product team to create a new APT
suite and a few APT components, and host them on the Apertis infrastructure. We
will call the new suite cluster-v1. The list of APT repositories will then
be:
deb https://repositories.apertis.org/apertis/ v2020 target development sdk
deb https://repositories.apertis.org/automotive/ cluster-v1 target
For reference, in [APT
terminology](https://manpages.debian.org/testing/apt/sources.list.5.en.html)
both `v2020` and `cluster-v1` are suites or distributions, and `target`,
`development`, and `sdk` are components.
The steps are:
1. What: Create new APT suite and APT components for the product team
1. Where to host: Apertis infrastructure
## Creating the list of golden components: the day of the release 1.0.0
As we mentioned earlier each component is identified by a hash, and it is also
possible to create tags. We recommend using hashes for identification of
specific revisions because hashes are immutable. Tags can also be used, but we
recommend careful evaluation as most tools allow tags to be modified after
creation. Modifying tags can lead to problems that are difficult to debug.
The image recipe is usually a small set of files that are stored in a single
git repository. Collect the hash of the latest commit of the recipe repository.
1. What: Image recipe
1. Where: Apertis GitLab
1. How: Collect the git hash of the latest commit of the recipe files
The Docker containers used for building are stored in GitLab Container
Registry. The Registry also allow to identify containers by hashes. A note of
caution: There are expiration policies and clean-up tools for deleting old
versions of containers. Make sure the golden containers are protected against
clean-up and expiration.
1. What: Docker containers used for building: apertis-v2020-image-builder and
apertis-v2020-package-source-builder
1. Where: GitLab Container Registry
1. How: On the GitLab Container Registry collect the hash for each container used
for building
1. Do not forget: Make sure the expiration policy and clean-up routines will
not delete the golden containers
From the perspective of APT clients, such as the tools used to create Apertis
images, APT repositories are simply a collection of static files served through
the web. The recommended method for creating the golden set of APT repositories
is to create snapshots using `aptly`. Aptly is used by Debian upstream and is
capable of making efficient use of disk space for snapshots. aptly snapshots
are identified by tags. Something in the lines of `aptly snapshot create v1.0.0
from mirror target`, and repeat the command for `target`, `development`, `sdk`,
and `cluster-v1`.
It is important to mention that the product team needs to create a snapshot
**every time a package is updated**. This is the only way to keep track the
full history of the APT archive.
1. What: APT repositories:
deb https://repositories.apertis.org/apertis/ v2020 target development sdk
deb https://repositories.apertis.org/automotive/ cluster-v1 target
1. Where: aptly
1. How: create a snapshot for each repository using aptly
1. Do not forget: create a snapshot for every package update
External artefacts should be avoided, but some times they are required. An
example of external artefacts are the multimedia files Apertis uses for
testing. Those files are currently simply hosted on a webserver which creates
two problems: no versioning information, and no long term guarantee of
availability.
To address this issue we recommend creating a repository on GitLab, and copy
all external artefacts to it. This gives the benefit of using the well defined
processes around versioning and tracking that are already used by the other
components. For large files we recommend using git-lfs.
1. What: External artefacts: files that are needed during the build but that are
not in git repositories
1. Where: A new repository in GitLab
1. How: Create a GitLab repository for external artefacts, add files, use
git-lfs for large files, and collect the hash pointing to the correct
version of files
Notice that the main idea is to collect hashes for the various resources used
for building. The partial exception are external resources, but our suggestion
is to also create a git repository for hosting the external artefacts and then
collect and use the git hash as a pointer to the correct version of the
content.
At the time of writing there is work planned to automate the collection of
relevant hashes that were used to create an image. The outcome of the planned
work will be the publication of text files containing all relevant hashes for
future use.
## Using the Golden components two years after release 1.0.0: Creating the new release
We recommend product teams to make constant releases, for example in a quarterly
basis, to cover security updates and to minimize the technical debt to Apertis
upstream. However in some cases a product team may decide to have a much longer
release cycle, and for our example, the product team decided to make the second
release two years after the first one.
For our example the product team wants the second release to include a fix for
OpenSSL that corrects a security vulnerability, but be as identical as possible
otherwise. A note of caution here is that deterministic builds, or the ability
to build packages that are byte-by-byte identical in different builds, is not
expected to happen naturally and is outside the scope of this guide. A good
source of information about this topic is the [Debian Reproducible
Builds](https://wiki.debian.org/ReproducibleBuilds) page.
Our aim is to be able to reproduce builds closely enough so that one can
reasonably expect that no regressions are introduced. For instance some non
essential variations could be caused by different time stamps or different
paths for files. These variations cause builds to not be byte-by-byte identical
while the runtime behavior is not affected.
For our example the product team will import the updated OpenSSL package from
Apertis, build the OpenSSL package, and build images for the new v1.0.1
release.
The first step is to rescue all the hashes that were collected on the day of
the build.
## Identify the recipe and build environment
Once the automation to collect relevant hashes is in place, this step will be
simplified, and will be as simple as making a copy of a few files such as:
`recipe-revision.txt`, apt-snapshot.txt`, and docker-image.txt`. These files
will be available for download in the same server where Apertis images are
available for download.
However it is also possible to collect the information by hand, and the
important hashes are:
1. Image recipe: `git log` from the image repository
1. The Docker images used for the build environment
1. The APT repositories
1. External artefacts: `git log` from the external artefacts repository. See the section External Artefacts for more information
Once all the input metadata are known, the build can be reproduced.
## Reproduce the build
1. On Gitlab [create a new
branch](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#create-a-new-branch-from-a-projects-dashboard)
on the previously identified recipe repository. The branch should point to the
golden commit, which was identified in the steps above.
1. [Execute a CI pipeline](https://docs.gitlab.com/ee/ci/pipelines.html#manually-executing-pipelines)
on the newly created branch, specifying parameters for the exact Docker
image revision and the APT snapshot identifier
When the pipeline completes, the produced artefacts should closely match the
original ones, albeit not being bit-by-bit identical.
## Customizing the build
On the newly created branch in the forked recipe repository, changes can be
committed just like on the main repository.
For instance, to install a custom package:
1. Check out the newly-created branch
1. Edit the relevant ospack recipe to install the custom package, either by
adding a custom APT archive in the `/etc/apt/sources.list.d` folder if
available, or retrieving and installing it with `wget` and `dpkg` (small
packages can even be committed as part of the repository to run quick
experiments during development)
1. Commit the results and push the branch
1. Execute the pipeline as described in the previous section
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment