Skip to content
Snippets Groups Projects
  1. May 09, 2022
  2. Mar 31, 2022
    • Arnaud Ferraris's avatar
      fetch-downstream: include additional licensing information · 7c144fef
      Arnaud Ferraris authored and Detlev Casanova's avatar Detlev Casanova committed
      
      In order to facilitate license compliance, the dashboard should report
      the most severe license issues found in packages from the "target"
      repository: this includes cases where we override the existing license
      information (especially when specifying a global default license) or
      packages for which whitelisting is enabled for the whole source tree.
      
      As part of this effort, this commit fetches the copyright override and
      whitelisting files, and process those in order to detect major issues.
      The corresponding new per-branch flags are then set accordingly and
      added to the output file.
      
      Signed-off-by: default avatarArnaud Ferraris <arnaud.ferraris@collabora.com>
      7c144fef
    • Arnaud Ferraris's avatar
      fetch-downstream: rework filtering method for clarity and completeness · 59e5999d
      Arnaud Ferraris authored and Detlev Casanova's avatar Detlev Casanova committed
      
      The `filter_cache()` method used an optional `check_tags` argument to
      decide between several code paths. This does not properly reflect the
      purpose of this argument, which is to differentiate between 2 use cases:
      - check and populate information about the descendants of a given ref
      - check and populate component and license information
      
      As additional fields will have to be taken into account in the future,
      this rework aims at making the aforementioned method more flexible,
      while making it clear to which use-case belongs each code path. To that
      effect, the `check_flags` argument is renamed to `purpose` and now uses
      an enum to distinguish between use-cases. It also only processes the
      relevant data for the current use-case:
      - when checking for descendants, only descendants data is looked up and
        updated
      - when checking for license info, only this information is checked and
        updated
      
      Signed-off-by: default avatarArnaud Ferraris <arnaud.ferraris@collabora.com>
      59e5999d
  3. Mar 17, 2022
  4. Feb 11, 2022
  5. Feb 08, 2022
    • Emanuele Aina's avatar
      fetch-downstream: Fix branch version detection · 19fe3eda
      Emanuele Aina authored
      
      Retrieve all descendants to correctly detect when a branch got merged in
      another and which tags apply.
      
      Unfortunately `commit.refs()` does not currently handle pagination, so
      only the first 20 items were downloaded.
      
      This meant that in some cases no tags were returned, so no version could
      be computed. In other cases this may have cause incorrect reports of
      branches not being merged into their downstream, or other issues due to
      computing the wrong version for the branch.
      
      Signed-off-by: Emanuele Aina's avatarEmanuele Aina <emanuele.aina@collabora.com>
      19fe3eda
  6. Jan 31, 2022
  7. Dec 31, 2021
  8. Dec 21, 2021
  9. Sep 17, 2021
  10. Feb 26, 2021
    • Emanuele Aina's avatar
      packaging-data-fetch-downstream: Workaround python-gitlab escaping bug · 5e651ed8
      Emanuele Aina authored
      Git refnames are relatively free-form and can contain all sort for
      special characters, not just `/ and `#`, see
      http://git-scm.com/docs/git-check-ref-format
      
      In particular, Debian's DEP-14 standard for storing packaging in git
      repositories mandates the use of the `%` character in tags in some
      cases like `debian/2%2.6-21`.
      
      Unfortunately python-gitlab currently only escapes `/` to `%2F` and in
      some cases `#` to `%23`. This means that when using the commit API to
      retrieve information about the `debian/2%2.6-21` tag only the slash is
      escaped before being inserted in the URL path and the `%` is left
      untouched, resulting in something like
      `/api/v4/projects/123/repository/commits/debian%2F2%2.6-21`. When
      urllib3 seees that it detects the invalid `%` escape and then urlencodes
      the whole string, resulting in
      `/api/v4/projects/123/repository/commits/debian%252F2%252.6-21`, where
      the original `/` got escaped twice and produced `%252F`.
      
      This works around the issue while waiting for the upstream fix,
      see https://github.com/python-gitlab/python-gitlab/pull/1336
      
      
      
      Signed-off-by: Emanuele Aina's avatarEmanuele Aina <emanuele.aina@collabora.com>
      5e651ed8
    • Emanuele Aina's avatar
      packaging-data-fetch-downstream: Fix latest version on branch · db27eb7f
      Emanuele Aina authored
      
      When the latest commit on a branch is not tagged (that is, there are
      unreleased commits) picking the first descendant tag is not correct
      since it can end up picking tags from descendant branches.
      
      For instance, if `apertis/v2022dev0` has unreleased commits and they get
      released in `apertis/v2022dev1` the current code gets confused and
      consider them to have the same version.
      
      To avoid that, check which tags are actually contained in each branch
      and pick the latest.
      
      Signed-off-by: Emanuele Aina's avatarEmanuele Aina <emanuele.aina@collabora.com>
      db27eb7f
  11. Feb 24, 2021
  12. Aug 23, 2020
  13. Jul 29, 2020
    • Emanuele Aina's avatar
      Index data by package name · 070001e5
      Emanuele Aina authored and Martyn Welch's avatar Martyn Welch committed
      
      Rather than indexing by repository name, use the package name as the
      main key since it is the common concept that ties GitLab, OBS and
      upstream sources.
      
      This simplifies some parts of the code as all the information is
      available from a single object instead of being spread across multiple
      data sources.
      
      Error reporting is also largely simplified by having a single `errors:`
      array on each package and have each error to be an object rather than a
      single string: iterating over every error is thus much simpler and the
      information about the error itself is now explicit rather than implicit
      based on its surrounding context (for instance, whether it was located
      on a branch, on the git project, or on the OBS package entry).
      
      The YAML structure went from:
      
          obs:
            packages:
              aalib:
                entries:
                  apertis:v2020:target:
                    name: aalib
                    errors:
                      - "ooops"
          projects:
            pkg/target/aalib:
              branches:
                debian/buster:
                  name: debian/buster
                  errors:
                    - "eeeww"
              errors:
                - "aaargh"
          sources:
            debian/buster:
              packages:
                aalib: [...]
      
      to:
      
          packages:
            aalib:
              obs:
                entries:
                  apertis:v2020:target: {...}
              git:
                branches:
                  debian/buster: {...}
              upstreams:
                debian/buster: [...]
              errors:
                - msg: "aaargh"
                - msg: "eeeww"
                  branch: debian/buster
                - msg: "ooops"
                  projects: [ "apertis:v2020:target" ]
      
      Signed-off-by: Emanuele Aina's avatarEmanuele Aina <emanuele.aina@collabora.com>
      070001e5
    • Emanuele Aina's avatar
      cb04a701
  14. Jul 13, 2020
  15. Jul 11, 2020
  16. May 15, 2020
    • Emanuele Aina's avatar
      packaging: Gather data and trigger actions on packaging repositories · e7163690
      Emanuele Aina authored
      
      Introduce a pipeline to fetch data from multiple sources, cross-check
      the retrieved information and trigger actions.
      
      Each step emits YAML data that can be consumed by later steps and then
      merged again to render a dashboard, with the goal of easing the addition
      of more data sources and checks as much as possible.
      
      The current steps are:
      * packaging-data-fetch-upstream: grab package listings from the
        configured upstream sources
      * packaging-data-fetch-downstream: scan GitLab to collect data about
        the packaging repositories and branches
      * yaml-merge: dedicated tool to merge data from multiple sources
      * packaging-sanity-check: verify some invariants and report mismatches
      * packaging-updates: compute which packages have a newer upstream and
        trigger the pipeline to pull them in
      * dashboard: render a basic dashboard listing the identified errors
      
      By triggering only the pipelines where there's a known update pending
      we avoid the issues with the previous approach that involved running
      the pipeline on each of the 4000+ repositories every week, which ended
      up overwhelming GitLab.
      
      Signed-off-by: Emanuele Aina's avatarEmanuele Aina <emanuele.aina@collabora.com>
      e7163690
Loading