Skip to content
Snippets Groups Projects
Commit 2f7e27e6 authored by Ryan Gonzalez's avatar Ryan Gonzalez
Browse files

dh_setup_copyright: Improve performance of matching files on large trees


The previous version of this code was `O(m*n)` where `m` = the number of
files in the source tree and `n` = the number of files referenced by
the binaries. In most cases, those numbers are quite small, but on large
packages they can grow incredibly large. For instance, for rustc's main
binary, `m > 300k` and `n > 33k`, resulting in each outer loop iteration
taking an average of ~1.9s. That would result in a runtime of over 17
hours, which is a rather absurd bump to the build time.

Instead, we can reorganize the code so that the source tree contents are
stored in a hash, indexed by basename. That turns the entire inner
matching loop into a single hash lookup, bringing the outer loop runtime
to a worst-case single-digit number of milliseconds.

Fixes: infrastructure/apertis-issues#595

Signed-off-by: default avatarRyan Gonzalez <ryan.gonzalez@collabora.com>
parent 74473f7b
No related branches found
No related tags found
Loading
Checking pipeline status
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment