Use JSON to improve efficiency
YAML is a nice format, very easy to read, unfortunately the Python YAML library is very inefficient both CPU and memory wise. Loading the same content using JSON takes 10 times less memory and time.
Since dashboard is always struggling with OOM, let's use JSON for the data it produces.
As reference, below results of importing a 10 MB file with YAML and JSON are presented
yaml-json $ ./test.py yaml
Time 15.507138013839722 seg
Memory (70914086, 394044198) bytes (current, peak)
yaml-json $ ./test.py json
Time 0.6210496425628662 seg
Memory (58913059, 67501787) bytes (current, peak)
https://phabricator.apertis.org/T10163
Signed-off-by: Walter Lozano walter.lozano@collabora.com
Merge request reports
Activity
requested review from @em
assigned to @wlozano
- Resolved by Christopher Obbard
the Python yaml library is very inefficient both CPU and memory wise
Can you quantify before vs after in the commit message?
- Resolved by Walter Lozano
Cool, thank you!
Now, if you can switch all jobs to the
lightweight
runner without having them go OOM this would be a massive win! :D
added 4 commits
-
c0779143...7978628f - 2 commits from branch
master
- a76e1389 - Use JSON to improve efficiency
- a0f2dd7e - Switch to lightweight runners
-
c0779143...7978628f - 2 commits from branch
added 6 commits
-
b383488c...ebab6716 - 3 commits from branch
master
- deb6cc98 - Only pass cache if file is not empty
- 254b71ed - Use JSON to improve efficiency
- a122dd3e - Switch to lightweight runners
Toggle commit list-
b383488c...ebab6716 - 3 commits from branch
- Resolved by Emanuele Aina
@em I have pushed a small fix, a new commit to finish fixing the pipeline in the case a cache cannot be downloaded. Please give it a quick look and if you are happy hit the button.