diff options
author | Jake Hunsaker <jhunsake@redhat.com> | 2021-07-16 12:50:06 -0400 |
---|---|---|
committer | Jake Hunsaker <jhunsake@redhat.com> | 2021-08-04 09:00:27 -0400 |
commit | 611b17788ee289bebfe3b13404fae73efd112f70 (patch) | |
tree | c835998174af28aec1052ee9145d1810a30ac990 /tmpfilesd-sos.conf | |
parent | 4e5bebffca9936bcdf4d38aad9989970a15dd72b (diff) | |
download | sos-611b17788ee289bebfe3b13404fae73efd112f70.tar.gz |
[cleaner] Use a nested ProcessPoolExecutor for extraction
This commit inserts a nested ProcessPoolExecutor into the extraction
workflow for archives that are being obfuscated by `sos clean`.
Previously, the extraction was handled inside the same thread as the
rest of the obfuscation routines for each archive. However, it has been
found that when very large archives are manipulated concurrently,
performance can take a massive hit during the extraction process. This
is due to GIL limitations.
In this aspect 'very large archives' implies many tens of thousands of
files - e.g. 50K+. Because TarFile uses a 10K internal buffer, we end up
spinning a lot of time processing each file via the interpreter.
By shunting each extraction off into a new process space, we can avoid
the GIL issues altogether.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
Diffstat (limited to 'tmpfilesd-sos.conf')
0 files changed, 0 insertions, 0 deletions