aboutsummaryrefslogtreecommitdiffstats
path: root/tests/unittests
diff options
context:
space:
mode:
authorJake Hunsaker <jhunsake@redhat.com>2022-01-13 13:52:34 -0500
committerJake Hunsaker <jhunsake@redhat.com>2022-01-17 12:24:06 -0500
commited618678fd3d07e68e1a430eb7d225a9701332e0 (patch)
treeca347bf38aa8a5f84b4cc89fbc0b026b2bec5b14 /tests/unittests
parentf270220fddb70ef71a8da0376333b2454d7c4983 (diff)
downloadsos-ed618678fd3d07e68e1a430eb7d225a9701332e0.tar.gz
[clean,parsers] Build regex lists for static items only once
For parsers such as the username and keyword parsers, we don't discover new items through parsing archives - these parsers use static lists determined before we begin the actual obfuscation process. As such, we can build a list of regexes for these static items once, and then reference those regexes during execution, rather than rebuilding the regex for each of these items for every obfuscation. For use cases where hundreds of items, e.g. hundreds of usernames, are being obfuscated this results in a significant performance increase. Individual per-file gains are minor - fractions of a second - however these gains build up over the course of the hundreds to thousands of files a typical archive can be expected to contain. Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
Diffstat (limited to 'tests/unittests')
-rw-r--r--tests/unittests/cleaner_tests.py1
1 files changed, 1 insertions, 0 deletions
diff --git a/tests/unittests/cleaner_tests.py b/tests/unittests/cleaner_tests.py
index cb20772f..b59eade9 100644
--- a/tests/unittests/cleaner_tests.py
+++ b/tests/unittests/cleaner_tests.py
@@ -105,6 +105,7 @@ class CleanerParserTests(unittest.TestCase):
self.host_parser = SoSHostnameParser(config={}, opt_domains='foobar.com')
self.kw_parser = SoSKeywordParser(config={}, keywords=['foobar'])
self.kw_parser_none = SoSKeywordParser(config={})
+ self.kw_parser.generate_item_regexes()
def test_ip_parser_valid_ipv4_line(self):
line = 'foobar foo 10.0.0.1/24 barfoo bar'