Deduplication of Detections by Public Id #542
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
To support this properly, engines need to process their rules in a deterministic order. If different detections are marked as duplicates from one sync to the next, you never really know which rules your system is executing.
UpdateRepos has been refactored to return an array instead of an unordered map. The array of RepoOnDisk (formerly DirtyRepo) will be in the same order as the repos listed in the config.
Strelka needed even more refactoring as it would parse each rule from beginning to end before going to the next rule. Now it gathers all the detections, dedupes them, then syncs them. Refactored the call to WalkDir to use the IOManager instead of directly calling filepath.WalkDir.
Suricata parses the rules in the order that they exist in the
communityRulesFile
file. Strelka processes the repos in the order they appear in the config, inside each repo the files are parsed in lexical order.Some tests needed to be updated to work around the changes to UpdateRepos and other determinism changes. Added a new test for the deduplication process.