Performance boost #72

dennispo · 2021-03-14T16:11:10Z

Addresses issue #67 .

The main idea in this change is to push the "unpacking" of instances to a later stage. It will happen after the initial filtering of the observables, but before the high level operators, like REPEATS or FOLLOWEDBY.

Based on experiments on real data, created by stix-shifter of low level monitoring events, such as EDRs or Sysmon, the boost in performance can be enormous. As an example, there are 15,754 observables generated out of 100 original observables. In another example, there are 189,023 instances generated out of 300 original observables.

When comparing the two above example with a version without instances duplication, the timing is as follows:

Improvement	Time measured for 100 observed_data	% of improvement	Time measured for 300 observed_data	% of improvement
basic	0:04:59.641		1:00:40.532
events deduplication	0:00:06.043	97.98%	0:00:15.945	99.56%

For the above 2 examples, the performance boost will be x50 and x200 respectively.

…the high level operators like REPEATS.

CLAassistant · 2021-03-14T16:11:14Z

All committers have signed the CLA.

codecov-io · 2021-03-14T16:12:11Z

Codecov Report

Merging #72 (654591f) into master (20b7d86) will increase coverage by 0.11%.
The diff coverage is 94.11%.

@@            Coverage Diff             @@
##           master      #72      +/-   ##
==========================================
+ Coverage   90.00%   90.11%   +0.11%     
==========================================
  Files          13       13              
  Lines        1070     1082      +12     
==========================================
+ Hits          963      975      +12     
  Misses        107      107

Impacted Files	Coverage Δ
stix2matcher/test/test_complex.py	`100.00% <ø> (+4.54%)`	⬆️
stix2matcher/matcher.py	`87.23% <94.11%> (+0.07%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20b7d86...654591f. Read the comment docs.

dennispo added 2 commits March 14, 2021 17:47

Fix minor typos

ae43a5a

Push observations "unpacking" after the filtering predicates, before …

654591f

…the high level operators like REPEATS.

mdazam1942 approved these changes Mar 17, 2021

View reviewed changes

mdazam1942 merged commit 59d13c7 into oasis-open:master Mar 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance boost #72

Performance boost #72

dennispo commented Mar 14, 2021 •

edited

Loading

CLAassistant commented Mar 14, 2021 •

edited

Loading

codecov-io commented Mar 14, 2021 •

edited

Loading

Performance boost #72

Performance boost #72

Conversation

dennispo commented Mar 14, 2021 • edited Loading

CLAassistant commented Mar 14, 2021 • edited Loading

codecov-io commented Mar 14, 2021 • edited Loading

Codecov Report

dennispo commented Mar 14, 2021 •

edited

Loading

CLAassistant commented Mar 14, 2021 •

edited

Loading

codecov-io commented Mar 14, 2021 •

edited

Loading