Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing sinks in attack graphs #14

Merged
merged 5 commits into from
Jul 12, 2023
Merged

Fix missing sinks in attack graphs #14

merged 5 commits into from
Jul 12, 2023

Conversation

jzelenjak
Copy link
Collaborator

@jzelenjak jzelenjak commented Jun 18, 2023

Description

Issue 1

traverse method is missing some sink states. Namely, if there is a transition from a non-sink state to a sink state (which is defined for this non-sink node in the final.json file), then this sink state will be missed (i.e. drawn solid, see images below).

Example 1: states 427, 67 and 371 are sinks in finalsinks.json file generated by FlexFringe (this can also be seen in the fragments of S-PDFA, the count of these states is < sink_count=5), however in the AGs they are non-sinks (note: this example is also after implementing the fix with unknown in this PR: #12). Left AG is before the fix, right AG is after the fix.

pr1 1
pr1 2
pr1 3
pr1 4

Example 2: state 82 is a non-sink state(in the left image before the fix), while in finalsinks.json file generated by FlexFringe it is a sink

pr1 5

Example 3: states 80 and 81 a non-sink states (in the left image before the fix), while in finalsinks.json file generated by FlexFringe they are sinks
pr1 6

Example 4: state 70 is a non-sink state (in the left image before the fix), while in finalsinks.json file generated by FlexFringe it is a sink

pr1 7

Issue 2

If a sequence ends with a sink, then it will not be added to the sev_sinks, as state will not be an empty string. In final.json file, there are transitions from non-sinks to sinks, but sinks themselves are not defined (they are in finalsinks.json file). So, SAGE misses sinks that are at the end of a sequence.

pr1 8
pr1 9

Issue 3

Sometimes there is a transition from a red non-sink state to a blue sink state. However, in the default spdfa-config.ini printblue parameter is set to 0. This results in a missing transition in final.json file, because of which ID: -1 will be assigned for medium-severity states, even though they do have an ID. This is especially a problem when there are multiple such states in an AG, as they will be merged to one state with ID: -1. Furthermore, it hinders interpretability.

Before (fragment):

pr1 10

After (fragment):
pr1 11

Another example + the fragments of S-PDFA (note: here 287 and 289 are the same nodes, as well as 1451 and 1437 - the screenshot of the S-PDFA merge was taken after implementing all the fixes, including PR #12 and PR #11

pr1 12
pr1 13

Issue 4

In make_AG method, in three places, it is checked whether states are sinks and if they are, they are made dotted. This is done using endwith method, namely: obj.endswith(sink). However, this might result in a false positive if sink is a string substring of the last part of the objective variant. For example, the condition obj.endswith(sink) will evaluate to True if sink=75 and obj=”DATA_EXFILTRATION|http|375, which will lead to state 375 coloured dashed, whereas it is not a sink.

pr1 14
pr1 15
pr1 16

Proposed fixes

First, replace endswith method with strict equals comparison after splitting an objective or a vertex name.

pr1 17

Second, at the end of traverse method, add a check to add a state in the sev_sinks set. This fix will prevent skipping sinks that are heads of edges with tails being non-sinks (defined in the main model). Note that if ID: -1, for low-severity sinks, they will not be added, as -1 is never in sinks (the intended behaviour).

pr1 18

Third, set printblue=1 by default to spdfa-config.json file. This will remove all the ID: -1 from medium-severity states and put the actual IDs as in the files generated by FF.

Result

  1. Resulting AGs are the same when it comes to node names (excluding state IDs) and edges, except for AG for 10.0.0.1|NETWORKDOS|mswbtserver, where before there was one ACCOUNT MANIPULATION|-1 node and after there are two such nodes: ACCOUNT MANIPULATION|snmp|ID: 371 and ACCOUNT MANIPULATION|snmp|ID:863.

pr1 19

  1. Plenty of lost sinks have been found (note: these stats are before the fixes in PRs Fix bug in make_ag() if statement #11 and Change Unknown/unknown to behave the same on both linux and windows #12)

pr1 20

  1. All medium-severity states with ID: -1 have disappeared:
[jegor@arch SAGE-fork]$ find before-2017AGs/ -type f -name '*.dot' | xargs gvpr 'N { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | grep -- 'ID: -1' | wc -l
71
[jegor@arch SAGE-fork]$ find before-2017AGs/ -type f -name '*.dot' | xargs gvpr 'N { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | grep -- 'ID: -1' | sort -u | wc -l
10
[jegor@arch SAGE-fork]$ find after-2017AGs/ -type f -name '*.dot' | xargs gvpr 'N { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | grep -- 'ID: -1' | wc -l
0
[jegor@arch SAGE-fork]$ find before-2018AGs/ -type f -name '*.dot' | xargs gvpr 'N { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | grep -- 'ID: -1' | wc -l
2
[jegor@arch SAGE-fork]$ find after-2018AGs/ -type f -name '*.dot' | xargs gvpr 'N { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | grep -- 'ID: -1' | wc -l
0
  1. With the following pipelines, we have verified all our findings. First, we get all unique sinks in CPTC-2017 (analogously for CPTC-2018) for AGs before and after the fixes. Then we take the ones that are present only after the fix (comm -13 …), and extract their IDs (sed …). At the same time, using jq we take the before-2017.txt.ff.finalsinks.json file (which is the same as after-2017.txt.ff.finalsinks.json file) and print IDs of only those nodes that are sinks. Finally, we take the intersection between the previously found IDs (comm -12 …) and count them ( … | wc -l). The result is equal to the number of such sink IDs that are only present after the fix (i.e. lost by the original version of SAGE before the fix).
# All new-found sinks in 2017 are indeed sinks
[jegor@arch SAGE-fork]$ sinks_before_2017=$(find before-2017AGs/ -type f -name '*.dot' | xargs grep -F -l "dotted" | xargs gvpr 'N [ $.style == "dotted" || $.style == "filled,dotted" || $.style == "dotted,filled" ] { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | sort -u | uniq -i)
[jegor@arch SAGE-fork]$ sinks_after_2017=$(find after-2017AGs/ -type f -name '*.dot' | xargs grep -F -l "dotted" | xargs gvpr 'N [ $.style == "dotted" || $.style == "filled,dotted" || $.style == "dotted,filled" ] { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | sort -u | uniq -i)
[jegor@arch SAGE-fork]$ echo -e "$sinks_before_2017" | wc -l
66
[jegor@arch SAGE-fork]$ echo -e "$sinks_after_2017" | wc -l
141
[jegor@arch SAGE-fork]$ new_sinks_2017=$(comm -13 <(echo -e "$sinks_before_2017") <(echo -e "$sinks_after_2017"))
[jegor@arch SAGE-fork]$ echo -e "$new_sinks_2017" | wc -l
75
[jegor@arch SAGE-fork]$ diff -q before-2017.txt.ff.finalsinks.json after-2017.txt.ff.finalsinks.json 
[jegor@arch SAGE-fork]$ all_sinks_2017=$(jq '.nodes[] | select(.issink==1) | .id' before-2017.txt.ff.finalsinks.json | sort)
[jegor@arch SAGE-fork]$ comm -12 <(echo -e "$new_sinks_2017" | sed 's/^.*ID: \([0-9-]\+\)$/\1/' | sort) <(echo -e "$all_sinks_2017") | wc -l
75

# All new-found sinks in 2018 are indeed sinks
[jegor@arch SAGE-fork]$ sinks_before_2018=$(find before-2018AGs/ -type f -name '*.dot' | xargs grep -F -l "dotted" | xargs gvpr 'N [ $.style == "dotted" || $.style == "filled,dotted" || $.style == "dotted,filled" ] { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | sort -u | uniq -i)
[jegor@arch SAGE-fork]$ sinks_after_2018=$(find after-2018AGs/ -type f -name '*.dot' | xargs grep -F -l "dotted" | xargs gvpr 'N [ $.style == "dotted" || $.style == "filled,dotted" || $.style == "dotted,filled" ] { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | sort -u | uniq -i)
[jegor@arch SAGE-fork]$ echo -e "$sinks_before_2018" | wc -l
58
[jegor@arch SAGE-fork]$ echo -e "$sinks_after_2018" | wc -l
104
[jegor@arch SAGE-fork]$ new_sinks_2018=$(comm -13 <(echo -e "$sinks_before_2018") <(echo -e "$sinks_after_2018"))
[jegor@arch SAGE-fork]$ echo -e "$new_sinks_2018" | wc -l
46
[jegor@arch SAGE-fork]$ diff -q before-2018.txt.ff.finalsinks.json after-2018.txt.ff.finalsinks.json
[jegor@arch SAGE-fork]$ all_sinks_2018=$(jq '.nodes[] | select(.issink==1) | .id' before-2018.txt.ff.finalsinks.json | sort)
[jegor@arch SAGE-fork]$ comm -12 <(echo -e "$new_sinks_2018" | sed 's/^.*ID: \([0-9-]\+\)$/\1/' | sort) <(echo -e "$all_sinks_2018") | wc -l
46

Furthermore, in a similar way as above, we have verified all sinks found after all fixes are indeed sinks in FlexFringe and all non-sinks with defined IDs (i.e. ID has not been removed) are indeed non-sinks in FlexFringe:

# All found sinks in 2017 are indeed sinks (after all fixes)
[jegor@arch SAGE-fork]$ sinks_after_2017=$(find after-2017AGs/ -type f -name '*.dot' | xargs grep -F -l "dotted" | xargs gvpr 'N [ $.style == "dotted" || $.style == "filled,dotted" || $.style == "dotted,filled" ] { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | sort -u | uniq -i)
[jegor@arch SAGE-fork]$ all_sinks_2017=$(jq '.nodes[] | select(.issink==1) | .id' before-2017.txt.ff.finalsinks.json | sort)
[jegor@arch SAGE-fork]$ echo -e "$sinks_after_2017" | wc -l
141
[jegor@arch SAGE-fork]$ comm -12 <(echo -e "$sinks_after_2017" | sed 's/^.*ID: \([0-9-]\+\)$/\1/' | sort) <(echo -e "$all_sinks_2017") | wc -l
141

# All found sinks in 2018 are indeed sinks (after all fixes)
[jegor@arch SAGE-fork]$ sinks_after_2018=$(find after-2018AGs/ -type f -name '*.dot' | xargs grep -F -l "dotted" | xargs gvpr 'N [ $.style == "dotted" || $.style == "filled,dotted" || $.style == "dotted,filled" ] { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | sort -u | uniq -i)
[jegor@arch SAGE-fork]$ all_sinks_2018=$(jq '.nodes[] | select(.issink==1) | .id' before-2018.txt.ff.finalsinks.json | sort)
[jegor@arch SAGE-fork]$ echo -e "$sinks_after_2018" | wc -l
104
[jegor@arch SAGE-fork]$ comm -12 <(echo -e "$sinks_after_2018" | sed 's/^.*ID: \([0-9-]\+\)$/\1/' | sort) <(echo -e "$all_sinks_2018") | wc -l
104
# All non-sinks with IDs in 2017 are indeed non-sinks (after all fixes)
[jegor@arch SAGE-fork]$ non_sinks_with_ids_after_2017=$(find after-2017AGs/ -type f -name '*.dot' | xargs gvpr 'N [ $.style != "dotted" && $.style != "filled,dotted" && $.style != "dotted,filled" ] { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | sort -u | uniq -i | grep 'ID: ')
[jegor@arch SAGE-fork]$ echo -e "$non_sinks_with_ids_after_2017" | wc -l
28
[jegor@arch SAGE-fork]$ all_non_sinks_after_2017=$(jq '.nodes[] | select(.issink==0) | .id' after-2017.txt.ff.final.json | sort -u)
[jegor@arch SAGE-fork]$ comm -12 <(echo -e "$non_sinks_with_ids_after_2017" | sed 's/^.*ID: \([0-9-]\+\)$/\1/' | sort -u) <(echo -e "$all_non_sinks_after_2017") | wc -l
28

# All non-sinks with IDs in 2018 are indeed non-sinks (after all fixes)
[jegor@arch SAGE-fork]$ non_sinks_with_ids_after_2018=$(find after-2018AGs/ -type f -name '*.dot' | xargs gvpr 'N [ $.style != "dotted" && $.style != "filled,dotted" && $.style != "dotted,filled" ] { print(gsub(gsub($.name, "\r"), "\n", " | ")); }' | sort -u | uniq -i | grep 'ID: ')
[jegor@arch SAGE-fork]$ echo -e "$non_sinks_with_ids_after_2018" | wc -l
16
[jegor@arch SAGE-fork]$ all_non_sinks_after_2018=$(jq '.nodes[] | select(.issink==0) | .id' after-2018.txt.ff.final.json | sort -u)
[jegor@arch SAGE-fork]$ comm -12 <(echo -e "$non_sinks_with_ids_after_2018" | sed 's/^.*ID: \([0-9-]\+\)$/\1/' | sort -u) <(echo -e "$all_non_sinks_after_2018") | wc -l
16

@jzelenjak jzelenjak changed the title Missing sinks Fix missing sinks in attack graphs Jun 18, 2023
@jzelenjak jzelenjak marked this pull request as ready for review June 18, 2023 10:58
@jzelenjak jzelenjak marked this pull request as draft June 21, 2023 07:31
@jzelenjak jzelenjak marked this pull request as ready for review June 21, 2023 09:47
@azqanadeem
Copy link
Contributor

The .ini file is not necessary here, it is already a part of the docker container. Please remove unnecessary changes in this PR such that it works directly with the docker container as well.

@jzelenjak
Copy link
Collaborator Author

jzelenjak commented Jul 10, 2023

I have removed the config file and changed back the path_to_ini. I thought that the spdfa-config.ini file was missing, since we had to take it from the docker branch.

@azqanadeem azqanadeem merged commit acd4bce into tudelft-cda-lab:main Jul 12, 2023
@jzelenjak jzelenjak mentioned this pull request Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants