Skip to content

Commit

Permalink
Set GitHub Actions with test cases (#46)
Browse files Browse the repository at this point in the history
* Add method-level documentation for SAGE

* Add test files for SAGE

* Add sudo to apt in the workflow

* Add GitHub token to the workflow

* Add gzipped alerts

* Add extracting alerts to workflow

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update test.yml

* Update stats-ff.sh

* Update test-ags.sh

* Update test.yml

* Update test scripts

* Update workflow file

* Fix PEP warnings

* Add Python style guide checker to workflow

* Address the comments on PR

* Add CONTRIBUTING.md

* Address some comments for the PR

* Add 'changes-ids' to the workflow

* Add chmod back to the workflow file

* Update test.yml

* Remove unnecessary command from test.yml

* Fix missing alert windows at the end

* Update diff-ags.sh and test-ags.sh

* Mention the docker branch in CONTRIBUTING.md

* Update tests.py

* Remove blank lines

* Remove printing in ag_generation.py
  • Loading branch information
jzelenjak authored Oct 2, 2023
1 parent cf47897 commit 0325a81
Show file tree
Hide file tree
Showing 36 changed files with 1,180 additions and 64 deletions.
186 changes: 186 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
name: 'Run tests'

on:
pull_request:
branches-ignore:
- 'docker'

jobs:
test:
runs-on: ubuntu-latest
steps:
################################################################
################# INSTALL DEPENDENCIES ######################
################################################################

- name: Install dependencies
run: |
sudo apt update
sudo apt install graphviz bc
pip install requests numpy matplotlib pycodestyle
- name: Fetch flexfringe binary
env:
GH_TOKEN: ${{ github.token }}
FF_REPO: "https://github.com/tudelft-cda-lab/FlexFringe"
shell: bash
run: |
gh release download latest -R $FF_REPO -p "flexfringe-x64-linux" # Might have to be updated if FlexFringe decides to change the binary name
- name: Clone main branch of tudelft-cda-lab/SAGE
uses: actions/checkout@v4
with:
repository: 'tudelft-cda-lab/SAGE'
ref: 'main'
path: 'SAGE-main'

- name: Clone PR branch of tudelft-cda-lab/SAGE
uses: actions/checkout@v4
with:
path: 'SAGE-updated'


################################################################
################## RUN STYLE CHECK #######################
################################################################

- name: Run Python style guide checker (PEP 8)
shell: bash
run: |
cd SAGE-updated/
pycodestyle --ignore=E501,W503,W504 *.py # Ignore "E501 line too long", and warnings "Line break occurred before/after a binary operator"
pycodestyle --ignore=E501,W503,W504 signatures/*.py
################################################################
############## PREPARE THE ENVIRONMENT ##################
################################################################

- name: Extract alerts
shell: bash
run: |
cd SAGE-updated/
find alerts/ -type f -name '*.gz' | xargs gunzip
cd ..
rm -rf SAGE-main/alers/
cp -R SAGE-updated/alerts SAGE-main/alerts/
- name: Copy FlexFringe to SAGE (main branch)
shell: bash
if: '!contains(github.event.pull_request.labels.*.name, ''changes-ags'')' # With 'changes-ags' label no regression tests will be run
run: |
cd SAGE-main/
mkdir FlexFringe/
mkdir FlexFringe/ini/
cp ../flexfringe-x64-linux FlexFringe/flexfringe # Might have to be updated if FlexFringe decides to change the binary name
chmod u+x FlexFringe/flexfringe
mv spdfa-config.ini FlexFringe/ini/
- name: Copy FlexFringe to SAGE (updated branch)
shell: bash
run: |
cd SAGE-updated/
mkdir FlexFringe/
mkdir FlexFringe/ini/
cp ../flexfringe-x64-linux FlexFringe/flexfringe # Might have to be updated if FlexFringe decides to change the binary name
chmod u+x FlexFringe/flexfringe
mv spdfa-config.ini FlexFringe/ini/
- name: Copy the test file in the top directory
shell: bash
run: |
cd SAGE-updated/
mv test-scripts/* ..
################################################################
############# RUN BOTH VERSIONS OF SAGE #################
################################################################

- name: Run SAGE on the main branch
if: '!contains(github.event.pull_request.labels.*.name, ''changes-ags'')' # With 'changes-ags' label no regression tests will be run
shell: bash
run: |
cd SAGE-main/
echo "Running CPTC-2017..."
python sage.py alerts/cptc-2017/ orig-2017 --dataset cptc --keep-files
echo "Running CPTC-2018..."
python sage.py alerts/cptc-2018/ orig-2018 --dataset cptc --keep-files
echo "Running CCDC-2018..."
python sage.py alerts/ccdc/ orig-ccdc --dataset other --keep-files
cp -R orig-2017.txt orig-2017.txt.ff.final.json orig-2017.txt.ff.finalsinks.json orig-2017AGs/ ../ # Might have to be updated if FlexFringe changes the names
cp -R orig-2018.txt orig-2018.txt.ff.final.json orig-2018.txt.ff.finalsinks.json orig-2018AGs/ ../
cp -R orig-ccdc.txt orig-ccdc.txt.ff.final.json orig-ccdc.txt.ff.finalsinks.json orig-ccdcAGs/ ../
- name: Run SAGE on the updated branch
shell: bash
run: |
cd SAGE-updated/
echo "Running CPTC-2017..."
python sage.py alerts/cptc-2017 updated-2017 --dataset cptc --keep-files
echo "Running CPTC-2018..."
python sage.py alerts/cptc-2018/ updated-2018 --dataset cptc --keep-files
echo "Running CCDC-2018..."
python sage.py alerts/ccdc/ updated-ccdc --dataset other --keep-files
cp -R updated-2017.txt updated-2017.txt.ff.final.json updated-2017.txt.ff.finalsinks.json updated-2017AGs/ ../ # Might have to be updated if FlexFringe changes the names
cp -R updated-2018.txt updated-2018.txt.ff.final.json updated-2018.txt.ff.finalsinks.json updated-2018AGs/ ../
cp -R updated-ccdc.txt updated-ccdc.txt.ff.final.json updated-ccdc.txt.ff.finalsinks.json updated-ccdcAGs/ ../
################################################################
################# RUN REGRESSION TESTS ###################
################################################################

- name: Run regression tests on CPTC-2017
env:
CHANGES_IDS: ${{ contains(github.event.pull_request.labels.*.name, 'changes-ids') }} # With 'changes-ids' label state IDs will be removed from the AGs
if: '!contains(github.event.pull_request.labels.*.name, ''changes-ags'')' # With 'changes-ags' label no regression tests will be run
shell: bash
run: |
[[ "$CHANGES_IDS" == "true" ]] && { echo "Running regression tests without state IDs" ; ./test-ags.sh -i orig-2017 updated-2017 ; } || { echo "Running regression tests with state IDs" ; ./test-ags.sh orig-2017 updated-2017 ; }
- name: Run regression tests on CPTC-2018
env:
CHANGES_IDS: ${{ contains(github.event.pull_request.labels.*.name, 'changes-ids') }} # With 'changes-ids' label state IDs will be removed from the AGs
if: '!contains(github.event.pull_request.labels.*.name, ''changes-ags'')' # With 'changes-ags' label no regression tests will be run
shell: bash
run: |
[[ "$CHANGES_IDS" == "true" ]] && { echo "Running regression tests without state IDs" ; ./test-ags.sh -i orig-2018 updated-2018 ; } || { echo "Running regression tests with state IDs" ; ./test-ags.sh orig-2018 updated-2018 ; }
- name: Run regression tests on CCDC-2018
env:
CHANGES_IDS: ${{ contains(github.event.pull_request.labels.*.name, 'changes-ids') }} # With 'changes-ids' label state IDs will be removed from the AGs
if: '!contains(github.event.pull_request.labels.*.name, ''changes-ags'')' # With 'changes-ags' label no regression tests will be run
shell: bash
run: |
[[ "$CHANGES_IDS" == "true" ]] && { echo "Running regression tests without state IDs" ; ./test-ags.sh -i orig-ccdc updated-ccdc ; } || { echo "Running regression tests with state IDs" ; ./test-ags.sh orig-ccdc updated-ccdc ; }
################################################################
################### RUN SINKS TESTS ######################
################################################################

- name: Run sinks tests on CPTC-2017
shell: bash
run: |
./test-sinks.sh updated-2017
- name: Run sinks tests on CPTC-2018
shell: bash
run: |
./test-sinks.sh updated-2018
- name: Run sinks tests on CCDC-2018
shell: bash
run: |
./test-sinks.sh updated-ccdc
################################################################
################## RUN PYTHON TESTS ######################
################################################################

- name: Run Python tests for episodes
run: |
cd SAGE-updated/
python tests.py
59 changes: 59 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Contributing

In this file you can find the information that might be useful if you want to contribute to SAGE. Thank you!

## Pull Request

### No changes in attack graphs

If you introduce some code changes that do not change attack graphs (e.g. refactoring or more test cases), make sure that the regression tests, the sink tests and the Python tests pass. You can find the regression tests and the sink tests (written in Bash and taken from [this repository](https://github.com/jzelenjak/research-project)) in `test-scripts/` directory and the Python tests in the `tests.py` file in the root directory of the repository. These tests can also be run locally before pushing the changes.

Furthermore, you can see PEP 8 errors and warnings in PyCharm or you can also run `pycodestyle --ignore=E501,W503,W504 *.py` and `pycodestyle --ignore=E501,W503,W504 signatures/*.py` (to install the [Python style guide checker](https://pycodestyle.pycqa.org/en/latest/), run `pip install pycodestyle`). The error "E501 line too long" and the warnings "W503 line break occurred before a binary operator" and "W504 line break occurred after a binary operator" are ignored, since addressing them does not improve readability of the code.

Finally, write a Pull Request description and, if applicable, mention the issue that is addressed/closed by this Pull Request (e.g. "Closes issue #38"). Choose the corresponding label(s), however do *not* select the label `changes-ags` (see below). Mark [Azqa Nadeem](https://github.com/azqanadeem) as a reviewer.

### Changes to the attack graphs

Follow the same procedure as above, however **do add a label `changes-ags`**. This will skip the regression tests and only run the sink tests and the Python tests. Since there is no ground truth for the attack graphs, make sure that the changes to the attack graphs make sense. Please carefully describe them in the Pull Request description.

### Changes to the IDs

State IDs in attack graphs depend on state IDs in the S-PDFA, which in turn depend on the episode traces that are passed to FlexFringe. For example, the traces might all be the same, but have a different order, in case of changes to the episode sequence generation. As a result, the attack graphs will be the same, even though the state IDs might be different, resulting in failing regression tests. To avoid this problem, you can add the label `changes-ids` to your Pull Request. This will remove the state IDs when comparing the attack graphs, so it is a way for you to verify that the attack graphs are indeed the same, despite having different state IDs. If you still run into anomalies, you can also set the `changes-ags` label to skip regression tests entirely.

### Adding more test cases

If you want to add new test cases, feel free to do so. For Python tests, you can add them to the `tests.py` file. In addition, you can add the new tests to the GitHub Actions by modifying the `.github/workflows/test.yml` file. The tests, however, need first be approved, see the procedure above.

### Updating the docker branch

Make sure that the changes that you have introduced do not affect the `docker` branch. In case they do, you also have to create a Pull Request to the `docker` branch that will take those changes into account. This might happen, for example, if you change the structure of the files.

## GitHub Actions

The workflow of the GitHub Actions is structured as follows:

1. Dependencies are installed: the required (Python) packages are obtained, the FlexFringe executable for Linux is downloaded and the two versions of SAGE are cloned (the one on the `main` branch, which is assumed to be the ground truth, and the one on the branch with the Pull Request)
2. Python style check (PEP 8) is executed on the Python files on the Pull Request branch
3. The environment is prepared: alerts are extracted, `flexfringe` executable and `spdfa-config.ini` files are moved to the directories where they are expected to be (for both versions of SAGE) and the scripts are copied to the root directory
4. SAGE version on the main branch is executed on the three datasets (CPTC-2017, CPTC-2018 and CCDC-2018) and the necessary output files are moved to the root directory; this step is skipped when the `changes-ags` label is present on the PR)
5. SAGE version on the Pull Request branch is executed on the same three datasets and the necessary output files are moved to the root directory
6. Regression tests are executed on the resulting attack graphs to make sure that the graphs are the same; this step is skipped when the `changes-ags` label is present on the Pull Request
7. Tests for sinks are executed on the SAGE version on the Pull Request branch; these tests check that the (non-)sinks in the attack graphs are consistent with the (non-)sinks in the FlexFringe's S-PDFA model
8. Python tests are executed on the SAGE version on the Pull Request branch; these tests check the functionality of the code (currently only the episode generation, but more tests might be added in the future)

## Documentation

Feel free to add changes to the documentation in case you can come up with a better wording. Also, don't forget to update the documentation if you change the code in the way that requires updating the documentation (e.g. changing parameters to the methods or changing the files).

## Relevant files

- `sage.py` - the entry point to SAGE, contains alert parsing and filtering as well as some global parameters
- `episode_sequence_generation.py` - the first part of the SAGE pipeline that creates episodes and episode (sub)sequences from the alerts, i.e. from making hyperalert sequences to episode subsequence generation
- `model_learning.py` - the second part of the SAGE pipeline that learns the (S-PDFA) model, i.e. running FlexFringe with the generated episode traces and parsing (traversing) the resulting model to create state sequences
- `ag_generation.py` - the third part of the SAGE pipeline that creates the attack graphs, i.e. converting state sequences into attack graphs
- `plotting.py` - contains the functions that are related to plotting (not needed for running SAGE, but might give more insights into alerts or episodes)
- `signatures/` - contains the mappings for Micro/Macro Attack Stages and alerts signatures (files `alert_signatures.py`, `attack_stages.py`, `mappings.py`)
- `.github/workflows/test.yml` - the file for the workflow
- `test-scripts/` - the Bash scripts used for testing
- `tests.py` - contains the Python tests

15 changes: 10 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,14 +48,19 @@ Tip: in case you often use the same non-default values, you can create an alias

## First time use

- Clone [FlexFringe repository](https://github.com/tudelft-cda-lab/FlexFringe)
- Move `spdfa-config.ini` file to `FlexFringe/ini/` directory. Alternatively, you can set the `path_to_ini` variable in `sage.py` to `"./spdfa-config.ini"`
- Clone [FlexFringe repository](https://github.com/tudelft-cda-lab/FlexFringe).
- Move `spdfa-config.ini` file to `FlexFringe/ini/` directory. Alternatively, you can set the `path_to_ini` variable in `sage.py` to `"./spdfa-config.ini"`.
- In case you move the `FlexFringe/` directory to another location, update the function `flexfringe` in `model_learning.py` accordingly.
- A sample alert file is provided with the name `sample-input.json` (T5 alerts from CPTC-2018) to test SAGE. Use the following command:
- You can find the compressed alerts for the [Collegiate Penetration Testing Competition (CPTC)](https://cp.tc/research) and [Collegiate Cyber Defense Competition (CCDC)](https://github.com/FrankHassanabad/suricata-sample-data) datasets (taken from the linked sources) in the `alerts/` directory. To uncompress the alerts, run:

`python sage.py alerts/ firstExp`
`find alerts/ -type f -name '*.gz' | xargs gunzip`

where `alerts/` contains `sample-input.json`. For other options, see Usage section above.
from the root directory of the repository. You can add other datasets, however make sure that they follow the same format.
- You can run SAGE with the default parameters using the following command:

`python sage.py alerts/ firstExp`,

where `alerts/` contains the uncompressed alerts. For other options, see Usage section above.

**If you use SAGE in a scientific work, consider citing the following papers:**

Expand Down
1 change: 0 additions & 1 deletion ag_generation.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ def _translate(label, root=False):
@param root: whether this node is a root node (will be prepended with 'Victim: <victim_ip>\n')
@return: a new more human-readable version of the label
"""
print(label)
new_label = ""
parts = label.split("|")
if root:
Expand Down
Binary file added alerts/ccdc/wrcddc-2018.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam1.json__.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam10.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam2.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam3.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam4.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam5.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam6.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam7.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam8.json.gz
Binary file not shown.
Binary file added alerts/cptc-2017/NationalTeam9.json.gz
Binary file not shown.
Binary file added alerts/cptc-2018/suricata_alert_t1.json.gz
Binary file not shown.
Binary file added alerts/cptc-2018/suricata_alert_t2.json.gz
Binary file not shown.
Binary file added alerts/cptc-2018/suricata_alert_t5.json.gz
Binary file not shown.
Binary file added alerts/cptc-2018/suricata_alert_t7.json.gz
Binary file not shown.
Binary file added alerts/cptc-2018/suricata_alert_t8.json.gz
Binary file not shown.
Binary file added alerts/cptc-2018/suricata_alert_t9.json.gz
Binary file not shown.
Loading

0 comments on commit 0325a81

Please sign in to comment.