Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ActorPacks #66

Merged
merged 8 commits into from
Dec 7, 2022
Merged

Support for ActorPacks #66

merged 8 commits into from
Dec 7, 2022

Conversation

cybersec-code
Copy link
Contributor

An initial set of features to support ActorPack validation and insertion into the database. A new taxonomy was introduced (country) as well as new tables to the schema.

@soad003
Copy link
Member

soad003 commented Dec 6, 2022

Thanks for the PR.

Are there any test actorpacks to test with in the repo? And some tags that use those actors? Would be great for testing (tests/testfiles).

I have not done a full review yet but here are a couple of things I noticed.

Problems

  • tagpack-tool taxonomy insert country does not work (invalid choice: 'country' (choose from 'abuse', 'entity', 'confidence')
    ) but tagpack-tool taxonomy insert tries to insert a coutry taxomony.

Nits

  • Readme could use an update e.g (Ingest taxonomies, Ingest TagPacks)
  • Files to test the actorpacks would be great

CREATE TABLE actor_jurisdictions (
id SERIAL PRIMARY KEY,
actor_id VARCHAR REFERENCES actor(id) ON DELETE CASCADE,
country_id VARCHAR REFERENCES concept(id) ON DELETE CASCADE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this constraint to the concepts table intended?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe i misunderstand the schema. Have seen that the tests also treat coutries as concepts.

@@ -227,7 +332,7 @@ def get_quality_measures(self, currency="") -> float:
keys = ["count", "avg", "stddev"]
return {keys[i]: v for row in self.cursor.fetchall() for i, v in enumerate(row)}

def calculate_quality_measures(self) -> float:
def calculate_quality_measures(self) -> dict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this quality view something we want to keep up to date in general? If so maybe add call to this to tp-tool tagstore refresh_views

@cybersec-code
Copy link
Contributor Author

Thanks for the feedback,

  • I added an actors file to the tagpacks repo for testing. We will add more ActorPacks to this folder in the near future.
  • We are still defining how to integrate actors with tags, so there are no tests for it at the moment.
  • We added country as a taxonomy, that's why the constraint references concept(id) in the schema.
  • tagpack-tool taxonomy insert country is working now.
  • I think for now it's not necessary to keep up to date the quality view, because we want to take "snapshots" of those results at different points in time. But maybe in the future this changes.
  • I'm updating the README file

@soad003
Copy link
Member

soad003 commented Dec 6, 2022

Thanks for the clarification and updates.

I have tested what I can test so far (import of actors, computing quality stats etc.) and quickly skimmed over the source-code.

I think its got to go like it is right now. Any objections?

@soad003 soad003 self-requested a review December 7, 2022 07:50
Copy link
Member

@mdragaschnig mdragaschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested validation and insert for tagpacks and actorpacks, looking good so far.

Copy link
Member

@soad003 soad003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have integrated recent changes from develop. I think it all looks fine. Will merge into develop now.

@cybersec-code: there have been some changes regarding folder structure of the repo and some others infrastructure change like enforced formatting and linting via pre-commit hooks. If you have any questions or troubles while developing, let me know.

Thanks everyone for the effort.

@soad003 soad003 merged commit 400a7e6 into develop Dec 7, 2022
@soad003 soad003 deleted the feature_list_low_quality_addrs branch December 7, 2022 12:29
@soad003 soad003 restored the feature_list_low_quality_addrs branch December 6, 2024 12:43
@soad003 soad003 deleted the feature_list_low_quality_addrs branch December 6, 2024 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants