Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New command to apply Compliance objects to dataset #556

Open
igorbrigadir opened this issue Oct 18, 2021 · 2 comments
Open

New command to apply Compliance objects to dataset #556

igorbrigadir opened this issue Oct 18, 2021 · 2 comments
Labels

Comments

@igorbrigadir
Copy link
Contributor

igorbrigadir commented Oct 18, 2021

There should be a command to actually apply the compliance objects to a dataset. Currently you can grab a list of IDs, get the compliance job results, but there's no command that will do the actual filtering.

I'm thinking of a command like:

twarc2 compliance apply dataset.json compliance.json result.json

That will take a dataset, compliance results, and output a clean, compliant dataset. (Either full objects or IDs)

@SamHames
Copy link
Contributor

Big thumbs up from me. I'd be happy to collaborate on that one, if for no other reason than to have a consistent way to do this kind of filtering (ie, do we remove referenced tweets that were part of the deleted tweets but still present in includes?)

@igorbrigadir
Copy link
Contributor Author

Yeah sure thing! I haven't started this at all yet

@igorbrigadir igorbrigadir added the good first issue Good for newcomers label Mar 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants