Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create plugin to determine file categories #1745

Open
johnmhoran opened this issue Oct 3, 2019 · 2 comments
Open

Create plugin to determine file categories #1745

johnmhoran opened this issue Oct 3, 2019 · 2 comments
Assignees
Milestone

Comments

@johnmhoran
Copy link
Member

This plugin will apply a set of rules to certain fields/values collected during a scan (e.g., file_type, mime_type) and add a category (or similarly named) field and associated value (e.g., Java, JavaScript) to the JSON output file.

@johnmhoran johnmhoran self-assigned this Oct 3, 2019
johnmhoran added a commit that referenced this issue Oct 7, 2019
* Install by navigating to /scancode-toolkit/plugins/scancode-categories/
   and running 'pip install .'
* Rules comprise a set of any() and all() functions contained as string
   values in a list of JSON objects.
* Current test ruleset is quite small, based solely on scan of
   bionic-master-libc-bionic.tar.gz-extract
* Current working ruleset:
   /scancode-categories/src/python_rules/python_rules_01.py
* Command example: scancode -i -n 2 <path to codebase> --categories
   <path to JSON object> --json <path to JSON output file>
* Currently uses JSON object inside .py file.  Test of .json file coming soon.
* Unable thus far to create working rules (1) using YAML or text files or
   (2) without including Python code (any/all() functions) inside the
   rules themselves.
* Code not yet cleaned up -- still a WIP.

Signed-off-by: John M. Horan <[email protected]>
johnmhoran added a commit that referenced this issue Oct 7, 2019
* .json ruleset performs as intended.
* Cleaned up plugin_categories.py

Signed-off-by: John M. Horan <[email protected]>
johnmhoran added a commit that referenced this issue Oct 10, 2019
* New rules: /scancode-categories/src/json_rules/json_rules_simple_01.json
* Seems to work well on test codebase
   bionic-master-libc-bionic.tar.gz-extract (largely C++).
* Next steps include expanding rules using more-diverse test codebases.
* No formal test suite yet but coming soon.
* This branch also includes code for 'Hello ScanCode' plugin
   illustrated in ScanCode wiki entry 'How To: Add a post scan plugin'
   (see /scancode-hello/).

Signed-off-by: John M. Horan <[email protected]>
@pombredanne
Copy link
Member

See also #426
I think we could eventually bundle all this as part of the --info scans... this is quite essential
IMHO

@johnmhoran
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants