-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transform and load dependencies from setup.cfg #718
Conversation
a5ebad1
to
e590707
Compare
4e397a7
to
08a9d77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some initial thoughts
Performs data transformations for the requirements.txt file in a GitHub repo, if available. | ||
:param req_file_contents: Dict: The contents of the requirements.txt file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotta love the random cleanup 🙂 , very "leave it better than you found it"
cartography/intel/github/repos.py
Outdated
}) | ||
|
||
def _transform_setup_cfg_requirements( | ||
setup_cfg_contents: Dict, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thoughts on using typing-extensions to get TypedDict
and defining which contents are in this Dict
? or alternatively making a quick NamedTuple
which contains the contents?
totally optional, this is just me pushing my static typing agenda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The contents are whatever is returned from the graphql API. Right now that's a single key value pair where the value is the entire setup.cfg file as a string, which I don't think is worth defining static typing for.
8fab58e
to
5fe59b7
Compare
5fe59b7
to
4bb75c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code changes look good to me. let's also get a review from cartography owners as discussed over slack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-approving after change and merge
_transform_requirements_txt(repo_object['requirements'], repo_object['url'], transformed_requirements_files) | ||
_transform_setup_cfg_requirements(repo_object['setupCfg'], repo_object['url'], transformed_requirements_files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if a dep is listed in both places (e.g. no bounds in setup.cfg but pinned in requirements.txt)?
What do we want to have happen (e.g. what would be most useful for our query patterns/the https://github.com/lyft/cartography/blob/master/docs/usage/samplequeries.md sample queries)?
This is likely worth a test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's listed in both places and has different specifiers for each usage, cartography will create two separate nodes, which I think makes sense rather than any sort of "merging" logic. This allows users to query what version(s) are being used or perhaps find out that they are specifying a dependency in multiple files when it's not needed. Added a test case for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That SGTM, thanks! Just wanted to check that we didn't have one "overwrite" the other
…/cartography into oh/add-setup-cfg-dependency-support
cartography/intel/github/repos.py
Outdated
:param repo_url: str: The URL of the GitHub repo. | ||
:param out_requirements_files: Output array to append transformed results to. | ||
:return: Nothing. | ||
""" | ||
if req_file_contents and req_file_contents.get('text'): | ||
text_contents = req_file_contents['text'] | ||
reqs_list = text_contents.split("\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would prefer this named the same as the param in the function you call below
Re testing, have you run this on a few setup.cfgs from some of our bigger projects? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good; a few minor questions but good to go.
I tested on repos with setup.cfg's that had requirements across install_requires, setup_requires, and extras_require. |
Previously, requirements.txt was the only file cartography loaded dependencies from.
This PR adds support for setup.cfg, which is where many Python libraries define dependencies;
specifically in
install_requires
,setup_requires
, andextras_require
Also tested locally by running cartography, ensuring the data was loaded correctly, and checking that I could query dependencies from
setup.cfg
.Note: the GH dependency graph unfortunately does not support setup.cfg ingestion at the moment. Although setup.cfg support has been implement in dependabot, the two projects are completely separate with no timeline for merge at the moment.