-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support matrices and stdout for all output types #48
Conversation
…--matrix (or any combination of the three)
I think these changes are considered breaking because of the new It will break test scripts like this: https://github.com/rapidsai/cudf/blob/72c067726ccfb6e87033d34ab07b4dc79b5e4a3e/ci/test_python_common.sh#L10-L14 @trxcllnt, can you add a Our CI images pin to the current major version of But we'll need a way to incrementally roll this out to each repository before we update the version in our CI images so that we don't break CI for everyone. I thought about this in the past, but never took any action on it due to time constraints. I think one way we can fix it is to add a new optional input,
Then each repo can manually opt-in to the new major version. Once all the repos are using the new version, we can update the CI image version accordingly and then go back and clean up all of the optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I have a couple of concerns about the current implementation but I think the core work around extending the CLI is a step in the right direction.
|
||
# If --clean was passed without arguments, default to cleaning from the root of the | ||
# tree where the config file is. | ||
if args.clean == "": | ||
args.clean = os.path.dirname(os.path.abspath(args.config)) | ||
|
||
args.file_key = list(sorted(list(set(sorted(args.file_key))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this is equivalent?
args.file_key = list(sorted(list(set(sorted(args.file_key))))) | |
args.file_key = sorted(set(args.file_key)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it gets converted to a list:
>>> abc = set((1,2,3,4,5))
>>> abc
{1, 2, 3, 4, 5}
>>> sorted(abc)
[1, 2, 3, 4, 5]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, and in general there are a lot of extra ops here.
@@ -323,7 +315,34 @@ def should_use_specific_entry(matrix_combo, specific_entry_matrix): | |||
) | |||
|
|||
|
|||
def make_dependency_files(parsed_config, config_file_path, to_stdout): | |||
def name_with_cuda_suffix(name, cuda_version=None, cuda_suffix="-cu"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want this logic in dfg. It feels like too much scope creep. I would rather dependencies.yaml files use matrix entries to handle this in the short term. In the longer term, if we decide we need this functionality I would suggest that we add some sort of generic support for variables and string interpolation into dfg as a more general solution. I'd like to be able to use the same solution for our packages and cupy, for instance, and the two use different naming conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can (and do) put the -cuXX
names in the dependencies.yaml
matrix entries. This is explicitly about the name
key of the pyproject.toml
.
Overall pyproject.toml
should probably be fully auto-generated instead of read and mutated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vyasr where do you think the logic to define the project.name
key in the pyproject.toml
should live?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can (and do) put the -cuXX names in the dependencies.yaml matrix entries. This is explicitly about the name key of the pyproject.toml.
Yup, I realized that later in my review around #48 (comment) 😅
Overall pyproject.toml should probably be fully auto-generated instead of read and mutated.
When you say fully auto-generated, what are you envisioning? I could see using something like a Jinja templated pyproject.toml.in and filling in a suitable set of fields, perhaps. There are large swathes of the file that are necessary for things like 1) running linters, 2) specifying "extra" dependency lists, 3) configuring build backends, and more that have to be encoded somewhere else. None of those should be dependency-file-generator's responsibility, and at least the first one requires the file to already exist somewhere in the repo for normal usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the project.name
, I honestly do not know. My approach to taking the magic out of the wheel building process so far has been very incremental, fixing one problem at a time. I've been viewing the name as sort of a final frontier, one for which I don't have a good answer for yet unfortunately. I don't think a tool dedicated to dependency management is the right place to put that, though.
I'd love to work with you on resolving that problem. I completely agree that the current approach with apply_wheel_modification.sh
is not a very good one.
for file_key, file_config in parsed_config["files"].items(): | ||
file_config["matrix"] = matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for file_key, file_config in parsed_config["files"].items(): | |
file_config["matrix"] = matrix | |
for file_config in parsed_config["files"].values(): | |
file_config["matrix"] = matrix |
@@ -394,7 +410,7 @@ def make_dependency_files(parsed_config, config_file_path, to_stdout): | |||
# exists. In that case we save the fallback_entry result | |||
# and only use it at the end if nothing more | |||
# specific is found. | |||
if not specific_matrices_entry["matrix"]: | |||
if not specific_matrices_entry.get("matrix", None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is the matrix key optional? For a specific entry it should be required, even with the new CLI functionality, right?
# Append `-cuXX` to `[package.name]` | ||
results[output_file_path]["project"][ | ||
"name" | ||
] = name_with_cuda_suffix( | ||
results[output_file_path]["project"]["name"], | ||
matrix_combo.get("cuda", None), | ||
cuda_suffix, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I first saw the function I actually thought the goal was to handle dependency suffixes rather than package name suffixes. I'm even less comfortable putting this logic into this tool. Package renaming to support our wheels-specific workflows is definitely scope creep. I'd love to find a better solution to what we're currently doing in our wheel builds, but I don't think this is it.
""" | ||
) | ||
|
||
if isinstance(data, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this implicitly relying on pyproject outputs generating a dict here while other outputs just contain a long string of text? We should make that condition explicit if so, otherwise it's confusing why tomlkit is getting used in a generic dict path.
f.write(contents) | ||
def write_output(data, output_dir, f): | ||
|
||
relpath_to_config_file = os.path.relpath(config_file_path, output_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we hoist this logic into the calling loop then the write_output
function could be moved outside instead of defined as a nested function right? I think that would be cleaner if we do decide to keep it.
@trxcllnt apologies for the delay in reviewing this PR. I propose that we we split this up so that the uncontroversial pieces can get merged quickly, if you're open to that.
|
@trxcllnt would you like some help finishing this up? Let me know if you want to chat about it or need some extra person-hour help. |
I'll defer to @trxcllnt here. He had an idea of how best to rewrite the generator to better support this behavior. |
Replaced by #74 |
stdout
for all output types via the new--stdout
flag--file_key
,--output
, or--matrix
(or any combination of the three)--file_key
and--output
argumentsFixes #46
Enables the
dependencies.yaml
changes in this branch.Example generating requirements.txt:
Example generating pyproject.toml