-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify input requirement parsing #7019
Comments
How does this relate to PR #6203? |
Also, in your write-up of the proposed rules, can you distinguish between choices that are forced by / follow logically from PEP’s, and rules that are more heuristics of your choosing? It seems like PR #6203 uses different heuristics (though I’m not certain). I think it would be helpful for people to know if / where there might be any ambiguity in interpreting and applying any of the PEP’s, and if we are making any choices here. |
Unrelated. The changes there are improvements on the existing code, but keep most of the logic locked up in |
I have updated the original issue to make it more explicitl. |
Also, PR #6203 introduces its own heuristics for resolving the ambiguities and deciding whether something should be interpreted as a path (something this issue uses different techniques for). It adds a |
Okay, so it is related then (it would subsume it).. I think it would be worth comparing the heuristics used there with the approach proposed here then, as there was lots of discussion around that. (I think the reason that PR was never merged is because of the trickiness here and lack of clarity on the correct way to proceed.) This is another reason why I think it's worth separating the behavior changes from refactoring in the discussion. The behavior changes alone are subtle and a bit tricky. |
In that PR only a single branch within |
A couple other things that would help in the description of the proposed rules (the "leads to the following rules" part of the original issue comment) are distinguishing between the parts encoding pip's current behavior with the new logic being introduced. In other words, how much of this is new versus describing what pip already does. Something else that would help is to know if what's being proposed is backwards compatible or what, if anything, might break for people. |
What's the problem this feature will solve?
Currently pip accepts several types of input as "requirements":
The parsing for these is ad-hoc and pretty complicated, with lots of code paths (see here). This makes it hard to understand:
InstallRequirement
given a user inputIt is also impossible to re-use the current code to initialize any other kind of type than an
InstallRequirement
(so this is a prereq for some of the build refactoring).Describe the solution you'd like
At a high level we need to map any arbitrary input to one of the 4 categories mentioned above. This is difficult to do unambiguously because we accept file paths, so I think we should make some assumptions and then users that want to use weird file paths can feel free to use an explicit
file://
URL.The primary standards-based constraints are:
@
followed by<scheme>://
followed optionally by;
and markers which can have any content@
characters followed optionally by extras and specifiers and then by;
and markers which can have any contentSimplifying assumptions:
.
,/
, or\
(on Windows), followed optionally by something that looks like extras and something that looks like markers://
://
That leads to the following rules for deciding how to process input:
Requirement
and derive all fields ofRequirementInfo
from that#egg=
fragment, which are used to instantiate aRequirement
if present. Any missing fields get derived from theRequirement
if set.os.pathsep
oros.altsep
or starts with '.' then we treat it like a path, convert it to an absolute file URL and process the same as 2.Requirement
and derive all fields ofRequirementInfo
from thatOther details:
The module to be added is
pip._internal.req.parsing
with a functionparse_requirement_text
that takes a string as would be input by a user or in dependency metadata and returns aRequirementInfo
.RequirementInfo
would contain:markers: Set[Marker]
link: Optional[Link]
- ifNone
then it's a name-based requirementrequirement: Optional[Requirement]
- ifNone
then it's an "unnamed" requirementextras: Set[str]
parse_requirement_text
would do the steps as described above.parse_requirement_text
would not do any filesystem operations or logging and it should map any expected exceptions toRequirementParsingError
with an indication of how we were trying to process the text (direct reference, url, path, or name-based).Once implemented, we should refactor
req.constructors.install_req_from_*
to delegate parsing toparse_requirement_text
and just do operations on the returnedRequirementInfo
.Alternative Solutions
Additional context
The text was updated successfully, but these errors were encountered: