Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support git subfolder (#755) #1822

Closed

Conversation

finswimmer
Copy link
Member

@finswimmer finswimmer commented Jan 3, 2020

This PR implements the possibility to add git dependencies, where the package is located within a subdirectory of the repository.

The path to the subdirectory can be specified as follows:

poetry add "git+https://github.com/demo/project_in_subdirectory.git?subdirectory=pyproject-demo"

Where pyproject-demo is a subfolder relative to the root after checkout the repository. This can be combined with the revision tag:

poetry add "git+https://github.com/demo/project_in_subdirectory.git#dev?subdirectory=pyproject-demo"

To specify the subdir in the pyproject.toml a new argument subdirectory was introduced:

mypackage = {git = "https://github.com/demo/project_in_subdirectory.git", rev = "dev", subdirectory = "pyproject-demo"}

To implement this feature is was necessary to checkout the repository first to a temporary folder and copy the desired folder to it's destination. For this a new DownloadTmpDir class was implemented, which avoids downloading from the same git source multiple times, during run time.

I would like to get some feed back from people, who can test this on a real world use case. Especially if this all works as expected.

Implements: #755

Pull Request Check List

This is just a reminder about the most common mistakes. Please make sure that you tick all appropriate boxes. But please read our contribution guide at least once, it will save you unnecessary review cycles!

  • Added tests for changed code.
  • Updated documentation for changed code.

@finswimmer finswimmer changed the title [WIP] support git subfolder (#755) WIP - support git subfolder (#755) Jan 3, 2020
@finswimmer finswimmer added the kind/feature Feature requests/implementations label Jan 5, 2020
@finswimmer finswimmer force-pushed the issue-0755-git-subdirectory branch from 8a169b8 to abf7f04 Compare January 13, 2020 20:15
@finswimmer finswimmer force-pushed the issue-0755-git-subdirectory branch from bbfe858 to f255b42 Compare January 22, 2020 04:54
@finswimmer finswimmer force-pushed the issue-0755-git-subdirectory branch from abe6603 to b46e8ea Compare February 2, 2020 09:05
@finswimmer finswimmer changed the title WIP - support git subfolder (#755) support git subfolder (#755) Feb 2, 2020
@finswimmer finswimmer requested a review from a team February 2, 2020 09:24
@Red-Eyed
Copy link

Red-Eyed commented Feb 13, 2020

It's very interesting to see this changes in release.

@approximator
Copy link

Hi @finswimmer,
Thank you for this feature!
I tested it on my project with private deps using a command like:

poetry add "git+ssh://[email protected]:12345/prj_group/common_libs.git?subdirectory=packages/some_package"

And it works just fine.

…g properly

new (poetry-schema): add subdirectory property to git-dependency
new (pip_installer): git repositories are cloned to a tmp folder, to be able to copy only subdirectory to dest
new (locker): write to and read from `poetry.lock` the subdirectory value
new (provider): take the subdirectory value into account when looking for packing
new (package): new property `source_subdirectory`
new (console.commands.init): git subdirectory can be provided in url when using `poetry add`
…ed delimiter for arguments

new (test_git): tests for URLs containing revision tag and subdirectory
@finswimmer finswimmer force-pushed the issue-0755-git-subdirectory branch from b46e8ea to 51093e5 Compare February 22, 2020 08:35
@finswimmer finswimmer force-pushed the issue-0755-git-subdirectory branch from 1675400 to e979d3e Compare March 17, 2020 05:28
@@ -81,6 +81,9 @@ def _export_requirements_txt(
line = "-e git+{}@{}#egg={}".format(
package.source_url, package.source_reference, package.name
)
if package.source_subdirectory:
line += "?subdirectory={}".format(package.source_subdirectory)
Copy link
Member

@abn abn Mar 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
line += "?subdirectory={}".format(package.source_subdirectory)
line += "&subdirectory={}".format(package.source_subdirectory)

See https://pip.readthedocs.io/en/stable/reference/pip_install/#vcs-support, think this is also mentioned in PEP 610.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @abn,
thanks a lot for your comments. You're right here 👍

tmp_dir = Path(
mkdtemp(prefix="pypoetry-git-{}".format(url.split("/")[-1].rstrip(".git")))
)
tmp_dir = Path(DownloadTmpDir.mkcache(url, prefix="pypoetry-git"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This this is not longer required, since this has changed:

tmp_dir = Path(
mkdtemp(prefix="pypoetry-git-{}".format(url.split("/")[-1].rstrip(".git")))
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, don't know if you link the correct lines?

The reason for the way I create the temp folder for cloning the repository is, that at the moment poetry clone the repository twice during install. The first time to resolve the package version and second time when installing. For the case, let's say I like to add 3 packages from subdirectory of the same repo, poetry would clone this repo 6 times. With the newly added class DownloadTmpDir it will clone the repository only once during runtime and reuse it for every package that is clone from the same repository.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you can ignore the snippet. I understand the intention here now.

Do you know if there was a reason why we need a temporary clone here at all? Can we for example, make this not runtime specific? Clone a repo to a deterministic source directory (maybe a name containing hash(source_url)) and switch revisions instead of a new clone? This should avoid cloing across runtimes - might be advantageous in non subdirectory use case as well. If so, I would say it might even be a separate PR independant from the subdirectory change as it has broader impact.

@@ -14,7 +14,8 @@
"port": r"\d+",
"path": r"[\w~.\-/\\]+",
"name": r"[\w~.\-]+",
"rev": r"[^@#]+",
"rev": r"[^@#?]+",
"subdir": r"[\w\-/\\]+",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit unsure about this one. As per the current poetry docuemtnation the argument for poetry add when providing a url needs to be a valid git url, which the python vcs dependency url us not. In the curent version we support <url>#<rev> (valid git) which is not adherant to pip specification (this seems to be standardised by PEP 610). It would seem that this is not fully compliant to either specification. The use of query string syntax might also vendor dependant (github vs gitlab).

Do you think this could be confusing? And if so, should we pick one and stick with it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When starting with this PR I thought about a new command line argument. I refuse it because it will lead to more changes in the code compared with passing it along with the URL. Second, I was inspired by the way pip is doing this. I guess most new users would find this intuitive.

Other opinions are welcome. I will have to work on this PR the next days, because of the changes introduced by poetry_core now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea for sure, I am all for consistency with PEP 610. The concern right now is that this change would mean we will neither conform to that nor git's url spec. We should probably make a decision how how to handle this.

More changes might not be a bad thing if it makes things clearer. Additionally, we can definitely tackle exposing this via a cli option in a later PR. Right now, I guess we just need to decide which url spec to follow, we can accept both too - but this would mean some assumptions in error handling during parsing.

There are definitely pros and cons for following either format. From a cli perspective having a standard git url and options that map to the toml spec will reduce the cognitive load on end users as it is explcit (more intuitive?) for someone who is familiar with poetry's config I feel (eg: poetry add <git url> --subdirectory=x --revision=x will map cleanly to dependency specification in toml). On the other hand, having it be the same will mean, atleast for new users as you said, be more intuitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about it; we should probably consider enabling both by doing the following.

  1. For users who use poetry add vcs+protocol://repo_url[@rev]/#egg=pkg&subdirectory=pkg_dir we simply convert this to the appropriate inline table in the toml pkg = { vcs = "protocol://repo_url", rev = "rev", subdirectory = "pkg_dir"}. Any cli options --subdirectory | --revision overrides values parsed from vcs url.
  2. For users who use poetry add vcs+protocol://repo_url[#rev] we maintain current behavior with the possibility of using new cli options.
  3. Error out with recommendation if the url is explicitly added to the toml file and is not a valid remote url.

(1) would be really be important for cases where the pip VCS dependency url is copied over.

@LongBeachHXC
Copy link

how we looking on this one? This would be extremely useful for my use case.

@remram44
Copy link
Contributor

remram44 commented Aug 4, 2021

Is this done or wontfix? Related PRs have been closed or are still open. Was this issue closed by mistake?

@ashnair1
Copy link
Contributor

ashnair1 commented Sep 16, 2021

This commit (python-poetry/poetry-core#192) was recently merged into poetry-core's master so this issue might be solved in 1.2

Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Feature requests/implementations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants