-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
env: fix islation when pyvenv.cfg is present #99
Conversation
cd0d336
to
a17f862
Compare
WRT the approach taken here, it'd be good if @gaborbernat could weigh in on this, because I am far from confident in my understanding of how virtual environment magic works. I do think that it would be a good idea to add at least one test that actually covers the isolation behavior, though. The current test only ensures that the monkey-patch works (which I would expect to change if we can come up with a better solution), but won't detect regressions if something else breaks the behavior. Creating a basically empty project and installing it into a virtualenv, then trying to build a project from within that |
a17f862
to
1340db5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a typo in the title (islation
) and IMHO I'd like a better description on why this is needed, what are the moving parts within python that allow it to happen and why we need to alter some of the knobs you do. The CI is failing also.
WRT the approach taken here, it'd be good if @gaborbernat could weigh in on this, because I am far from confident in my understanding of how virtual environment magic works.
I'd wait with judgment for @FFY00 to explain more in-depth why he took this approach, and what problems he's trying to fix and how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong button, sorry!
Signed-off-by: Filipe Laíns <[email protected]>
d3f98d4
to
17e167f
Compare
Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
17e167f
to
28fb3b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're isolation logic basically tries to create a virtual environment, and breaks in a lot of situations. That might be ok if this was a personal project, but for a pypa one you need to support all major interpreters. Maybe just use virtualenv instead?
self._place_path_relative(sysconfig.get_config_var('LIBPL')) | ||
|
||
''' | ||
We use PYTHONHOME to relocate the Python installation to our environment, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need the pyvenv.cfg only on python 3.4+. Copy on python 2 will not work. We should raise if that happens. Same on python 3 that's signed (macos+ windows store). We basically don't support those environments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate? We are not using pyvenv.cfg
.
Can you describe how it breaks? |
For starters windows store python can't be copied for security reasons. So when symlink is off for the is os this will break. Furthermore as far as I read cpython code it checks all parent folders, not just the immediate parent for pyvenv.cfg. it's more - complicated but I'm on my phone now. |
The number of systems without symlinking support is already pretty small, so this won't affect many users. I am okay with not supporting this use case. How does virtualenv deal with this anyway?
That is not documented, nor is the behavior I see. This could change between os, but on Linux, CPython is only checking the same directory and one directory up. I would be happy to discuss this further. virtualenv is kind of a heavy dependency, so I am reluctant to pull it. |
If you say not supporting windows without symlink is ok I'll need to retract my support for this joining pypa. |
*supporting Windows without symlink support when the interpreter cannot be copied (which is the case of Python distributed by the Windows store) The core functionality of the package still works, you just can't build in isolation. We can actually disable the Python interpreter copy behavior outside virtual environments, which would running python-build from a virtual environment as an additional requirement above. Please let me know how virtualenv solves this issue, we may be able to apply the same approach here. |
Per PEP 517 building in isolated environments is considered core functionality. Maybe you can get away with using venv instead of virtualenv to be fair here. And then have some hack like this for python 2, where copy only might be enough. virtualenv solution is hundreds os lines of code. Rewriting e.g. mach-o files is non trivial. |
It definitely would be nice to avoid pulling in If it's significantly non-trivial to get the isolation logic right, maybe we should use With regards to platform support, I don't think PyPA has any particular requirements for universal support for all features for all platforms — I'll note that |
I think using venv for python 3, and some hacked together for python 2 would be the easiest and best investment. Paul pip wants to use this tool to build packages, so it must support everything because pip needs to. |
virtualenv should have exactly the same issue we do, so what I am interested here is to understand if the way it solves the issue is complex enough that would make sense for us to pull it as a dependency. If it isn't complicated we could possibly do the same thing here. Both venv and virtualenv work by giving you a custom Python interpreter. We have already established that overriding Now, I think the use cases that wouldn't be supported are obscure enough for us to justify it. And if we (or more likely, I) do manage to fix the current workflow in the Python interpreter, they would start to be supported. As a maintainer, I think it is very important to keep in mind the scope of the project and understand that you can't do everything. I think supporting the use case in question will block progress for the project. If I had unlimited time, I would maintain all backend approaches, maximizing the use cases we support, but I don't. If anyone does, please let me know, I would be more than glad to have that. To finish, I just wanted to note that this is not a run, it's a marathon. Not supporting use cases right now doesn't mean we can't support them in the future. We need to save our breath to try to pass the finish line, not waste it trying to go fast. Sorry for this corny metaphor 😅, that is just how I feel. |
Oh, I think it's import point this out, in case anyone is forgetting, python-build would still be able to build packages in the environments in question, it's just that it won't be able to build in an isolated environment. |
Can you ellaborate on this, what does custom covers here?
Why do you need to override?
Not sure I follow, but we need to support Python in general, not just CPython. Please don't exclude other Python implementations.
Unlikely without tens of lines of extra code for every release type (version constrained, etc). The main reason virtualenv moved away from single file architecture is because they are way too many variations, and you end up with hundereds lines of code.
Exactly. The scope of this project is to build a python package. Creating isolated python environments is a goal for virtual envrionment creation projects. So let's not try to adddress that within this project, but instead delegate that job to a tool that does this (venv/virtualenv). This means we can spend more time on addressing the project goals, less trying to get support for various python releases.
This is exactly my issue raised. IMHO you're trying to fix a dumpster fire, and I'm just warning if you go down this path you'll end up with few more of this. I'd recommend instead avoiding them by delegating isolation part to tools who are dedicated to do this.
If you just switch to venv/virtualenv (I'd say use venv on py3.4+, virtualenv otherwise - pulling in virtualenv on python 2 seems good enough compromise considering it's EOL) you'll no longer need to revisit this issue at any point in the future. You solve this and all future isolation issues in one go. Seems the efficient thing to do.
Violating the requirement of PEP-518 should not be taken lighly IMHO. Building in isolated environment is not optional per that PEP. |
|
||
Passing non PEP 508 strings will result in undefined behavior, you | ||
*should not* rely on it. It is merely an implementation detail, it may | ||
change any time without warning. | ||
''' | ||
if not requirements: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that you are calling:
subprocess.check_call([sys.executable, '-m', 'ensurepip'], cwd=self.path)
This will install pip and setuptools. IMHO this is bad because now setuptools is by default provisioned and not specifying it within your build requires will silently pass. We should ensure only pip is installed in there 🤔 . Perhaps we should uninstall if not specified within the build-requires list. On handle the easy to make error of forgetting setuptools as build-require but build still passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I hadn't noticed before. We should be able to just delete setuptools*
from site-packages
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to just delete
setuptools*
fromsite-packages
.
I don't think that's wise, you'd be relying there on implementation details of what pip uninstall does. Probably better to just use pip uninstall directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WTF? Why does ensurepip
install setuptools
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in a source only non PEP-518 world pip install
would break without a local setuptools.. Is my guess.
They give you their own Python interpreter.
Because
Sorry, in Python.
Can you elaborate, I don't follow? Unless Python implementations are not following the PEP, I see no reason to have custom code for each release.
The scope of the project is to provide a CLI tool to build Python packages, that is easy to bootstrap. I understand that other people may not care about bootstrapability, and just vendor dependencies, but I do and that is exactly the reason I created this project. I want to make sure that Python environments are easily bootstrapble, preferring following all that PEP 517 says. You might also notice the 'correct' in the description, which means that we should do things correctly, I've been burned many times by projects not caring about this, especially on such critical components as build systems. We shouldn't be relying on hacks such as overriding The long term plan can't be to use venv and virtualenv. But we could use them venv and virtualenv short term, I am not arguing with that. The long term plan would be to fix this use-case in Python itself. The current short term effects are that a few obscure use cases are not supported, and I don't think that warrants me taking the time to move to venv/virtualenv. I would rather take that time to work on fixing the workflow in Python, it takes time and as soon as we do that, as soon we will have this problem fixed. With that said, if anyone does want to take the time to change the code to venv/virtualenv, please be my guest, I would really appreciate it. But that will not be the long term plan.
My position on this is to wait, if the maintenance burden becomes higher than the burden of moving the code to venv/virtualenv, I will consider doing that.
As I just said above, I think that would be a fine short term approach. You are free to do it.
I do not take it lightly. |
Also, we are not on 1.0.0, we are on 0.0.4, having rough edges is fine. |
If we invoke the build operations in a subprocess of the virtual environment (isolated) this is not a problem. You already do a lot of subprocess calls, so see no reason why not do it here too.
Well it's not really their own. It's a python environment that tweaks enough parts of the global python to hide away globally installed packages. Which also sounds to me exactly what an isolated python environment is.
I've suggested to use venv, that's part of the standard library so should not need boostrapping. I did say virtualenv for python 2, but if you'd not want that we can use some hackish solution like here.
I'm complelty on board with you on this. This is why I'm suggesting a better isolation system than what you have here. Because what's here is not correct when symlinks are not available, or the python executable is not read-able (but can be executed).
Why not? Those systems have been designed especially to create isolated build environments, and venv does not require any custom boostraping.
I guess we fundamentally disagree on how one should go about supporting python interpreters. I prefer designed to work upfront, while you're aiming at wait until enough people complain to make your effort worthwhile. From my POV people trying to use this tool, and then finding out it does not work because it does not support their use cases is hurtful to the whole packaging ecosystem, because another thing that they should use but they cannot. |
I can understand
Yeah, @gaborbernat to be clear the
Anything we fix in Python would only hit Python 3.10+. Assuming
I think we're already doing significantly better than
The upside of this is that it should help us avoid the complicated logic involved in supporting dozens of different platforms while also giving most users the benefits of not using Another option that is possibly more palatable would be to default to using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't consider my comments as deal breakers, but just notting the project as it stands will (very likely) not work (correctly) with Windows Store or macOs framework python implementations.
I agree with Bernát here, it would be very difficult to distinguish where to use venv and where not. I think we should just use venv.
Essentially I think we should have an environment variable to disable The goal is to allow spinning up custom environments completely independently on the interpreter. sysconfig paths:
With this we should be able to simply start a process from What do you think? |
Not sure I understand how would it be useful. How would you provision pip for a such isolation, without making the entire system packages available🤔maybe would help to create a no packages installed venv via just env vars but creating virtual environments without pip is not expensive 🤷♂️so IMHO would cost more than benefit for interpreter maintenance. |
I don't see any reason why ensurepip would not work since I am not touching The benefit here would be that we can run a virtual environment with any interpreter, that is the whole point. It would also solve the Windows store Python issue, as we wouldn't need to copy the executable anywhere, no weird workarounds needed. |
I guess could work... But arguably only useful for tools, as would be impractical to use by end users. I'm a bit skeptical core devs would accept this kind of fine tunability while you can already solve the problem with just invoking virtualenv. The benefits over direct virtualenv invocation overall are very low the way I see it. |
Yes, only useful for tools. virtualenv would indeed be available but the problem I see here is that bootstraping virtual environments when we have some limitations regarding the interpreter, like your example of the Windows store, is very tricky. virtualenv is great and I am very glad it exists but I would like to see virtual environments being easily bootstrapable by anyone. The current status quo is use virtualenv or have pain. Being able to create virtual environments without any dependency on what Python interpreter is being run would really help that. Btw, I am still not clear on how virtualenv solves the issue of using Windows store Python when symlinks are not available. I would really appreciate if you could clarify that for me. The main issue I see here is how most console entrypoints are generated. For tools like pipx that want to run an entrypoint from one specific environment, they would have to clear the environment before running the Python process. This is prefectly doable, but could be annoying. I think it is at least worth exploring and getting feedback from other people. Do you agree? |
For Python 3 you can just use venv. For Python 2 you need virtualenv, but let's be realistic any PEP we'd come up would be python 3 only.
It doesn't solve it. Delegates the job to the window store pythons venv module, that knows how to handle himself. The venv module basically looks up a redirect script and uses that as python executable.
If you have the bandwidth for it sure. From my side not worth the gain considering the effort one needs to put in, but feel free to pursue that path. |
I'm not a packaging guru (by any measure!) but I don't understand what we're trying to do here. As I understand it, build isolation means that the builder should not have access to system- and user-level packages (anything that is not part of the stdlib). To my knowledge, you can accomplish this in two ways: minimally, by overriding |
We have no control over how the Python subprocess in |
Thanks for explaining. Using a virtual env sounds like a far better option than trying to account for what looks like a hotchpotch of edge cases in this package. |
@gaborbernat Are there plans to port your |
That's for the venv maintainers to decide onto. Note venv uses ensurepip to provision pip/setuptools, so in all likelyhood it's up to the ensurepip maintainers at the end of day. This seems to be @ncoghlan @pfmoore @dstufft. So if they agree that the path taken by virtualenv is preferable, can make a PR against CPython (and PyPy respectively). |
Can someone summarise the issue here, and what it is that is being suggested ensurepip (or venv) needs to do? I don't have time right now to read the whole of this thread, but I'm happy to comment on something specific. |
off-topic for this discussion but currently ensurepip uses pip to install pip+setuptools. @ofek was suggesting it should do the same as virtualenv does: extract wheel himself (potentially via the install project), and cache it. So that venv virtual environment creation time is on par with virtualenv. |
OK, I'd say that's not likely to be something we'd add to ensurepip. Once installer is mature, maybe it would be worth using to do the installs - but why add an extra component to the core when we already have a wheel installer included in the form of pip? More likely would be improvements to speed up pip. We currently use |
@ofek answers this.
Your words above do object against this. And instead redirect people to make pip faster. Note this later is likely by a few order of magnitudes harder task, due to the legacy/complexity of pip compared to a simple wheel unistall/cache system. |
@gaborbernat OK, sorry if I'm missing things - as I say, I don't have time to read the whole thread, so I'm somewhat relying on the summary. To repeat what I was trying to say - what is the proposed change, and do we have a demonstration that it'll increase venv creation speed sufficiently to justify any additional costs and overheads? I suspect a big proportion of the speed penalty is pip unpacking to a temp location and copying. Has anyone confirmed whether the new version of pip with unpack in place improves things here and by how much? That would be the first thing to check. The proposal to add the installer module has a number of problems right now. First, the installer module isn't even complete. Let's look again when that's sorted out. Secondly, we would be adding a module which would be solely in support of ensurepip, and wouldn't be installed on the user's system. That's a bit of a weird thing to do - why not either bring installer into the stdlib (immediate response - it needs to mature a lot before that's worth considering) or use the tools that are already there, i.e., pip (the response being "pip's too slow", to which my counter is "why not improve pip and help everyone?") To directly answer your question:
I'm happy to see a PR, as long as it comes with benchmarks that demonstrate the improvement, and the code isn't more of a maintenance burden than the improvement warrants. One particular question I'd want to see addressed, is who would update the code to deal with possible future changes in the wheel format? Would that task fall on the core devs? |
Signed-off-by: Filipe Laíns <[email protected]>
Signed-off-by: Filipe Laíns <[email protected]>
cf294d7
to
892aa38
Compare
Signed-off-by: Filipe Laíns [email protected]