-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pin requirements to specific versions and start removing python 2 support #502
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine, but I'll approve after you respond to my comments.
@@ -408,6 +408,28 @@ instance (without any instrumentation) | |||
through variable `driver`. | |||
|
|||
|
|||
### Managing requirements | |||
|
|||
We use [`pip-tools`](https://github.com/jazzband/pip-tools) to pin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels complicated :(
Is there any way we can abstract the dev/production dependency layering away from the dev?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bit of discussion around this here: jazzband/pip-tools#398. I think it's worth keeping despite the minor complexity for the developer handling the dependency upgrades. IMO it's better to pay that complexity cost at the the time of adding new dependencies than to pay the cost associated with chasing down bugs that result from an inconsistent environment that changes depending on whether the dev requirements were installed.
As an example, let's say requirements.txt
installs the newest version of pyarrow
. This version causes the bug we experienced in #498. But some dependency in requirements-dev.txt
pins to an older version of the library (which doesn't cause #498 ). If we run pip install -r requirements.txt -r requirements-dev.txt
pip will throw an error about conflicting and incompatible pinned versions -- this is actually what prompted me to look in to layering. If we instead run pip install -r requirements.txt
followed by pip install -r requirements-dev.txt
, pip
will first install the new version of pyarrow
and then uninstall it and install the older pinned version from requirements-dev.txt
. All of the CI tests will pass just fine, but any user who only runs pip install -r requirements.txt
will receive the newer version of pyarrow
and thus experience the bug.
This post discusses other tools that try to hide some of this complexity, but ultimately recommends pip-tools
due to constant issues.
@@ -0,0 +1,29 @@ | |||
beautifulsoup4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first impression from reading your comment above was that these .in files are like buffers that get fed into the final .txt files. If that's correct, perhaps we should add these to .gitignore?
Does feeding an empty .in file to pip-compile result in an empty .txt file? If so, I'm wrong - we should commit these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, feeding an empty .in
file will result in an empty .txt
file. Including both is the recommended workflow.
Think of the .txt
files as compiled versions of the .in
files. The .in
files specify the packages we want to use, and any pins we want to manually specify. Then, the .txt
file includes the full set of pinned dependencies (including dependencies of dependencies). So we will need to regularly re-compile the .txt
files from the .in
files to have the latest versions of all of the packages we rely on, but know that between updates we will have a deterministic set of dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, sounds like package.json and package-lock.json in the world of Node.js
Now that we have proper support for pinned packages, this is more trouble than it's worth. Over the past few years the only time I've seen this test fail is when I needed to track down a mismatch between a package name on pip and the import name. That's only going to get worse now that we are pinning all sub-dependencies.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR itself looks good.
However, I'd love to see a move to anaconda eventually (a vital step towards #503), and this could be a good moment in time to do it. Anaconda's environment.yml
is a well known standard, involves no pre-processing step and conda env export
pins versions by default. There is a test-passing PR for this in #429 - it needs rebasing and cleanup for sure, but it may be worth going down that path now rather than first introducing this and then moving to conda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I don't have an opinion on what @motin said about moving to anaconda.
Thanks! I still need to spend some time to think through Anaconda and whether it's something we'd want to move to. For now, this PR solves an immediate need and I don't think we should block on making a decision about #429. |
(
Includes #498, but will be rebased once that is merged)DONEThis PR does three things:
pip-compile
frompip-tools
to pin requirements to specific versions. Fixes Pin versions for all pip-installed dependencies #499.