Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use uv instead of pip to manage virtualenvs #432

Merged
merged 9 commits into from
Jul 23, 2024
Merged

Conversation

jherland
Copy link
Member

@jherland jherland commented May 7, 2024

uv is much faster than pip for installing package (it is also faster then python -m venv for creating virtualenvs). This PR switches from venv/pip to uv in three separate parts of our project:

  1. Instruct Nox to use uv instead of pip to manage the virtualenv associated with each Nox session (first commit).
  2. Use uv instead of venv/pip (when available) to manage virtualenvs for our sample_project and real_projects tests (next three commits).
  3. Use uv instead of venv/pip (when available) to create and populate the temporary virtualenv managed by the --install-deps option (next two commits).

The first two points above only concern our tests and has no effect on the FawltyDeps program itself. The last one changes the behavior of the --install-deps option to use uv when available (otherwise fall back to venv/pip).

The last commit in this PR adds a small convenience for our users: If they want to ensure that uv is available for FawltyDeps to use, they can now install fawltydeps[uv]. This uv extra will bring in uv as a dependency in much the same way that we currently depend on nox[uv] in our own developement environment.

Commits:

  • Use uv instead of pip to manage nox virtualenvs
  • tests/project_helpers: Use uv (if available) to prepare virtualenvs
  • .github/workflows/tests.yaml: Preserve uv cache across test runs
  • Sidestep issues installing projectq with 'uv pip install'
  • Prepare to decouple --install-deps from pip
  • TemporaryAutoInstallResolver: Use uv if available
  • pyproject.toml: Add uv as an "extra" dependency for FawltyDeps

@jherland jherland force-pushed the jherland/ruff-format-instead-of-black branch from a2c964c to fc666fd Compare May 13, 2024 11:16
Base automatically changed from jherland/ruff-format-instead-of-black to main May 13, 2024 11:59
@jherland jherland force-pushed the jherland/uv-replaces-pip branch from e6f968f to ac3e5bd Compare May 22, 2024 15:56
@jherland jherland force-pushed the jherland/uv-replaces-pip branch from 0ba8f7d to 005c054 Compare June 11, 2024 10:45
@jherland jherland marked this pull request as ready for review June 11, 2024 10:56
@jherland jherland mentioned this pull request Jun 11, 2024
11 tasks
@jherland jherland linked an issue Jun 11, 2024 that may be closed by this pull request
11 tasks
@jherland jherland force-pushed the jherland/uv-replaces-pip branch 3 times, most recently from f47229f to 3978d7d Compare June 12, 2024 09:28
@jherland jherland changed the title Use uv instead of pip to manage nox virtualenvs Use uv instead of pip to manage virtualenvs Jul 9, 2024
jherland added 7 commits July 9, 2024 11:15
Upgrade Nox to (at least) 2024.03.02 (which is the first version with
support for managing virtualenvs with uv.

Add uv as an indirect dependency by depending on "nox[uv]" instead of
"nox" (exception: inside the "lint" dependency group, we only depend on
"nox" in order for Mypy to access Nox' type annotations, uv is not
needed here).

Also we cannot use/depend on uv when using Python 3.7, since uv requires
Python >=v3.8. I'm not actually sure _why_ uv requires >=v3.8, as it is
apparently able to create venvs for Python v3.7 (see e.g.
https://github.com/astral-sh/uv?tab=readme-ov-file#python-discovery),
still astral-sh/uv#1239 prevents uv from being
installed on <=v3.7).

Finally, in noxfile.py, use uv as our default venv_backend instead of
the default (pip), but only when it is in fact available.

A final complication on Nix(OS) happens when we install requirements for
the current session; we do this in two steps, and then we make sure that
whatever we installed was patched appropriately:

    session.install("-r", str(requirments_txt))
    if include_self:
        session.install("-e", ".")

    if not session.virtualenv._reused:  # noqa: SLF001
        patch_binaries_if_needed(session, session.virtualenv.location)

However, with uv in the mix, we have to consider that session.install()
itself _runs_ uv at the same time as the first session.install() may
also _install_ uv itself into the virtualenv. The second
session.install() can then end up _running_ a uv that was _installed_
by the first session.install(), and this will break on Nix(OS) unless
the uv binary has been patched in the meantime.

We therefore need to insert a call to patch_binaries_if_needed()
_between_ the two session.install() calls. Since the second
session.install() only installs FawltyDeps itself (which does not
introduce any binaries to be patched), we can get away with simply
reordering the second session.install() and the call to
patch_binaries_if_needed():

    session.install("-r", str(requirments_txt))

    if not session.virtualenv._reused:  # noqa: SLF001
        patch_binaries_if_needed(session, session.virtualenv.location)

    if include_self:
        session.install("-e", ".")
Our sample_projects and real_projects tests use CachedExperimentVenv in
tests/project_helpers.py to prepare virtualenvs containing the
dependencies for each of these tests. Establishing these virtualenvs is
costly, which is why we also _cache_ these virtualenvs between test
runs (using the pytest cache).

Using `uv` instead of `pip` can considerably speed up the creation of
these virtualenvs. The speedup is largely due to two factors:

1. `uv` is simply faster than `pip`, even when they essentially perform
   the same tasks.
2. `uv` also implements its own cache of downloaded packages and will
   install a package into a virtualenv by _hardlinking_ the package
   files from its own cache.

Here are some measurements before and after this commit. We run

  `time nox -Rs integration_tests-3.12 -- -k Python:all_reqs_installed`

which times the execution of _one_ real_projects test with a fairly
large set of dependencies: "The Algorithms - Python:all_reqs_installed".
Each scenario is run 3 times:

Before this commit (i.e. using `pip`):
  - Cold pytest cache (after running `rm -rf ~/.cache/pytest/*`):
        - 1m34.633s
        - 1m28.204s
        - 1m37.618s
  - Warm pytest cache:
        - 7.138s
        - 6.732s
        - 7.406s

After this commit (i.e. using `uv` instead of `pip`):
  - Cold `uv` cache + cold pytest cache (after running
    `rm -rf ~/.cache/uv ~/.cache/pytest/*`):
        - 1m28.220s
        - 1m34.373s
        - 1m34.682s
  - Cold pytest cache (after running `rm -rf ~/.cache/pytest/*`):
        - 9.602s
        - 9.077s
        - 9.918s
  - Warm pytest cache:
        - 7.575s
        - 6.600s
        - 6.780s

When both the `uv` cache and our own pytest-based cache are empty, `pip`
and `uv` essentially have to perform the same work and the run time is
dominated by the time it takes to download and unpack the required
packages.

In the warm cache case we reuse an existing virtualenv from the pytest
cache and `pip`/`uv` is not involved at all.

But in the case where we cannot reuse our pytest cache (e.g. because
some detail of the experiment has changed), then `uv` will take
advantage of its own cache to created the required virtualenvs almost
instantaneously.

In essence, with `uv` downloaded packages will be cached across test
runs whether or not we implement our own caching. In the future - if we
can _mandate_ `uv` instead of `pip` - we can consider removing our
pytest-based cache with little impact on our test run times.
The previous commit explains why the uv cache has the potential to speed
up the execution of our sample_projects and real_projects test cases.

However, in order for CI to benefit from the same potential speedup, we
need to actually preserve the uv cache across test runs.
While attempting to set up the Python environment for TheAlgorithms/Python,
`uv pip install` fails to install the `projectq` dependency:

  AssertionError: would build wheel with unsupported tag
  ('cp311', 'cp312', 'linux_x86_64')

This error is reproducible outside of the FawltyDeps context, and seems
to be an issue with either `projectq` itself, `uv`, or the combination
of the two.

Our test suite is NOT about installing projectq, but rather about
running FD on TheAlgorithms/Python, so instead of getting bogged down in
irrelevant details concerning its dependencies, let us abandon the
all_reqs_installed experiment altogether. This experiment was already
limited to Linux only, due to other dependency-related issues on Mac and
Windows.

Also, adjust the some_reqs_customized experiment to avoid having to deal
with the projectq dependency (custom mapping FTW!).

One thing we might want to consider for our real_projects test suite is
to stop relying on properly installing third-party dependencies
(that are currently unpinned and therefore will change without warning),
and instead simulate their installation with our `fake_project` test
fixture.
We are about to introduce `uv` as a preferred alternative to `pip` for
automatically installing packages when `--install-deps` is enabled.

Thus, it no longer makes sense for the class implementing this to be
called `TemporaryPipInstallResolver`, so rename it to
`TemporaryAutoInstallResolver`.

Update comments around the codebase accordingly.
Until now TemporaryAutoInstallResolver has used venv.create() to create
the temporary virtualenv and then run `pip install` to install
dependencies into this venv. This works, but is slow.

So slow, in fact, that our tests (test_resolver in particular) have had
to implement caching of the virtualenv simply to reduce the test runtime
from ~60s to ~15s (runtimes vary wildly since this is a hypothesis test,
but these are averages from mutiple runs).

When `uv` is available we can use it to both create the temporary
virtualenv, as well as install packages into it. Due to `uv` itself and
the local cache of packages that it maintains, this is now so fast that
we no longer have to cache the virtualenv in test runs (test_resolver
now takes ~7s without caching, and ~6s with caching).

For now, we don't make this configurable: If `uv` is found in $PATH, we
will use it, otherwise we fall back to venv + pip.
This allows users to install (or depend on) `"fawltydeps[uv]"` which
will then automatically bring in `uv` alongside `fawltydeps`.

This has no effect on our test/developer environments, as we install
`fawltydeps` without extras, but we still get `uv` via our dependency
on `nox[uv]`.

We should consider updating the FawltyDeps GitHub Action to install
`fawltydeps[uv]`, so that we can benefit from `uv` in that scenario as
well.
@jherland jherland force-pushed the jherland/uv-replaces-pip branch from 3978d7d to 57ebc4c Compare July 9, 2024 09:16
@mknorps
Copy link
Collaborator

mknorps commented Jul 16, 2024

Thank you @jherland 🫶

I started checking the PR by looking at all usages of "pip". There are some places where we can consider updating the text or a name:

  1. FawltyDeps into a temporary virtual environment. This will use `pip install`,
    :
    "This will use pip install" -> "This will use uv pip install and fall back to pip install if uv is not available"
  2. pip show myLibrary

    Should we also include uv instructions? Or should we only apply it in the developer context?
  3. # set the test's env variables so that pip would install from the local repo
    monkeypatch.setenv("PIP_NO_INDEX", "True")
  4. requirements: List[str] # PEP 508 requirements, passed to 'pip install'
  5. def test_resolve_dependencies_install_deps__raises_unresolved_error_on_pip_install_failure(
    caplog,
    does this still hold for uv?

Copy link
Collaborator

@mknorps mknorps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jherland for handling that ❤️

I did a small test before and after installing UV on detect-waste project and the results are awesome!

(fawltydeps-py3.11) ➜  detect-waste git:(main) time fawltydeps --install-deps
...
For a more verbose report re-run with the `--detailed` option.
fawltydeps --install-deps  23.11s user 6.87s system 88% cpu 34.040 total

vs 🎉

(fawltydeps-py3.11) ➜  detect-waste git:(main) time fawltydeps --install-deps
...
For a more verbose report re-run with the `--detailed` option.
fawltydeps --install-deps  5.21s user 4.54s system 43% cpu 22.191 total

The --install-deps option will be much more usable now, thank you!

I did not meticulously check all the changes you made, but given that you did not have to change any test of our rich tests suite, I am confident (99% 😄 ) it works without errors.

@jherland jherland force-pushed the jherland/uv-replaces-pip branch from 17c25ea to f2d359c Compare July 23, 2024 12:54
@jherland
Copy link
Member Author

jherland commented Jul 23, 2024

I started checking the PR by looking at all usages of "pip". There are some places where we can consider updating the text or a name:

Agreed. Look at f2d359c for my rewordings.

monkeypatch.setenv("PIP_NO_INDEX", "True")
does this still hold for uv?

Actually it does not! 🙀

Furthermore, although uv supports some UV_* environment variables that mirror the PIP_* equivalents, these settings/variables cannot yet be set via the environment. I found a workaround instead that relies on setting up a (temporary) config file for uv. Take a look at 2795898 for the details.

jherland added 2 commits July 23, 2024 15:47
The local_pypi fixture is used in tests to provide a reproducible
package installation environment, i.e. without _actually_ fetching
packages from PyPI.

This is done for `pip install` by setting a couple of `PIP_*`
environment variables that forces `pip` to only look at a local
directory of package files. However, the new `uv pip install` method
does not obey the same environment variables.

To configure the same for `uv` we need to write a TOML configuration
file and point `uv` to this via `UV_CONFIG_FILE` (because corresponding
`UV_NO_INDEX` and `UV_FIND_LINKS` environment variables does not yet
exist, see astral-sh/uv#1789 for details).

We setup a temporary file containing this TOML config for `uv` and
make sure it is automatically deleted after the test is run.
In various docs and code comments we still refer to `pip install`, when
we in fact end up calling _either_ `uv pip install`, or falling back to
`pip install` when `uv` is not available.

Fix these references to indicate this extra detail.
@jherland jherland force-pushed the jherland/uv-replaces-pip branch from 76eee5d to eebb84e Compare July 23, 2024 13:56
@jherland jherland merged commit f980f3c into main Jul 23, 2024
63 checks passed
@jherland jherland deleted the jherland/uv-replaces-pip branch July 23, 2024 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Meta: Dev/test environment speedups
2 participants