Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEBUG #299

Closed
wants to merge 42 commits into from
Closed

DEBUG #299

wants to merge 42 commits into from

Conversation

h-vetinari
Copy link
Member

@h-vetinari h-vetinari commented Dec 4, 2024

Demonstrate that fix in conda-forge/libcufile-feedstock#24 was not sufficient, and that conda-forge/libcufile-feedstock@733742d needs to be reverted.

Debugging run for #298 (which contains the commits for the original goal of this PR) + #304

mgorny and others added 4 commits December 4, 2024 19:13
Upstream keeps all magma-related routines in a separate
libtorch_cuda_linalg library that is loaded dynamically whenever linalg
functions are used.  Given the library is relatively small, splitting it
makes it possible to provide "magma" and "nomagma" variants that can
be alternated between.

Fixes conda-forge#275

Co-authored-by: Isuru Fernando <[email protected]>
Try to speed up magma/nomagma builds a bit.  Rather than rebuilding
the package 3 times (possibly switching magma → nomagma → magma again),
build it twice at the very beginning and store the built files for later
reuse in subpackage builds.

While at it, replace the `pip wheel` calls with `setup.py build` to
avoid unnecessarily zipping up and then unpacking the whole thing.
In the end, we are only grabbing a handful of files for `libtorch*`
packages and they are in predictable location in the build directory.
`pip install` remains being used in the final builds for `pytorch`.
@conda-forge-admin
Copy link
Contributor

conda-forge-admin commented Dec 4, 2024

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

  • ℹ️ The recipe is not parsable by parser conda-souschef (grayskull). Your recipe may not receive automatic updates and/or may not be compatible with conda-forge's infrastructure. Please check the logs for more information and ensure your recipe can be parsed.
  • ℹ️ The recipe is not parsable by parser conda-recipe-manager. Your recipe may not receive automatic updates and/or may not be compatible with conda-forge's infrastructure. Please check the logs for more information and ensure your recipe can be parsed.

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12387955132. Examine the logs at this URL for more detail.

@isuruf
Copy link
Member

isuruf commented Dec 6, 2024

You need conda-forge/conda-forge-ci-setup-feedstock#368

mgorny added 17 commits December 7, 2024 20:49
Put all the rules in a single file.  In the end, build_common.sh
has pytorch-conditional code at the very end anyway, and keeping
the code split like this only makes it harder to notice mistakes.
While technically upstream uses 2024.2.0, this is causing some of
the calls to fail with an error:

    RuntimeError: MKL FFT error: Intel oneMKL DFTI ERROR: Inconsistent configuration parameters

Force <2024 that seems to work better.

Fixes conda-forge#301
Enable actually running a fixed random subset (1/5) of core tests
to check for packaging-related regressions.  We are not running
the complete test suite because it takes too long to complete.
Per RuntimeError: Ninja is required to load C++ extensions
While there still doesn't seem to be a clear agreement which builds
should be preferred, let's prefer "magma" to keep the current behavior
unchanged for end users.
Replace the build number hacks with `track_features` to deprioritize
generic BLAS over mkl, and CPU over CUDA.  This is mostly intended
to simplify stuff before trying to port to rattler-build.
Remove a leftover `skip` that prevented CUDA + generic BLAS build
from providing all packages, notably `pytorch`.  While at it, remove
redundant [win] skip.
Tobias-Fischer added a commit to baszalmstra/pytorch-cpu-feedstock that referenced this pull request Dec 17, 2024
Tobias-Fischer added a commit to baszalmstra/pytorch-cpu-feedstock that referenced this pull request Dec 17, 2024
@h-vetinari h-vetinari changed the title WIP: remove workarounds for wrong libcufile metadata DEBUG Dec 18, 2024
@Tobias-Fischer
Copy link
Contributor

Can we close here @h-vetinari?

@h-vetinari h-vetinari closed this Dec 26, 2024
@h-vetinari h-vetinari deleted the cufile2 branch December 26, 2024 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants