Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SuiteSparse_jll] Update to v7.4.0 #52577

Merged
merged 3 commits into from
Jan 6, 2024

Conversation

giordano
Copy link
Contributor

@giordano giordano commented Dec 18, 2023

Usual memo to self for the update:

  • change version number in stdlib/SuiteSparse_jll/Project.toml
  • refresh checksums in deps/checksums/suitesparse with make -f contrib/refresh_checksums.mk -j libsuitesparse. Note: this will also delete the files deps/checksums/SuiteSparse-*.tar.gz/*, but those refer to the stdlib which is the julia wrapper around the binary dependency, those files have to be restored manually, I don't know how to make the makefile behave nicely. You will also have to remove the lines SuiteSparse-*.tar.gz/md5/* added to the file deps/checksums/suitesparse, they are wrong for the same reason
  • update version numbers in deps/libsuitesparse.version.

Judging by the changes in JuliaPackaging/Yggdrasil#7596 we don't need to touch the manual build system.

I have no idea whether SuiteSparse retains API/ABI compatibility within the same major version, I surely hope so because I won't have the strength to try and fix large compatibility errors.

@giordano giordano added building Build system, or building Julia or its dependencies sparse Sparse arrays external dependencies Involves LLVM, OpenBLAS, or other linked libraries JLLs labels Dec 18, 2023
@giordano giordano force-pushed the mg/update-suite-sparse branch 2 times, most recently from 59a7a1f to b3081f8 Compare December 18, 2023 21:33
@giordano
Copy link
Contributor Author

There is only one relevant test failure, of course on 32-bit Windows:

Error in testset SparseArrays/cholmod:
Test Failed at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-b3081f8082\share\julia\stdlib\v1.11\SparseArrays\test\cholmod.jl:947
  Expression: (cholesky(A, perm = 1:n)).p == 1:n
   Evaluated: [12, 25, 5, 1, 2, 3, 4, 6, 7, 8  …  91, 92, 93, 94, 95, 96, 97, 98, 99, 100] == 1:100

I have no idea of what's the problem, so I'd appreciate any help here (also in the form of removing support for 32-bit Windows 🙃)

Everywhere else it's all good. But we also need to merge JuliaSparse/SparseArrays.jl#478 at the same time to clear the warnings about the version of cholmod:

      From worker 10:	┌ Warning: CHOLMOD version incompatibility
      From worker 10:	│
      From worker 10:	│ Julia was compiled with CHOLMOD version 4.0.4. It is
      From worker 10:	│ currently linked with version 5.0.0.
      From worker 10:	│ This might cause Julia to terminate when working with
      From worker 10:	│ sparse matrix factorizations, e.g. solving systems of
      From worker 10:	│ equations with \.
      From worker 10:	│
      From worker 10:	│ It is recommended that you use Julia with the same major
      From worker 10:	│ version of CHOLMOD as the one used during the build, or
      From worker 10:	│ download the generic binaries from www.julialang.org,
      From worker 10:	│ which ship with the correct versions of all dependencies.
      From worker 10:	└ @ SparseArrays.CHOLMOD /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-master/julia-b3081f8082/share/julia/stdlib/v1.11/SparseArrays/src/solvers/cholmod.jl:206

@gbaraldi
Copy link
Member

Is it adding some garbage to the start of the array?

@giordano giordano marked this pull request as draft December 19, 2023 15:51
@giordano giordano force-pushed the mg/update-suite-sparse branch from 34a56b0 to 41433ce Compare December 19, 2023 15:59
@giordano
Copy link
Contributor Author

No, the number of elements is correct, but 12, 25, 5 for some reasons are at the front: https://buildkite.com/julialang/julia-master/builds/31334#018c82ce-35e2-43c1-a143-eebf240e7b15/12749-13075

F.p = [12, 25, 5, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]

But this is wrong already at https://github.com/JuliaSparse/SparseArrays.jl/blob/8308232f12fe6d28c689e11b2e19b48a484136f6/src/solvers/cholmod.jl#L1320, and as far as I understand (but I may be wrong) the symbolic function is all written in Julia, no call into suitesparse yet: https://github.com/JuliaSparse/SparseArrays.jl/blob/8308232f12fe6d28c689e11b2e19b48a484136f6/src/solvers/cholmod.jl#L1258-L1278

@ViralBShah
Copy link
Member

cc @rayegun. The challenge here is of course how does one even test on 32-bit windows.

@giordano
Copy link
Contributor Author

I'm tempted by marking the test as broken on 32-bit Windows so that we can move on 🥲

@ViralBShah
Copy link
Member

I think it may actually be ok to do so.

@rayegun
Copy link
Member

rayegun commented Dec 21, 2023

😮‍💨, time to find the W32 VM again

@giordano giordano force-pushed the mg/update-suite-sparse branch from 82cdd33 to e9e9897 Compare December 22, 2023 14:25
@giordano
Copy link
Contributor Author

In case someone has a way to debug this on 32-bit Windows, I reduced the failure to

using LinearAlgebra, SparseArrays, SparseArrays.CHOLMOD

n = 100
A = sprand(n,n,5/n) |> t -> t't + I
SA = CHOLMOD.Sparse(A)
F = CHOLMOD.@cholmod_param postorder = false begin
    CHOLMOD.@cholmod_param nmethods = 1 begin
        CHOLMOD.analyze_p(SA, Int[p-1 for p in 1:n])
    end
end
F.p == 1:n # this should be true, but it's false

It can probably be reduced further, but I know absolutely nothing of the SparseArrays code, I think I reached the maximum I could spend on this.

If no one has ideas, I might just mark the test as broken only on that platforms so that we can upgrade the package.

@ViralBShah
Copy link
Member

@DrTimothyAldenDavis Would you have any insight into what might be causing the win32 failure above?

@DrTimothyAldenDavis
Copy link

No idea.

I do try to maintain as much backward compatibility as possible between say a 4.0 and a 5.0 package. I would have to check the details to see what API/API differs. Is that the question?

I do test SuiteSparse in its CI with WIN32 on GitHub and it's passing.

Can you try 7.4.0.betaX where X is 10 I think, but is changing daily? I will release 7.4.0 stable soon.

@ViralBShah
Copy link
Member

That would suggest that there may be an issue in our wrappers.

@DrTimothyAldenDavis
Copy link

Yes. It could be the cholmod 4 vs 5, above.

Or it could be a Christmas bug. The first two numbers are 12,25 after all 😀

@giordano giordano force-pushed the mg/update-suite-sparse branch from e9e9897 to bd5104f Compare December 31, 2023 11:58
@giordano
Copy link
Contributor Author

giordano commented Dec 31, 2023

There are tons of test failures with SuiteSparse v7.4.0, on all platforms, not just 32-bit Windows....

@giordano
Copy link
Contributor Author

What's most worrying is that we're getting loads of

Error: no BLAS/LAPACK library loaded!

errors, which comes from libblastrampoline, but I have no clue of how this is related to upgrading to SuiteSparse 7.4.0

@mmuetzel
Copy link

mmuetzel commented Dec 31, 2023

The build system of SuiteSparse has been completely overhauled for 7.4.0. Many of the configuration flags have changed their names. You'd probably need to adjust your build rules for that change here:
https://github.com/JuliaPackaging/Yggdrasil/blob/master/S/SuiteSparse/SuiteSparse%407/build_tarballs.jl

Maybe, you could also consider building with the root CMakeLists.txt now.

@giordano
Copy link
Contributor Author

giordano commented Jan 2, 2024

Great news: after updating the wrappers using the SuiteSparse header files (PR JuliaSparse/SparseArrays.jl#480 by @rayegun), all SparseArrays tests are passing in this PR, for all platforms, including 32-bit ones (and including the permutation test). The failures reported above were likely due to some changes in SuiteSparse between v7.2.1 and v7.3.0/v7.4.0 which broke ABI compatibility on 32-bit systems. Also @imciner2 identified the change to the type of other_2 in cholmod_common as a potential problem on 32-bit platforms, so that may be the culprit, and what forced us to update the wrappers.

Tomorrow I'll update JuliaSparse/SparseArrays.jl#478 to include the changes in JuliaSparse/SparseArrays.jl#480 and keep everything in one place, but this is now looking good for us, thanks everybody for the help!

@DrTimothyAldenDavis
Copy link

It has become impossible to maintain the SuiteSparse bindings - I have to re-learn the whole build system every few months.

Sorry for the headaches. We're getting feedback from spack, PETsc, linux distros, msys, octave, ... and of course julia, and the feedback doesn't come all at once. We're trying to make the build system work for everybody, but it means that it's an iterative process.

@DrTimothyAldenDavis
Copy link

The failures reported above were likely due to some changes in SuiteSparse between v7.2.1 and v7.3.0/v7.4.0 which broke ABI compatibility on 32-bit systems.

Whenever I change a package, I do keep to the semantic versioning process: CHOLMOD has had a few breaking changes in the past, for example. SuiteSparse itself doesn't track each version change of the packages inside it. I used to bump the major SO of SuiteSparse itself when any of its component packages had a major SO bump, but that caused problems with linux distro maintainers so I don't do that anymore.

So here's how CHOLMOD has bumped in is major SO numbers, recently:

SuiteSparse     CHOLMOD
5.13.0          3.0.14      many cholmod*h include files
6.0.0           4.0.0       single cholmod.h include file, SuiteSparse_long deprecated
7.0.0           4.0.3
7.2.1           4.2.1
7.3.0           5.0.0       added single precision support (xworksize -> xworkbytes)
7.4.0           5.1.0
7.5.0           5.1.1

and you're note that the bump from CHOLMOD 4.x.x to 5.x.x did not cause SuiteSparse itself to bump from 7.2.1 to 8.0.0.

I may have broken the ABI upward compatibility by changing the size_t other_2 to double other_2, however. That was inadvertent. That happened when going from SuiteSparse 7.2.1 to 7.3.0 and CHOLMOD bumped from 4.2.1 to 5.0.0 so at least that is "safe" in the sense that a breaking change to the ABI did get reflected in the major SO number.

@DrTimothyAldenDavis
Copy link

For example, the change from ALLOW_64BIT_BLAS to SUITESPARSE_USE_64BIT_BLAS was from linux distro feedback, or spack (I can't recall exactly), so that all user-facing cmake variables are prefixed with SUITESPARSE_ or Package_ (like UMFPACK_, etc).

@imciner2
Copy link
Contributor

imciner2 commented Jan 3, 2024

Sorry for the headaches.

No worries @DrTimothyAldenDavis. Thanks for the work you are doing to coalesce all these various requests and update the packages, the actual build system changes recently have actually led to a simplification of the build script we use in Yggdrasil, so that is appreciated.

One thing that might be useful going forward for doing the updates to these wrappers is to have a more detailed changelog for the APIs of the various components. For instance, when I started looking at the CHOLMOD 4->5 version bump to see if that was the cause of the errors, there wasn't really a good list of what changes were made to the API, so I ended up having to read through the header file and sometimes do a comparison between the old header and the new header. Some of the changes (like changing the naming/meaning of xdtype from xtype in cholmod_allocate_sparse were more subtle, and weren't obvious at first read through. Having a changelog for the API would make the process of updating easier and flag those types of subtle changes.

@ViralBShah
Copy link
Member

ViralBShah commented Jan 3, 2024

It has become impossible to maintain the SuiteSparse bindings - I have to re-learn the whole build system every few months.

Sorry for the headaches. We're getting feedback from spack, PETsc, linux distros, msys, octave, ... and of course julia, and the feedback doesn't come all at once. We're trying to make the build system work for everybody, but it means that it's an iterative process.

@DrTimothyAldenDavis Apologies - My complaint was not meant for you, but for the way we integrate SuiteSparse in Julia itself and update it in a roundabout way.

@DrTimothyAldenDavis
Copy link

One thing that might be useful going forward for doing the updates to these wrappers is to have a more detailed changelog for the APIs of the various components.

I can work on that.

@giordano giordano force-pushed the mg/update-suite-sparse branch from 832c882 to acb9ecf Compare January 3, 2024 18:55
@giordano
Copy link
Contributor Author

giordano commented Jan 3, 2024

I updated this PR to use latest changes in JuliaSparse/SparseArrays.jl#478. That PR now includes better way to update SuiteSparse_jll to regenerate our bindings, and also improved documentation to explain how to upgrade the version of the JLL package, this should address Viral's concerns about maintenance of our bindings.

If tests are green, as I hope since there should be no functional changes compared to this fully green run (excluding to may-fail jobs), we should be good to go.

@giordano giordano force-pushed the mg/update-suite-sparse branch from 7109421 to aeba6d1 Compare January 5, 2024 18:24
@rayegun
Copy link
Member

rayegun commented Jan 6, 2024

I am not sure off the top of my head how to regenerate the hashes, if no one does it before ~10AM tomorrow I will givve it a shot.

@ViralBShah
Copy link
Member

The checksum refreshing script is in contrib/refresh_checksums.mk. I can probably get to it tomorrow.

Update information in `stdlib/SparseArrays.version` and then run the commands

```
git rm -rf deps/checksums/SparseArrays-*
make -C stdlib -j checksum-SparseArrays
```
@giordano giordano force-pushed the mg/update-suite-sparse branch from aeba6d1 to 2b9634a Compare January 6, 2024 08:44
@giordano giordano marked this pull request as ready for review January 6, 2024 08:44
@giordano
Copy link
Contributor Author

giordano commented Jan 6, 2024

I am not sure off the top of my head how to regenerate the hashes

Update information in stdlib/SparseArrays.version and then run the commands

git rm -rf deps/checksums/SparseArrays-*
make -C stdlib -j checksum-SparseArrays

if no one does it before ~10AM tomorrow I will givve it a shot.

I just did it. This should be good to go now!

@ViralBShah ViralBShah merged commit c5dcf4f into JuliaLang:master Jan 6, 2024
5 of 7 checks passed
@ViralBShah
Copy link
Member

Does anyone know if the BLAS patch we are applying upstreamed to SuiteSparse?

@DrTimothyAldenDavis
Copy link

Does anyone know if the BLAS patch we are applying upstreamed to SuiteSparse?

I could take a look, and consider it for SuiteSparse 7.5.0. Where is it in SuiteSparse_jll?

@giordano giordano deleted the mg/update-suite-sparse branch January 6, 2024 14:49
@giordano
Copy link
Contributor Author

giordano commented Jan 6, 2024

It's a backport of DrTimothyAldenDavis/SuiteSparse#671, it's already in SuiteSparse.

@PallHaraldsson PallHaraldsson mentioned this pull request Jan 6, 2024
33 tasks
@KristofferC
Copy link
Member

KristofferC commented Jan 6, 2024

IIRC, @KristofferC said we can never move SparseArrays.jl out of stdlib because it would be breaking.

If I did I was just reiterating what triage (in its great wisdom) decided #48161 (comment). So according to that we are indeed never allowed to remove stdlibs from Julia. We could make the stdlibs upgradable (so you can upgrade them with the package manager) and ignore the bundled ones. This situation is of course worse for users than if the stdlibs were just properly moved out but hey, it is not "breaking".

staticfloat added a commit that referenced this pull request Jan 7, 2024
staticfloat added a commit that referenced this pull request Jan 8, 2024
staticfloat added a commit that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies external dependencies Involves LLVM, OpenBLAS, or other linked libraries JLLs sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants