Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rationalize tests for CI vs. releases #1117

Open
ViralBShah opened this issue Nov 27, 2024 · 14 comments
Open

Rationalize tests for CI vs. releases #1117

ViralBShah opened this issue Nov 27, 2024 · 14 comments
Labels
ci Continuous integration

Comments

@ViralBShah
Copy link
Member

ViralBShah commented Nov 27, 2024

Is it possible to rationalize the way we test in CI to use much lesser time than right now?

Maybe have a set of CI tests and a set of complete tests. Perhaps the complete tests can be run during releases of LinearAlgebra.jl into julia master.

I suppose the harder part of all of this is the workflow.

@DilumAluthge Any ideas?

@ViralBShah ViralBShah changed the title Rationalize tests Rationalize tests for CI vs. releases Nov 27, 2024
@andreasnoack
Copy link
Member

Since most time is spent on compilation, it significantly increases to total CPU time when the tests are run as many separate jobs. However, the user time might take a significant hit if run sequentially. Might also be interesting to see the effect to running tests with a low optimization level.

@KristofferC
Copy link
Member

Here is sorted test time.

Test (Worker) Time (s) GC (s) GC % Alloc (MB) RSS (MB)
ldlt (16) 1.95 0.02 1.2 64.84 688.12
factorization (14) 10.86 0.13 1.2 393.02 666.03
abstractq (12) 16.26 0.18 1.1 617.43 711.82
pinv (16) 21.28 0.43 2.0 1107.97 688.12
givens (16) 21.83 0.26 1.2 839.98 623.48
symmetriceigen (16) 42.04 0.78 1.9 1384.64 694.36
schur (14) 53.11 0.56 1.1 2009.62 666.03
adjtrans (16) 73.02 1.19 1.6 3105.21 623.48
lapack (14) 106.15 1.19 1.1 3470.32 650.14
lq (15) 108.49 1.7 1.6 4975.61 618.0
blas (12) 119.79 1.85 1.5 5485.69 525.55
generic (10) 127.14 1.72 1.4 4609.6 715.38
svd (17) 132.8 2.35 1.8 6257.72 543.32
uniformscaling (14) 153.87 2.15 1.4 5505.12 634.76
structuredbroadcast (15) 161.54 2.73 1.7 8148.26 575.81
eigen (12) 193.75 2.65 1.4 6849.69 711.82
hessenberg (16) 199.8 2.92 1.5 8216.44 541.0
qr (10) 231.48 3.63 1.6 9941.32 585.98
tridiag (17) 238.97 3.52 1.5 10335.22 859.79
bunchkaufman (15) 327.26 10.93 3.3 34795.95 1162.19
cholesky (11) 480.62 5.93 1.2 16072.22 655.04
matmul (5) 492.49 6.04 1.2 18103.55 679.82
symmetric (7) 535.37 8.6 1.6 23216.27 802.7
special (9) 535.51 9.67 1.8 31511.59 920.5
lu (13) 536.12 5.34 1.0 15547.96 651.05
dense (6) 579.19 8.61 1.5 22657.19 812.67
diagonal (8) 646.46 9.09 1.4 24207.6 833.98
addmul (3) 801.18 6.25 0.8 16774.03 589.55
bidiag (4) 1222.44 19.73 1.6 59146.33 1316.59
triangular (2) 2910.48 68.6 2.4 206813.92 2560.22

The triangular and bidiag have like triply nested loops over a large set of types which just explodes compilation time in a combinatorial manner.

@dkarrasch
Copy link
Member

The triangular and bidiag have like triply nested loops over a large set of types which just explodes compilation time in a combinatorial manner.

They also have unitful tests. I had already thought about hoisting them out in an extra test file, but haven't got to it.

@andreasnoack
Copy link
Member

An extra data point #1118 (comment)

@ViralBShah
Copy link
Member Author

Do we really need to test Triangular this much? Perhaps we can just test fewer types as well.

@andreasnoack
Copy link
Member

Might be possible to reduce the number of combinations but looping over types did reveal several issues when it was introduced.

@ViralBShah
Copy link
Member Author

Might be possible to reduce the number of combinations but looping over types did reveal several issues when it was introduced.

Yes, that's what I was thinking. It was greatly helpful early on, but presumably for catching regressions, we may only need fewer combinations.

@jishnub
Copy link
Collaborator

jishnub commented Nov 27, 2024

It should be possible to ensure that code coverage isn't reduced by trimming the combinations.

@andreasnoack
Copy link
Member

Code coverage is just an incomplete measure when it comes to some of the possible issues that can show up when mixing types. I'm not saying that we can't reduce the tests. I'm just saying that it is possible to make them less useful while maintaining code coverage.

@KristofferC
Copy link
Member

There should still be some thought behind it instead of "spray and pray" so it is established what is unique about every combination. There are no comments now which makes it hard to know what the thought was.

@andreasnoack
Copy link
Member

Just browsed test/triangular.jl again. It's been a while. I actually don't think it is unreasonable. You can argue if it is reasonable to have all four triangular types fully parametric and allow for operations on mixed element types with "correct" promotion handling. But that was the choice we made, and given that choice, I think the testing here is roughly appropriate. The load balancing is off, though, so we can reduce the test time a lot by splitting the triangular tests in three or four.

@KristofferC
Copy link
Member

KristofferC commented Nov 27, 2024

One should then test things "orthogonally":

  • Test that types promote to each other as they should
  • Test that when mixed types are used the promotion happens

There is absolutely no way you should be required to test a semirandom 5x5x5x5 type combination. You are still missing 99% of custom types from the ecosystem.

Again, I completely reject the notion that you have to spend one hour compiling to test the tridiagonal support in Julia's linear algebra. 😆

@ViralBShah
Copy link
Member Author

I thought back in the day, doing this in deeply nested loops etc. helped discover issues in the compiler.

@andreasnoack
Copy link
Member

I thought back in the day, doing this in deeply nested loops etc. helped discover issues in the compiler.

I actually don't think that many compiler issues surfaced as part of this testing but there were many issues related to promotion. There are just a lot of type combinations related to the triangular types. It's cool that Julia allow you to maintain structures such

UnitLowerTriangular*UnitLowerTriangular = UnitLowerTriangular
UnitLowerTriangular*LowerTriangular = LowerTriangular
LowerTriangular*UpperTriangular = (Some)Matrix

and that backsolve with UpperUnitTriangular doesn't promote much because it doesn't have any division, but it requires a lot of methods to be compiled.

There is absolutely no way you should be required to test a semirandom 5x5x5x5 type combination.

The full cartesian product here is probably not adding much value, though. It's just an easy. Let's see how much of a difference #1123 makes as a first step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous integration
Projects
None yet
Development

No branches or pull requests

5 participants