Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual graph transposition #107

Merged
merged 6 commits into from
Sep 26, 2024
Merged

Manual graph transposition #107

merged 6 commits into from
Sep 26, 2024

Conversation

amontoison
Copy link
Collaborator

No description provided.

@amontoison amontoison added the benchmark Run benchmarks on PR label Sep 22, 2024
@amontoison amontoison requested a review from gdalle September 22, 2024 20:53
Copy link

codecov bot commented Sep 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (939c9df) to head (f0da765).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #107   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           10        10           
  Lines          778       806   +28     
=========================================
+ Hits           778       806   +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

github-actions bot commented Sep 22, 2024

Benchmark Results

main 099de65... main/099de651354c5d...
coloring/nonsymmetric/column/direct/n=1000/p=0.002 0.0887 ± 0.0043 ms 0.0869 ± 0.0044 ms 1.02
coloring/nonsymmetric/column/direct/n=1000/p=0.005 0.194 ± 0.0082 ms 0.191 ± 0.0081 ms 1.02
coloring/nonsymmetric/column/direct/n=1000/p=0.01 0.402 ± 0.014 ms 0.402 ± 0.014 ms 1
coloring/nonsymmetric/column/direct/n=100000/p=0.0001 0.0774 ± 0.016 s 0.095 ± 0.022 s 0.815
coloring/nonsymmetric/column/direct/n=100000/p=2.0e-5 14.3 ± 0.36 ms 13.7 ± 0.39 ms 1.05
coloring/nonsymmetric/column/direct/n=100000/p=5.0e-5 0.0355 ± 0.0037 s 0.0348 ± 0.0015 s 1.02
coloring/nonsymmetric/row/direct/n=1000/p=0.002 0.104 ± 0.005 ms 0.105 ± 0.0052 ms 0.987
coloring/nonsymmetric/row/direct/n=1000/p=0.005 0.219 ± 0.0098 ms 0.216 ± 0.0091 ms 1.01
coloring/nonsymmetric/row/direct/n=1000/p=0.01 0.45 ± 0.016 ms 0.447 ± 0.016 ms 1.01
coloring/nonsymmetric/row/direct/n=100000/p=0.0001 0.0919 ± 0.026 s 0.0982 ± 0.0074 s 0.936
coloring/nonsymmetric/row/direct/n=100000/p=2.0e-5 16.5 ± 0.31 ms 15.5 ± 0.85 ms 1.07
coloring/nonsymmetric/row/direct/n=100000/p=5.0e-5 0.0384 ± 0.0021 s 0.0384 ± 0.0034 s 1
coloring/symmetric/column/direct/n=1000/p=0.002 0.272 ± 0.015 ms 0.275 ± 0.015 ms 0.99
coloring/symmetric/column/direct/n=1000/p=0.005 0.739 ± 0.028 ms 0.742 ± 0.03 ms 0.996
coloring/symmetric/column/direct/n=1000/p=0.01 1.81 ± 0.06 ms 1.81 ± 0.059 ms 1
coloring/symmetric/column/direct/n=100000/p=0.0001 0.485 ± 0.015 s 0.618 ± 0.053 s 0.785
coloring/symmetric/column/direct/n=100000/p=2.0e-5 0.0408 ± 0.003 s 0.0449 ± 0.0041 s 0.908
coloring/symmetric/column/direct/n=100000/p=5.0e-5 0.146 ± 0.011 s 0.182 ± 0.046 s 0.801
coloring/symmetric/column/substitution/n=1000/p=0.002 0.622 ± 0.033 ms 0.622 ± 0.032 ms 1
coloring/symmetric/column/substitution/n=1000/p=0.005 1.61 ± 0.064 ms 1.62 ± 0.062 ms 0.995
coloring/symmetric/column/substitution/n=1000/p=0.01 3.47 ± 0.11 ms 3.55 ± 0.12 ms 0.979
coloring/symmetric/column/substitution/n=100000/p=0.0001 0.806 ± 0.059 s 1.05 ± 0.03 s 0.77
coloring/symmetric/column/substitution/n=100000/p=2.0e-5 0.103 ± 0.015 s 0.125 ± 0.019 s 0.825
coloring/symmetric/column/substitution/n=100000/p=5.0e-5 0.327 ± 0.038 s 0.412 ± 0.035 s 0.793
decompress/nonsymmetric/column/direct/n=1000/p=0.002 3.96 ± 0.44 μs 3.89 ± 0.29 μs 1.02
decompress/nonsymmetric/column/direct/n=1000/p=0.005 7.95 ± 0.79 μs 8.26 ± 0.66 μs 0.962
decompress/nonsymmetric/column/direct/n=1000/p=0.01 17.4 ± 4.6 μs 21.7 ± 7.1 μs 0.801
decompress/nonsymmetric/column/direct/n=100000/p=0.0001 5.28 ± 0.72 ms 5.39 ± 0.4 ms 0.981
decompress/nonsymmetric/column/direct/n=100000/p=2.0e-5 1.1 ± 0.2 ms 1.08 ± 0.14 ms 1.02
decompress/nonsymmetric/column/direct/n=100000/p=5.0e-5 2.59 ± 0.32 ms 2.78 ± 0.38 ms 0.933
decompress/nonsymmetric/row/direct/n=1000/p=0.002 4.02 ± 0.42 μs 4.22 ± 0.3 μs 0.952
decompress/nonsymmetric/row/direct/n=1000/p=0.005 7.45 ± 0.58 μs 8.02 ± 0.55 μs 0.928
decompress/nonsymmetric/row/direct/n=1000/p=0.01 18.3 ± 5.2 μs 15.3 ± 1.2 μs 1.2
decompress/nonsymmetric/row/direct/n=100000/p=0.0001 2.32 ± 0.26 ms 2.05 ± 0.16 ms 1.13
decompress/nonsymmetric/row/direct/n=100000/p=2.0e-5 0.491 ± 0.11 ms 0.418 ± 0.083 ms 1.18
decompress/nonsymmetric/row/direct/n=100000/p=5.0e-5 1.18 ± 0.21 ms 0.931 ± 0.1 ms 1.27
decompress/symmetric/column/direct/n=1000/p=0.002 4.05 ± 0.36 μs 4.32 ± 0.43 μs 0.938
decompress/symmetric/column/direct/n=1000/p=0.005 7.81 ± 0.53 μs 8.1 ± 0.71 μs 0.964
decompress/symmetric/column/direct/n=1000/p=0.01 16.4 ± 2.8 μs 17.2 ± 2.5 μs 0.959
decompress/symmetric/column/direct/n=100000/p=0.0001 4.96 ± 0.73 ms 5.09 ± 0.47 ms 0.974
decompress/symmetric/column/direct/n=100000/p=2.0e-5 1.01 ± 0.2 ms 0.936 ± 0.23 ms 1.08
decompress/symmetric/column/direct/n=100000/p=5.0e-5 2.57 ± 0.25 ms 2.27 ± 0.66 ms 1.13
decompress/symmetric/column/substitution/n=1000/p=0.002 0.0624 ± 0.0035 ms 0.0638 ± 0.0037 ms 0.978
decompress/symmetric/column/substitution/n=1000/p=0.005 0.158 ± 0.0089 ms 0.159 ± 0.0071 ms 0.991
decompress/symmetric/column/substitution/n=1000/p=0.01 0.336 ± 0.014 ms 0.342 ± 0.015 ms 0.98
decompress/symmetric/column/substitution/n=100000/p=0.0001 0.0964 ± 0.0033 s 0.0593 ± 0.0046 s 1.62
decompress/symmetric/column/substitution/n=100000/p=2.0e-5 15.3 ± 1.4 ms 12.2 ± 0.35 ms 1.25
decompress/symmetric/column/substitution/n=100000/p=5.0e-5 0.0436 ± 0.0018 s 29.7 ± 1.4 ms 1.47
time_to_load 0.225 ± 0.0053 s 0.218 ± 0.00076 s 1.03

@gdalle
Copy link
Owner

gdalle commented Sep 23, 2024

I am not a fan of reimplementing what's already in a standard library just to shave off one allocation (the new nzval, if I understand correctly). Are you sure there is no other way?

@amontoison
Copy link
Collaborator Author

I don't see another solution.
But It's just a routine of 50 lines of code. We don't reimplement a sparse Cholesky so I don't see the problem.

@amontoison
Copy link
Collaborator Author

@gdalle What is the command to apply the formater on the code?

@gdalle
Copy link
Owner

gdalle commented Sep 25, 2024

JuliaFormatter.format with a path, or just control shift I in VSCode.
Okay for this custom transposition PR but can you add unit tests?

@amontoison
Copy link
Collaborator Author

@gdalle Can you approve it such that I don't rebase it after #112?

@gdalle
Copy link
Owner

gdalle commented Sep 26, 2024

Actually I wanted to work some more on this one, since the other PR is very short I'll take care of rebasing

@gdalle
Copy link
Owner

gdalle commented Sep 26, 2024

I don't really like this handwritten loop with obscure logic, even though you documented it well I still struggle to understand it. I opened #114 with a much more elegant 3-line implementation, which should still shave off all unneeded allocations but one (haven't benchmarked though). I suggest we merge that one and possibly revisit this afterwards.

@gdalle
Copy link
Owner

gdalle commented Sep 26, 2024

Also, some of the benchmarks show degradation but I think that's just because they're incredibly noisy. Will try to fix them

@gdalle gdalle marked this pull request as draft September 26, 2024 19:42
@amontoison
Copy link
Collaborator Author

amontoison commented Sep 26, 2024

I don't really like this handwritten loop with obscure logic, even though you documented it well I still struggle to understand it. I opened #114 with a much more elegant 3-line implementation, which should still shave off all unneeded allocations but one (haven't benchmarked though). I suggest we merge that one and possibly revisit this afterwards.

If you propose, don't merge it!
I don't find your 3-line implementation elegant.
I prefer this PR and would like this to be merged.
It's a very classic CSR --> CSC implementation, if you don't understand it, it doesn't mean that you should do your own version.

@gdalle gdalle changed the title Add a function transpose_graph Better (manual) graph transposition Sep 26, 2024
@gdalle gdalle marked this pull request as ready for review September 26, 2024 20:45
@gdalle
Copy link
Owner

gdalle commented Sep 26, 2024

If you propose, don't merge it!

You're right, I'm sorry. I merged mine first because the benchmarks showed virtually no difference, but I should have waited for you to weigh in.

I don't find your 3-line implementation elegant. I prefer this PR and would like this to be merged.

We obviously have diverging opinions on this, so let's judge from the benchmarks? I have merged main into this PR, so they will show improvements of your method against mine.

It's a very classic CSR --> CSC implementation, if you don't understand it, it doesn't mean that you should do your own version.

Precisely because it is so classic, my assumption is that there is a version of it in SparseArrays that is better than ours (both in terms of efficiency and reliability), so we should just reuse it instead of adding complexity to our code base.

@gdalle gdalle changed the title Better (manual) graph transposition Manual graph transposition Sep 26, 2024
@gdalle gdalle merged commit 6d0899b into main Sep 26, 2024
7 checks passed
@gdalle gdalle deleted the transpose_graph branch September 26, 2024 21:42
@gdalle gdalle mentioned this pull request Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Run benchmarks on PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants