Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the implementation #23

Open
wants to merge 33 commits into
base: main
Choose a base branch
from
Open

Refactor the implementation #23

wants to merge 33 commits into from

Conversation

IvanUkhov
Copy link
Member

@IvanUkhov IvanUkhov commented Apr 29, 2024

macOS, ARM64

Main:

test compute_0001000 ... bench:       1,696 ns/iter (+/- 12)
test compute_0010000 ... bench:      16,578 ns/iter (+/- 162)
test compute_0100000 ... bench:     164,988 ns/iter (+/- 2,065)
test compute_1000000 ... bench:   1,649,850 ns/iter (+/- 10,186)
test context_0001000 ... bench:       1,697 ns/iter (+/- 16)
test context_0010000 ... bench:      16,582 ns/iter (+/- 203)
test context_0100000 ... bench:     165,100 ns/iter (+/- 2,997)
test context_1000000 ... bench:   1,652,079 ns/iter (+/- 39,340)

This PR:

test compute_0001000 ... bench:       2,088 ns/iter (+/- 26)
test compute_0010000 ... bench:      20,554 ns/iter (+/- 1,331)
test compute_0100000 ... bench:     204,522 ns/iter (+/- 4,461)
test compute_1000000 ... bench:   2,045,050 ns/iter (+/- 8,309)
test context_0001000 ... bench:       2,226 ns/iter (+/- 17)
test context_0010000 ... bench:      21,832 ns/iter (+/- 449)
test context_0100000 ... bench:     217,379 ns/iter (+/- 6,678)
test context_1000000 ... bench:   2,173,283 ns/iter (+/- 18,401)

Linux, AMD64

Main:

test compute_0001000 ... bench:       2,370 ns/iter (+/- 19)
test compute_0010000 ... bench:      23,144 ns/iter (+/- 4,352)
test compute_0100000 ... bench:     230,255 ns/iter (+/- 1,556)
test compute_1000000 ... bench:   2,299,363 ns/iter (+/- 6,198)
test context_0001000 ... bench:       2,368 ns/iter (+/- 32)
test context_0010000 ... bench:      23,136 ns/iter (+/- 101)
test context_0100000 ... bench:     230,174 ns/iter (+/- 760)
test context_1000000 ... bench:   2,302,334 ns/iter (+/- 8,662)

This PR:

test compute_0001000 ... bench:       1,790 ns/iter (+/- 14)
test compute_0010000 ... bench:      17,532 ns/iter (+/- 106)
test compute_0100000 ... bench:     174,332 ns/iter (+/- 572)
test compute_1000000 ... bench:   1,745,888 ns/iter (+/- 5,770)
test context_0001000 ... bench:       2,163 ns/iter (+/- 13)
test context_0010000 ... bench:      21,208 ns/iter (+/- 274)
test context_0100000 ... bench:     211,190 ns/iter (+/- 2,317)
test context_1000000 ... bench:   2,037,844 ns/iter (+/- 5,788)

@IvanUkhov
Copy link
Member Author

@tomhamiltonlambda, not sure how to reconcile this. This morning tried on a laptop with a different architecture and go numbers that contradict the ones we have been seeing so far. What kind of hardware do you have?

@tomhamiltonlambda
Copy link
Contributor

I run on Windows, Intel processor.

Looks like it's pretty inconclusive either direction - a separate point is that the code itself is now a lot clearer - I'd like to avoid going back on that. I can spinning up various EC2 instances and compare speed there. What else do you want to do?

@IvanUkhov
Copy link
Member Author

Yeah, it is pointing in different directions. Thanks to you, the new code looks much cleaner and more intelligible, and it would be nice to retain it. I will keep it open for now and occasionally try to see if I can figure out how to avoid performance degradation. But please do not spent more time on this. Thank you again!

@sammysheep
Copy link

sammysheep commented Nov 28, 2024

M4 Max (unbinned model), MacOS 15.1.1

Main:

test compute_0001000 ... bench:       1,236.41 ns/iter (+/- 82.48)
test compute_0010000 ... bench:      12,129.44 ns/iter (+/- 908.35)
test compute_0100000 ... bench:     120,752.77 ns/iter (+/- 1,339.80)
test compute_1000000 ... bench:   1,205,737.50 ns/iter (+/- 96,453.37)
test context_0001000 ... bench:       1,237.54 ns/iter (+/- 27.23)
test context_0010000 ... bench:      12,136.26 ns/iter (+/- 130.81)
test context_0100000 ... bench:     120,863.10 ns/iter (+/- 9,001.46)
test context_1000000 ... bench:   1,208,620.80 ns/iter (+/- 41,977.97)

This PR:

test compute_0001000 ... bench:       1,482.34 ns/iter (+/- 13.49)
test compute_0010000 ... bench:      14,654.60 ns/iter (+/- 178.37)
test compute_0100000 ... bench:     145,970.15 ns/iter (+/- 2,459.80)
test compute_1000000 ... bench:   1,460,958.30 ns/iter (+/- 20,525.79)
test context_0001000 ... bench:       1,499.25 ns/iter (+/- 25.57)
test context_0010000 ... bench:      14,776.59 ns/iter (+/- 1,921.14)
test context_0100000 ... bench:     146,528.45 ns/iter (+/- 1,815.32)
test context_1000000 ... bench:   1,465,783.30 ns/iter (+/- 11,994.10)

rustc 1.85.0-nightly (7db7489f9 2024-11-25)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants