Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize iszero function (3-5x faster) #12881

Merged
merged 2 commits into from
Oct 13, 2024

Conversation

simonvandel
Copy link
Contributor

Which issue does this PR close?

Didn't create an issue beforehand.

Rationale for this change

iszero runs faster using BooleanArray::from_unary.

I think this is because we don't need to branch on each value to check if it's null, which the .iter inside make_function_scalar_inputs_return_type is doing.

What changes are included in this PR?

  • Add benchmark
  • Replace make_function_scalar_inputs_return_type use with BooleanArray::from_unary

Are these changes tested?

Yes, with existing tests.

Are there any user-facing changes?

Faster iszero. I don't expect this function to be a bottleneck anywhere, but at least it's a bit faster now.

iszero f32 array: 1024  time:   [771.48 ns 773.07 ns 774.97 ns]
                        change: [-69.960% -69.618% -69.232%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 16 outliers among 100 measurements (16.00%)
  6 (6.00%) high mild
  10 (10.00%) high severe

iszero f64 array: 1024  time:   [683.35 ns 683.91 ns 684.49 ns]
                        change: [-71.834% -71.698% -71.596%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

iszero f32 array: 4096  time:   [2.6036 µs 2.6096 µs 2.6169 µs]
                        change: [-76.842% -76.730% -76.618%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  4 (4.00%) high severe

iszero f64 array: 4096  time:   [2.2268 µs 2.2291 µs 2.2316 µs]
                        change: [-80.502% -80.449% -80.397%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

iszero f32 array: 8192  time:   [5.0282 µs 5.0438 µs 5.0620 µs]
                        change: [-79.426% -79.363% -79.282%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  6 (6.00%) high severe

iszero f64 array: 8192  time:   [4.2459 µs 4.2543 µs 4.2643 µs]
                        change: [-82.480% -82.445% -82.411%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Copy link
Member

@jonahgao jonahgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM👍

Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼
This should be applicable to many places as long as the lambda arg in from_unary() can't fail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants