Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement special min/max accumulator for Strings and Binary (10% faster for Clickbench Q28) #12792

Merged
merged 20 commits into from
Oct 13, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Oct 7, 2024

Which issue does this PR close?

Closes #6906

Rationale for this change

Ensure we can turn on StringView -- see #6906. Also bonus make some queries faster

What changes are included in this PR?

  1. Implement specialized GroupsAccumulator for Strings/Binary

Are these changes tested?

Yes -- new functional tests

Performance

3x faster for some pathological queries)

For this query defined in #6906 (comment)

set datafusion.execution.parquet.schema_force_view_types = true;

SELECT REGEXP_REPLACE("Referer", '^https?://(?:www\\.)?([^/]+)/.*$', '\\1') AS k, AVG(length("Referer")) AS l, COUNT(*) AS c, MIN("Referer")
FROM hits_partitioned
WHERE "Referer" <> '' GROUP BY k HAVING COUNT(*) > 100000 ORDER BY l DESC LIMIT 25;

Run via datafusion-cli -f q28.sql

  • main: Elapsed 18.549 seconds.
  • This branch: Elapsed 6.187 seconds.

10% faster for normal ClickBench Q28

Q28 (with MIN("Referrer") also gets slightly faster:

SELECT REGEXP_REPLACE("Referer", '^https?://(?:www\.)?([^/]+)/.*$', '\1') AS k, AVG(length("Referer")) AS l, COUNT(*) AS c, MIN("Referer") FROM hits WHERE "Referer" <> '' GROUP BY k HAVING COUNT(*) > 100000 ORDER BY l DESC LIMIT 25;

Total results:

--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃  main_base ┃ alamb_min_max_strings ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.28ms │                2.20ms │     no change │
│ QQuery 1     │    37.05ms │               38.20ms │     no change │
│ QQuery 2     │    92.34ms │               94.39ms │     no change │
│ QQuery 3     │   100.18ms │              102.79ms │     no change │
│ QQuery 4     │   922.93ms │              917.95ms │     no change │
│ QQuery 5     │  1023.47ms │              989.25ms │     no change │
│ QQuery 6     │    34.02ms │               33.54ms │     no change │
│ QQuery 7     │    41.22ms │               42.80ms │     no change │
│ QQuery 8     │  1403.10ms │             1384.97ms │     no change │
│ QQuery 9     │  1345.01ms │             1310.50ms │     no change │
│ QQuery 10    │   333.22ms │              334.38ms │     no change │
│ QQuery 11    │   388.83ms │              390.41ms │     no change │
│ QQuery 12    │  1078.98ms │             1095.45ms │     no change │
│ QQuery 13    │  1752.46ms │             1661.97ms │ +1.05x faster │
│ QQuery 14    │  1235.38ms │             1238.51ms │     no change │
│ QQuery 15    │  1073.12ms │             1094.49ms │     no change │
│ QQuery 16    │  2536.49ms │             2511.71ms │     no change │
│ QQuery 17    │  2335.03ms │             2338.92ms │     no change │
│ QQuery 18    │  5026.47ms │             5021.49ms │     no change │
│ QQuery 19    │    93.42ms │               94.42ms │     no change │
│ QQuery 20    │  1753.61ms │             1757.55ms │     no change │
│ QQuery 21    │  2022.70ms │             2021.02ms │     no change │
│ QQuery 22    │  5226.57ms │             5197.39ms │     no change │
│ QQuery 23    │ 10480.78ms │            10503.06ms │     no change │
│ QQuery 24    │   575.37ms │              585.16ms │     no change │
│ QQuery 25    │   497.50ms │              479.91ms │     no change │
│ QQuery 26    │   660.02ms │              655.23ms │     no change │
│ QQuery 27    │  2638.55ms │             2573.67ms │     no change │
│ QQuery 28    │ 15846.93ms │            14410.80ms │ +1.10x faster │
│ QQuery 29    │   516.71ms │              528.71ms │     no change │
│ QQuery 30    │  1043.51ms │             1050.48ms │     no change │
│ QQuery 31    │  1082.67ms │             1128.12ms │     no change │
│ QQuery 32    │  4224.18ms │             4305.11ms │     no change │
│ QQuery 33    │  5152.31ms │             5228.72ms │     no change │
│ QQuery 34    │  5262.44ms │             5219.20ms │     no change │
│ QQuery 35    │  1983.76ms │             1937.33ms │     no change │
│ QQuery 36    │   277.84ms │              265.44ms │     no change │
│ QQuery 37    │   119.12ms │              118.07ms │     no change │
│ QQuery 38    │   140.92ms │              139.69ms │     no change │
│ QQuery 39    │   740.00ms │              748.95ms │     no change │
│ QQuery 40    │    60.74ms │               59.34ms │     no change │
│ QQuery 41    │    50.01ms │               48.32ms │     no change │
│ QQuery 42    │    62.75ms │               63.48ms │     no change │
└──────────────┴────────────┴───────────────────────┴───────────────┘

Are there any user-facing changes?

Slightly faster performance

@alamb alamb force-pushed the alamb/min_max_strings branch 2 times, most recently from 65f1b65 to 8cd03ac Compare October 8, 2024 19:24
@alamb alamb force-pushed the alamb/min_max_strings branch from 8cd03ac to 8bc00f8 Compare October 8, 2024 19:32
@alamb alamb changed the title WIP Implement special min/max accumulator for Strings and Binary WIP Implement special min/max accumulator for Strings and Binary (10% faster for Clickbench Q28) Oct 8, 2024
@alamb
Copy link
Contributor Author

alamb commented Oct 8, 2024

I need to write some more specific / targeted tests here and we'll be ready for review

fn set_value(&mut self, group_index: usize, new_val: &[u8]) {
match self.min_max[group_index].as_mut() {
None => {
self.min_max[group_index] = Some(new_val.to_vec());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to overallocate a bit here to avoid reallocations (e.g. 2x the size, or using a statistic, e.g. (total_data_bytes / min_max.len()) * factor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try it -- with this change the time spent in this accumulator becomes very small (like < 1% in the traces). I will try it anyways

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing in #12845

/// groups in an individual byte array, which balances allocations and memory
/// fragmentation (aka garbage).
///
/// ```text
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the diagram of this design

@@ -3818,6 +3818,180 @@ DROP TABLE min_bool;
# Min_Max End #
#################



#################
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will also make another PR to update the AggregateFuzz test added by @Rachelint in #12667 to cover these scenarios as well

@alamb alamb changed the title WIP Implement special min/max accumulator for Strings and Binary (10% faster for Clickbench Q28) Implement special min/max accumulator for Strings and Binary (10% faster for Clickbench Q28) Oct 10, 2024
/// Replaces the nulls in the input array with the given `NullBuffer`
///
/// Can replace when upstreamed in arrow-rs: <https://github.com/apache/arrow-rs/issues/6528>
pub fn set_nulls_dyn(input: &dyn Array, nulls: Option<NullBuffer>) -> Result<ArrayRef> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is supporting code for replacing the null buffers in arrays

@alamb alamb marked this pull request as ready for review October 10, 2024 12:49
@alamb
Copy link
Contributor Author

alamb commented Oct 10, 2024

Ok, I think this is now ready to review

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb
Copy link
Contributor Author

alamb commented Oct 10, 2024

Thank you for the reviews @jayzhan211 and @Dandandan

I plan to merge this tomorrow to give other people a chance to review it if they would like

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(skimming)

/// Replaces the nulls in the input array with the given `NullBuffer`
///
/// Can replace when upstreamed in arrow-rs: <https://github.com/apache/arrow-rs/issues/6528>
pub fn set_nulls_dyn(input: &dyn Array, nulls: Option<NullBuffer>) -> Result<ArrayRef> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could be private

Comment on lines +196 to +197
_ => {
return not_impl_err!("Applying nulls {:?}", input.data_type());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to support this for any other data types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessairly -- I am hoping we put this code upstream in arrow-rs and can remove it entirely from datafusion eventually

Comment on lines +354 to +368
/// ┌─────────────────────────────────┐
/// ┌─────┐ ┌────▶│Option<Vec<u8>> (["A"]) │───────────▶ "A"
/// │ 0 │────┘ └─────────────────────────────────┘
/// ├─────┤ ┌─────────────────────────────────┐
/// │ 1 │─────────▶│Option<Vec<u8>> (["Z"]) │───────────▶ "Z"
/// └─────┘ └─────────────────────────────────┘ ...
/// ... ...
/// ┌─────┐ ┌────────────────────────────────┐
/// │ N-2 │─────────▶│Option<Vec<u8>> (["A"]) │────────────▶ "A"
/// ├─────┤ └────────────────────────────────┘
/// │ N-1 │────┐ ┌────────────────────────────────┐
/// └─────┘ └────▶│Option<Vec<u8>> (["Q"]) │────────────▶ "Q"
/// └────────────────────────────────┘
///
/// min_max: Vec<Option<Vec<u8>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@alamb alamb merged commit 646f40a into apache:main Oct 13, 2024
24 checks passed
@alamb
Copy link
Contributor Author

alamb commented Oct 13, 2024

🚀

hailelagi pushed a commit to hailelagi/datafusion that referenced this pull request Oct 14, 2024
…ter for Clickbench Q28) (apache#12792)

* Implement special min/max accumulator for Strings: `MinMaxBytesAccumulator`

* fix bug

* fix msrv

* move code, handle filters

* simplify

* Add functional tests

* remove unecessary test

* improve docs

* improve docs

* cleanup

* improve comments

* fix diagram

* fix accounting

* Use correct type in memory accounting

* Add TODO comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement fast min/max accumulator for binary / strings (now it uses the slower path)
5 participants