Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase batch size #8388

Closed
wants to merge 6 commits into from
Closed

Conversation

Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Dec 1, 2023

Which issue does this PR close?

Closes #6916

Rationale for this change

We should have some more optimal defaults.

32K seems for this machine relatively optimal. Increasing to 64K does further improve performance on some queries, but also slows down a couple.

Tested on apple m1 (10 core)

--------------------
Benchmark tpch.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ tweak_batch_size2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 153.66ms │          154.25ms │     no change │
│ QQuery 2     │  50.78ms │           48.70ms │     no change │
│ QQuery 3     │  70.93ms │           68.99ms │     no change │
│ QQuery 4     │  53.13ms │           49.83ms │ +1.07x faster │
│ QQuery 5     │  82.32ms │           79.74ms │     no change │
│ QQuery 6     │  32.01ms │           31.21ms │     no change │
│ QQuery 7     │ 103.00ms │           97.76ms │ +1.05x faster │
│ QQuery 8     │ 107.53ms │          104.09ms │     no change │
│ QQuery 9     │ 129.14ms │          132.60ms │     no change │
│ QQuery 10    │ 136.39ms │          132.71ms │     no change │
│ QQuery 11    │  37.42ms │           36.17ms │     no change │
│ QQuery 12    │  85.80ms │           81.19ms │ +1.06x faster │
│ QQuery 13    │ 188.78ms │          180.52ms │     no change │
│ QQuery 14    │  51.23ms │           49.17ms │     no change │
│ QQuery 15    │  62.73ms │           57.85ms │ +1.08x faster │
│ QQuery 16    │  58.26ms │           55.36ms │     no change │
│ QQuery 17    │ 106.88ms │          105.24ms │     no change │
│ QQuery 18    │ 209.29ms │          175.27ms │ +1.19x faster │
│ QQuery 19    │ 102.93ms │           97.32ms │ +1.06x faster │
│ QQuery 20    │  56.63ms │           53.63ms │ +1.06x faster │
│ QQuery 21    │ 127.17ms │          114.02ms │ +1.12x faster │
│ QQuery 22    │  38.51ms │           36.83ms │     no change │
└──────────────┴──────────┴───────────────────┴───────────────┘
--------------------
Benchmark tpch_mem.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ tweak_batch_size2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  91.11ms │           85.82ms │ +1.06x faster │
│ QQuery 2     │  26.68ms │           24.94ms │ +1.07x faster │
│ QQuery 3     │  52.38ms │           30.65ms │ +1.71x faster │
│ QQuery 4     │  51.02ms │           27.19ms │ +1.88x faster │
│ QQuery 5     │ 115.14ms │           64.31ms │ +1.79x faster │
│ QQuery 6     │   9.75ms │            5.64ms │ +1.73x faster │
│ QQuery 7     │ 212.57ms │          105.89ms │ +2.01x faster │
│ QQuery 8     │  60.12ms │           41.41ms │ +1.45x faster │
│ QQuery 9     │  59.65ms │           52.97ms │ +1.13x faster │
│ QQuery 10    │ 113.51ms │           63.75ms │ +1.78x faster │
│ QQuery 11    │  19.42ms │           19.36ms │     no change │
│ QQuery 12    │  58.83ms │           28.11ms │ +2.09x faster │
│ QQuery 13    │  54.67ms │           40.07ms │ +1.36x faster │
│ QQuery 14    │  18.18ms │            9.44ms │ +1.93x faster │
│ QQuery 15    │  58.83ms │           24.29ms │ +2.42x faster │
│ QQuery 16    │  21.80ms │           22.11ms │     no change │
│ QQuery 17    │  53.05ms │           48.37ms │ +1.10x faster │
│ QQuery 18    │ 154.05ms │          120.13ms │ +1.28x faster │
│ QQuery 19    │  34.50ms │           26.75ms │ +1.29x faster │
│ QQuery 20    │  62.34ms │           37.43ms │ +1.67x faster │
│ QQuery 21    │ 247.24ms │          121.62ms │ +2.03x faster │
│ QQuery 22    │  14.14ms │           14.25ms │     no change │
└──────────────┴──────────┴───────────────────┴───────────────┘

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Dec 1, 2023
@andygrove
Copy link
Member

Here are my results at sf10 on a 24 core Threadripper, showing a 7% slowdown overall.

Comparing main and tweak_batch_size2
--------------------
Benchmark tpch.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ tweak_batch_size2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 1495.93ms │         2319.55ms │  1.55x slower │
│ QQuery 2     │  372.33ms │          351.49ms │ +1.06x faster │
│ QQuery 3     │  720.76ms │         1110.36ms │  1.54x slower │
│ QQuery 4     │  833.01ms │          471.11ms │ +1.77x faster │
│ QQuery 5     │ 1759.97ms │         1730.63ms │     no change │
│ QQuery 6     │  247.42ms │          465.09ms │  1.88x slower │
│ QQuery 7     │ 2054.92ms │         2327.30ms │  1.13x slower │
│ QQuery 8     │ 1809.03ms │         1993.64ms │  1.10x slower │
│ QQuery 9     │ 2796.79ms │         3076.88ms │  1.10x slower │
│ QQuery 10    │ 1739.64ms │         1995.19ms │  1.15x slower │
│ QQuery 11    │  315.52ms │          365.86ms │  1.16x slower │
│ QQuery 12    │  663.13ms │          542.66ms │ +1.22x faster │
│ QQuery 13    │  776.61ms │          798.09ms │     no change │
│ QQuery 14    │  464.86ms │          541.58ms │  1.17x slower │
│ QQuery 15    │  745.22ms │         1011.17ms │  1.36x slower │
│ QQuery 16    │  437.82ms │          477.74ms │  1.09x slower │
│ QQuery 17    │ 3237.00ms │         3033.37ms │ +1.07x faster │
│ QQuery 18    │ 3223.49ms │         3296.59ms │     no change │
│ QQuery 19    │  664.84ms │          950.09ms │  1.43x slower │
│ QQuery 20    │ 1072.52ms │         1127.15ms │  1.05x slower │
│ QQuery 21    │ 3144.50ms │         2967.69ms │ +1.06x faster │
│ QQuery 22    │  944.87ms │          545.98ms │ +1.73x faster │
└──────────────┴───────────┴───────────────────┴───────────────┘

@Dandandan
Copy link
Contributor Author

Here are my results at sf10 on a 24 core Threadripper, showing a 7% slowdown overall.

Comparing main and tweak_batch_size2
--------------------
Benchmark tpch.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      main ┃ tweak_batch_size2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 1495.93ms │         2319.55ms │  1.55x slower │
│ QQuery 2     │  372.33ms │          351.49ms │ +1.06x faster │
│ QQuery 3     │  720.76ms │         1110.36ms │  1.54x slower │
│ QQuery 4     │  833.01ms │          471.11ms │ +1.77x faster │
│ QQuery 5     │ 1759.97ms │         1730.63ms │     no change │
│ QQuery 6     │  247.42ms │          465.09ms │  1.88x slower │
│ QQuery 7     │ 2054.92ms │         2327.30ms │  1.13x slower │
│ QQuery 8     │ 1809.03ms │         1993.64ms │  1.10x slower │
│ QQuery 9     │ 2796.79ms │         3076.88ms │  1.10x slower │
│ QQuery 10    │ 1739.64ms │         1995.19ms │  1.15x slower │
│ QQuery 11    │  315.52ms │          365.86ms │  1.16x slower │
│ QQuery 12    │  663.13ms │          542.66ms │ +1.22x faster │
│ QQuery 13    │  776.61ms │          798.09ms │     no change │
│ QQuery 14    │  464.86ms │          541.58ms │  1.17x slower │
│ QQuery 15    │  745.22ms │         1011.17ms │  1.36x slower │
│ QQuery 16    │  437.82ms │          477.74ms │  1.09x slower │
│ QQuery 17    │ 3237.00ms │         3033.37ms │ +1.07x faster │
│ QQuery 18    │ 3223.49ms │         3296.59ms │     no change │
│ QQuery 19    │  664.84ms │          950.09ms │  1.43x slower │
│ QQuery 20    │ 1072.52ms │         1127.15ms │  1.05x slower │
│ QQuery 21    │ 3144.50ms │         2967.69ms │ +1.06x faster │
│ QQuery 22    │  944.87ms │          545.98ms │ +1.73x faster │
└──────────────┴───────────┴───────────────────┴───────────────┘

Wow, interesting difference!

@alamb alamb marked this pull request as draft December 5, 2023 20:01
@alamb
Copy link
Contributor

alamb commented Dec 5, 2023

Marking as a draft to make it clear this PR isn't waiting on review (I don't think)

@Dandandan Dandandan closed this Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tweak optimal default batch size
3 participants