Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the default value of task writer count to 32 for partition case #14553

Merged
merged 6 commits into from
Oct 20, 2022
Merged

Increase the default value of task writer count to 32 for partition case #14553

merged 6 commits into from
Oct 20, 2022

Conversation

gaurav8297
Copy link
Member

@gaurav8297 gaurav8297 commented Oct 10, 2022

Description

Problem:

  • Current default for task writer count is set to 1 for partitioned write which is terrible for performance.
  • Prefer partitioning is disabled if there are no stats which makes the writes unreliable in case there are huge partitions.

Approach:

  • Introduce task.partitioned-writer-count with the default value of 32. This is for tables which are partitioned and when prefer partitioning is used. Users can also change this config through partitioned_task_writer_count session config.
  • Enable prefer partitioning in case stats are absent.

Non-technical explanation

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Oct 10, 2022
@gaurav8297 gaurav8297 requested a review from sopel39 October 10, 2022 21:23
@gaurav8297 gaurav8297 marked this pull request as ready for review October 11, 2022 15:21
@gaurav8297 gaurav8297 requested a review from sopel39 October 11, 2022 16:08
@sopel39 sopel39 requested a review from Dith3r October 12, 2022 07:49
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comment
% can test be added?

@gaurav8297
Copy link
Member Author

% can test be added?

I don't think we need more. There are already multiple tests for inserts (partitioned and unpartitioned) with different writer counts and for the second last commit, we are already testing that preferred partitioning should be true when stats are not available.

@gaurav8297 gaurav8297 requested a review from sopel39 October 16, 2022 18:48
@gaurav8297 gaurav8297 requested a review from sopel39 October 17, 2022 15:04
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

% comments

@gaurav8297
Copy link
Member Author

CI hit % #14682

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Introduce task.partitioned-writer-count with the default
value of 32. This is for tables which are partitioned
and prefer partitioning is used. Users can also change
this config through task_partitioned_writer_count
session config.
This is essential because query could write to
huge amount of partition using unpartitioned
writing route. Therefore, it can lead to out
of memory error since each writer thread could
write to all partitions while each partition
per writer allocates a certain amount of memory
for buffering.
@sopel39 sopel39 merged commit 803d57a into trinodb:master Oct 20, 2022
@github-actions github-actions bot added this to the 401 milestone Oct 20, 2022
@sopel39 sopel39 mentioned this pull request Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants