Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add upper limit on number of tasks that take part in writing stages #16238

Conversation

radek-kondziolka
Copy link
Contributor

@radek-kondziolka radek-kondziolka commented Feb 23, 2023

Description

The option query.max-writer-node-count was added to limit number of nodes that take part in executing writer stages.
Roughly, it was implemented by some changes in ScaledWriterScheduler (for unpartitioned data) and by changing the AddExchanges rule (mostly for partitioned data).

Release notes

() This is not user-visible or docs only and no release notes are required.
() Release notes are required, please propose a release note for me.
(*) Release notes are required, with the following suggested text:

Add posibility to limit number of writing tasks.

Documentation

There is a change in this PR

@radek-kondziolka
Copy link
Contributor Author

radek-kondziolka commented Feb 23, 2023

I will add more tests after we agree that this approach is ok.

@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from 713aae5 to 6906e11 Compare February 23, 2023 15:33
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from 6906e11 to 2bb3511 Compare February 24, 2023 09:40
@gaurav8297
Copy link
Member

There are some relevant CI failures. Please fix them

@gaurav8297
Copy link
Member

Can we also have an integration test in BaseHiveConnectorTest? Something similar to testMultipleWriters where we check the number of files in the end

@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from 2bb3511 to f3bd21e Compare February 27, 2023 10:15
@radek-kondziolka
Copy link
Contributor Author

radek-kondziolka commented Feb 27, 2023

Can we also have an integration test in BaseHiveConnectorTest? Something similar to testMultipleWriters where we check the number of files in the end

Sure, as I wrote here: #16238 (comment) I am going to add new tests after we agree this approach is ok.

@radek-kondziolka radek-kondziolka changed the title Rk/add upper limit on writer scaling backup with connector Add upper limit on writer scaling backup with connector Feb 27, 2023
@radek-kondziolka radek-kondziolka changed the title Add upper limit on writer scaling backup with connector Add upper limit on number of tasks that take part in writing stages Feb 27, 2023
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from f3bd21e to 16aded6 Compare February 27, 2023 12:12
@wendigo
Copy link
Contributor

wendigo commented Feb 27, 2023

2023-02-27T12:44:11.0895418Z java.lang.AssertionError: io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata does not override [public default java.util.OptionalInt io.trino.spi.connector.ConnectorMetadata.getMaxWriterTasks(io.trino.spi.connector.ConnectorSession)]

@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from 16aded6 to 203784d Compare February 27, 2023 13:59
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from 203784d to 7afb274 Compare February 28, 2023 15:33
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from ae493bb to fdfca2a Compare March 9, 2023 07:36
@radek-kondziolka radek-kondziolka requested a review from sopel39 March 9, 2023 07:38
Map's key in PipelinedQueryScheduler.partitioningCacheMap is built only from
partitioningHandle parameter. If any two stages has same partitioning
but different number of partitions the key does not reflect that.
This is why key was extended with partitionCount.
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from fdfca2a to 7240ab1 Compare March 9, 2023 10:43
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch 6 times, most recently from 1009e95 to c56bc02 Compare March 14, 2023 07:29
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

Limit number of tasks that will be scheduled to execute writing stages
basing on needs of specific connector or global limit -
query.max-writer-nodes-count.

Added an option query.max-writer-nodes-count to QueryManagerConfig and
a session option that limits number of tasks that take part in writing
nodes.
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch 7 times, most recently from 884826e to 738e3dc Compare March 16, 2023 09:30
@radek-kondziolka radek-kondziolka force-pushed the rk/add_upper_limit_on_writer_scaling_backup_with_connector branch from 738e3dc to fa9a667 Compare March 16, 2023 19:25
@sopel39 sopel39 merged commit 8750f07 into trinodb:master Mar 17, 2023
@sopel39 sopel39 mentioned this pull request Mar 17, 2023
@sopel39
Copy link
Member

sopel39 commented Mar 17, 2023

Thx!

@github-actions github-actions bot added this to the 411 milestone Mar 17, 2023
@ebyhr ebyhr mentioned this pull request Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

4 participants