Iceberg split generation rate limit #18214

marton-bod · 2023-07-10T14:46:17Z

Description

Hive, Hudi and DeltaLake connectors already use the same approach implemented in this PR, whereby they first asynchronously load (stream) splits into an AsyncQueue (which can be configured with throttling settings), and then when getNextBatch() is periodically called from the SplitSource, splits are borrowed from this queue.

The primary aim is to control the rate at which splits are fetched for scheduling, thereby protecting the storage layer from getting overwhelmed by requests (e.g. namenodes in HDFS). This is achieved by a newly-introduced config flag (already present in Hive, Hudi and Delta): iceberg.max-splits-per-second.

Additional context and related issues

Solves the issue: #17974

The logic for filtering out splits should be unchanged: the pruning logic related to dynamic filter is in getNextBatch() since DF needs to be recomputed periodically. The rest of the pruning logic we can be applied to a split even before enqueueing it, therefore we can perform those checks within acceptScanTask() when we create the split stream.

Note: FileAwareScanTask is a new class introduced to keep metadata for FileScanTasks regarding their parent files. This is needed to replicate the current logic which stops the iteration if we hit enough records already for a LIMIT clause.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(X) Release notes are required, with the following suggested text:

# Iceberg connector
* Introduce new config property to rate limit split generation in Iceberg table scans: iceberg.max-splits-per-second.

cla-bot · 2023-07-10T20:52:48Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Marton Bod.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

marton-bod · 2023-07-17T17:41:34Z

@alexjo2144 @findinpath Just a friendly bump, I would really appreciate it if you could review this PR when you get the chance. Thank you!

mosabua · 2023-07-18T00:27:08Z

Given that this is modelled of implementation in other connectors this seems like a good approach. Could you potentially also look @electrum or @findepi

marton-bod · 2023-07-20T10:17:54Z

Thanks a lot @mosabua for taking a first look. @electrum / @findepi please let me know if you have any thoughts, I do appreciate any input. Thanks a lot!

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSessionProperties.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitManager.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergModule.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitSource.java

cla-bot · 2023-07-26T16:33:37Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Marton Bod.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot · 2023-07-27T14:44:29Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Marton Bod.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

marton-bod · 2023-07-27T14:54:15Z

...n/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergMetadataFileOperations.java

-                        .addCopies(new FileOperation(SNAPSHOT, INPUT_FILE_NEW_STREAM), 1)
-                        .addCopies(new FileOperation(MANIFEST, INPUT_FILE_NEW_STREAM), numberOfFiles + min(icebergManifestPrefetching, numberOfFiles))
-                        .build());
+        assertFileSystemDataFileAccesses("SELECT * FROM test_select_with_limit LIMIT " + limit, min(limit, numberOfFiles));


@findepi the number of manifest accesses is no longer deterministic, because the splits are enqueued async. As soon as it's identified during a getNextBatch call that we have enough splits, the enqueuing will stop, but we can't make any guarantees about the number of manifest accesses.

On the other hand, the number of data file accesses are deterministic and no more data files will be read than necessary. Changed this test to reflect this.

thinking about this

we still want to test manifest accesses; this is important metric

there should probably be a kill-switch for this functionality

so maybe the test should just use that kill-switch?

I've been thinking about this too, but I think adding a kill switch would complicate the code greatly. The iterative split generation approach is fundamentally different from the background async split loading approach, and a kill switch would require maintaining the code for both approaches in parallel and duplicating the logic in some places (and more painfully, porting fixes to both places). Because of this, I think it's impractical to introduce this flag.

I've made some enhancement to the test though, since even with this new approach we can make some guarantees about the number of manifest accesses. I've changed the test to compare the manifest access count against a lower bound and upper bound, and added some code comments to explain the reasoning behind the bounds. Regardless of the number of manifests in the scan, if there's a limit, the number of actually-accessed manifests should stay within a constant, narrow range.

The iterative split generation approach is fundamentally different from the background async split loading approach

kill switches are useful in case a bug is uncovered. if the new code is fundamentally different from the old one, it sounds like it's not safe to proceed, is it?

I phrased it wrong, I think. They're not fundamentally different, since most of the split generation and filtering logic has remained unchanged. It's different only in the sense that hiding it behind a feature flag is not trivial like putting some code in an if-else statement. But it's doable. For example, I could refactor it so that IcebergAsyncSplitSource extends IcebergSplitSource and then IcebergSplitManager can toggle between which implementation to use based on the flag - what do you think?

Can we make the new implementation produce the same behaviour as before if max-outstanding-splits was configurable and set to 1 ?

marton-bod · 2023-07-27T14:57:11Z

@findepi Thanks so much for your time to review this. I've tried answering your questions as best as I could. Please let me know if you have any other questions/improvement ideas. Thanks!

cla-bot · 2023-08-01T11:45:23Z

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Marton Bod.
This is most likely caused by a git client misconfiguration; please make sure to:

check if your git client is configured with an email to sign commits git config --list | grep email
If not, set it up using git config --global user.email [email protected]
Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

vgankidi · 2024-02-26T17:08:27Z

@raunaqmorarka @findepi @findinpath @electrum Would really appreciate reviews here, its been a long pending one. Hope to close this out soon.

vgankidi · 2024-03-13T23:48:23Z

@raunaqmorarka @findepi @findinpath @electrum @mosabua Gentle ping again on this.

bitsondatadev · 2024-03-14T01:26:34Z

@vgankidi I'm just gonna ping and ping until someone can get to this lol. Sorry this has taken so long.

Also, please attend the contributor meeting next week: https://github.com/trinodb/trino/wiki/Contributor-meetings

Bring this up there.

dain · 2024-03-14T18:34:01Z

This feature is pretty anti Trino. Trino is designed to complete queries as fast as possible. Storage systems should be protecting themselves by pushing back and throttling (like all of the cloud ones do). Also split generation is not representitive of actually performing any IO, so even if we wanted to add a throttle, this is the wrong place.

dain

~~I don't think we should be doing this, and it should be discussed (slack, contributors meeting, etc.) before merging.~~

Turns out this is in every other hive style connector for some reason, so I guess this this ship has sailed

dain · 2024-03-15T04:46:46Z

@electrum pointed out this is in the outher Hive style connectors already. I have no idea why this was added because it is a poor stand in for real concurrency control... anyway, I suggest we do not document or recommend this feature.

marton-bod · 2024-03-18T17:23:52Z

@dain

The same approach is already followed by both Hive, Delta Lake and Hudi. As for the "anti-trinoness" of this approach, I would just remind you of this Delta commit: 9956543e, which introduces the Delta connector, which was co-authored by you as well as couple dozen Starburst engineers, signing off on this very same approach.
The point of this change is to protect storage systems which do not have throttling capabilities, like HDFS. I clearly pointed this out in the issue description as well. We have plenty of Iceberg datasets in HDFS. For storage systems with throttling, like S3, I agree this shouldn't be used, but no one suggested it otherwise.

dain · 2024-04-10T21:54:10Z

I suggest we fix this up and merge it, so there is parity across the implementations. Then I suggest we add a real throttling wrapper to the new native file systems to actualy deal with real errors (we can start with a simpel per machine uncoordinated limt). Once that is in we should drop this from all of the connectors.

mosabua · 2024-04-10T22:29:45Z

Awesome @dain .. do you want me to file an issue so we can track this and move it along?

@marton-bod can you rebase and ping us when its ready for review again? Then we can figure out how @findepi @raunaqmorarka @ebyhr and @dain collaborate efficiently to proceed this to merge.

marton-bod · 2024-04-11T19:18:43Z

@mosabua @dain Thanks all! I've rebased this to the latest master, so should be ready again for reviews

github-actions · 2024-05-03T17:07:03Z

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

marton-bod · 2024-05-13T15:12:38Z

Hi @dain, I was wondering if you could please take a look at this to close it out? Thanks so much!

mosabua · 2024-05-24T22:01:46Z

Please rebase @marton-bod

Either @findepi or @dain can then proceed with final review and merge

marton-bod · 2024-05-28T15:53:03Z

@mosabua @dain Rebase done. Thanks!

marton-bod · 2024-07-17T11:35:38Z

@mosabua Shall we just close this PR instead?

mosabua · 2024-07-17T16:48:48Z

@marton-bod @dain .. we kinda agreed to get this merged to get parity for all the object storage connectors on this feature. I still think we should do that .. and also work on a proper solution, maybe on the file system level @electrum

Thoughts?

And also .. sorry for the delays and back and forth.

cla-bot bot added the cla-signed label Jul 10, 2023

github-actions bot added the iceberg Iceberg connector label Jul 10, 2023

cla-bot bot removed the cla-signed label Jul 10, 2023

marton-bod force-pushed the max_splits_iceberg branch from 21f41d4 to c237f9f Compare July 10, 2023 20:54

cla-bot bot added the cla-signed label Jul 10, 2023

marton-bod requested review from alexjo2144, findinpath and findepi July 11, 2023 17:31

findepi changed the title ~~Iceberg: add config flags to rate limit split generation~~ Iceberg: add config flags to rate limit Iceberg split generation Jul 12, 2023

findepi changed the title ~~Iceberg: add config flags to rate limit Iceberg split generation~~ Iceberg split generation rate limit Jul 12, 2023

findepi reviewed Jul 21, 2023

View reviewed changes

marton-bod force-pushed the max_splits_iceberg branch from 11e29b8 to c322bad Compare July 25, 2023 15:38

cla-bot bot removed the cla-signed label Jul 26, 2023

marton-bod force-pushed the max_splits_iceberg branch from ae26fb3 to 6b08d49 Compare July 26, 2023 16:35

cla-bot bot added the cla-signed label Jul 26, 2023

marton-bod force-pushed the max_splits_iceberg branch from 6b08d49 to aa4fca3 Compare July 26, 2023 16:53

cla-bot bot removed the cla-signed label Jul 27, 2023

marton-bod force-pushed the max_splits_iceberg branch from 3f3674a to ab82595 Compare July 27, 2023 14:45

cla-bot bot added the cla-signed label Jul 27, 2023

marton-bod commented Jul 27, 2023

View reviewed changes

cla-bot bot removed the cla-signed label Aug 1, 2023

marton-bod force-pushed the max_splits_iceberg branch from e408de7 to 2fc86ae Compare August 1, 2023 11:46

marton-bod requested review from findepi and raunaqmorarka February 22, 2024 15:57

dain previously requested changes Mar 14, 2024

View reviewed changes

ebyhr removed their request for review March 25, 2024 00:55

marton-bod force-pushed the max_splits_iceberg branch from 09a6c52 to 1bff24d Compare April 11, 2024 11:11

github-actions bot added the stale label May 3, 2024

mosabua added stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed. and removed stale labels May 7, 2024

marton-bod requested a review from dain May 20, 2024 16:56

mosabua assigned dain May 24, 2024

marton-bod added 3 commits May 28, 2024 14:36

Extract initializeIterator method in IcebergSplitSource

3aca87d

Iceberg: add config flags to rate limit split generation

e363693

Empty commit

dedf6a0

marton-bod force-pushed the max_splits_iceberg branch from b913b34 to dedf6a0 Compare May 28, 2024 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iceberg split generation rate limit #18214

Iceberg split generation rate limit #18214

marton-bod commented Jul 10, 2023 •

edited

Loading

cla-bot bot commented Jul 10, 2023

marton-bod commented Jul 17, 2023

mosabua commented Jul 18, 2023

marton-bod commented Jul 20, 2023

cla-bot bot commented Jul 26, 2023

cla-bot bot commented Jul 27, 2023

marton-bod Jul 27, 2023

findepi Jul 31, 2023

marton-bod Aug 1, 2023

findepi Aug 1, 2023

marton-bod Aug 1, 2023 •

edited

Loading

raunaqmorarka Dec 11, 2023

marton-bod commented Jul 27, 2023

cla-bot bot commented Aug 1, 2023

vgankidi commented Feb 26, 2024

vgankidi commented Mar 13, 2024

bitsondatadev commented Mar 14, 2024 •

edited

Loading

dain commented Mar 14, 2024 •

edited

Loading

dain left a comment •

edited

Loading

dain commented Mar 15, 2024

marton-bod commented Mar 18, 2024

dain commented Apr 10, 2024

mosabua commented Apr 10, 2024

marton-bod commented Apr 11, 2024

github-actions bot commented May 3, 2024

marton-bod commented May 13, 2024

mosabua commented May 24, 2024

marton-bod commented May 28, 2024

marton-bod commented Jul 17, 2024

mosabua commented Jul 17, 2024 •

edited

Loading

Iceberg split generation rate limit #18214

Are you sure you want to change the base?

Iceberg split generation rate limit #18214

Conversation

marton-bod commented Jul 10, 2023 • edited Loading

Description

Additional context and related issues

Release notes

cla-bot bot commented Jul 10, 2023

marton-bod commented Jul 17, 2023

mosabua commented Jul 18, 2023

marton-bod commented Jul 20, 2023

cla-bot bot commented Jul 26, 2023

cla-bot bot commented Jul 27, 2023

marton-bod Jul 27, 2023

Choose a reason for hiding this comment

findepi Jul 31, 2023

Choose a reason for hiding this comment

marton-bod Aug 1, 2023

Choose a reason for hiding this comment

findepi Aug 1, 2023

Choose a reason for hiding this comment

marton-bod Aug 1, 2023 • edited Loading

Choose a reason for hiding this comment

raunaqmorarka Dec 11, 2023

Choose a reason for hiding this comment

marton-bod commented Jul 27, 2023

cla-bot bot commented Aug 1, 2023

vgankidi commented Feb 26, 2024

vgankidi commented Mar 13, 2024

bitsondatadev commented Mar 14, 2024 • edited Loading

dain commented Mar 14, 2024 • edited Loading

dain left a comment • edited Loading

Choose a reason for hiding this comment

dain commented Mar 15, 2024

marton-bod commented Mar 18, 2024

dain commented Apr 10, 2024

mosabua commented Apr 10, 2024

marton-bod commented Apr 11, 2024

github-actions bot commented May 3, 2024

marton-bod commented May 13, 2024

mosabua commented May 24, 2024

marton-bod commented May 28, 2024

marton-bod commented Jul 17, 2024

mosabua commented Jul 17, 2024 • edited Loading

marton-bod commented Jul 10, 2023 •

edited

Loading

marton-bod Aug 1, 2023 •

edited

Loading

bitsondatadev commented Mar 14, 2024 •

edited

Loading

dain commented Mar 14, 2024 •

edited

Loading

dain left a comment •

edited

Loading

mosabua commented Jul 17, 2024 •

edited

Loading