[Task]: Client-side throttling for BigQueryIO DIRECT_READ mode #30646

Abacn · 2024-03-15T20:26:24Z

Abacn · 2024-03-15T20:39:10Z

This is a known issue and there was another approach trying to address it before, namely #24260. The current status of this approach is a pipeline option --enableStorageReadApiV2 to enroll in BigQuery storage Read API v2, where the read streams are the units of work instead of unit of parallelism, and the streams that created once won't be split further. This appraoch still need stress test to see if it mitigated the issue and to what extent.

This task is try to resolve the issue by client side throttling, alternative to aforementioned.

Abacn · 2024-04-26T20:51:04Z

After #31096, the client side throttling now work with ( Storage read API v2 stream (#28778) + Dataflow legacy runner). There are still many caveats

For the default read API v1 stream, it appears the API call waiting on retry won't temporarily release the concurrent stream quota, so hasNext call can be blocked very long until the metrics get reported back to the work item thread. The pipeline do not upscale, but it stuck indefinitely (probably until exhausted retry)

Update: API v1 stream issue is due to (effective) deadlock of two synchronized block at

beam/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java

Line 238 in 673da54

private synchronized boolean readNextRecord() throws IOException {

beam/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java

Line 376 in 673da54

synchronized (this) {

both will call hasNext. If splitAtFraction proceeded into synchronized block first, the work item thread will have to wait until it exits synchronized block, the symptom is then

Operation ongoing in step BigQueryIO.TypedRead/Read(BigQueryStorageTableSource) for at least 05m00s without outputting or completing in state process in thread pool-3-thread-2 with id 28
  at org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource$BigQueryStorageStreamReader.advance(BigQueryStorageStreamSource.java:232)

if readNextRecord gets called first, splitAtFraction will take very long (probably causing issue then?)

Finally, metrics is not supported in the thread calling splitAtFraction (Worker status update thread), so reportingPendingMetrics there should be removed

=====

It seems also not effective on Dataflow runner v2

Abacn · 2024-06-11T16:15:45Z

A Dataflow runner side issue identified and resolved. Close this as done.

Abacn added task awaiting triage labels Mar 15, 2024

Abacn mentioned this issue Mar 15, 2024

[Feature Request]: Design and Implement I/O Connector Throttling components #24743

Open

15 tasks

github-actions bot added the P2 label Mar 15, 2024

liferoad assigned Abacn Mar 15, 2024

github-actions bot removed the awaiting triage label Mar 15, 2024

Abacn mentioned this issue Apr 24, 2024

Fix reporting metrics not supported warning for BigQueryIO Direct read when throttled #31096

Merged

3 tasks

Abacn closed this as completed in #31096 Apr 26, 2024

github-actions bot added this to the 2.57.0 Release milestone Apr 26, 2024

Abacn reopened this Apr 26, 2024

Abacn removed this from the 2.57.0 Release milestone Apr 26, 2024

Abacn mentioned this issue Apr 27, 2024

Suppress BigQuery read stream splitAtFraction when API busy call or timeout #31125

Merged

3 tasks

Abacn closed this as completed Jun 11, 2024

github-actions bot added this to the 2.58.0 Release milestone Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task]: Client-side throttling for BigQueryIO DIRECT_READ mode #30646

[Task]: Client-side throttling for BigQueryIO DIRECT_READ mode #30646

Abacn commented Mar 15, 2024

Abacn commented Mar 15, 2024

Abacn commented Apr 26, 2024 •

edited

Loading

Abacn commented Jun 11, 2024

[Task]: Client-side throttling for BigQueryIO DIRECT_READ mode #30646

[Task]: Client-side throttling for BigQueryIO DIRECT_READ mode #30646

Comments

Abacn commented Mar 15, 2024

What needs to happen?

Issue Priority

Issue Components

Abacn commented Mar 15, 2024

Abacn commented Apr 26, 2024 • edited Loading

Abacn commented Jun 11, 2024

Abacn commented Apr 26, 2024 •

edited

Loading