Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQueryIO - Can't use DynamicDestination with CREATE_IF_NEEDED for unbounded PCollection and FILE_LOADS #18706

Open
kennknowles opened this issue Jun 3, 2022 · 2 comments

Comments

@kennknowles
Copy link
Member

My workflow : KAFKA -> Dataflow streaming -> BigQuery

Given that having low-latency isn't important in my case, I use FILE_LOADS to reduce the costs. I'm using BigQueryIO.Write with a DynamicDestination, which is a table with the current hour as a suffix.

This BigQueryIO.Write is configured like this :


.withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)
.withMethod(Method.FILE_LOADS)
.withTriggeringFrequency(triggeringFrequency)
.withNumFileShards(100)

The first table is successfully created and is written to. But then the following tables are never created and I get these exceptions:


(99e5cd8c66414e7a): java.lang.RuntimeException: Failed to create load job with id prefix 5047f71312a94bf3a42ee5d67feede75_5295fbf25e1a7534f85e25dcaa9f4986_00001_00023,
reached max retries: 3, last failed load job: {
  "configuration" : {
    "load" : {
      "createDisposition"
: "CREATE_NEVER",
      "destinationTable" : {
        "datasetId" : "dev_mydataset",
        "projectId"
: "myproject-id",
        "tableId" : "mytable_20180302_16"
      },

The CreateDisposition used is CREATE_NEVER, contrary as CREATE_IF_NEEDED as specified.

Imported from Jira BEAM-3772. Original Jira may contain additional context.
Reported by: benjben.

@ragyabraham
Copy link

@kennknowles any updates on this issue? This is causing problems with us in production

@matthieucham
Copy link

For anyone interested, a workaround is explained here: https://dev.to/stack-labs/tricky-dataflow-ep-1-auto-create-bigquery-tables-in-pipelines-n2k

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants