Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery destination runs into 404 error in STANDARD mode (reproduce the error) #19998

Closed
grishick opened this issue Dec 1, 2022 · 3 comments · Fixed by #21144
Closed

BigQuery destination runs into 404 error in STANDARD mode (reproduce the error) #19998

grishick opened this issue Dec 1, 2022 · 3 comments · Fixed by #21144
Assignees
Labels
needs-triage team/destinations Destinations team's backlog type/bug Something isn't working

Comments

@grishick
Copy link
Contributor

grishick commented Dec 1, 2022

OnCall issues https://github.com/airbytehq/oncall/issues/1134, https://github.com/airbytehq/oncall/issues/1271

A connection configured to use STANDARD loading method passed CHECK but failed with an error that indicates that dataset either does not exist or is not located in the specified region:

	2022-12-01 02:21:54 �[43mdestination�[0m > Selected loading method is set to: STANDARD
2022-12-01 02:21:55 �[43mdestination�[0m > Partitioned table created successfully: GenericData{classInfo=[datasetId, projectId, tableId], {datasetId=public, tableId=_airbyte_tmp_kol_activities}}
2022-12-01 02:21:55 �[43mdestination�[0m > Selected loading method is set to: STANDARD
2022-12-01 02:21:55 �[43mdestination�[0m > Something went wrong in the connector. See the logs for more details.
Stack Trace: com.google.cloud.bigquery.BigQueryException: 404 Not Found
POST https://www.googleapis.com/upload/bigquery/v2/projects/redacted-dataset-name/jobs?uploadType=resumable
{
  "error": {
    "code": 404,
    "message": "Not found: Dataset redacted-dataset-name:redacted-schema-name",
    "errors": [
      {
        "message": "Not found: Dataset redacted-dataset-name:redacted-schema-name,
        "domain": "global",
        "reason": "notFound"
      }
    ],
    "status": "NOT_FOUND"
  }
}

	at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
	at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.open(HttpBigQueryRpc.java:655)
	at com.google.cloud.bigquery.TableDataWriteChannel$2.call(TableDataWriteChannel.java:87)
	at com.google.cloud.bigquery.TableDataWriteChannel$2.call(TableDataWriteChannel.java:82)
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
	at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
	at com.google.cloud.bigquery.TableDataWriteChannel.open(TableDataWriteChannel.java:81)
	at com.google.cloud.bigquery.TableDataWriteChannel.<init>(TableDataWriteChannel.java:41)
	at com.google.cloud.bigquery.BigQueryImpl.writer(BigQueryImpl.java:1388)
	at io.airbyte.integrations.destination.bigquery.uploader.BigQueryUploaderFactory.getBigQueryDirectUploader(BigQueryUploaderFactory.java:144)
	at io.airbyte.integrations.destination.bigquery.uploader.BigQueryUploaderFactory.getUploader(BigQueryUploaderFactory.java:66)
	at io.airbyte.integrations.destination.bigquery.BigQueryDestination.putStreamIntoUploaderMap(BigQueryDestination.java:241)
	at io.airbyte.integrations.destination.bigquery.BigQueryDestination.getUploaderMap(BigQueryDestination.java:230)
	at io.airbyte.integrations.destination.bigquery.BigQueryDestination.getStandardRecordConsumer(BigQueryDestination.java:269)
	at io.airbyte.integrations.destination.bigquery.BigQueryDestination.getConsumer(BigQueryDestination.java:201)
	at io.airbyte.integrations.base.IntegrationRunner.runInternal(IntegrationRunner.java:149)
	at io.airbyte.integrations.base.IntegrationRunner.run(IntegrationRunner.java:100)
	at io.airbyte.integrations.destination.bigquery.BigQueryDestination.main(BigQueryDestination.java:327)
Caused by: com.google.api.client.http.HttpResponseException: 404 Not Found

the other strange behavior that I noticed in this case is that the connector successfully created a table before attempting to load data. Looking at the destination code, one suspicion I have is that the connector should specify dataset location when creating the table in STANDARD mode.

@grishick grishick added type/bug Something isn't working needs-triage team/destinations Destinations team's backlog labels Dec 1, 2022
@grishick
Copy link
Contributor Author

@grishick
Copy link
Contributor Author

grishick commented Dec 22, 2022

Notes from grooming: it is not yet clear what is causing this. Dataset location could be it, but need to confirm.
To reproduce, will need to create a service account and a bunch of datasets (likely in different locations)
Estimate is for coming up with repro scenario.

@grishick grishick changed the title BigQuery destination runs into 404 error in STANDARD mode BigQuery destination runs into 404 error in STANDARD mode (reproduce the error) Dec 22, 2022
@etsybaev etsybaev self-assigned this Dec 28, 2022
@etsybaev
Copy link
Contributor

etsybaev commented Jan 4, 2023

I finally managed to reproduce this issue. Basically, it's related to #20561 and #21030

In our case, we check the connection against the provided config (i.e. provided schema), but then we set the "Mirror Source Structure". So basically we will try to write data to the schema that we haven't checked.

So steps to reproduce this particular issue:

  1. Create a custom schema in bigquery in some location (ex. EU) manually.
  2. Create a destination connector setting another location (ex. US) and some schema name
  3. Create source connection to the DB with schema name equals to schema from step 1.
  4. Create c connection pair (source-destination) with the default "Mirror Source Structure".

Actual result:
All connectors will pass the check connection stage but the sync itself will fail

Selection_206

Selection_188

Selection_189

Selection_207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage team/destinations Destinations team's backlog type/bug Something isn't working
Projects
None yet
2 participants