Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore disappeared datasets during listing tables in BigQuery #9954

Merged

Conversation

ebyhr
Copy link
Member

@ebyhr ebyhr commented Nov 15, 2021

No description provided.

@cla-bot cla-bot bot added the cla-signed label Nov 15, 2021
@ebyhr ebyhr force-pushed the ebi/bigquery-handle-remote-metadata-change branch from cd86096 to 3994e64 Compare November 15, 2021 05:23
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits

}
catch (BigQueryException e) {
if (e.getCode() == 404 && e.getMessage().contains("Not found: Dataset")) {
log.debug("Dataset disappeared during listing operation: %s", remoteSchemaName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a comment like in

catch (TableNotFoundException | AccessDeniedException e) {
// table disappeared during listing operation or user is not allowed to access it
// these exceptions are ignored because listTableColumns is used for metadata queries (SELECT FROM information_schema)
}
would be helpful here otherwise it seems "incorrect" to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW this seems to be a pre-existing case in JDBC metadata too. There we ignore only missing tables, not schemas.

Copy link
Member Author

@ebyhr ebyhr Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I checked JDBC module before sending this PR, it doesn't have the same issue. Can you point out the exact issue place? The DefaultJdbcMetadata snippet condition is little misleading (donwstream method won't throw TableNotFoundException), but the entire listing logic looks correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It throws JDBC_ERROR wrapping the original exception that was encountered instead of ignoring failures during listing.

See BaseJdbcClient#getTableNames -

catch (SQLException e) {
throw new TrinoException(JDBC_ERROR, e);
}

Copy link
Member Author

@ebyhr ebyhr Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getTables() in the try block calls DatabaseMetaData.getTables() method, so it won't throw an exception even if the schema doesn't exist in my understanding. IdentifierMapping might lead to the failure though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if it was unclear. I meant that the existing JDBC code can fail if DatabaseMetaData#getTAbles fails which we know to be the case with MemSQL for example. It's purely theoretical but it can happen IIUC?

// filter ambiguous tables
boolean isAmbiguous = bigQueryClient.toRemoteTable(projectId, remoteSchemaName, table.getTableId().getTable().toLowerCase(ENGLISH), tables)
.filter(RemoteDatabaseObject::isAmbiguous)
.isPresent();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use ifPresentOrElse here to avoid assignment and the if...else? Would it look better?

@ebyhr ebyhr force-pushed the ebi/bigquery-handle-remote-metadata-change branch 2 times, most recently from a887d19 to bd3ae8d Compare November 15, 2021 07:46
@ebyhr ebyhr force-pushed the ebi/bigquery-handle-remote-metadata-change branch from bd3ae8d to 2768b0d Compare November 15, 2021 07:58
@ebyhr ebyhr merged commit a0d3dc2 into trinodb:master Nov 15, 2021
@ebyhr ebyhr deleted the ebi/bigquery-handle-remote-metadata-change branch November 15, 2021 09:40
@github-actions github-actions bot added this to the 365 milestone Nov 15, 2021
@ebyhr ebyhr mentioned this pull request Nov 23, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants