-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Label BigQuery jobs with Trino query id #16187
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second commit lgtm
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Outdated
Show resolved
Hide resolved
06045c6
to
915be3f
Compare
@ebyhr can you run BQ tests? |
/test-with-secrets sha=915be3f3e808f27aa0c2b684f342147a5a9a7173 |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/4233150724 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit order looks a little weird. For instance, 1st commit adds ConnectorSession
to BigQueryClient.query
, and 2nd commit removes it. I'm fine with squashing commits into one.
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryQueryPageSource.java
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryPageSourceProvider.java
Show resolved
Hide resolved
915be3f
to
ebe9ec2
Compare
ebe9ec2
to
9055938
Compare
@ebyhr PTAL |
9055938
to
7c1ee90
Compare
/test-with-secrets sha=7c1ee905b13cecf03ab56cf27b5405f0b53f4a21 |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/4912226543 |
7c1ee90
to
32618b7
Compare
@ebyhr ptal. I needed to revert to the previous approach. Please see a rationale in the commit message why passing |
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClientFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClientFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/BaseBigQueryConnectorTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to have a test for it? To ask BQ what labels were set when Trino did a query?
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Outdated
Show resolved
Hide resolved
@kokosing it's not possible to have this test with a current set of privileges that the credential key has. |
Also I don't think it really needs elevated permissions, I see and can query from information schema https://cloud.google.com/bigquery/docs/viewing-labels |
81ff158
to
1fb60a5
Compare
The another commit was pushed since the last review
1fb60a5
to
00b04af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % format-interpolation feature
|
||
@Config("bigquery.job.label-format") | ||
@ConfigDescription("Adds `trino_query` label to the BigQuery job with provided value format") | ||
public BigQueryConfig setQueryLabelFormat(String queryLabelFormat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while it's useful to use common format across connectors for BigQuery it's probably less powerful in the way we use it since all information gets stuffed into single label in BigQuery - I feel this is prematurely adding a feature which would make it harder to evolve over time as people request changes.
What do you think @kokosing @ebyhr - should we not add it at the moment and just label the query id for now?
Alternatively we can convert the config into a boolean toggle and if enabled add one label per each pre-defined placeholder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should allow to define the label key. So people may organize it the way the like.
I am under impression that the current setup make is very customizable so one can use just query id for labelling. The other can format the label as JSON and parse it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Test | ||
public void testQueryLabel() | ||
{ | ||
String materializedView = "test_query_label" + randomNameSuffix(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a specific reason to use an MV? Add a comment or simplify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, labels are added to jobs and jobs are used only when querying non-storage-backed relations (i.e. views)
/test-with-secrets sha=00b04af367fe3febd59ae472a4f9fa5447f87652 |
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/BaseBigQueryConnectorTest.java
Show resolved
Hide resolved
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/logging/FormatInterpolator.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryClient.java
Outdated
Show resolved
Hide resolved
|
||
@Config("bigquery.job.label-format") | ||
@ConfigDescription("Adds `trino_query` label to the BigQuery job with provided value format") | ||
public BigQueryConfig setQueryLabelFormat(String queryLabelFormat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should allow to define the label key. So people may organize it the way the like.
I am under impression that the current setup make is very customizable so one can use just query id for labelling. The other can format the label as JSON and parse it later.
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/main/java/io/trino/plugin/bigquery/BigQueryConnectorModule.java
Outdated
Show resolved
Hide resolved
plugin/trino-bigquery/src/test/java/io/trino/plugin/bigquery/BaseBigQueryConnectorTest.java
Outdated
Show resolved
Hide resolved
@kokosing @hashhar airlift/airlift#1066 this would allow this feature to be generic i.e. defining labels as:
Asking the user to input JSON and parsing it is adding a complexity here, not hiding it from a user. |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/4926624071 |
00b04af
to
a68ab2e
Compare
ConnectorSession needs to be passed to query/update methods because BigQueryClient is cached using identityCacheMapping.getRemoteUserCacheKey() which is not taking into account session properties. We need also to access queryId in order to properly label queries but we don't want to cache client per query id.
a68ab2e
to
56af278
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Eventually we should plan to have multiple labels instead of adding all information in a single label.
/test-with-secrets sha=56af27891d3001a0358bfd5b1bb4e3f76dee4c6c |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/4988294291 |
I would prefer to document this as it is today. |
Description
It makes it easier to track the origin of BQ DDL and DML queries.
Release notes
(x) This is not user-visible or docs only and no release notes are required.