Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't return query results from BigQuery DML/DDL statements #2141

Closed
nfallen opened this issue Jun 19, 2018 · 4 comments · Fixed by #2530
Closed

Don't return query results from BigQuery DML/DDL statements #2141

nfallen opened this issue Jun 19, 2018 · 4 comments · Fixed by #2530
Assignees
Labels
api: bigquery Issues related to the BigQuery API. type: question Request for information or clarification. Not an issue.

Comments

@nfallen
Copy link

nfallen commented Jun 19, 2018

This client library uses tabledata.list to fetch query results, and so even for a DML/DDL statement it fetches rows. Instead, you could use the jobs.getQueryResults API, which correctly does not return any rows for DML/DDL.

@JustinBeckwith JustinBeckwith added the triage me I really want to be triaged. label Jun 19, 2018
@quartzmo quartzmo self-assigned this Jun 19, 2018
@quartzmo quartzmo added type: question Request for information or clarification. Not an issue. api: bigquery Issues related to the BigQuery API. labels Jun 19, 2018
@JustinBeckwith JustinBeckwith removed the triage me I really want to be triaged. label Jun 19, 2018
@quartzmo
Copy link
Member

@nfallen Thank you for bringing up this issue. This library originally used getQueryResults, but was changed last year to use tabledata.list for better performance. I'm not sure if the performance difference still exists, however. I'll see what I can find out.

How many rows are you seeing?

@nfallen
Copy link
Author

nfallen commented Jun 20, 2018

If you want to continue using tabledata.list for performance reasons you could use the presence of the field statistics.query.numDmlAffectedRows in the jobs.get response to indicate that the query was DML, and the presence of statistics.query.ddlTargetTable to indicate that the query was DDL.

Since there is no maxResults value provided by default the call to tabledata.list will attempt to fetch all rows, and will only be limited by the size of the table or the maximum response size of tabledata.list.

@quartzmo
Copy link
Member

Thanks @nfallen, this is very helpful!

@quartzmo
Copy link
Member

@nfallen I believe the library actually provides a way to execute DML/DDL without fetching rows, as demonstrated in dataset_ddl_test.rb.

There is no test for DML operations, but I believe it would work similarly. The library does not currently expose statistics.query.numDmlAffectedRows, but I can add it, along with the test.

Does this sound OK to you?

quartzmo added a commit to quartzmo/google-cloud-ruby that referenced this issue Oct 18, 2018
quartzmo added a commit that referenced this issue Oct 24, 2018
* Add #num_dml_affected_rows to QueryJob
* Update QueryJob#data to conditionally return empty Data
* Add DDL/DML attrs to Data
* Add DDL/DML examples to #query_job and #query

[closes #2141]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants