Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maximum_bytes_billed option for BigQuery #2346

Closed
haukeduden opened this issue Apr 21, 2020 · 3 comments · Fixed by #2427
Closed

Add maximum_bytes_billed option for BigQuery #2346

haukeduden opened this issue Apr 21, 2020 · 3 comments · Fixed by #2427
Labels
bigquery enhancement New feature or request

Comments

@haukeduden
Copy link
Contributor

Describe the feature

BigQuery has the query option maximum_bytes_billed, which sets a limit for the maximum acceptable cost for individual queries. Any query that exceeds this cost will automatically fail. It would be nice if one could use this feature in DBT to guard against potentially expensive mistakes.

I suggest adding this option in the profile settings, where one can already set the BigQuery query priority and other BigQuery specific settings.

Let me know if you would like to get a pull request for this. It is a small change and I'd be happy to implement it.

Describe alternatives you've considered

I have researched ways to try and set this option via environment variables or ~/.bigqueryrc. While one can set the option in bigqueryrc, that does not influence the BigQuery client SDK, only the BigQuery commandline tools (which does not help for DBT).

Additional context

This is BigQuery specific.

Who will this benefit?

Consider a table with a million rows that you join with another table with a million rows. You mess up the join condition and accidentally create a full cross join. Since BigQuery has huge scaling capabilities, it will happily create the resulting table with 10^11 rows. Any scan of that table will be hugely expensive. This option would protect against such mistakes.

@haukeduden haukeduden added enhancement New feature or request triage labels Apr 21, 2020
@drewbanin drewbanin removed the triage label Apr 21, 2020
@drewbanin
Copy link
Contributor

hey @haukeduden - cool idea! Sure, please feel free to send through a PR for this change.

I think maximum_bytes_billed would be a good profile config name. If the value is not provided, or if it is 0, then dbt should not configure a value for maximum_bytes_billed in a query.

Thanks for opening this issue!

@haukeduden
Copy link
Contributor Author

@drewbanin Which branch should I use as the base for the PR? It seems that master is quite outdated ... is feature/dbt-project-v2 the right starting point maybe?

@beckjake
Copy link
Contributor

@haukeduden dev/octavius-catto should be the "default branch" on github - please branch off of that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants