You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BigQuery has the query option maximum_bytes_billed, which sets a limit for the maximum acceptable cost for individual queries. Any query that exceeds this cost will automatically fail. It would be nice if one could use this feature in DBT to guard against potentially expensive mistakes.
I suggest adding this option in the profile settings, where one can already set the BigQuery query priority and other BigQuery specific settings.
Let me know if you would like to get a pull request for this. It is a small change and I'd be happy to implement it.
Describe alternatives you've considered
I have researched ways to try and set this option via environment variables or ~/.bigqueryrc. While one can set the option in bigqueryrc, that does not influence the BigQuery client SDK, only the BigQuery commandline tools (which does not help for DBT).
Additional context
This is BigQuery specific.
Who will this benefit?
Consider a table with a million rows that you join with another table with a million rows. You mess up the join condition and accidentally create a full cross join. Since BigQuery has huge scaling capabilities, it will happily create the resulting table with 10^11 rows. Any scan of that table will be hugely expensive. This option would protect against such mistakes.
The text was updated successfully, but these errors were encountered:
hey @haukeduden - cool idea! Sure, please feel free to send through a PR for this change.
I think maximum_bytes_billed would be a good profile config name. If the value is not provided, or if it is 0, then dbt should not configure a value for maximum_bytes_billed in a query.
@drewbanin Which branch should I use as the base for the PR? It seems that master is quite outdated ... is feature/dbt-project-v2 the right starting point maybe?
Describe the feature
BigQuery has the query option maximum_bytes_billed, which sets a limit for the maximum acceptable cost for individual queries. Any query that exceeds this cost will automatically fail. It would be nice if one could use this feature in DBT to guard against potentially expensive mistakes.
I suggest adding this option in the profile settings, where one can already set the BigQuery query priority and other BigQuery specific settings.
Let me know if you would like to get a pull request for this. It is a small change and I'd be happy to implement it.
Describe alternatives you've considered
I have researched ways to try and set this option via environment variables or ~/.bigqueryrc. While one can set the option in bigqueryrc, that does not influence the BigQuery client SDK, only the BigQuery commandline tools (which does not help for DBT).
Additional context
This is BigQuery specific.
Who will this benefit?
Consider a table with a million rows that you join with another table with a million rows. You mess up the join condition and accidentally create a full cross join. Since BigQuery has huge scaling capabilities, it will happily create the resulting table with 10^11 rows. Any scan of that table will be hugely expensive. This option would protect against such mistakes.
The text was updated successfully, but these errors were encountered: