-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track BigQuery costs of GVS python VS-480 #7915
Conversation
Codecov Report
@@ Coverage Diff @@
## ah_var_store #7915 +/- ##
================================================
Coverage ? 86.288%
Complexity ? 35194
================================================
Files ? 2170
Lines ? 164888
Branches ? 17785
================================================
Hits ? 142278
Misses ? 16288
Partials ? 6322 |
@@ -60,7 +61,7 @@ workflow GvsUnified { | |||
|
|||
File interval_weights_bed = "gs://broad-public-datasets/gvs/weights/gvs_vet_weights_1kb.bed" | |||
|
|||
String extract_output_file_base_name = filter_set_name | |||
String extract_output_file_base_name = sub(filter_set_name, " ", "-") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there any other characters we should be concerned about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at one point we were worried about underscores, but I think we ruled that out?
|
||
# populate cost_observability data | ||
sql = f"""INSERT INTO `{fq_dataset}.cost_observability` | ||
(call_set_identifier, step, call, shard_identifier, event_key, call_start_timestamp, event_timestamp, event_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just realizing now that this is writing bytes and the datatype for event_bytes
is INTEGER
... happily BQ INTEGERS
are unusually large at +/- 8 EiBs; if we overflow that we have bigger problems
# populate cost_observability data | ||
sql = f"""INSERT INTO `{fq_dataset}.cost_observability` | ||
(call_set_identifier, step, call, shard_identifier, event_key, call_start_timestamp, event_timestamp, event_bytes) | ||
VALUES('{call_set_identifier}', '{step}', '{call}', '{shard_identifier}', 'BigQuery Query Billed', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this supposed to be 'BigQuery Query Scanned' like in George's PR?
Integration test run: https://app.terra.bio/#workspaces/gvs-dev/GVS%20Integration/job_history/30a2d8ee-13dd-4829-b3a8-4e6a67409705
Closes https://broadworkbench.atlassian.net/browse/VS-480