-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADAP-1016] [Regression] [1.7] python models don't work #1006
Comments
I can confirm this is broken for me running dbt-bigquery with Dataproc for both
I have reverted to an earlier working version of |
Hi! - I don't think #1014 completely fixed running python models on Dataproc by the way. The first batch job can work, but it seems like I'm having still having issues running on I believe this is error could be due to the way the PR #1014 is written it tags the batch name with |
keeping this open till we've verified it fixes things |
Is there any way that I can temporarily fix this issue by manually specifying the batch_id in the dbt config for now? That way I could just generate the id with like |
@tanghyd models:
- name: model_name
config:
batch_id: |
{{ run_started_at.strftime("%Y-%m-%d-%H-%M") }}-modelname-{{ range(0,10000) | random }} |
Hello! Thank you for your suggestion but unfortunately that does not reliably work for repeat runs. Sometimes two subsequent calls of the same python model does not generate two different It seems like the As an alternative, I tried assigning this jinja string to the dbt config inside the python file instead of the schema.yml file as follows:
However that also fails with the following error: |
OK, I've done some further testing here. For subsequent runs after an initially successful first run, I've found two possible scenarios:
I've also tried with the following config and the same issue comes up (a duplicate
|
Python models are still broken in v1.7.3 due to ADAP-1063. |
TL;DR: Two subsequent runs of the same python model continue to fail on the second attempt as the model config's batch_id does not change and Dataproc requires unique batch_id's for separate job submissions. Python models are still broken in v1.7.3 for the same reason as described above in my comments despite the recently released change introduced here: #1020 in dbt/adapters/bigquery/python_submissions.py on line 128 to 130 as follows:
I tried two sequential runs with a python model running As hypothesised above, if a model has not changed in between runs, the See below for example commands and results where a subsequent run has an identical batch job id (details obfuscated for privacy reasons):
|
✅ FYI for completeness, my error described above has been resolved after updating to the latest dbt-bigquery and not writing the following in the model yaml config (likely missed after #1020)
|
Is this a regression in a recent version of dbt-bigquery?
Current Behavior
all Python models fail to compile with the following compilation error.
sequence item 2: expected str instance, NoneType found
thread from #db-bigquery community Slack
Expected/Previous Behavior
the model should run
Steps To Reproduce
dbt seed
dbt run -s +thing
weirdly
dbt compile -s thing
works without issueRelevant log output
No response
Environment
Additional Context
perhaps related to #681?
The text was updated successfully, but these errors were encountered: