-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-972] [Feature] Block usage of submit_python_job
outside of materialization logic
#5596
Comments
submit_python_job
outside of materialization logicsubmit_python_job
outside of materialization logic
@jtcohen6 either way feels not very easy, any idea? |
@ChenyuLInx to spike |
Talked about this live, the manipulating |
@jtcohen6 @lostmygithubaccount I think I found a path to modify the context so that we only allow the code(not saying I like it) dbt-core/core/dbt/clients/jinja.py Lines 301 to 321 in a1ee348
The trace_call function here is called for each macro call and all the sub macro calls. So we can do something like: when a macro being called, we figure out whether it is materialization macro(from name), if yes, preserve that submit_python_job function somewhere, update it with a non_op/raise error, then if we run into the statement , we put it back. This way we only allow using the submit_python_job being used in materialization -> statement . still not a total block, but better.We can't really only allow it in for only statement with name main since we call it in create tmp table sometime. otherwise would that would be the way to only allow 1 python job submission per model. We can do something even more funky to add language to results of statement macro and do some check afterwards.
So the situation now is: we can limit this, with some less than ideal method, and the more restriction we want to have, the less ideal the implementation is going to be. How far do we want to go? |
@ChenyuLInx Thanks for the investigation!
I'd be happy to proceed with this approach — not a complete block, but a helpful guardrail — so long as it doesn't add too much cruft to this tightly wound part of the codebase. If it seems like a ton of work, we'd need to weigh it against other priorities during the beta period. Tracking calls between macros feels like a thread we might want to pull on more in the future, e.g. to understand which macros are "dirty" / volatile and depend upon introspective queries for their results. |
Is this your first time submitting a feature request?
Describe the feature
Throw a clear error during parsing(or later time in dbt run) if
adapter.submit_python_job
is being used in places other than the materialization logic.Relative information:
adapter.submit_python_job
is currently being called in macro statement.Ideas:
materialization
->statement
, we run regex on it to make sureadapter.submit_python_job
is not there. This could be very costyadapter.submit_python_job
in desired situation. This means we will have to remove it from the context sometime during jinjia compilation, this could be complexrelated links
The text was updated successfully, but these errors were encountered: