-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(bigquery): add create job method #32
feat(bigquery): add create job method #32
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few questions on handling the job_config
input argument.
Another is a general question how to keep this factory method in sync with the *Job
classes - I presume it should be a very rare occurrence of a new job type being added/removed?
Edit:
One more thing - the ticket description contains three checklist items, while this PR only implements the first of them - is there anything else this will be added?
If not let's change the PR issue link in the description to "Towards #14" or something similar, so that merging the PR will not close the issue just yet.
google/cloud/bigquery/client.py
Outdated
job_config | ||
) | ||
destination = TableReference.from_api_repr( | ||
job_config["load"]["destinationTable"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought - what if the input dict is incorrectly structured? The user might then face rather uninformative KeyError
s.
It seems that it would make sense to use _get_sub_prop()
in all potentially risky cases?
google/cloud/bigquery/client.py
Outdated
source, destination_uris, job_config=extract_job_config, retry=retry | ||
) | ||
elif "query" in job_config: | ||
del job_config["query"]["destinationTable"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will modify the dict
passed in, which is probably undesired? IIRC there was an issue back then that addressed a very similar case by operating on a copy of the input argument.
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
37e55cb
to
ad758aa
Compare
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, awaiting just the feedback on whether or not we are fine with potentially modifying an input argument.
google/cloud/bigquery/client.py
Outdated
destination_uris = _get_sub_prop(job_config, ["extract", "destinationUris"]) | ||
return self.extract_table( | ||
source, destination_uris, job_config=extract_job_config, retry=retry | ||
) | ||
elif "query" in job_config: | ||
del job_config["query"]["destinationTable"] | ||
_del_sub_prop(job_config, ["query", "destinationTable"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will still potentially modify the input argument in-place, do we mind? (cc: @tswast )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not do that if at all possible. Maybe make a copy before calling _del_sub_prop
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HemangChothani Creating a copy and operating on the latter sounds good, yes.
If i consider
clear destination_table if it was a query job
statement which mentioned in issue's description, need to modify the input argument.
If I understood the ticket description correctly, it is not clear when a client library should clear the destination table property. But if we do clear it (as is the case with query
jobs), it is safer to do it in a config copy, because users normally don't expect that their input parameter could be modified.
Fixes #14