-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added collection of Managing Dataset samples #6
Conversation
bigquery/samples/delete_dataset.py
Outdated
# TODO(developer): Set model_id to the ID of the model to fetch. | ||
# dataset_id = 'your-project.your_dataset' | ||
|
||
client.delete_dataset(dataset_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original sample also had an example showing the delete_contents=True
option. I think you could actually do the same, but also show off not_found_ok=True
.
client.delete_dataset(dataset_id)
# Comment about deleting a table that contains tables.
# Comment about tables that might not exist.
client.delete_dataset(dataset_id, delete_contents=True, not_found_ok=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! Added the parameters and notes about what they do.
bigquery/samples/get_dataset.py
Outdated
) | ||
|
||
# View dataset properties | ||
print("Dataset ID: {}".format(dataset_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can delete this line since you're printing the ID in the same sample in the print statement above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
create_dataset.create_dataset(client, random_dataset_id) | ||
out, err = capsys.readouterr() | ||
assert "Created dataset {}".format(random_dataset_id) in out | ||
|
||
# get dataset | ||
get_dataset.get_dataset(client, random_dataset_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should all be separate tests. That way we can more easily tell when an individual sample is broken. Models API is an exception because it's so slow to create one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All separated out into their own tests. Had to include the create_dataset function for some of the tests or else they wouldn't pass (needs a dataset to make changes to!).
|
||
def update_dataset_default_table_expiration(client, dataset_id): | ||
|
||
# [START bigquery_update_dataset_default_table_expiration] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be removed from docs/snippets.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please keep the "region tag" the same. That way we can track changes over time better. It was called bigquery_update_dataset_expiration
in snippets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, but forgot to push /facepalm. Updated now!
Updated the region tag as directed.
) # API request | ||
|
||
full_dataset_id = "{}.{}".format(dataset.project, dataset.dataset_id) | ||
print("Updated dataset {}".format(full_dataset_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing purposes, it would be good to print the new dataset.default_table_expiration_ms
also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the expiration to the print statement.
client, random_dataset_id | ||
) | ||
out, err = capsys.readouterr() | ||
assert "Updated dataset {}".format(random_dataset_id) in out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look for the expected expiration str(24 * 60 * 60 * 1000)
in the output too. That way we can be more certain that the expiration was actually updated as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the time into it's own fixture so it's easier to be used in the test.
bigquery/samples/tests/conftest.py
Outdated
@@ -25,6 +25,12 @@ def client(): | |||
return bigquery.Client() | |||
|
|||
|
|||
@pytest.fixture | |||
def one_day_ms(client): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be a fixture. Just a normal variable/constant is fine. Fixtures are more for setup & clean-up code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, now that I see where this is used in the sample code, I'd prefer this code stay in the sample. That way it's clear to someone reading the sample what expected values for expiration look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took this out of the fixture and put it back into the sample and test.
from .. import delete_dataset | ||
|
||
|
||
def test_delete_dataset(capsys, client, random_dataset_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend you use the dataset_id
fixture instead of random_dataset_id
, since the dataset_id
fixture will create a dataset for you (to then delete in the sample).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now updated.
from .. import get_dataset | ||
|
||
|
||
def test_get_dataset(capsys, client, random_dataset_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. I recommend you use the dataset_id
fixture instead of random_dataset_id
, since the dataset_id
fixture will create a dataset for you (to then get in the sample).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now updated.
from .. import list_datasets | ||
|
||
|
||
def test_list_datasets(capsys, client, random_dataset_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The random_dataset_id
fixture isn't necessary for this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
from .. import update_dataset_access | ||
|
||
|
||
def test_update_dataset_access(capsys, client, random_dataset_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. I recommend you use the dataset_id
fixture instead of random_dataset_id
, since the dataset_id
fixture will create a dataset for you (to then update in the sample).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with dataset_id as directed.
|
||
|
||
def test_update_dataset_default_table_expiration( | ||
capsys, client, random_dataset_id, one_day_ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. I recommend you use the dataset_id
fixture instead of random_dataset_id
, since the dataset_id
fixture will create a dataset for you (to then update in the sample).
one_day_ms
should just be a constant in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated dataset_id and added the constant for one_day_ms from the fixture.
# dataset_id = 'your-project.your_dataset' | ||
|
||
dataset = client.get_dataset(dataset_id) | ||
dataset.default_table_expiration_ms = one_day_ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's against DRY principles, but In sample code it's better to actually show the actual value rather than create a constant. That way people can see typical values and understand that an integer is expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took out the extra variable as directed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
* Added delete dataset function * Added get dataset function * Added list dataset function * Added update dataset description sample * Added update dataset access sample * Added update dataset table expiration sample * Added tests for dataset samples and updated docs * Removing original update dataset access from snippets file. * Moved all dataset tests into own file. Made changes based on feedback. * Made changes based on feedback * Removed unnecessary use of random_dataset_id in tests and removed one_day_ms fixture * Removed unnecessary constant * Stored the math as a constant to make it look cleaner.
* Added delete dataset function * Added get dataset function * Added list dataset function * Added update dataset description sample * Added update dataset access sample * Added update dataset table expiration sample * Added tests for dataset samples and updated docs * Removing original update dataset access from snippets file. * Moved all dataset tests into own file. Made changes based on feedback. * Made changes based on feedback * Removed unnecessary use of random_dataset_id in tests and removed one_day_ms fixture * Removed unnecessary constant * Stored the math as a constant to make it look cleaner.
@engelke