-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update google storage #601
Conversation
Mocking suggestions from @jamessynge
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review, more later
|
||
@pytest.fixture(scope="module") | ||
def auth_key(): | ||
local_config = load_config('pocs_local', ignore_local=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why pocs_local AND ignore_local? If intended to test load_config, then I think the test belongs elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now rely on authentication being done for the system as a whole. As such, I am removing the ability to to pass an auth_key to the PanStorage
object directly. That also then makes some of this testing infrastructure disappear.
) | ||
|
||
|
||
@pytest.fixture(scope="module") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to not make this a function scope fixture and have it take the existing config fixture as a param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
def test_unit_id(storage): | ||
assert storage.unit_id is not None | ||
# TODO(wtgee)Verify the unit id better after #384 is done. | ||
assert re.match(r'PAN\d\d\d', storage.unit_id), storage.logger.error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is being tested here? Shouldn't we just ensure for testing that we provide a config that has the correct contents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, test removed as not appropriate.
|
||
|
||
@pytest.fixture(scope="function") | ||
def storage(auth_key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't help but think that there should be a way that we can have an auth_key that is not in github for all the world to see, but that can be used by each tester for accessing the test bucket without having to use the same credentials that give access to the primary bucket. Of course, maybe you've already addressed this.
If not, travis-ci does support secret values being passed into tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a generic testing auth_key. I saw reference to that somewhere, will try to find.
@@ -0,0 +1,111 @@ | |||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some comments please?
will not want people to delete things. However it's good to test and we | ||
want to remove the above files | ||
""" | ||
remote_path = '{}/pong.txt'.format(storage.unit_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd very much like to avoid depending on tests running serially and in a specific order, unless there is a fixture that is used to ensure that happens. I sometimes run using the xdist or random plugins in order to try find mistakes where our tests are non-hermetic.
""" | ||
Note: The util wrappers don't provide a way to delete because we generally | ||
will not want people to delete things. However it's good to test and we | ||
want to remove the above files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both files aren't removed.
pocs/utils/google/storage.py
Outdated
remote_path (str|`google.cloud.storage.blob.Blob`): Blob or path to remote blob. | ||
If just the blob name is given then file will be downloaded. | ||
save_dir (str, optional): Path for saved file, defaults to current directory. | ||
force (bool, optional): Force a download even if file exists locally, default False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps call this overwrite, which we use elsewhere?
pocs/utils/google/storage.py
Outdated
Returns: | ||
str: Path to local (uncompressed) FITS file | ||
""" | ||
# Get blob object if just a path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally a nit: Please end sentences with a period.
pocs/utils/google/storage.py
Outdated
See https://fits.gsfc.nasa.gov/fits_primer.html for overview of FITS format. | ||
|
||
Args: | ||
remote_blob (google.cloud.storage.blob.Blob): Blob object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to not allow a string as done in the prior method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, thanks.
pocs/utils/google/storage.py
Outdated
# Get a header card | ||
start_byte = 2880 * (i - 1) | ||
end_byte = (2880 * i) - 1 | ||
b_string = remote_blob.download_as_string(start=start_byte, end=end_byte) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you know that the FITS header is usually just one block, I'd be inclined to simply download the first N blocks of size 2880 bytes (e.g. 50 blocks), then operate on each of those blocks (or really treat it as a longer sequence of 80 byte lines. This will very likely be faster than downloading two 2880 blocks.
Or test to see if I'm right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know that the first block is the compression header information (for .fz
packed files) so we are skipping that block and starting at block 2. After that we don't know how many blocks it will be.
I'll add some comments so it's clear what is trying to be done then we can figure out if that is what actually should be done. :)
…e-google-storage
as we now rely on command line `gcloud` authorization. Tests cleaned up accordingly.
…e-google-storage
* Add convenience method for testing if authenticated * Skip tests if not authenticated
…e-google-storage
Part of #599, depends on #600.
upload
renamed toupload_file
get_file_blob
/get_file_blobs
- returns blob datadownload_file
- get the actual filelookup_fits_header
stream just the fits header from the storageupload_observation_to_bucket
- usesgsutil
rather than the underlying gcloud because that allows for easy parallelization.ToDo: Figure out how to mock tests. Current tests are designed for a live system and actually communicate with the network (requires flag to test).