-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reference implementation of archive format from issue #15 #57
Conversation
Try: `pytest` in the repo root `python setup.py test` in the repo root runs pytest `tox` in the repo root runs pytest in a virtualenv against all intalled interpreters `tox -e coverage` in the repo root produces a test coverage report
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a large PR, but I read through a most of it. Each time I thought I found an issue, I discovered it had actually been handled, hah. Bravo.
I only have one question, possibly applicable in two places, and it's fairly minor.
Thanks so much for the excellent PR, @djanderson, and for keeping the actual implementation up-to-speed with the spec.
sigmf/archive.py
Outdated
else: | ||
has_ext = self.name.endswith(SIGMF_ARCHIVE_EXT) | ||
name = self.name if has_ext else self.name + SIGMF_ARCHIVE_EXT | ||
sigmf_fd = open(name, "wb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above re: catching permission errors on file open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could definitely throw an error, so it might be nice to catch it. If this were a command-line utility, I would definitely catch it and print out the error and then exit, but since it's designed to be imported by another program, the polite thing to do is let exceptions raise (rise?). We could catch it and wrap it in one of the SigMFError exception classes I created, but the few cases where I used those I didn't use them to catch and wrap existing exceptions, I just didn't want to raise something generic like RuntimeError. My personal feeling is that just letting the original error raise is a bit cleaner than wrapping it and re-raising. Especially in python3 where you will get the traceback chain with the original error anyways!
>>> t = tempfile.mkstemp()
>>> t
(3, '/tmp/tmpl4smirjf')
>>> f = open(t[1], mode="r")
>>> f.writable()
False
>>> f.write("stuff")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
io.UnsupportedOperation: not writable
>>> try:
... f.write("stuff")
... except Exception as err:
... raise RuntimeError(err) # pretending this is SigMFFileError or something
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
io.UnsupportedOperation: not writable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
RuntimeError: not writable
>>>
I dunno, considering we will need to raise anyways, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have a good point, and I would prefer that we don't create new exception classes for the sole purpose of renaming them to SigMF
exceptions.
My one concern with the above, though, is that if this code is being used more indirectly by a user, it may not be clear to them which file was not writable, and the Python3 traceback isn't immediately useful for identifying which filepath was the problem.
So, how about passing on the built-in exception, but printing a bit more information about which file the user needs to fix?
sigmf_data_filename = archive_name + SIGMF_DATASET_EXT | ||
sigmf_data_path = os.path.join(tmpdir, sigmf_data_filename) | ||
|
||
with open(sigmf_md_path, "w") as mdfile: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we wrap this with try/catch in case we don't have write permissions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, sigmf_md_path
is being created under tmpdir
, which is guaranteed to be "readable, writable, and searchable only by the creating user ID." (mkdtemp), so I think we're 100% safe here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, good call.
If a SigMFFile holds valid metadata and a data_file path, then calling the `archive` method tars metadata and data according to the spec, and returns the path to the archive. This commit also adds a function: `sigmffile.fromarchive`. If passed a valid sigmf archive and a path, extract the archive to that path and return a SigMFFile object holding the archive's metadata and pointing to the extracted `data_file`.
…useful error Also refactor for readability (chop up a few larger functions) and correctness (don't use "fd" when we're working with file-like objects, not file descriptors)
@bhilburn, I addressed the issue by catching the unhelpful error from Since I did some refactoring in the same commit, I'll highlight the main changes: In def _get_output_fileobj(self):
try:
fileobj = self._get_open_fileobj()
except:
if self.fileobj:
e = "fileobj {!r} is not byte-writable".format(self.fileobj)
else:
e = "can't open {!r} for writing".format(self.name)
raise error.SigMFFileError(e)
return fileobj
def _get_open_fileobj(self):
if self.fileobj:
fileobj = self.fileobj
fileobj.write(bytes()) # force exception if not byte-writable
else:
fileobj = open(self.name, "wb")
return fileobj In def test_unwritable_fileobj_throws_fileerror(test_sigmffile):
with tempfile.NamedTemporaryFile(mode="rb") as t:
with pytest.raises(error.SigMFFileError):
test_sigmffile.archive(fileobj=t)
def test_unwritable_name_throws_fileerror(test_sigmffile):
unwritable_file = "/root/unwritable.sigmf" # assumes root is unwritable
with pytest.raises(error.SigMFFileError):
test_sigmffile.archive(name=unwritable_file) |
@bhilburn, was there anything left you wanted me to clean up here? We've been using this code locally and it's working well. |
@djanderson - Nope! Sorry, was just a matter of me getting the time to sit down and focus. In short, it was my fault This PR is excellent, by the way, and represents the first major contribution to the public upstream. Great work. |
This (rather large, sorry!) PR adds a reference implementation for the single file archive format suggested in issue #15.
Since I've added a not-insignificant amount of code, I first beefed up the unit testing capability of the reference implementation. Specifically, I've taken advantage of some features of
pytest
, since the existing tests looked to bepytest
style. I also add atox.ini
config file so that runningtox
in the repo root runspytest
against all supported python versions in a virtualenv.tox -e coverage
gives us an idea of how well the reference implementation is tested (currently 80%).Regarding the actual archive capability, if a
SigMFFile
holds valid metadata anddata_file
is set, then calling the newarchive
method will create a.sigmf
file in accordance with the spec. Thearchive
method takesname
andfileobj
parameters that attempt to be reasonably consistent with python'starfile
module.https://docs.python.org/3.5/library/tarfile.html#tarfile-objects
Lastly, to go in the other direction, I've added the function
sigmffile.fromarchive
. Given a valid sigmf archive path and a directory, extract the sigmf archive to that directory and return aSigMFFile
instance with the archive's metadata loaded anddata_file
pointing to the extracted data.Please review!