-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow bagging to destination #138
Conversation
bagit.py
Outdated
if not os.path.isdir(dest_dir): | ||
os.makedirs(dest_dir) | ||
else: | ||
raise RuntimeError(_("The following directory already exists:\n%s"), dest_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider allowing an empty directory as a target, useful e.g. when creating that directory requires different permissions than the user running bagit has.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Would checking for an empty directory like this be enough?
raise RuntimeError(_("The following directory already exists:\n%s"), dest_dir) | |
elif len(os.listdir(dest_dir)) > 0: | |
raise RuntimeError(_("The following directory already exists and contains files:\n%s"), dest_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will work because os.listdir
is documented as not including .
or ..
but I was wondering about performance if someone points it at a directory which a large number of files — the classic Unix answer being os.stat(dest_dir).st_nlink > 2
— but I don't think that'll really be an issue in this scenario.
bagit.py
Outdated
else: | ||
dest_dir = os.path.abspath(bag_dir) | ||
|
||
source_dir = os.path.abspath(bag_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using realpath
instead to resolve symlinks (not related to this change, just a general observation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use it on line 233 to address an issue with getcwd, but didn't want to break too far from existing practice without more consideration. I don't think it would be an issue. Are there any edge cases that would be good to test for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't think of any edge cases, but I did have to invoke realpath in the setup class for the tests.
I've incorporated the suggested changes. After using this branch for a few months, I changed the copy mechanism to use a single copytree call. To do this, I had to create the temporary directory name from scratch since copytree will not copy to pre-existing directories. I couldn't think of a way to cover L245-248 with a test. |
Based on @runderwood #92, but reimplemented within the current codebase to address #32 and #136
Major difference, this places every source within its own bag rather than combining bag, e.g.
bagit.py --destination foo/ --directory bar/ baz/
will create
which mirrors current behavior that bagit.py --directory bar/ baz/ creates