Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow bagging to destination #138

Closed
wants to merge 10 commits into from
Closed

Allow bagging to destination #138

wants to merge 10 commits into from

Conversation

nkrabben
Copy link
Contributor

@nkrabben nkrabben commented Jul 5, 2019

Based on @runderwood #92, but reimplemented within the current codebase to address #32 and #136

Major difference, this places every source within its own bag rather than combining bag, e.g.
bagit.py --destination foo/ --directory bar/ baz/
will create

foo
├──bar
│  ├── bag-info.txt
│  ├── ...
└──baz
   ├── bag-info.txt
   ├── ...

which mirrors current behavior that bagit.py --directory bar/ baz/ creates

./
├──bar
│  ├── bag-info.txt
│  ├── ...
└──baz
   ├── bag-info.txt
   ├── ...

@coveralls
Copy link

coveralls commented Jul 5, 2019

Coverage Status

Coverage decreased (-1.3%) to 82.019% when pulling 61769cd on nkrabben:master into 4b76c14 on LibraryOfCongress:master.

bagit.py Outdated
if not os.path.isdir(dest_dir):
os.makedirs(dest_dir)
else:
raise RuntimeError(_("The following directory already exists:\n%s"), dest_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider allowing an empty directory as a target, useful e.g. when creating that directory requires different permissions than the user running bagit has.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Would checking for an empty directory like this be enough?

Suggested change
raise RuntimeError(_("The following directory already exists:\n%s"), dest_dir)
elif len(os.listdir(dest_dir)) > 0:
raise RuntimeError(_("The following directory already exists and contains files:\n%s"), dest_dir)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will work because os.listdir is documented as not including . or .. but I was wondering about performance if someone points it at a directory which a large number of files — the classic Unix answer being os.stat(dest_dir).st_nlink > 2 — but I don't think that'll really be an issue in this scenario.

bagit.py Outdated
else:
dest_dir = os.path.abspath(bag_dir)

source_dir = os.path.abspath(bag_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using realpath instead to resolve symlinks (not related to this change, just a general observation)

Copy link
Contributor Author

@nkrabben nkrabben Jul 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use it on line 233 to address an issue with getcwd, but didn't want to break too far from existing practice without more consideration. I don't think it would be an issue. Are there any edge cases that would be good to test for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't think of any edge cases, but I did have to invoke realpath in the setup class for the tests.

@nkrabben
Copy link
Contributor Author

I've incorporated the suggested changes.

After using this branch for a few months, I changed the copy mechanism to use a single copytree call. To do this, I had to create the temporary directory name from scratch since copytree will not copy to pre-existing directories. I couldn't think of a way to cover L245-248 with a test.

@nkrabben
Copy link
Contributor Author

@acdha is it okay if I add tests to cover the CLI? It would have caught the bug I fixed in 61769cd, but I hadn't seen an approach to testing in the package yet.

@acdha
Copy link
Member

acdha commented Feb 12, 2020

@acdha is it okay if I add tests to cover the CLI? It would have caught the bug I fixed in 61769cd, but I hadn't seen an approach to testing in the package yet.

Definitely – that's a big gap in the test coverage currently

@nkrabben nkrabben closed this by deleting the head repository Aug 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants