Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--overwrite flag is not respected for <bundles> elements in harvest config #112

Closed
alexdunnjpl opened this issue Dec 8, 2022 · 2 comments · Fixed by #113
Closed

--overwrite flag is not respected for <bundles> elements in harvest config #112

alexdunnjpl opened this issue Dec 8, 2022 · 2 comments · Fixed by #113
Assignees
Labels
B13.1 bug Something isn't working s.high

Comments

@alexdunnjpl
Copy link
Contributor

🐛 Describe the bug

When harvesting using a config xml file, harvesting already-registered products/bundle with

<bundles>
    <bundle dir="/nomount/harvest/.idea/plawton_umd_test_data/pds4-epoxi_mri-v1.0" versions="all" />
</bundles>

does not succeed, even when --overwrite is provided. This contrasts with

<directories>
    <path>/nomount/harvest/.idea/plawton_umd_test_data/pds4-epoxi_mri-v1.0</path>
</directories>

which behaves as-expected.

📜 To Reproduce

Steps to reproduce the behavior:

  1. Run harvest to register a bundle and its collections/products, using the <bundles> config element
  2. Repeat execution, using the --overwrite option
  3. Observe that the products are skipped, not overwritten

🕵️ Expected behavior

I expect the --overwrite option to be respected irrespective of the XML element used to target the bundle

📚 Version of Software Used

v3.8.0-SNAPSHOT

🦄 Related requirements

See NASA-PDS/registry #118

⚙️ Engineering Details

@alexdunnjpl
Copy link
Contributor Author

The way I'd expect harvest to work is that there would be

  • an enumeration step (enumeration of the products* to be processed)
  • a generation step (generation of the registry documents for each product)
  • a write step (i.e. throwing those documents at the registry database, and handling any errors
    *product/collection/bundle

this way, different enumeration approaches (in this case, <bundles> and <directories>) would be decoupled from everything else, and share a common downstream processing path.

The current situation is that <bundle> leverages HarvestCmd.processBundles and <directories> leverages HarvestCmd.processDirectories (likewise, files and collections have equivalents), and these processing execution paths diverge completely depending on which kind of source they're deriving from.

@alexdunnjpl
Copy link
Contributor Author

alexdunnjpl commented Dec 8, 2022

@tloubrieu-jpl @jordanpadams I've pushed a bandaid fix in issue-112-overwrite-bug, but I feel like this speaks to a larger architectural flaw which is worth addressing, but risks dragging me in the weeds and taking a long time due to the scope of changes and my relative unfamiliarity with the harvest codebase.

Your call how you'd like to proceed - my suggestion is to have me merge a PR for the bandaid and open an icebox issue for the larger rework if you agree that it's necessary/desirable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B13.1 bug Something isn't working s.high
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants