You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When processing a large volume of packages, it can be easy to upload the same package twice. This could also happen if package deposit is automated or in a variety of other situations. In any case, processing the same package twice is an unnecessary use of compute resources and storage space and should be avoided.
Describe the solution you'd like
I would like Enduro to check to see if the package has already been ingested. This could be done by computing a checksum for the package and recording this in a database, to be checked against in the future. All incoming packages will be compressed, so computing a checksum should be fast. Any checksum that is a repeat of one already recorded in the database will trigger a failure.
This should be optional 😉
Describe alternatives you've considered
This is a pretty high-level check that will not catch very similar packages. It would be possible to be much more specific - checking individual file checksums, for example, or checking certain metadata elements - but I think this will suffice for the migration (and therefore MVP).
Additional context
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
When processing a large volume of packages, it can be easy to upload the same package twice. This could also happen if package deposit is automated or in a variety of other situations. In any case, processing the same package twice is an unnecessary use of compute resources and storage space and should be avoided.
Describe the solution you'd like
I would like Enduro to check to see if the package has already been ingested. This could be done by computing a checksum for the package and recording this in a database, to be checked against in the future. All incoming packages will be compressed, so computing a checksum should be fast. Any checksum that is a repeat of one already recorded in the database will trigger a failure.
This should be optional 😉
Describe alternatives you've considered
This is a pretty high-level check that will not catch very similar packages. It would be possible to be much more specific - checking individual file checksums, for example, or checking certain metadata elements - but I think this will suffice for the migration (and therefore MVP).
Additional context
The text was updated successfully, but these errors were encountered: