Description
Please describe the problem you'd like to be solved.
As a someone transferring information into Archivematica I'd like to find duplicate content across AIPs so that I can understand if the content has already been stored for preservation and access, or if there is excess amounts of redundancy in the direct copies that I am maintaining.
Describe the solution you'd like to see implemented.
I would like a checksum comparison to be available somewhere in workflow that will allow me to identify duplicates. I can then make decisions based on the information returned.
Describe alternatives you've considered.
I can detect duplicates before transfer using tools that generate checksums but it is difficult to maintain state over long periods of time, and if I have many AIPs already stored, then there isn't an easy way for me to know if there is content stored that may be identical to the content that I am transferring.
For Artefactual use:
Please make sure these steps are taken before moving this issue from Review to Verified in Waffle:
- All PRs related to this issue are properly linked 👍
- All PRs related to this issue have been merged 👍
- Test plan for this issue has been implemented and passed 👍
- Documentation regarding this issue has been written and it has been added to the release notes, if needed 👍
Activity