You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the path is useful as an archival record. otherwise, we would store directory names as attribute tags and perhaps eventually change the folder structure
The text was updated successfully, but these errors were encountered:
We can easily run a process that deletes the duplicate files and collapses the documents down to one with the other filenames recorded within it. If we do that, the hard bit will be deciding which categorization we keep.
If the categorizations are all valid, these could be added to the document, making the one we choose to keep arbitrary.
For example, if /a/foo.pdf and /b/bar.pdf are identical files, We could end up with one of the following metadata blocks:
This does not consider the possibility that other metadata could differ (e.g. through a spreadsheet import). If that is the case then both metadata blocks should probably remain (as an optimization, they could be modified to point at the same actual s3 file).
the path is useful as an archival record. otherwise, we would store directory names as attribute tags and perhaps eventually change the folder structure
The text was updated successfully, but these errors were encountered: