Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: AIP moved to completed when S3 enabled. Breaks S3 upload #339

Open
camlyall opened this issue May 2, 2023 · 2 comments
Open

Problem: AIP moved to completed when S3 enabled. Breaks S3 upload #339

camlyall opened this issue May 2, 2023 · 2 comments

Comments

@camlyall
Copy link
Contributor

camlyall commented May 2, 2023

S3 AIP storage methods expect the AIP location to be in currentlyProcessing/ingest/, so we get a "file not found" error when it's moved to completed. You can see the affected code in the verify_aip.py file, lines 229-232:

aip_path = Path(aip_path)
completed_dir = Path(mcpclient_settings.SHARED_DIRECTORY, "completed")
shutil.move(str(aip_path), str(completed_dir))
logger.info("AIP generated: %s", aip_path.name)

You can also check out the S3 storage methods in the a3m_store_aip.py file: https://github.com/artefactual-labs/a3m/blob/main/a3m/client/clientScripts/a3m_store_aip.py

To fix this issue, you can add a check for the S3 enabled setting before moving the AIP to the completed directory. You can find an example of this in my updated verify_aip.py file, lines 229-234:

# Don't move to completed if S3 enabled
if not mcpclient_settings.S3_ENABLED:
aip_path = Path(aip_path)
completed_dir = Path(mcpclient_settings.SHARED_DIRECTORY, "completed")
shutil.move(str(aip_path), str(completed_dir))
logger.info("AIP generated: %s", aip_path.name)

Here's an example of the arguments from the logs:
['a3m_store_aip', '5a68bbdb-f5a0-4eb3-bd58-4f9badefb12d', '/home/a3m/.local/share/a3m/share/currentlyProcessing/ingest/5a68bbdb-f5a0-4eb3-bd58-4e9badefb12d/transfer-5a68bbdb-f5a0-4eb3-bd58-4e9badefb12d.7z']

@sevein
Copy link
Member

sevein commented May 4, 2023

Thanks, @camlyall. If we do this, would the AIP be left in currentlyProcessing?

I was wondering if the right thing to do would be for verify_aip to update the database with the new physical location of the AIP, i.e.:

  1. verify_aip moves to completed,
  2. [CHANGE] verify_aip updates the database with the new path, and
  3. a3m_store_aip receives the proper path.

I haven't tried it but if I remember correctly the database keeps paths relative to the shared directory so it should be possible, not entirely sure though!

@camlyall
Copy link
Contributor Author

camlyall commented Jun 1, 2023

Yes I think that is the way to go.
We have modified our usage of the package and are no longer utilizing the s3 functionality. Therefore, development from our end in respect to this is halted currently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants