Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: Changing the configured a3m shared directory breaks the a3m workflow #864

Open
djjuhasz opened this issue Feb 21, 2024 · 6 comments
Assignees

Comments

@djjuhasz
Copy link
Collaborator

Describe the bug

Changing the value of the a3m "shareDir" configuration setting in the "enduro.toml" configuration file causes the a3m processing workflow to fail with a fatal error.

To Reproduce

Steps to reproduce the behavior:

  1. Configure Enduro to use a3m for preservation
  2. In the "enduro.toml" configuration file, change the default value of the "a3m.shareDir" setting to a different value
  3. Restart the enduro and a3m workers to load the new configuration
  4. Attempt to process a transfer by uploading it to the MinIO "sips" bucket
  5. The processing workflow fails with the error:
 "message": "error creating temporary directory: stat /home/a3m/.local/share/a3m/share: no such file or directory",
  "source": "GoSDK",

Expected behavior

Changing the configured a3m shared directory location to a valid filesystem path should not break the a3m workflow.

Screenshot:
image

Failure message (JSON):

{
  "message": "activity error",
  "source": "GoSDK",
  "stackTrace": "",
  "encodedAttributes": null,
  "cause": {
    "message": "error creating temporary directory: stat /home/a3m/.local/share/a3m/share: no such file or directory",
    "source": "GoSDK",
    "stackTrace": "",
    "encodedAttributes": null,
    "cause": null,
    "applicationFailureInfo": {
      "type": "",
      "nonRetryable": true,
      "details": {
        "payloads": [
          null
        ]
      }
    }
  },
  "activityFailureInfo": {
    "scheduledEventId": "20",
    "startedEventId": "21",
    "identity": "1@enduro-a3m-0@",
    "activityType": {
      "name": "bundle-activity"
    },
    "activityId": "20",
    "retryState": "NonRetryableFailure"
  }
}

Additional context

The a3m share directory path is hardcoded at https://github.com/artefactual-sdps/enduro/blob/main/internal/workflow/processing.go#L321

@djjuhasz djjuhasz self-assigned this Feb 21, 2024
djjuhasz added a commit that referenced this issue Feb 21, 2024
Fixes #864

- Pass the full enduro config to processing workflow, and remove the
  separate taskQueue parameter
- Use config value for A3m shareDir in processing.go
- Use a temporary directory as a shareDir in processing_test.go
@djjuhasz djjuhasz changed the title Problem: Changing the configured a3m shared directory breaks the a3m workkflow Problem: Changing the configured a3m shared directory breaks the a3m workflow Feb 21, 2024
djjuhasz added a commit that referenced this issue Feb 21, 2024
Refs #864

- Pass the full enduro config to processing workflow, and remove the
  separate taskQueue parameter
- Use config value for A3m shareDir in processing.go
- Use a temporary directory as a shareDir in processing_test.go
djjuhasz added a commit that referenced this issue Feb 29, 2024
Refs #864

- Pass the full enduro config to processing workflow, and remove the
  separate taskQueue parameter
- Use config value for A3m shareDir in processing.go
- Use a temporary directory as a shareDir in processing_test.go
djjuhasz added a commit that referenced this issue Feb 29, 2024
Refs #864

- Pass the full enduro config to processing workflow, and remove the
  separate taskQueue parameter
- Use config value for A3m shareDir in processing.go
- Use a temporary directory as a shareDir in processing_test.go
@sevein
Copy link
Contributor

sevein commented Mar 4, 2024

Can this issue be closed?

@djjuhasz
Copy link
Collaborator Author

djjuhasz commented Mar 4, 2024

@sevein that's a tough question. Changing the shareDir will still break the a3m integration in dev because a3m is still trying to send the AIP back to /home/a3m/.local/share/a3m/share. I couldn't find any way to configure a3m (in the dev env) to send the AIP to a different directory. :(

@djjuhasz
Copy link
Collaborator Author

djjuhasz commented Mar 4, 2024

I'm copying this from #865 (comment) for better visibility:

Commit eca89b6 fixes a hardcoded a3m shared directory path in Enduro, but the processing workflow is still failing if the path is changed from the default. A3m finds the deposited SIP in the new path, and succesfully creates and AIP, but a3m is still saving the final AIP to the default "/home/a3m/.local/share/a3m/share/" directory instead of the new path. A3m doesn't pass the stored AIP path back to Enduro directly so Enduro is assuming the path of the AIP is "/home/a3m/.local/share/a3m/new_share/completed", which is not correct.

Fixing this completely will require changes to a3m.

@djjuhasz
Copy link
Collaborator Author

djjuhasz commented Mar 6, 2024

I was just reading the a3m documentation and noticed that a3m accepts a shared_directory configuration setting (ref: https://a3m.readthedocs.io/en/latest/settings.html). I'll try setting this value in the Enduro a3m hack config, and see if solves the problem with Enduro not finding the AIP.

@djjuhasz
Copy link
Collaborator Author

djjuhasz commented Apr 1, 2024

I've done some investigation on using the a3m shared_directory setting to change the directory shared by Enduro and a3m for file exchange. So far, this is what I've found:

I set an A3M_SHARED_DIRECTORY environment variable in the hack/kube/overlays/dev-a3m/enduro-a3m.yaml file, but this breaks the a3m tmp_directory, processing_directory, and rejected_directory paths. This is a consequence of https://github.com/artefactual-labs/a3m/blob/main/a3m/settings/common.py#L168, which sets all four directory paths (shared, tmp, processing, and rejected) when shared_directory is not explicitly set, but doesn't set any of the paths if shared_directory is set. I think I'll file an bug ticket in a3m about this behaviour, as it's unexpected and undocumented.

Next I tried setting the environment variable for all four a3m directories:

 env:
        - name: A3M_SHARED_DIRECTORY
          value: "/home/a3m/share/"
        - name: A3M_TEMP_DIRECTORY
          value: "/home/a3m/share/tmp/"
        - name: A3M_PROCESSING_DIRECTORY
          value: "/home/a3m/share/currentlyProcessing/"
        - name: A3M_REJECTED_DIRECTORY
          value: "/home/a3m/share/rejected/"

but this fails in a3m at the verify AIP step:

[a3m] 	| =============== JOB
[a3m] 	| verify_aip (exit=1; code=success uuid=abf35e91-53ea-4e80-9f10-a9fe1df23164)
[a3m] 	| =============== STDOUT
[a3m] 	| 
[a3m] 	| =============== END STDOUT
[a3m] 	| =============== STDERR
[a3m] 	| PermissionError(13, 'Permission denied')
[a3m] Error extracting AIP at "/home/a3m/share/currentlyProcessing/ingest/54921778-26ea-46d1-a20e-a6c5ad80f504/small-54921778-26ea-46d1-a20e-a6c5ad80f504.7z"
[a3m] 
[a3m] 	| =============== END STDERR
[a3m] 	| =============== ARGS
[a3m] 	| ['verify_aip', '54921778-26ea-46d1-a20e-a6c5ad80f504', '/home/a3m/share/currentlyProcessing/ingest/54921778-26ea-46d1-a20e-a6c5ad80f504/small-54921778-26ea-46d1-a20e-a6c5ad80f504.7z']
[a3m] 	| =============== END ARGS

I don't yet know why verify AIP is failing, but I did notice that https://github.com/artefactual-labs/a3m/blob/main/a3m/server/shared_dirs.py#L12 is creating some extra directories when the a3m server is started, and it is assuming the "processing directory" is named currentlyProcessing which may be a problem if A3M_PROCESSING_DIRECTORY is set to something else.

@djjuhasz
Copy link
Collaborator Author

djjuhasz commented Apr 1, 2024

The verify AIP error message appears to occur at https://github.com/artefactual-labs/a3m/blob/main/a3m/client/clientScripts/verify_aip.py#L193. I'm not clear where in the code the PermissionError(13, 'Permission denied') originates, but my best guess is https://github.com/artefactual-labs/a3m/blob/main/a3m/client/clientScripts/verify_aip.py#L25. It seems like a3m should have sufficient permissions to create a temporary directory for the extracted AIP, so I'm not sure why permissions are denied. 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants