Skip to content
This repository has been archived by the owner on Nov 14, 2023. It is now read-only.

neicnordic/sda-pipeline

Repository files navigation

Archival notice

⚠️ This repository is no longer maintained. The code has been integrated and it is further developed at: https://github.com/neicnordic/sensitive-data-archive

sda-pipeline

License GoDoc

Build Status Go Report Card Code Coverage

DeepSource Join the chat at https://gitter.im/neicnordic/sda-pipeline

sda-pipeline is part of NeIC Sensitive Data Archive and implements the components required for data submission. It can be used as part of a Federated EGA or as a isolated Sensitive Data Archive. sda-pipeline was built with support for both S3 and POSIX storage.

Deployment

Recommended provisioning method for production is:

For local development/testing see instructions in dev_utils folder. There is an README file in the dev_utils folder with sections for running the pipeline locally using Docker Compose.

Core Components

Component Role
intercept The intercept service relays message between the queue provided from the federated service and local queues. (Required only for Federated EGA use case)
ingest The ingest service accepts messages for files uploaded to the inbox, registers the files in the database with their headers, and stores them header-stripped in the archive storage.
verify The verify service reads and decrypts ingested files from the archive storage and sends accession requests.
finalize The finalize command accepts messages with accessionIDs for ingested files and registers them in the database.
mapper The mapper service registers the mapping of accessionIDs (IDs for files) to datasetIDs.
backup The backup service accepts messages with accessionIDs for ingested files and copies them to the second/backup storage.

Internal Components

Component Role
broker Package containing communication with Message Broker SDA-MQ.
config Package for managing configuration.
database Provides functionalities for using the database, as well as high level functions for working with the SDA-DB.
storage Provides interface for storage areas such as a regular file system (POSIX) or as a S3 object store.

Documentation

sda-pipeline documentation can be found at: https://neicnordic.github.io/sda-pipeline/pkg/sda-pipeline/

NeIC Sensitive Data Archive documentation can be found at: https://neic-sda.readthedocs.io/en/latest/ along with documentation about other components for data access.

Contributing

We happily accepts contributions. Please see our contributing documentation for some tips on getting started.