View the schemas in Swagger UI
The goal of DRS is to create a generic API on top of existing object storage systems so workflow systems can access data in a single, standard way regardless of where it's stored. It's maintained by the GA4GH Cloud Workstream.
The API is split into two sections:
- Data Object management, which enables the creation, updating, deletion, versioning, and unique identification of files and data bundles (flat collections of files); and
- Data Object querying, which can locate data objects across different cloud environments and DRS implementations.
Installing is as easy as:
$ pip install ga4gh-dos-schemas
This will install both a demonstration server and a Python client that will allow you to
manage Data Objects in a local server. You can start the demo server using ga4gh_dos_server
.
This starts a Data Repository Service at http://localhost:8080.
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg38/chromosomes/chr22.fa.gz
md5sum chr22.fa.gz
# 41b47ce1cc21b558409c19b892e1c0d1 chr22.fa.gz
curl -X POST -H 'Content-Type: application/json' \
--data '{"data_object":
{"id": "hg38-chr22",
"name": "Human Reference Chromosome 22",
"checksums": [{"checksum": "41b47ce1cc21b558409c19b892e1c0d1", "type": "md5"}],
"urls": [{"url": "http://hgdownload.cse.ucsc.edu/goldenPath/hg38/chromosomes/chr22.fa.gz"}],
"size": "12255678"}}' http://localhost:8080/ga4gh/dos/v1/dataobjects
# We can then get the newly created Data Object by id
curl http://localhost:8080/ga4gh/dos/v1/dataobjects/hg38-chr22
# Or by checksum!
curl -X GET http://localhost:8080/ga4gh/dos/v1/dataobjects -d checksum=41b47ce1cc21b558409c19b892e1c0d1
For more on getting started, check out the quickstart guide or the rest of the documentation at ReadtheDocs!
The Data Repository Service Schemas are Apache 2 Licensed Open Source software. Please join us in the issues or check out the contributing docs!