-
Notifications
You must be signed in to change notification settings - Fork 0
DataManagerAPI
Mobyle manages datasets. A dataset is a data element that can contain multiple files and contains meta-information about its type etc...
- {uid}: the dataset identifier, the _id parameter of the Dataset element.
- *file: file path in the dataset structure (as defined in the dataset information)
Different types of authentications are implemented to access the restricted access resources.
- user has been authenticated with the web portal
- a token has been created to grant temporary access to a dataset: the apikey parameter must be set in the query string parameters.
- Oauth2 access: after getting access token, the Authorization heaber must send the bearer code as per specified in Oauth specification http://tools.ietf.org/html/rfc6749.
The datamanager supports Oauth2 protocol to ask access to the data
- /oauth/v2/authorize: authentication endpoint
- /oauth/v2/token: access endpoint
Available scope are:
- "user:email": access to the user email address
- drive: acces to the user datasets
GET /my.json
Response example:
[
{"status": 2, "name": "jjjjjdddd", "tags": [], "persistent": false, "project": {"$oid": "5374c4d62e71a8032e9f4447"}, "path": null, "_id": {"$oid": "5375d59e2e71a879abed7497"}, "data": {"files": [{"path": "test.bam", "size": 19}, {"path": "test.bai", "size": 21}], "_type": "StructData", "type": "EDAM_data:0955+EDAM_data:0924", "properties": {"bai_data": {"path": ["test.bai"], "_type": "RefData", "format": "EDAM_format:3327", "type": "EDAM_data:0955", "size": 21}, "bam_data": {"path": ["test.bam"], "_type": "RefData", "format": "EDAM_format:2572", "type": "EDAM_data:0924", "size": 19}}, "format": "EDAM_format:3327+EDAM_format:2572"}, "public": true, "description": "dddd"},
{"status": 2, "name": "seq", "tags": [], "persistent": false, "project": {"$oid": "5374c4d62e71a8032e9f4447"}, "path": null, "_id": {"$oid": "5375d6b02e71a879abed7498"}, "data": {"path": ["seq"], "_type": "RefData", "format": "EDAM_format:2200", "type": "EDAM_data:2044", "size": 1382}, "public": true, "description": "ddd"}
]
In this response we have 2 datasets with:
"data": {"files": [{"path": "test.bam", "size": 19}, {"path": "test.bai", "size": 21}],, "_type": "StructData", ...
This is a case of a "complex" dataset (StructData) i.e. a dataset made of multiple files. Relative path to each file is set in the files parameter.
and
data": {"path": ["seq"], "_type": "RefData", "format": "EDAM_format:2200", "type": "EDAM_data:2044", "size": 1382}
This is a base dataset made of a single file (RefData). Its path is set in the path parameter.
The dataset id is in the "_id" parameter: "_id": {"$oid": "5375d59e2e71a879abed7497"}
GET /public.json
This request is the same as my.json but returns only public datasets.
GET /download/{uid}/*file
Example, for dataset:
data": {"path": ["seq"], "_type": "RefData", "format": "EDAM_format:2200", "type": "EDAM_data:2044", "size": 1382}
GET /download/5375d6b02e71a879abed7498/seq
The file upload creates a new dataset with a list of files embedded in the body request (multipart)
POST /data
Mandatory query parameters are:
- type: EDAM type of the data, '|' separator if multiple types
- format: EDAM format of the data, '|' separator if multiple formats
- project: Id of the project where data should be stored
- protocol: one of the supported protocols (ftp, http,...)
Optional query parameters are:
- _id: Id of the dataset to update
- privacy: is dataset public or not (public/private, default private)
- name: name of the dataset
- description: text to describe the dataset
- uncompress: if parameter is present, uncompress the file (default false)
- group: if file has been uncompressed, group the different files in a single dataset (default false, create one dataset per file)
The remote file upload creates a new dataset with a remote file URL (http, ftp, ...)
POST /remotedata
Mandatory query parameters are:
- rurl: URL of the remote file
- type: EDAM type of the data, '|' separator if multiple types
- format: EDAM format of the data, '|' separator if multiple formats
- project: Id of the project where data should be stored
- protocol: one of the supported protocols (ftp, http,...)
Optional query parameters are:
- _id: Id of the dataset to update
- privacy: is dataset public or not (public/private, default private)
- name: name of the dataset
- description: text to describe the dataset
- uncompress: if parameter is present, uncompress the file (default false)
- group: if file has been uncompressed, group the different files in a single dataset (default false, create one dataset per file)
POST /data/{uid}/*file
Mandatory parameters:
- rurl: URL of the file to download for the update
Optional query parameters:
- description: text to describe the dataset
- msg: message describing the update