Skip to content
This repository has been archived by the owner on May 28, 2024. It is now read-only.

DataManagerAPI

Olivier Sallou edited this page Aug 28, 2014 · 18 revisions

Public API

Introduction

Mobyle manages datasets. A dataset is a data element that can contain multiple files and contains meta-information about its type etc...

  • {uid}: the dataset identifier, the _id parameter of the Dataset element.
  • {token}: Token generated to access temporarly to a dataset
  • *file: file path in the dataset structure (as defined in the dataset information)

Authentication access

Different types of authentications are implemented to access the restricted access resources.

  • user has been authenticated with the web portal
  • a token has been created to grant temporary access to a dataset: the apikey parameter must be set in the query string parameters.
  • Oauth2 access: after getting access token, the Authorization heaber must send the bearer code as per specified in Oauth specification http://tools.ietf.org/html/rfc6749.

Oauth endpoints

The datamanager supports Oauth2 protocol to ask access to the data

  • /oauth/v2/authorize: authentication endpoint
  • /oauth/v2/token: access endpoint

If accepted content-type is application/json, a JSON answer is sent instead of redirect_uri.

Available scope are:

  • "user:email": access to the user email address
  • drive: acces to the user datasets

Dataset list

Get user dataset list

GET /my.json

Response example:

[
{"status": 2, "name": "jjjjjdddd", "tags": [], "persistent": false, "project": {"$oid": "5374c4d62e71a8032e9f4447"}, "path": null, "_id": {"$oid": "5375d59e2e71a879abed7497"}, "data": {"files": [{"path": "test.bam", "size": 19}, {"path": "test.bai", "size": 21}], "_type": "StructData", "type": "EDAM_data:0955+EDAM_data:0924", "properties": {"bai_data": {"path": ["test.bai"], "_type": "RefData", "format": "EDAM_format:3327", "type": "EDAM_data:0955", "size": 21}, "bam_data": {"path": ["test.bam"], "_type": "RefData", "format": "EDAM_format:2572", "type": "EDAM_data:0924", "size": 19}}, "format": "EDAM_format:3327+EDAM_format:2572"}, "public": true, "description": "dddd"},
{"status": 2, "name": "seq", "tags": [], "persistent": false, "project": {"$oid": "5374c4d62e71a8032e9f4447"}, "path": null, "_id": {"$oid": "5375d6b02e71a879abed7498"}, "data": {"path": ["seq"], "_type": "RefData", "format": "EDAM_format:2200", "type": "EDAM_data:2044", "size": 1382}, "public": true, "description": "ddd"}
]

In this response we have 2 datasets with:

"data": {"files": [{"path": "test.bam", "size": 19}, {"path": "test.bai", "size": 21}],, "_type": "StructData", ...

This is a case of a "complex" dataset (StructData) i.e. a dataset made of multiple files. Relative path to each file is set in the files parameter.

and

data": {"path": ["seq"], "_type": "RefData", "format": "EDAM_format:2200", "type": "EDAM_data:2044", "size": 1382}

This is a base dataset made of a single file (RefData). Its path is set in the path parameter.

The dataset id is in the "_id" parameter: "_id": {"$oid": "5375d59e2e71a879abed7497"}

Get public dataset list

GET /public.json

This request is the same as my.json but returns only public datasets.

Dataset file download

GET /data/{uid}/raw/*file (if public or authenticated or oauth token) GET /token/{token}/raw/*file (if mobyle access token)

Example, for dataset:

data": {"path": ["seq"], "_type": "RefData", "format": "EDAM_format:2200", "type": "EDAM_data:2044", "size": 1382}

GET /data/5375d6b02e71a879abed7498/raw/seq

Dataset upload

File upload - Not yet available in JSON

The file upload creates a new dataset with a list of files embedded in the body request (multipart)

POST /data

Mandatory query parameters are:

  • type: EDAM type of the data, '|' separator if multiple types
  • format: EDAM format of the data, '|' separator if multiple formats
  • project: Id of the project where data should be stored
  • protocol: one of the supported protocols (ftp, http,...)

Optional query parameters are:

  • _id: Id of the dataset to update
  • privacy: is dataset public or not (public/private, default private)
  • name: name of the dataset
  • description: text to describe the dataset
  • uncompress: if parameter is present, uncompress the file (default false)
  • group: if file has been uncompressed, group the different files in a single dataset (default false, create one dataset per file)

Remote file upload - Not yet in JSON

The remote file upload creates a new dataset with a remote file URL (http, ftp, ...)

POST /remotedata

Mandatory query parameters are:

  • rurl: URL of the remote file
  • type: EDAM type of the data, '|' separator if multiple types
  • format: EDAM format of the data, '|' separator if multiple formats
  • project: Id of the project where data should be stored
  • protocol: one of the supported protocols (ftp, http,...)

Optional query parameters are:

  • _id: Id of the dataset to update
  • privacy: is dataset public or not (public/private, default private)
  • name: name of the dataset
  • description: text to describe the dataset
  • uncompress: if parameter is present, uncompress the file (default false)
  • group: if file has been uncompressed, group the different files in a single dataset (default false, create one dataset per file)

Update a data file in a dataset

PUT /data/{uid}/raw/*file (if authenticated or oauth token)

PUT /token/{token}/raw/*file (if mobyle access token)

The file upload updates a dataset file with the file embedded in the body request (multipart)

Mandatory parameters:

  • msg: message describing the update
  • rurl (optional): gets file from a remote URL instead of embedding it in body request.

Example:

PUT /data/5375d6b02e71a879abed7498/raw/seq
params: 
        msg="file edited in myeditor.org"