forked from mskilab-org/Flow
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathDESCRIPTION
35 lines (35 loc) · 1.85 KB
/
DESCRIPTION
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Package: Flow
Title: Workflow and task management for genomics pipelines.
Version: 0.0.0.9000
Author: Marcin Imielinski <[email protected]>
Maintainer: Marcin Imielinski <[email protected]>
Authors@R: person("Marcin", "Imielinski", , "[email protected]", role = c("aut", "cre"))
Description: Flow is an R package that enables local configuration and execution
of modules on annotated sets of entities (eg pairs, individuals, samples).
Jobs can be either deployed locally or on LSF, then monitored and managed. Once
jobs complete, their outputs can be attached back to their respective entities
as annotations for easy import back into firehose or other databases. Like in
firehose (http://www.broadinstitute.org/cancer/cga/firehose), a job consists
of a task run on an entity (e.g. pair, individual, sample). A task wraps around
a module and binds module arguments to names of entity-specific annotations
or fixed literals which can represent paths (eg a bam file path) or values (eg
200). A task also specifies the binding of module outputs to output annotations.
A job is created by applying a task to a set of entities, which correspond to
keyed table of entity-specific annotations (eg bam_file_wgs, seg_file, etc).
Once a job completes, one or more output annotations (i.e. paths to output
files) are attached to the respective entity in an output table. This table
can now be merged into a flat "master" file or database of entity specific
annotation. Coming soon: A Flow object, which will represent a workflow, or
a collection of Tasks run on entities, but will have very similar properties
(vectorization, run control, status updating).
Depends:
R (>= 3.1.0),
data.table,
stringr,
parallel
Suggests:
XML,
testthat
License: GPL v3.0
LazyData: true
RoxygenNote: 6.0.1.9000