-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathworkflow.yml
392 lines (366 loc) · 12.4 KB
/
workflow.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
saladVersion: v1.1
$base: "https://galaxyproject.org/gxformat2/v19_09#"
$namespaces:
gxformat2: "https://galaxyproject.org/gxformat2/v19_09#"
gxformat2common: "https://galaxyproject.org/gxformat2/gxformat2common#"
cwl: "https://w3id.org/cwl/cwl#"
sld: "https://w3id.org/cwl/salad#"
rdfs: "http://www.w3.org/2000/01/rdf-schema#"
$graph:
- name: "WorkflowDoc"
type: documentation
doc:
- |
# Galaxy Workflow Format 2 Description
The traditional Galaxy workflow description (.ga) is not meant to be concise and is neither readily human readable or human writable.
Format 2 addresses all three of these limitations while also converging (where it makes sense without sacrificing these other goals)
with the workflow description with that used by the Common Workflow Language.
This standard is in active development and a moving target in many ways, but we will try to keep what is ingestible by Galaxy
backward-compatible going forward.
- $import: "../common/metaschema/metaschema_base.yml"
- $import: "./Process.yml" # trimmed down version of cwl's Process.yml
- $import: "../common/common.yml"
- name: GalaxyType
type: enum
extends: sld:PrimitiveType
symbols:
- integer
- text
- File
- data
- collection
doc:
- "Extends primitive types with the native Galaxy concepts such datasets and collections."
- "integer: an alias for int type - matches syntax used by Galaxy tools"
- "text: an alias for string type - matches syntax used by Galaxy tools"
- "File: an alias for data - there are subtle differences between a plain file, the CWL concept of 'File', and the Galaxy concept of a dataset - this may have subtly difference semantics in the future"
- "data: a Galaxy dataset"
- "collection: a Galaxy dataset collection"
- name: WorkflowStepType
type: enum
symbols:
- tool
- subworkflow
- pause
doc:
- |
Module types used by Galaxy steps. Galaxy's native format allows additional types such as data_input, data_input_collection, and parameter_type
but these should be represented as ``inputs`` in Format2.
- "tool: Run a tool."
- "subworkflow: Run a subworkflow."
- "pause: Pause computation on this branch of workflow until user allows it to continue."
- name: WorkflowInputParameter
type: record
extends:
- cwl:InputParameter
- gxformat2common:HasStepPosition
docParent: "#GalaxyWorkflow"
fields:
- name: type
type:
- GalaxyType
- "null"
- type: array
items:
- GalaxyType
default: data
jsonldPredicate:
"_id": "sld:type"
"_type": "@vocab"
refScope: 2
typeDSL: True
doc: |
Specify valid types of data that may be assigned to this parameter.
- name: optional
type:
- boolean
- "null"
default: false
doc: |
If set to true, `WorkflowInputParameter` is not required to submit the workflow.
- name: format
doc: |
Specify datatype extension for valid input datasets.
type:
- "null"
- type: array
items: string
- name: collection_type
doc: |
Collection type (defaults to `list` if `type` is `collection`). Nested
collection types are separated with colons, e.g. `list:list:paired`.
type:
- "null"
- string
- name: WorkflowOutputParameter
type: record
extends: cwl:OutputParameter
docParent: "#GalaxyWorkflow"
doc: |
Describe an output parameter of a workflow. The parameter must be
connected to one parameter defined in the workflow that
will provide the value of the output parameter. It is legal to
connect a WorkflowInputParameter to a WorkflowOutputParameter.
fields:
- name: outputSource
doc: |
Specifies workflow parameter that supply the value of to
the output parameter.
# Steps don't reference outputs in gxformat2 (yet anyway).
# Can we just link the step if before the /
#jsonldPredicate:
# "_id": "gxformat2:outputSource"
# "_type": "@id"
# refScope: 0
type:
- string?
- name: type
type: GalaxyType?
default: data
jsonldPredicate:
"_id": "sld:type"
"_type": "@vocab"
refScope: 2
typeDSL: True
doc: |
Specify valid types of data that may be assigned to this parameter.
- name: WorkflowStep
type: record
extends:
- cwl:Identified
- cwl:Labeled
- sld:Documented
- gxformat2common:HasStepPosition
- gxformat2common:ReferencesTool
- gxformat2common:HasStepErrors
- gxformat2common:HasUUID
docParent: "#GalaxyWorkflow"
doc: |
This represents a non-input step a Galaxy Workflow.
# A note about `state` and `tool_state` fields.
Only one or the other should be specified. These are two ways to represent the "state"
of a tool at this workflow step. Both are essentially maps from parameter names to
parameter values.
`tool_state` is much more low-level and expects a flat dictionary with each value a JSON
dump. Nested tool structures such as conditionals and repeats should have all their values
in the JSON dumped string. In general `tool_state` may be present in workflows exported from
Galaxy but shouldn't be written by humans.
`state` can contained a typed map. Repeat values can be represented as YAML arrays. An alternative
to representing `state` this way is defining inputs with default values.
fields:
- name: in
type: WorkflowStepInput[]?
jsonldPredicate:
_id: "gxformat2:in"
mapSubject: id
mapPredicate: source
doc: |
Defines the input parameters of the workflow step. The process is ready to
run when all required input parameters are associated with concrete
values. Input parameters include a schema for each parameter which is
used to validate the input object. It may also be used build a user
interface for constructing the input object.
- name: out
type:
- type: array
items: [string, WorkflowStepOutput]
- "null"
jsonldPredicate:
_id: "gxformat2:out"
mapSubject: id
mapPredicate: source
doc: |
Defines the parameters representing the output of the process. May be
used to generate and/or validate the output object.
This can also be called 'outputs' for legacy reasons - but the resulting
workflow document is not a valid instance of this schema.
- name: state
type: Any?
doc: |
Structured tool state.
jsonldPredicate:
_id: "gxformat2:state"
noLinkCheck: true
- name: tool_state
type: Any?
jsonldPredicate:
_id: "gxformat2:tool_state"
noLinkCheck: true
doc: |
Unstructured tool state.
- name: type
type: WorkflowStepType?
jsonldPredicate:
"_id": "sld:type"
"_type": "@vocab"
refScope: 2
typeDSL: True
default: tool
doc: |
Workflow step module's type (defaults to 'tool').
- name: run
type:
- "null"
- GalaxyWorkflow
jsonldPredicate:
_id: "cwl:run"
_type: "@id"
subscope: run
doc: |
Specifies a subworkflow to run.
- name: runtime_inputs
type:
- "null"
- type: array
items: string
- name: when
type:
- "null"
- string # TODO: cwl defines an enum for this, not sure how that works
jsonldPredicate:
_id: "gxformat2:when"
noLinkCheck: true
doc: |
If defined, only run the step when the expression evaluates to
`true`. If `false` the step is skipped. A skipped step
produces a `null` on each output.
Expression should be an ecma5.1 expression.
- name: Sink
type: record
abstract: true
fields:
- name: source
doc: |
Specifies one or more workflow parameters that will provide input to
the underlying step parameter.
jsonldPredicate:
"_id": "cwl:source"
"_type": "@id"
refScope: 2
type:
- string?
- string[]?
- type: record
name: WorkflowStepInput
extends: [cwl:Identified, Sink, cwl:Labeled]
docParent: "#WorkflowStep"
doc: |
TODO:
fields:
- name: default
type: ["null", Any]
doc: |
The default value for this parameter to use if either there is no
`source` field, or the value produced by the `source` is `null`. The
default must be applied prior to scattering or evaluating `valueFrom`.
jsonldPredicate:
_id: "sld:default"
noLinkCheck: true
- type: record
name: Report
doc: |
Definition of an invocation report for this workflow. Currently the only
field is 'markdown'.
fields:
- name: markdown
type: string
doc: |
Galaxy flavored Markdown to define an invocation report.
- type: record
name: WorkflowStepOutput
docParent: "#WorkflowStep"
extends: cwl:Identified
doc: |
Associate an output parameter of the underlying process with a workflow
parameter. The workflow parameter (given in the `id` field) be may be used
as a `source` to connect with input parameters of other workflow steps, or
with an output parameter of the process.
A unique identifier for this workflow output parameter. This is
the identifier to use in the `source` field of `WorkflowStepInput`
to connect the output value to downstream parameters.
fields:
rename:
type: string?
hide:
type: boolean?
delete_intermediate_datasets:
type: boolean?
change_datatype:
type: string?
set_columns:
type:
- "null"
- type: array
items: string
add_tags:
type:
- "null"
- type: array
items: string
remove_tags:
type:
- "null"
- type: array
items: string
- name: GalaxyWorkflow
type: record
extends:
- cwl:Process
- gxformat2common:HasUUID
specialize:
- specializeFrom: cwl:InputParameter
specializeTo: WorkflowInputParameter
- specializeFrom: cwl:OutputParameter
specializeTo: WorkflowOutputParameter
documentRoot: true
doc: |
A Galaxy workflow description. This record corresponds to the description of a workflow that should be executable
on a Galaxy server that includes the contained tool definitions.
The workflows API or the user interface of Galaxy instances that are of version 19.09 or newer should be able to
import a document defining this record.
## A note about `label` field.
This is the name of the workflow in the Galaxy user interface. This is the mechanism that
users will primarily identify the workflow using. Legacy support - this may also be called 'name' and Galaxy will
consume the workflow document fine and treat this attribute correctly - however in order to validate against this
workflow definition schema the attribute should be called `label`.
fields:
# I can't find a way to override the documentation of label, even though I contextual.
# - name: label
# type:
# - "null"
# - string
# jsonldPredicate: "rdfs:label"
# doc: {$include: ../common/workflow_label_description.txt}
# # Legacy support - this may also be called 'name' and Galaxy will consume the workflow fine and treat this
# # attribute correctly - however in order to validate against this schema the attribute should be called label.
- name: "class"
jsonldPredicate:
"_id": "@type"
"_type": "@vocab"
type: string
- name: steps
doc: {$include: ../common/steps_description.txt}
type:
- type: array
items: "#WorkflowStep"
jsonldPredicate:
mapSubject: id
- name: report
doc: Workflow invocation report template.
type: Report?
- name: tags
type:
- type: array
items: string
- "null"
doc: |
Tags for the workflow.
- name: creator
type: Any?
doc: Can be a schema.org Person (https://schema.org/Person) or Organization (https://schema.org/Organization) entity
- name: license
type: string?
doc: Must be a valid license listed at https://spdx.org/licenses/
- name: release
type: string?
doc: If listed should correspond to the release of the workflow in its source reposiory.