-
Notifications
You must be signed in to change notification settings - Fork 12
/
openapi.yaml
6244 lines (5819 loc) · 267 KB
/
openapi.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
openapi: 3.0.2
info:
title: openEO API
version: 1.1.0
description: |-
The openEO API specification for interoperable cloud-based processing of large Earth observation datasets.
# API Principles
## Language
In the specification the key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.html) and [RFC 8174](https://www.rfc-editor.org/rfc/rfc8174.html).
## Casing
Unless otherwise stated the API works **case sensitive**.
All names SHOULD be written in snake case, i.e. words are separated with one underscore character (`_`) and no spaces, with all letters lower-cased. Example: `hello_world`. This applies particularly to endpoints and JSON property names. HTTP header fields follow their respective casing conventions, e.g. `Content-Type` or `OpenEO-Costs`, despite being case-insensitive according to [RFC 7230](https://www.rfc-editor.org/rfc/rfc7230.html#section-3.2).
## HTTP / REST
This uses [HTTP REST](https://en.wikipedia.org/wiki/Representational_state_transfer) [Level 2](https://martinfowler.com/articles/richardsonMaturityModel.html#level2) for communication between client and back-end server.
Public APIs MUST be available via HTTPS only.
Endpoints are made use meaningful HTTP verbs (e.g. GET, POST, PUT, PATCH, DELETE) whenever technically possible. If there is a need to transfer big chunks of data for a GET requests to the back-end, POST requests MAY be used as a replacement as they support to send data via request body. Unless otherwise stated, PATCH requests are only defined to work on direct (first-level) children of the full JSON object. Therefore, changing a property on a deeper level of the full JSON object always requires to send the whole JSON object defined by the first-level property.
Naming of endpoints follow the REST principles. Therefore, endpoints are centered around resources. Resource identifiers MUST be named with a noun in plural form except for single actions that can not be modelled with the regular HTTP verbs. Single actions MUST be single endpoints with a single HTTP verb (POST is RECOMMENDED) and no other endpoints beneath it.
## JSON
The API uses JSON for request and response bodies whenever feasible. Services use JSON as the default encoding. Other encodings can be requested using [Content Negotiation](https://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html). Clients and servers MUST NOT rely on the order in which properties appears in JSON. Collections usually don't include nested JSON objects if those information can be requested from the individual resources.
## Web Linking
The API is designed in a way that to most entities (e.g. collections and processes) a set of links can be added. These can be alternate representations, e.g. data discovery via OGC WCS or OGC CSW, references to a license, references to actual raw data for downloading, detailed information about pre-processing and more. Clients should allow users to follow the links.
Whenever links are utilized in the API, the description explains which relation (`rel` property) types are commonly used.
A [list of standardized link relations types is provided by IANA](https://www.iana.org/assignments/link-relations/link-relations.xhtml) and the API tries to align whenever feasible.
Some very common relation types - usually not mentioned explicitly in the description of `links` fields - are:
1. `self`: which allows link to the location that the resource can be (permanently) found online.This is particularly useful when the data is data is made available offline, so that the downstream user knows where the data has come from.
2. `alternate`: An alternative representation of the resource, may it be another metadata standard the data is available in or simply a human-readable version in HTML or PDF.
3. `about`: A resource that is related or further explains the resource, e.g. a user guide.
4. `canonical`: This relation type usually points to a publicly accessible and more long-lived URL for a resource that otherwise often requires (Bearer) authentication with a short-lived token.
This way the the exposed resources can be used by non-openEO clients without additional authentication steps.
For example, a shared user-defined process or batch job results could be exposed via a canonical link.
If a URL should be publicly available to everyone, it can simply a user-specific URL, e.g. `https://example.com/processes/john_doe/ndvi`.
For resources that should only be accessible to a certain group of user, a signed URL could be given, e.g. `https://example.com/processes/81zjh1tc2pt52gbx/ndvi`.
Generally, it is RECOMMENDED to add descriptive titles (propertty `title`) and media type information (propertty `type`) for a better user experience.
## Error Handling
The success of requests MUST be indicated using [HTTP status codes](https://www.rfc-editor.org/rfc/rfc7231.html#section-6) according to [RFC 7231](https://www.rfc-editor.org/rfc/rfc7231.html).
If the API responds with a status code between 100 and 399 the back-end indicates that the request has been handled successfully.
In general an error is communicated with a status code between 400 and 599. Client errors are defined as a client passing invalid data to the service and the service *correctly* rejecting that data. Examples include invalid credentials, incorrect parameters, unknown versions, or similar. These are generally "4xx" HTTP error codes and are the result of a client passing incorrect or invalid data. Client errors do *not* contribute to overall API availability.
Server errors are defined as the server failing to correctly return in response to a valid client request. These are generally "5xx" HTTP error codes. Server errors *do* contribute to the overall API availability. Calls that fail due to rate limiting or quota failures MUST NOT count as server errors.
### JSON error object
A JSON error object SHOULD be sent with all responses that have a status code between 400 and 599.
``` json
{
"id": "936DA01F-9ABD-4D9D-80C7-02AF85C822A8",
"code": "SampleError",
"message": "A sample error message.",
"url": "https://example.openeo.org/docs/errors/SampleError"
}
```
Sending `code` and `message` is REQUIRED.
* A back-end MAY add a free-form `id` (unique identifier) to the error response to be able to log and track errors with further non-disclosable details.
* The `code` is either one of the [standardized textual openEO error codes](errors.json) or a proprietary error code.
* The `message` explains the reason the server is rejecting the request. For "4xx" error codes the message explains how the client needs to modify the request.
By default the message MUST be sent in English language. Content Negotiation is used to localize the error messages: If an `Accept-Language` header is sent by the client and a translation is available, the message should be translated accordingly and the `Content-Language` header must be present in the response. See "[How to localize your API](http://apiux.com/2013/04/25/how-to-localize-your-api/)" for more information.
* `url` is an OPTIONAL attribute and contains a link to a resource that is explaining the error and potential solutions in-depth.
### Standardized status codes
The openEO API usually uses the following HTTP status codes for successful requests:
- **200 OK**:
Indicates a successful request **with** a response body being sent.
- **201 Created**
Indicates a successful request that successfully created a new resource. Sends a `Location` header to the newly created resource **without** a response body.
- **202 Accepted**
Indicates a successful request that successfully queued the creation of a new resource, but it has not been created yet. The response is sent **without** a response body.
- **204 No Content**:
Indicates a successful request **without** a response body being sent.
The openEO API has some commonly used HTTP status codes for failed requests:
- **400 Bad Request**:
The back-end responds with this error code whenever the error has its origin on client side and no other HTTP status code in the 400 range is suitable.
- **401 Unauthorized**:
The client did not provide any authentication details for a resource requiring authentication or the provided authentication details are not correct.
- **403 Forbidden**:
The client did provided correct authentication details, but the privileges/permissions of the provided credentials do not allow to request the resource.
- **404 Not Found**:
The resource specified by the path does not exist, i.e. one of the resources belonging to the specified identifiers are not available at the back-end.
*Note:* Unsupported endpoints MAY also return HTTP status code 501.
- **500 Internal Server Error**:
The error has its origin on server side and no other status code in the 500 range is suitable.
- **501 Not Implemented**:
The requested endpoint is specified by the openEO API, but is not implemented (yet) by the back-end.
*Note:* Unsupported endpoints MAY also return HTTP status code 404.
If a HTTP status code in the 400 range is returned, the client SHOULD NOT repeat the request without modifications. For HTTP status code in the 500 range, the client MAY repeat the same request later.
All HTTP status codes defined in RFC 7231 in the 400 and 500 ranges can be used as openEO error code in addition to the most used status codes mentioned here. Responding with openEO error codes 400 and 500 SHOULD be avoided in favor of any more specific standardized or proprietary openEO error code.
## Temporal data
Date, time, intervals and durations are formatted based on ISO 8601 or its profile [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html) whenever there is an appropriate encoding available in the standard. All temporal data are specified based on the Gregorian calendar.
# Authentication
The openEO API offers two forms of authentication by default:
* OpenID Connect (recommended) at `GET /credentials/oidc`
* Basic at `GET /credentials/basic`
After authentication with any of the methods listed above, the tokens obtained during the authentication workflows can be sent to protected endpoints in subsequent requests.
Further authentication methods MAY be added by back-ends.
<SecurityDefinitions />
# Cross-Origin Resource Sharing (CORS)
> Cross-origin resource sharing (CORS) is a mechanism that allows restricted resources [...] on a web page to be requested from another domain outside the domain from which the first resource was served. [...]
> CORS defines a way in which a browser and server can interact to determine whether or not it is safe to allow the cross-origin request. It allows for more freedom and functionality than purely same-origin requests, but is more secure than simply allowing all cross-origin requests.
Source: [https://en.wikipedia.org/wiki/Cross-origin_resource_sharing](https://en.wikipedia.org/wiki/Cross-origin_resource_sharing)
openEO-based back-ends are usually hosted on a different domain / host than the client that is requesting data from the back-end. Therefore most requests to the back-end are blocked by all modern browsers. This leads to the problem that the JavaScript library and any browser-based application can't access back-ends. Therefore, all back-end providers SHOULD support CORS to enable browser-based applications to access back-ends. [CORS is a recommendation of the W3C organization](https://www.w3.org/TR/cors/). The following chapters will explain how back-end providers can implement CORS support.
**Tip**: Most servers can send the required headers and the responses to the OPTIONS requests automatically for all endpoints. Otherwise you may also use a proxy server to add the headers and OPTIONS responses.
## CORS headers
The following headers MUST be included with every response:
| Name | Description | Example |
| -------------------------------- | ------------------------------------------------------------ | ------- |
| Access-Control-Allow-Origin | Allowed origin for the request, including protocol, host and port or `*` for all origins. It is RECOMMENDED to return the value `*` to allow requests from browser-based implementations such as the Web Editor. | `*` |
| Access-Control-Expose-Headers | Some endpoints require to send additional HTTP response headers such as `OpenEO-Identifier` and `Location`. To make these headers available to browser-based clients, they MUST be white-listed with this CORS header. The following HTTP headers are white-listed by browsers and MUST NOT be included: `Cache-Control`, `Content-Language`, `Content-Length`, `Content-Type`, `Expires`, `Last-Modified` and `Pragma`. At least the following headers MUST be listed in this version of the openEO API: `Link`, `Location`, `OpenEO-Costs` and `OpenEO-Identifier`. | `Link, Location, OpenEO-Costs, OpenEO-Identifier` |
### Example request and response
Request:
```http
POST /api/v1/jobs HTTP/1.1
Host: openeo.cloudprovider.com
Origin: https://client.org:8080
Authorization: Bearer basic//ZXhhbXBsZTpleGFtcGxl
```
Response:
```http
HTTP/1.1 201 Created
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Location, OpenEO-Identifier, OpenEO-Costs, Link
Content-Type: application/json
Location: https://openeo.cloudprovider.com/api/v1/jobs/abc123
OpenEO-Identifier: abc123
```
## OPTIONS method
All endpoints must respond to the `OPTIONS` HTTP method. This is a response for the preflight requests made by web browsers before sending the actual request (e.g. `POST /jobs`). It needs to respond with a status code of `204` and no response body.
**In addition** to the HTTP headers shown in the table above, the following HTTP headers MUST be included with every response to an `OPTIONS` request:
| Name | Description | Example |
| -------------------------------- | ------------------------------------------------------------ | ------- |
| Access-Control-Allow-Headers | Comma-separated list of HTTP headers allowed to be sent with the actual (non-preflight) request. MUST contain at least `Authorization` if any kind of authorization is implemented by the back-end. | `Authorization, Content-Type` |
| Access-Control-Allow-Methods | Comma-separated list of HTTP methods allowed to be requested. Back-ends MUST list all implemented HTTP methods for the endpoint. | `OPTIONS, GET, POST, PATCH, PUT, DELETE` |
| Content-Type | SHOULD return the content type delivered by the request that the permission is requested for. | `application/json` |
### Example request and response
Request:
```http
OPTIONS /api/v1/jobs HTTP/1.1
Host: openeo.cloudprovider.com
Origin: https://client.org:8080
Access-Control-Request-Method: POST
Access-Control-Request-Headers: Authorization, Content-Type
```
Note that the `Access-Control-Request-*` headers are automatically attached to the requests by the browsers.
Response:
```http
HTTP/1.1 204 No Content
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: OPTIONS, GET, POST, PATCH, PUT, DELETE
Access-Control-Allow-Headers: Authorization, Content-Type
Access-Control-Expose-Headers: Location, OpenEO-Identifier, OpenEO-Costs, Link
Content-Type: application/json
```
# Processes
A **process** is an operation that performs a specific task on a set of parameters and returns a result. An example is computing a statistical operation, such as mean or median, on selected EO data. A process is similar to a function or method in programming languages. In openEO, processes are used to build a chain of processes ([process graph](#section/Processes/Process-Graphs)), which can be applied to EO data to derive your own findings from the data.
A **predefined process** is a process provided by the *back-end*. There is a set of predefined processes by openEO to improve interoperability between back-ends.
Back-ends SHOULD follow these specifications whenever possible. Not all processes need to be implemented by all back-ends. See the **[process reference](https://processes.openeo.org)** for predefined processes.
A **user-defined process** is a process defined by the *user*. It can directly be part of another process graph or be stored as custom process on a back-end. Internally it is a *process graph* with optional additional metadata.
A **process graph** chains specific process calls from the set of predefined and user-defined processes together. A process graph itself can be stored as a (user-defined) process again. Similarly to scripts in the context of programming, process graphs organize and automate the execution of one or more processes that could alternatively be executed individually. In a process graph, processes need to be specific, i.e. concrete values or "placeholders" for input parameters need to be specified. These values can be scalars, arrays, objects, references to parameters or previous computations or other process graphs.
## Defining Processes
Back-ends and users MAY define new proprietary processes for their domain.
**Back-end providers** MUST follow the schema for predefined processes as in [`GET /processes`](#operation/list-processes) to define new processes. This includes:
* Choosing a intuitive process id, consisting of only letters (a-z), numbers and underscores. It MUST be unique across the predefined processes.
* Defining the parameters and their exact (JSON) schemes.
* Specifying the return value of a process also with a (JSON) schema.
* Providing examples or compliance tests.
* Trying to make the process universally usable so that other back-end providers or openEO can adopt it.
**Users** MUST follow the schema for user-defined processes as in [`GET /process_graphs/{process_graph_id}`](#operation/describe-custom-process) to define new processes. This includes:
* Choosing a intuitive name as process id, consisting of only letters (a-z), numbers and underscores. It MUST be unique across the user-defined processes.
* Defining the algorithm as a process graph.
* Optionally, specifying the additional metadata for processes.
If new process are potentially useful for other back-ends the openEO consortium is happily accepting [pull requests](https://github.com/Open-EO/openeo-processes/pulls) to include them in the list of predefined processes.
### Schemas
Each process parameter and the return values of a process define a schema that the value MUST comply to. The schemas are based on [JSON Schema draft-07](http://json-schema.org/).
Three custom keywords have been defined:
* `subtype` for more fine-grained data-types than JSON Schema supports.
* `parameters` to specify the parameters of a process graph (object with subtype `process-graph`).
* `returns` to describe the return value of a process graph (object with subtype `process-graph`).
### Subtypes
JSON Schema allows to specify only a small set of native data types (string, boolean, number, integer, array, object, null).
To support more fine grained data types, a custom [JSON Schema keyword](https://json-schema.org/draft-07/json-schema-core.html#rfc.section.6.4) has been defined: `subtype`.
It works similarly as the JSON Schema keyword [`format`](https://json-schema.org/draft-07/json-schema-validation.html#rfc.section.7)
and standardizes a number of openEO related data types that extend the native data types, for example:
`bounding-box` (object with at least `west`, `south`, `east` and `north` properties),
`date-time` (string representation of date and time following RFC 3339),
`raster-cube` (the type of data cubes), etc.
The subtypes should be re-used in process schema definitions whenever suitable.
If a general data type such as `string` or `number` is used in a schema, all subtypes with the same parent data type can be passed, too.
Clients should offer make passing subtypes as easy as passing a general data type.
For example, a parameter accepting strings must also allow passing a string with subtype `date` and thus clients should encourage this by also providing a date-picker.
A list of predefined subtypes is available as JSON Schema in [openeo-processes](https://github.com/Open-EO/openeo-processes).
## Process Graphs
As defined above, a **process graph** is a chain of processes with explicit values for their parameters.
Technically, a process graph is defined to be a graph of connected processes with exactly one node returning the final result:
```
<ProcessGraph> := {
"<ProcessNodeIdentifier>": <ProcessNode>,
...
}
```
`<ProcessNodeIdentifier>` is a unique key within the process graph that is used to reference (the return value of) this process in arguments of other processes. The identifier is unique only strictly within itself, excluding any parent and child process graphs. Process node identifiers are also strictly scoped and can not be referenced from child or parent process graphs. Circular references are not allowed.
Note: We provide a non-binding [JSON Schema for basic process graph validation](assets/pg-schema.json).
### Processes (Process Nodes)
A single node in a process graph (i.e. a specific instance of a process) is defined as follows:
```
<ProcessNode> := {
"process_id": <string>,
"namespace": <string> / null,
"description": <string>,
"arguments": <Arguments>,
"result": true / false
}
```
A process node MUST always contain key-value-pairs named `process_id` and `arguments`. It MAY contain a `description`.
One of the nodes in a map of processes (the final one) MUST have the `result` flag set to `true`, all the other nodes can omit it as the default value is `false`. Having such a node is important as multiple end nodes are possible, but in most use cases it is important to exactly specify the return value to be used by other processes. Each child process graph must also specify a result node similar to the "main" process graph.
`process_id` MUST be a valid process ID in the `namespace` given. Clients SHOULD warn the user if a user-defined process is added with the same identifier as one of the predefined process.
### Arguments
A process can have an arbitrary number of arguments. Their name and value are specified
in the process specification as an object of key-value pairs:
```
<Arguments> := {
"<ParameterName>": <string|number|boolean|null|array|object|ResultReference|UserDefinedProcess|ParameterReference>
}
```
**Notes:**
- The specified data types are the native data types supported by JSON, except for `ResultReference`, `UserDefinedProcess` and `ParameterReference`.
- Objects are not allowed to have keys with the following reserved names:
* `from_node`, except for objects of type `ResultReference`
* `process_graph`, except for objects of type `UserDefinedProcess`
* `from_parameter`, except for objects of type `ParameterReference`
- Arrays and objects can also contain a `ResultReference`, a `UserDefinedProcess` or a `ParameterReference`. So back-ends must *fully* traverse the process graphs, including all children.
### Accessing results of other process nodes
A value of type `<ResultReference>` is an object with a key `from_node` and a `<ProcessNodeIdentifier>` as corresponding value:
```
<ResultReference> := {
"from_node": "<ProcessNodeIdentifier>"
}
```
This tells the back-end that the process expects the result (i.e. the return value) from another process node to be passed as argument.
The `<ProcessNodeIdentifier>` is strictly scoped and can only reference nodes from within the same process graph, not child or parent process graphs.
### User-defined process
A user-defined process in a process graph is a child process graph, to be evaluated as part of another process.
**Example**: You want to calculate the absolute value of each pixel in a data cube.
This can be achieved in openEO by executing the `apply` process and pass it
a user-defined process as the "operator" to apply to each pixel.
In this simple example, the "child" process graph defining the user-defined process
consists of a single process `absolute`, but it can be arbitrary complex in general.
A `<UserDefinedProcess>` argument MUST at least consist of an object with a key `process_graph`.
Optionally, it can also be described with the same additional properties available for predefined processes such as an id, parameters, return values etc.
When embedded in a process graph, these additional properties of a user-defined process are usually not used, except for validation purposes.
```
<UserDefinedProcess> := {
"process_graph": <ProcessGraph>,
...
}
```
### Accessing process parameters
A "parent" process that works with a user-defined process can make so called *process graph parameters*
available to the "child" logic.
Processes in the "child" process graph can access these parameters by passing a `ParameterReference` object as argument.
It is an object with key `from_parameter` specifying the name of the process graph parameter:
```
<ParameterReference> := {
"from_parameter": "<ParameterReferenceName>"
}
```
The parameter names made available for `<ParameterReferenceName>` are defined and passed to the process graph by one of the parent entities.
The parent could be a process (such as `apply` or `reduce_dimension`) or something else that executes a process graph (a secondary web service for example).
If the parent is a process, the parameter are defined in the [`parameters` property](#section/Processes/Defining-Processes) of the corresponding JSON Schema.
In case of the example given above, the parameter `process` in the process [`apply`](https://processes.openeo.org/#apply) defines two process graph parameters: `x` (the value of each pixel that will be processed) and `context` (additional data passed through from the user).
The process `absolute` expects an argument with the same name `x`.
The process graph for the example would look as follows:
```
{
"process_id": "apply",
"arguments": {
"data": {"from_node": "loadcollection1"}
"process": {
"process_graph": {
"abs1": {
"process_id": "absolute",
"arguments": {
"x": {"from_parameter": "x"}
},
"result": true
}
}
}
}
}
```
`loadcollection1` would be a result from another process, which is not part of this example.
**Important:** `<ParameterReferenceName>` is less strictly scoped than `<ProcessNodeIdentifier>`.
`<ParameterReferenceName>` can be any parameter from the process graph or any of its parents.
The value for the parameter MUST be resolved as follows:
1. In general the most specific parameter value is used. This means the parameter value is resolved starting from the current scope and then checking each parent for a suitable parameter value until a parameter values is found or the "root" process graph has been reached.
2. In case a parameter value is not available, the most unspecific default value from the process graph parameter definitions are used. For example, if default values are available for the root process graph and all children, the default value from the root process graph is used.
3. If no default values are available either, the error `ProcessParameterMissing` must be thrown.
### Full example for an EVI computation
Deriving minimum EVI (Enhanced Vegetation Index) measurements over pixel time series of Sentinel 2 imagery. The main process graph in blue, child process graphs in yellow:
![Graph with processing instructions](assets/pg-evi-example.png)
The process graph for the algorithm: [pg-evi-example.json](assets/pg-evi-example.json)
## Data Processing
Processes can run in three different ways:
1. Results can be pre-computed by creating a ***batch job***. They are submitted to the back-end's processing system, but will remain inactive until explicitly put into the processing queue. They will run only once and store results after execution. Results can be downloaded. Batch jobs are typically time consuming and user interaction is not possible although log files are generated for them. This is the only mode that allows to get an estimate about time, volume and costs beforehand.
2. A more dynamic way of processing and accessing data is to create a **secondary web service**. They allow web-based access using different protocols such as [OGC WMS](http://www.opengeospatial.org/standards/wms), [OGC WCS](http://www.opengeospatial.org/standards/wcs), [OGC API - Features](https://www.ogc.org/standards/ogcapi-features) or [XYZ tiles](https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames). Some protocols such as the OGC WMS or XYZ tiles allow users to change the viewing extent or level of detail (zoom level). Therefore, computations often run *on demand* so that the requested data is calculated during the request. Back-ends should make sure to cache processed data to avoid additional/high costs and reduce waiting times for the user.
3. Processes can also be executed **on-demand** (i.e. synchronously). Results are delivered with the request itself and no job is created. Only lightweight computations, for example previews, should be executed using this approach as timeouts are to be expected for [long-polling HTTP requests](https://www.pubnub.com/blog/2014-12-01-http-long-polling/).
### Validation
Process graph validation is a quite complex task. There's a [JSON schema](assets/pg-schema.json) for basic process graph validation. It checks the general structure of a process graph, but only checking against the schema is not fully validating a process graph. Note that this JSON Schema is probably good enough for a first version, but should be revised and improved for production. There are further steps to do:
1. Validate whether there's exactly one `result: true` per process graph.
2. Check whether the process names that are referenced in the field `process_id` are actually available in the corresponding `namespace`.
3. Validate all arguments for each process against the JSON schemas that are specified in the corresponding process specifications.
4. Check whether the values specified for `from_node` have a corresponding node in the same process graph.
5. Validate whether the return value and the arguments requesting a return value with `from_node` are compatible.
7. Check the content of arrays and objects. These could include parameter and result references (`from_node`, `from_parameter` etc.).
### Execution
To process the process graph on the back-end you need to go through all nodes/processes in the list and set for each node to which node it passes data and from which it expects data. In another iteration the back-end can find all start nodes for processing by checking for zero dependencies.
You can now start and execute the start nodes (in parallel, if possible). Results can be passed to the nodes that were identified beforehand. For each node that depends on multiple inputs you need to check whether all dependencies have already finished and only execute once the last dependency is ready.
Please be aware that the result node (`result` set to `true`) is not necessarily the last node that is executed. The author of the process graph may choose to set a non-end node to the result node!
contact:
name: openEO Consortium
url: 'https://openeo.org'
email: [email protected]
license:
name: Apache 2.0
url: 'http://www.apache.org/licenses/LICENSE-2.0.html'
externalDocs:
description: openEO Documentation
url: 'https://openeo.org/documentation/1.0/'
tags:
- name: Capabilities
description: General information about the API implementation and other supported capabilities at the back-end.
- name: Account Management
description: |-
The following endpoints handle user profiles, accounting and authentication. See also [Authentication](#section/Authentication). In general, the openEO API only defines a minimum subset of user management and accounting functionality. It allows to
* [authenticate and authorize](http://www.differencebetween.net/technology/difference-between-authentication-and-authorization/) a user, which may include [user registration with OpenID Connect](http://openid.net/specs/openid-connect-registration-1_0.html),
* handle storage space limits (disk quota),
* manage billing, which includes to
* query the credit a user has available,
* estimate costs for certain operations (data processing and downloading),
* get information about produced costs,
* limit costs of certain operations.
Therefore, the API leaves some aspects open that have to be handled by the back-ends separately, including
* credentials recovery, e.g. retrieving a forgotten password
* user data management, e.g. changing the users payment details or email address
* payments, i.e. topping up credits for pre-paid services or paying for post-paid services
* other accounting related tasks, e.g. creating invoices,
* user registration (except for [user registration with OpenID Connect](http://openid.net/specs/openid-connect-registration-1_0.html)).
- name: EO Data Discovery
description: |-
These endpoints allow to list the collections that are available at the back-end and can be used as data cubes for data processing.
## STAC
For data discovery of Earth Observation Collections at the back-ends, openEO strives for compatibility with the specifications [SpatioTemporal Asset Catalog (STAC)](https://stacspec.org/) and [OGC API - Features - Part 1: Core](http://docs.opengeospatial.org/is/17-069r3/17-069r3.html) as far as possible. Implementing the data discovery endpoints of openEO should also produce valid STAC API 0.9.0 and OGC API - Features 1.0 responses, including ([partial](#provide-data-for-download)) compatibility with their APIs.
The data discovery endpoints `GET /collections` and `GET /collections/{collection_id}` are compatible with OGC API - Features and STAC. Both specifications define additional endpoints that need to be implemented to be fully compatible. The additional endpoints can easily be integrated into an openEO API implementation. A rough list of actions for compatibility is available below, but please refer to their specifications to find out the full details.
**Important:** [STAC specification](https://github.com/radiantearth/stac-spec) and [STAC API](https://github.com/radiantearth/stac-api-spec) are different specifications and have different version numbers after version 0.9.0.
The openEO API only implements [STAC API version 0.9.0](https://github.com/radiantearth/stac-spec/blob/v0.9.0/api-spec/README.md), which allows to serve all STAC specification versions in the range of 0.9.x and 1.x.x (see the `stac_version` property).
### Content Extensions
STAC has several [extensions](https://stac-extensions.github.io) that can be used to better describe your data. Clients and server are not required to implement all of them, so be aware that some clients may not be able to read all your metadata.
Some commonly used extensions that are relevant for datasets exposed through the openEO API are:
- Data Cube extension (part of the openEO API)
- [EO (Electro-Optical) extension](https://github.com/stac-extensions/eo)
- [Processing extension](https://github.com/stac-extensions/processing)
- [Raster extension](https://github.com/stac-extensions/raster)
- [SAR extension](https://github.com/stac-extensions/sar)
- [Satellite extension](https://github.com/stac-extensions/sat)
- [Scientific Citation extension](https://github.com/stac-extensions/scientific)
### Provide data for download
If you'd like to provide your data for download in addition to offering the cloud processing service, you can implement the full STAC API. Therefore you can implement the endpoints `GET /collections/{collection_id}/items` and `GET /collections/{collection_id}/items/{feature_id}` to support retrieval of individual items. To benefit from the STAC ecosystem and allow searching for items you can also implement `POST /search` and `GET /search`. Further information can be found in the [STAC API repository](https://github.com/radiantearth/stac-spec/tree/v0.9.0/api-spec).
- name: Process Discovery
description: |-
These endpoints allow to list the predefined processes that are available at the back-end. To list user-defined processes see '[User-Defined Processes](#tag/User-Defined-Processes)'.
- name: User-Defined Processes
description: These endpoints allow to store and manage user-defined processes with their process graphs at the back-end.
- name: Data Processing
description: Organizes and manages data processing on the back-end, either as synchronous on-demand computation or batch jobs.
- name: Batch Jobs
description: Management of batch processing tasks (jobs) and their results.
- name: Secondary Services
description: On-demand access to data using other web service protocols.
- name: File Storage
description: Management of user-uploaded assets and processed data.
servers:
- url: 'https://localhost/api/{version}'
description: >-
The URL of the API MAY freely be chosen by the back-end providers. The
path, including API versioning, is a *recommendation* only. Nevertheless,
all servers MUST support HTTPS as the authentication methods are not
secure with HTTP only!
variables:
version:
default: v1
description: |-
API versioning is RECOMMENDED. As the openEO API is following
[SemVer](https://semver.org/) only the **major** part of the version
numbers SHOULD be used for API versioning in the URL. To make clear
that it is a version number, it is RECOMMENDED to add the prefix `v`.
Example: API version `1.2.3` is recommended to use `v1`.
The reason to only consider the major part is that backward-incompatible
changes are introduced by major changes only. All changes from minor
and patch releases can usually be integrated without breakages and thus
a change in the URL is not really needed.
The version number in the URL MUST not be used by the clients to detect
the version number of the API. Use the version number returned in the
property `api_version` from `GET /` instead.
paths:
/:
get:
summary: Information about the back-end
operationId: capabilities
description: >-
Lists general information about the back-end, including which version
and endpoints of the openEO API are supported. May also include billing
information.
tags:
- Capabilities
security:
- {}
responses:
'200':
description: |-
Information about the API version and supported endpoints /
features.
This endpoint MUST always be available for the API to be valid.
content:
application/json:
schema:
title: Capabilities
type: object
required:
- id
- title
- description
- api_version
- backend_version
- stac_version
- endpoints
- links
properties:
api_version:
type: string
description: >-
Version number of the openEO specification this back-end
implements.
enum:
- 1.1.0
backend_version:
type: string
description: >-
Version number of the back-end implementation.
Every change on back-end side MUST cause a change of the
version number.
example: 1.1.2
stac_version:
$ref: '#/components/schemas/stac_version'
type:
type: string
enum:
- Catalog
description: >-
For STAC versions >= 1.0.0-rc.1 this field is required.
example: Catalog
id:
type: string
description: >-
Identifier for the service.
This field originates from STAC and is used as unique identifier for the STAC catalog available at `/collections`.
example: cool-eo-cloud
title:
type: string
description: The name of the service.
example: Cool EO Cloud
description:
type: string
format: commonmark
description: >-
A description of the service, which allows the service
provider to introduce the user to its service.
[CommonMark 0.29](http://commonmark.org/) syntax MAY be
used for rich text representation.
example: |-
This service is provided to you by [Cool EO Cloud Corp.](http://cool-eo-cloud-corp.com). It implements the full openEO API and allows to process a range of 999 EO data sets, including
* Sentinel 1/2/3 and 5
* Landsat 7/8
A free plan is available to test the service. For further information please contact our customer service at [[email protected]](mailto:[email protected]).
production:
$ref: '#/components/schemas/production'
endpoints:
type: array
description: >-
Lists all supported endpoints. Supported are all
endpoints, which are implemented, return a 2XX or 3XX HTTP
status code and are fully compatible to the API
specification.
An entry for this endpoint (path `/` with method `GET`)
SHOULD NOT be listed.
items:
title: Endpoint
type: object
required:
- path
- methods
properties:
path:
description: >-
Path to the endpoint, relative to the URL of this
endpoint. In general the paths MUST follow the paths
specified in the openAPI specification as closely as
possible. Therefore, paths MUST be prepended with a
leading slash, but MUST NOT contain a trailing
slash. Variables in the paths MUST be placed in
curly braces and follow the parameter names in the
openAPI specification, e.g. `{job_id}`.
type: string
methods:
description: >-
Supported HTTP verbs in uppercase. It is OPTIONAL to
list `OPTIONS` as method (see the [CORS section](#section/Cross-Origin-Resource-Sharing-(CORS))).
type: array
items:
type: string
enum:
- GET
- POST
- PATCH
- PUT
- DELETE
- OPTIONS
example:
- path: /collections
methods:
- GET
- path: '/collections/{collection_id}'
methods:
- GET
- path: /processes
methods:
- GET
- path: /jobs
methods:
- GET
- POST
- path: '/jobs/{job_id}'
methods:
- GET
- DELETE
- PATCH
billing:
title: Billing
description: >-
Billing related data, e.g. the currency used or available
plans to process jobs.
This property MUST be specified if the back-end uses any
billing related API functionalities, e.g. budgeting or
estimates.
The absence of this property doesn't mean the back-end is
necessarily free to use for all. Providers may choose to
bill users outside of the API, e.g. with a monthly fee
that is not depending on individual API interactions.
type: object
required:
- currency
properties:
currency:
description: >-
The currency the back-end is billing in. The currency
MUST be either a valid currency code as defined in
ISO-4217 or a proprietary currency, e.g. tiles or
back-end specific credits. If set to the default value
`null`, budget and costs are not supported by the
back-end and users can't be charged.
type: string
nullable: true
default: null
example: USD
default_plan:
type: string
description: >-
Name of the default plan to use when the user doesn't
specify a plan or has no default plan has been assigned
for the user.
example: free
plans:
description: Array of plans
type: array
items:
title: Billing Plan
type: object
required:
- name
- description
- paid
properties:
name:
type: string
description: >-
Name of the plan. It MUST be accepted in a *case
insensitive* manner throughout the API.
example: free
description:
type: string
format: commonmark
description: >-
A description that gives a rough overview over
the plan.
[CommonMark 0.29](http://commonmark.org/) syntax
MAY be used for rich text representation.
example: Free plan for testing.
paid:
type: boolean
description: >-
Indicates whether the plan is a paid plan
(`true`) or a free plan (`false`).
url:
type: string
description: >-
URL to a web page with more details about the
plan.
format: uri
example: 'http://cool-cloud-corp.com/plans/free-plan'
example:
- name: free
description: >-
Free plan. Calculates one tile per second and a
maximum amount of 100 tiles per hour.
url: 'http://cool-cloud-corp.com/plans/free-plan'
paid: false
- name: premium
description: >-
Premium plan. Calculates unlimited tiles and each
calculated tile costs 0.003 USD.
url: 'http://cool-cloud-corp.com/plans/premium-plan'
paid: true
links:
description: |-
Links related to this service, e.g. the homepage of
the service provider or the terms of service.
It is highly RECOMMENDED to provide links with the
following `rel` (relation) types:
1. `version-history`: A link back to the Well-Known URL
(including `/.well-known/openeo`, see the corresponding endpoint for details)
to allow clients to work on the most recent version.
2. `terms-of-service`: A link to the terms of service. If
a back-end provides a link to the terms of service, the
clients MUST provide a way to read the terms of service
and only connect to the back-end after the user agreed to
them. The user interface MUST be designed in a way that
the terms of service are not agreed to by default, i.e.
the user MUST explicitly agree to them.
3. `privacy-policy`: A link to the privacy policy (GDPR).
If a back-end provides a link to a privacy policy, the
clients MUST provide a way to read the privacy policy and
only connect to the back-end after the user agreed to
them. The user interface MUST be designed in a way that
the privacy policy is not agreed to by default, i.e. the
user MUST explicitly agree to them.
4. `service-desc` or `service-doc`: A link to the API definition.
Use `service-desc` for machine-readable API definition and
`service-doc` for human-readable API definition.
Required if full OGC API compatibility is desired.
5. `conformance`: A link to the Conformance declaration
(see `/conformance`).
Required if full OGC API compatibility is desired.
6. `data`: A link to the collections (see `/collections`).
Required if full OGC API compatibility is desired.
7. `create-form`: A link to a user registration page.
8. `recovery-form`: A link to a page where a user can
recover a user account (e.g. to reset the password or send
a reminder about the username to the user's email account).
For additional relation types see also the lists of
[common relation types in openEO](#section/API-Principles/Web-Linking).
type: array
items:
$ref: '#/components/schemas/link'
example:
- href: 'http://www.cool-cloud-corp.com'
rel: about
type: text/html
title: Homepage of the service provider
- href: 'https://www.cool-cloud-corp.com/tos'
rel: terms-of-service
type: text/html
title: Terms of Service
- href: 'https://www.cool-cloud-corp.com/privacy'
rel: privacy-policy
type: text/html
title: Privacy Policy
- href: 'https://www.cool-cloud-corp.com/register'
rel: create-form
type: text/html
title: User Registration
- href: 'https://www.cool-cloud-corp.com/lost-password'
rel: recovery-form
type: text/html
title: Reset Password
- href: 'http://www.cool-cloud-corp.com/.well-known/openeo'
rel: version-history
type: application/json
title: List of supported openEO versions
- href: 'http://www.cool-cloud-corp.com/api/v1.0/conformance'
rel: conformance
type: application/json
title: OGC Conformance Classes
- href: 'http://www.cool-cloud-corp.com/api/v1.0/collections'
rel: data
type: application/json
title: List of Datasets
4XX:
$ref: '#/components/responses/client_error'
5XX:
$ref: '#/components/responses/server_error'
/.well-known/openeo:
get:
summary: Supported openEO versions
operationId: connect
description: |-
Lists all implemented openEO versions supported by the
service provider. This endpoint is the Well-Known URI
(see [RFC 5785](https://www.rfc-editor.org/rfc/rfc5785.html)) for openEO.
This allows a client to easily identify the most recent openEO
implementation it supports. By default, a client SHOULD connect to the
most recent production-ready version it supports. If not available, the
most recent supported version of *all* versions SHOULD be connected to.
Clients MAY let users choose to connect to versions that are not
production-ready or outdated.
The most recent version is determined by comparing the version numbers
according to rules from [Semantic Versioning](https://semver.org/),
especially [§11](https://semver.org/#spec-item-11). Any pair of API
versions in this list MUST NOT be equal according to Semantic Versioning.
The Well-Known URI is the entry point for clients and users, so make
sure it is permanent and easy to use and remember. Clients MUST NOT
require the well-known path (`./well-known/openeo`) in the URL that is
specified by a user to connect to the back-end. A client MUST request
`https://example.com/.well-known/openeo` if a user tries to connect to
`https://example.com`. If the request to the well-known URI fails, the
client SHOULD try to request the capabilities at `/` from
`https://example.com`.
**This URI MUST NOT be versioned as the other endpoints.** If your API
is available at `https://example.com/api/v1.0`, the Well-Known URI
SHOULD be located at `https://example.com/.well-known/openeo` and the
URI users connect to SHOULD be `https://example.com`.
Clients MAY get additional information (e.g. title or description) about
a back-end from the most recent version that has the `production` flag
set to `true`.
tags:
- Capabilities
security:
- {}
servers:
- url: 'https://localhost'
description: >-
The Well-Known URI SHOULD be available directly at
`https://{{domain}}/.well-known/openeo` in contrast to the other
endpoints, which may be versioned and can run on other hosts, ports,
... etc.
responses:
'200':
description: >-
List of all available API instances, each with URL and the
implemented openEO API version.
content:
application/json:
schema:
title: Well Known Discovery
type: object
required:
- versions
properties:
versions:
type: array
items:
title: API Instance
type: object
required:
- url
- api_version
properties:
url:
type: string
format: uri
description: '*Absolute* URLs to the service.'
example: 'https://example.com/api/v1.0'
production:
$ref: '#/components/schemas/production'
api_version:
type: string
description: >-
Version number of the openEO specification this
back-end implements.
example:
versions:
- url: 'https://example.openeo.org/api/v0.5'
api_version: 0.5.1
- url: 'https://example.openeo.org/api/v1.0'
api_version: 1.0.0
- url: 'https://example.openeo.org/api/v1.1'
production: false
api_version: 1.1.0-beta
4XX:
$ref: '#/components/responses/client_error'
5XX:
$ref: '#/components/responses/server_error'
/file_formats:
get:
summary: Supported file formats
operationId: list-file-types
description: |-
Lists supported input and output file formats.
*Input* file formats specify which file a back-end can *read* from.
*Output* file formats specify which file a back-end can *write* to.
The response to this request is an object listing all available input
and output file formats separately with their parameters and additional
data. This endpoint does not include the supported secondary web
services.
**Note**: Format names and parameters MUST be fully aligned with the
GDAL codes if available, see [GDAL Raster
Formats](http://www.gdal.org/formats_list.html) and [OGR Vector
Formats](http://www.gdal.org/ogr_formats.html). It is OPTIONAL to
support all output format parameters supported by GDAL. Some file