Merge pull request #135 from fission/scatter

Add control flow / utility functions to the workflow engine
fission · May 17, 2018 · 57a1add · 57a1add
2 parents e9a8d24 + 6275f9c
commit 57a1add
Show file tree

Hide file tree

Showing 118 changed files with 5,024 additions and 1,318 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -50,12 +50,15 @@ before_script:
 - glide install -v
 - build/build-linux.sh
 - test/e2e/travis-setup.sh
+# Checks using the built artifacts
 
 script:
 # Unit and Integration tests
 - test/runtests.sh
 # End-to-end tests
 - NOBUILD=y test/e2e/buildtest.sh
+# Use newest wfcli to verify workflows in example dir
+- hack/verify-workflows.sh
 
 after_script:
 - test/e2e/cleanup.sh

diff --git a/Docs/README.md b/Docs/README.md
@@ -4,4 +4,7 @@
 - [Terminology](./terminology.md) 
 - [System Architecture](./architecture.md)
 - [Installation](../INSTALL.md)
+- [Functions](./functions.md)
+- [Data](data.md)
 - [Roadmap](./roadmap.md)
+- [Deployment Administration](./admin.md)
diff --git a/Docs/admin.md b/Docs/admin.md
@@ -0,0 +1,55 @@
+# Administrating Fission Workflows
+
+Currently, Fission Workflows is in active development.
+This also means that it is still rough on edges.
+This section provides you with some approaches to debugging, and maintaining a 
+Fission Workflows deployment, as well as how to diagnose and fix (common) issues.
+
+## Inspect workflow invocations
+Use the `wfcli` tool, which allows you to query and inspect workflow invocations.
+
+## View workflow engine logs
+To view the logging of the workflow engine:
+```bash
+# Find and tail the workflows pod for logs
+kubectl -n fission-function get all
+kubectl -n fission-function logs <workflow-pod> workflow -f
+```
+
+Note if nothing seems to happen when you are invoking workflows, you should inspect the 
+Fission executor and router logs
+
+## Unresponsive functions/workflows (Fission < 0.7.0)
+The workflow engine maintains a lookup table to match workflow invocations to workflows.
+In fission < 0.7.0, there can be situations (e.g. after a crash) that the workflow engine 
+loses this lookup table, while Fission assumes that the workflow engine still is able to map 
+the Fission function to the workflow.
+This will result in requests to the functions/workflows to time-out.
+
+To fix simply delete and re-create the functions:
+```bash
+fission fn delete --name <workflow-name>
+fission fn create --name <workflow-name> --env workflow --src <workflow-file>
+```    
+
+## Soft reset 
+If you suspect that the engine is not functioning correctly, you can try restarting the engine.
+By restarting the pod, the engine will restart, replay the events to return to the current state.
+
+```bash
+kubectl -n fission-function get po
+kubectl -n fission-function delete po <workflow-pod>
+```
+
+## Complete reset
+There might be cases that you want to completely clear a Fission Workflows deployment.
+```bash
+# Find and delete the workflows deployment 
+helm list
+helm delete --purge {fission-workflows-name}
+
+# State is maintained in the NATS deployment; to clear restart the nats-streaming pod
+kubectl -n fission get po
+kubectl -n fission delete po <nats-streaming-pod>
+```
+
diff --git a/Docs/data.md b/Docs/data.md
@@ -0,0 +1,94 @@
+# Input and Output
+
+This document describes how the input and output data is handled in Fission Workflows.
+
+## Data Lifecycle
+
+There are two main places where data is converted from internal to and from external formats. 
+
+On incoming request
+0. read body
+1. attempt to parse body to TypedValue
+2. check if task/workflow
+    3. add dynamic task to
+
+On executing function
+1. fetch inputs
+2. resolve expressions in function inputs
+3. Format inputs to request (infer content-type).
+
+## TypedValue
+
+In contrast to the average workflow engine, Workflows attempts to interpret input and output values.
+There are several use cases for this: 
+- Check if the output value of a task is a task in order to support dynamic tasks.
+- Support and resolve [expressions](../Docs/expressions.md) in inputs (and, in the future, outputs) for manipulation 
+and selections of data.
+- Ensure that values can be stored and exchanged in a variety of formats (e.g. protobuf and JSON).
+
+The typing is completely optional.
+This typing is done automatically, among others, based on the Content-Type headers (for example `application/json`).
+If a value could not be parsed into a specific type, it will just be assumed to be a binary 'blob'.
+Although it will not be possible to manipulate or inspect as more specific data types, you can still reference and 
+pass it around as a whole.
+
+In the future, options will be added to prevent type parsing to optimize performance. 
+
+### Implementation
+
+This interpretation is done using an internal construct called `TypedValue`.
+This, as the name says, is a representation of the data with some added metadata:
+
+```proto
+message TypedValue {
+    string type = 1;
+    bytes value = 2;
+    map<string, string> labels = 3;
+}
+```
+
+Conceptually, it exists out of an arbitrary `type` (see Supported Types), the corresponding byte-level representation 
+of the `value`.
+Finally it contains annotations or `labels` that provide metadata about the value.
+This metadata could include the original Content-Type header that was associated with the value, the source of 
+evaluated expressions, and so on.    
+
+## Supported types
+
+The supported types can be divided into a small set of primitive types, and more complex types. 
+
+### Primitive Types
+
+The primitive types are somewhat analogous to the types in JavaScript. 
+
+As memory consumption of individual values is not very important in the scope of a serverless workflows, no attempt 
+is made to model the various kinds of integer and float values.
+
+Type       | GoType                         | Description
+-----------|--------------------------------|--------------------------------------------------
+bytes      | []byte                         | (aka the 'blob' type) the default type for data that is not recognized. 
+nil        | nil                            | Empty, but defined value.
+boolean    | bool                           | Boolean value.
+number     | float64                        | A number value that represents number values 
+string     | string                         | A string value.
+
+### Other Types
+
+Type       | GoType                         | Description
+-----------|--------------------------------|--------------------------------------------------
+expression | string                         | An expression that resolves at runtime to a TypedValue.
+task       | *types.TaskSpec                | A specification of a task (used to implement dynamic tasks)
+workflow   | *types.WorkflowSpec            | A specification of a workflow (used to implement dynamic tasks)
+map        | map[string]interface{}         | Map of key-value pairs.
+list       | []interface{}                  | List of values.
+
+## Values and References
+Currently, the workflow engine is very generous with storing all data received from and sent to functions.
+Although this helps debuggability, and simplicity, with data-intensive functions - functions that for example output 
+video or large images - this becomes a less-than optimal situation wasting storage.
+
+As part of our future work, we are looking into a way to avoid copying and storing all data in the workflow engine's 
+storage.
+The challenge here is to optimize the usage of storage vs. keeping the simplicity of the current execution model.
+One solution that is promising is to have a middleware component that stores and replaces large data sources with 
+references to the data instead.
diff --git a/Docs/input-expressions.md → Docs/expressions.md b/Docs/input-expressions.md → Docs/expressions.md
@@ -1,4 +1,4 @@
-# Input Expressions
+# Input and Output Expressions
 
 Often there are trivial data transformations between subsequent workflow steps.
 For example, you might need to select a field within a larger object or normalize the input text.
@@ -109,7 +109,7 @@ For example `$.Workflow.Status` is valid, whereas `$.workflow.Status` will error
 Note that in the case of `inputs`, if there is a single input without an explicit key defined, it will be stored 
 under the default key: `default`.
 
-### Built-in Functions
+### Built-in Expression Functions
 Besides the standard library of JavaScript, the expression interpreter provides a couple of additional utility 
 functions.
 These functions do not have access to any additional functionality not provided to the user; they are generally