The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 RFC2119 RFC8174 when, and only when, they appear in all capitals, as shown here.
This document is licensed under The Apache License, Version 2.0.
The Open Cognitive Skills (OCS) specification defines a standard, language-agnostic, and platform-agnostic language that describes how to use containerized, serverless applications to composes and executes code, machine learning models, and data. Skills enables data scientists and developers to easily compose, pack, ship, run and scale any machine learning model and code across any design and deployment environment.
The Open Cognitive Skills specification is the first of a family of specifications grouped under the CAMEL initiative. The CAMEL initiative defines a standard, language-agnostic, and platform-agnostic language that describes the composition and orchestration of skills, data, and machine learning models to define and exeucte a Cognitive Agent that is used to augment human intelligence. CAMEL is an acronym that stands for "Cognitive Agent Modeling and Execution Language" and can be thought of as a language for programming Cognitive Agents. The OCS specification focuses on the Cognitive Skill construct which is the key building block of Cognitive Agents.
A document that defines or describes a OCS resource. Skills are the primary type of OCS resource and conform to the common properties and conventions of all CAMEL resource types. The relevant properties and conventions are repeated here in this specification for simplicity.
A Resource is a first class domain object in CAMEL.
- Resources have unique names (Fully Qualified Names or FQN’s)
- Resources are versioned; each modification to a resource creates a new version
- A reference to a Resource ID, is a reference to a specific version of that resource for a specific owner
- A reference to a Resource name must be resolved to a scope (version and ownership)
- Resources have an owner (a user) and privileges (Read, Write, Admin)
- Privileges can be assigned to users or groups
Resources are named using Fully Qualified Names (FQN’s). A FQN consists of:
- Namespace: String
- Name: String
- Version: Integer
- Format:
<ns>/<name>:<version>
The Namespace is required but defaults to default
.
Example namespace: acme
Version is optional and defaults to the latest version.
Example FQN’s:
- acme/sentiment_analysis
- acme/compliance_checker:47
- acme/is_it_green
A containerized, serverless application that composes and executes code, machine learning models, and data. It enables data scientists and developers to easily compose, pack, ship, run and scale any machine learning model and code across any design and deployment environment.
Provide a means to store and retrieve data. Two-dimensional, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Can also have metadata describing schema, organization w.r.t. an industry taxonomy, licensing terms, access rights, etc.
The term Model refers to the model artifact that is created by a training or configuration process. A trained ML model refers to a Model that has been created by training a learning algorithm with training data. Other Model types include linguistic models which may be learning algorithm based, rules based, or some combination. Business rules or heuristics codified in business logic can also be used to create a Model.
Declaration of a type used as input and/or output to a Skills and Agents. Messages contain a list of ordered fields that each have a specific data type (string, boolean, integer, float, etc.).
Allows the definition of input and output data types in a reusuable object that can be referenced from multiple resources. For example, multiple Skills can share a single Schema for a common input or output parameter list.
The OCS Specification is versioned using Semantic Versioning 2.0.0 (semver) and follows the semver specification.
The major
.minor
portion of the semver (for example 1.0
) SHALL designate the OCS feature set. Typically, .patch
versions address errors in this document, not the feature set. Tooling which supports OCS 1.0 SHOULD be compatible with all OCS 1.0.* versions. The patch version SHOULD NOT be considered by tooling, making no distinction between 1.0.0
and 1.0.1
for example.
Subsequent minor version releases of the OCS Specification (incrementing the minor
version number) SHOULD NOT interfere with tooling developed to a lower minor version and same major version. Thus a hypothetical 1.1.0
specification SHOULD be usable with tooling designed for 1.0.0
.
An OCS document compatible with OCS 1.*.*
contains a required camel
field which designates the semantic version of the OCS specification that it uses.
A OCS document that conforms to the OCS Specification is itself a JSON object, which may be represented either in JSON or YAML format.
For example, if a field has an array value, the JSON array representation will be used:
{
"field": [ 1, 2, 3 ]
}
All field names in the specification are case sensitive.
In order to preserve the ability to round-trip between YAML and JSON formats, YAML version 1.2 is RECOMMENDED along with some additional constraints:
- Tags MUST be limited to those allowed by the JSON Schema ruleset.
- Keys used in YAML maps MUST be limited to a scalar string, as defined by the YAML Failsafe schema ruleset.
Primitive data types in OCS are based on the types supported by the JSON Schema Specification Wright Draft 00. Note that integer
as a type is also supported and is defined as a JSON number without a fraction or exponent part. null
is not supported as a type (see nullable
for an alternative solution). Data types are defined using the Parameters, which is an extended subset of JSON Schema Specification Wright Draft 00.
Primitives have an optional modifier property: format
.
OCS uses several known formats to define in fine detail the data type being used.
However, to support documentation needs, the format
property is an open string
-valued property, and can have any value.
Formats such as "email"
, "uuid"
, and so on, MAY be used even though undefined by this specification.
Types that are not accompanied by a format
property follow the type definition in the JSON Schema. Tools that do not recognize a specific format
MAY default back to the type
alone, as if the format
is not specified.
The formats defined by the OCS specification are:
Common Name | type |
format |
Comments |
---|---|---|---|
integer | integer |
int32 |
signed 32 bits |
long | integer |
int64 |
signed 64 bits |
float | number |
float |
|
double | number |
double |
|
string | string |
||
byte | string |
byte |
base64 encoded characters |
binary | string |
binary |
any sequence of octets |
boolean | boolean |
||
date | string |
date |
As defined by full-date - RFC3339 |
dateTime | string |
date-time |
As defined by date-time - RFC3339 |
object | object |
embedded/nested object | |
array | array |
A valid type | embedded/nested array |
An integer
{
"type": "integer"
}
A double
{
"type": "number",
"format": "double"
}
A base64 encoded binary
{
"type": "string",
"format": "byte"
}
A date-time string
{
"type": "string",
"format": "date-time"
}
Embedded object
{
"type": "object"
}
Embedded array of numbers
{
"type": "array",
"format": "number"
}
Throughout the specification description
fields are noted as supporting CommonMark markdown formatting. Where OCS tooling renders rich text it MUST support, at a minimum, markdown syntax as described by CommonMark 0.27. Tooling MAY choose to ignore some CommonMark features to address security concerns.
In the following description, if a field is not explicitly REQUIRED or described with a MUST or SHALL, it can be considered OPTIONAL.
The following fields are common across many OCS resources and will be referenced throughout the specification.
Field Name | Type | Description |
---|---|---|
camel | string |
REQUIRED. This string MUST be the semantic version number of the OCS Specification version that the OCS document uses. The camel field SHOULD be used by tooling specifications and clients to interpret the OCS document. This is not related to the resource version string. |
name | string |
REQUIRED. The fully qualified name of the resource. |
title | string |
OPTIONAL. The human friendly name given to a resource. |
description | string |
OPTIONAL. A free-text account of a resource. MAY include rich text. MAY be a reference to external documentation using a URL Reference Object. |
tags | [Tag Object] | OPTIONAL. An array of tags used to annotate the resource. |
A OCS document MAY include system fields. Any field that starts with a _
is considered a system field. Some common system fields are described below:
The following objects are common across many OCS resources and will be referenced throughout the specification.
The property defintion object is used to declare a configurable property of a resource.
The property value object is used to set a property.
Field Name | Type | Description |
---|---|---|
name | string |
The property name. |
value | any |
The property value. |
The tag object is used to annotate resources with descriptive labels or categories to enable discovery.
Field Name | Type | Description |
---|---|---|
label | string |
The tag label. |
value | string |
The tag value. |
The parameter object is used to define a list of ordered parameters used in a Message.
Primitive types example: User, Item, Rating
parameters:
- name: uid
title: User ID
type: integer
format: int64
required: true
- name: iid
title: Item ID
type: integer
format: int64
required: true
- name: rating
title: Rating
type: number
format: double
required: true
Embedded objects and arrays example: News Article
parameters:
- name: articleId
title: Article ID
type: string
required: true
- name: headline
title: Headline
type: string
required: true
- name: text
title: Article Text
type: string
required: false
- name: imageLinks
title: Image Links
type: array
format: url
required: false
- name: feedInfo
title: Feed Information
type: object
required: false
Field Name | Type | Description |
---|---|---|
name | string |
REQUIRED. The unique name of the parameter. The name SHOULD be a code friendly identifier. |
title | string |
OPTIONAL. See Resource Title. |
description | string |
OPTIONAL. See Resource Description. |
type | string |
REQUIRED. One of the six valid types allowed by OCS: integer, number, boolean, string, object, array. |
format | string |
OPTIONAL. A descriptive format string like date , email , or double . See Data Types for a discussion of the built-in formats. |
required | boolean |
OPTIONAL. Default false . |
The reference object is used to declare a pointer to a resource such as a Dataset or Schema.
Dataset reference, used in "pass-by-value" scenarios:
{
"payload": {
"$ref": "examples/movies_dataset"
}
}
Schema reference, used in parameter declarations:
parameters:
$ref: examples/MovieInformation
Field Name | Type | Description |
---|---|---|
$ref | string |
REQUIRED. The Resource Name, including version if desired, of the resource this object is pointing to. |
The URL reference object is used to declare a hyperlink to an external resource, such as external documentation, and can be used in description
(see Resource Description) fields or other fields that support resource linking.
External documentation reference
{
"description": {
"$url": "http://example.com/external_markdown_doc.md"
}
}
Field Name | Type | Description |
---|---|---|
$url | string |
REQUIRED. A valid RFC 1738 URL pointing to an external resource that should be used as the value for the containing field. |
This is the root document object of a OCS document that contains a Skill definition.
An exmaple "Hello World" Skill is shown below:
camel: 1.0.0
name: default/hello_world
title: Hello World
description: The classic Hello World example.
properties:
-
name: lang
title: Language
description: The language to say hello in.
required: true
type: Enum
defaultValue: en
validValues:
- en
- es
- it
- de
inputs:
-
name: yourName
title: Your Name
parameters:
- name: name
type: string
description: The name to send
required: true
routing:
all:
action: default/hello_world
runtime: cortex/functions
output: greeting
outputs:
-
name: greeting
title: Greeting
parameters:
- name: message
type: string
description: The greeting message
Field Name | Type | Description |
---|---|---|
camel | string |
REQUIRED. See OCS Specification Version. |
name | string |
REQUIRED. See Resource Name. |
title | string |
REQUIRED. See Resource Title. |
description | string |
OPTIONAL. See Resource Description. |
tags | [Tag Object] | OPTIONAL. See Resource Tags. |
properties | [Property Object] | OPTIONAL. |
inputs | [Skill Input Object] | REQUIRED. An array of Input Objects. At least one Input is required. |
outputs | [Skill Output Object] | OPTIONAL. An array of Output Objects. |
models | [Model Defintion Object] | OPTIONAL. An array of Model Definition Objects. |
datasets | [DataSet Reference Object] | OPTIONAL. An array of Dataset Reference Objects. Each reference declares a dependency on a Dataset that must be mapped at runtime. |
This object defines an input message used by the Skill.
Field Name | Type | Description |
---|---|---|
name | string |
REQUIRED. |
title | string |
REQUIRED. |
parameters | [Parameter Object | Reference Object] | REQUIRED. |
routing | Routing Object | REQUIRED. |
This object defines an output message used by the Skill.
Field Name | Type | Description |
---|---|---|
name | string |
REQUIRED. |
title | string |
REQUIRED. |
parameters | [Parameter Object | Reference Object] | REQUIRED. |
This object defines the routing rules for a Skill input. Skills route Messages received on an Input to an Action for processing and then to an Output. A Skill MUST define at least one routing rule for each Input. Messages can be routed based on properties or Message field values. The simpliest form of routing is the all
routing which routes all Messages received on a given Input to a single Action.
The ALL
routing. Routes all messages to a single action.
routing:
all:
action: example/hello_world
output: greeting
Property based routing
routing:
property: model
default:
action: example/sentiment_python_pattern
output: sentiment
rules:
- match: Stanford Sentiment
action: example/sentiment_stanford
output: sentiment
- match: Microsoft Cognitive Services
action: example/sentiment_microsoft
output: sentiment
- match: IBM Watson
action: example/sentiment_watson
output: sentiment
Field based routing
routing:
field: language
default:
action: example/sentiment_english
output: sentiment
rules:
- match: es
action: example/sentiment_spanish
output: sentiment
- match: de
action: example/sentiment_german
output: sentiment
- match: it
action: example/sentiment_italian
output: sentiment
Field Name | Type | Description |
---|---|---|
action | string |
REQUIRED. The Resource Name of the action to route to. |
runtime | string |
OPTIONAL. The Resource Name of the action runtime to use. The default runtime is assumed if not provided. |
output | string |
REQUIRED. The name of the Output to route to. |
Field Name | Type | Description |
---|---|---|
property | string |
REQUIRED. The name of the property to apply routing rules to. |
default | All Routing | OPTIONAL. The default routing rule used if no property matches are made. |
rules | [Routing Rule Object] | REQUIRED. List of routing rules to apply to the specified property value. |
Field Name | Type | Description |
---|---|---|
field | string |
REQUIRED. The name of the Message field to apply routing rules to. |
default | All Routing | OPTIONAL. The default routing rule used if no property matches are made. |
rules | [Routing Rule Object] | REQUIRED. List of routing rules to apply to the specified field value. |
This object defines a routing rule to apply to a value that comes from either a Skill property or Message field value.
Field Name | Type | Description |
---|---|---|
match | string |
The value to match. |
action | string |
REQUIRED. The Resource Name of the action to route to. |
runtime | string |
OPTIONAL. The Resource Name of the action runtime to use. The default runtime is assumed if not provided. |
output | string |
REQUIRED. The name of the Output to route to. |
This object declares a model that is created by or used by a Skill. A Model is considered to be any machine learning model, statistical model, or other model that is trained, versioned, and deployed by a Skill. A Model has the following declarations:
- Metadata describing the model such as functional goal and algorithm used.
- Model verification records to determine if a model is generating results that are consistant with its design (see PMML Model Verification)
- Model quality metrics based on the type of algorithm, function, and modeling assumptions.
Field Name | Type | Description |
---|---|---|
name | string |
REQUIRED. The resource name of the model. MUST be unique within a Skill definition. |
title | string |
REQUIRED. The human friendly display title of the model. |
description | string |
OPTIONAL. The rich text description of the model. |
tags | [Tag] | OPTIONAL. |
functionName | string |
OPTIONAL. The functional description of the model use. For example, classification or regression . MUST be one of the defined Model Functions. |
algorithmName | string |
OPTIONAL. The name of the underlying algorithm. For example, Linear Regression or LSTM . |
This object defines how a Skill declares a dependency on a dataset that must be mapped to it at runtime.
Field Name | Type | Description |
---|---|---|
name | string |
REQUIRED. The name of the reference. |
title | string |
OPTIONAL. A human friendly display name for the reference. |
description | string |
OPTIONAL. A description or documentation of the reference. |
parameters | [Parameter Object | Reference Object] | OPTIONAL. The fields expected in the dataset. |
requiresWrite | boolean |
OPTIONAL. Set to true to indicate that the mapped dataset must support writes. |
This is the root document object of a OCS document that contains a Dataset definition.
An exmaple Dataset is shown below:
camel: 1.0.0
name: default/movie_info
title: Movie Information
description: Contains basic information about thousands of movies.
fields:
- name: movieId
type: integer
- name: movieTitle
type: string
- name: releaseDate
type: string
- name: imdbUrl
type: string
format: url
- name: category
type: string
connections:
default:
name: cortex/content
type: managedContent
query:
- name: key
value: movielens/ML100K-Movies.csv
- name: contentType
value: CSV
- name: csv/delimiter
value: '|'
environments:
- environment: PROD
name: example/moviesdb
type: postgresql
query:
- name: query
value: "SELECT * FROM movies"
Field Name | Type | Description |
---|---|---|
camel | string |
REQUIRED. See OCS Specification Version. |
name | string |
REQUIRED. See Resource Name. |
title | string |
REQUIRED. See Resource Title. |
description | string |
OPTIONAL. See Resource Description. |
tags | [Tag Object] | OPTIONAL. See Resource Tags. |
fields | [Parameters Object] | REQUIRED. Defines the fields included in this dataset. The dataset MAY include additional fields not defined here. |
connections | Connection Configuration Object | REQUIRED. Data store connection configuration for this dataset. |
This object defines the data source connection configuration for a Dataset. A Dataset MUST have at least one connection per deployed environment or MUST provide a default connection configuration that is used in environments where no environment specific connection is defined.
Field Name | Type | Description |
---|---|---|
default | Connection Reference Object | OPTIONAL. The default connection to use in environments where no environment specific connection configuration exists. |
environments | [Connnection Reference Object] | OPTIONAL. A list of environment specific connections to use for this dataset. Each object in this list MUST use the environment property to qualify which environment it applies to. If more than one connection for an environment is defined, only the first connection in the list will be used. |
This object defines the reference to a data source connection for a specific environment. It includes information about the type of connection (e.g. mongo, s3, etc.) as well as the query parameter configuration.
Field Name | Type | Description |
---|---|---|
name | string |
REQUIRED. The fully qualified name of the connection to use. |
type | string |
REQUIRED. The name of the connection type. |
query | [Property Value]] | OPTIONAL. Defines the default connection query parameters to use when reading from this connection. The definitions of these properties is provided by connection type object. |
environment | string |
OPTIONAL. |
This object defines the message format used for skill-to-skill and agent-to-skill communication.
Field Name | Type | Description |
---|---|---|
payload | Payload Object | REQUIRED. |
parameters | [Parameter Object] | OPTIONAL. |
This object is used in Messages to carry a payload sent for processing. Payloads have four different styles demonstrated in examples below:
- Inline Objects
- Inline Records
- Inline DataFrame
- DataSet Reference (e.g. pass-by-reference)
{
"payload": {"text": "Hello World"}
}
{
"payload":
{
"records": [
{},
{},
{}
]
}
}
{
"payload":
{
"columns": []
"values": []
}
}
{
"payload": {"$ref": "default/MyDataSet"}
}
Field Name | Type | Description |
---|---|---|
records | [any ] |
OPTIONAL. An array of JSON objects representing dataset records. The fields of the these records MUST match the fields defined in the dataset. |
columns | [string ] |
OPTIONAL. The column names that match the records containing in the values array. The column names MUST at least match the field names defined by this dataset. |
values | [any ] |
OPTIONAL. A two dimensional array representing payload records and their values. |
$ref | Reference Object | OPTIONAL. if present, MUST refer to a Dataset using its fully qualified name. |