Open Cognitive Skills Specification

Version 1.0.0

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 RFC2119 RFC8174 when, and only when, they appear in all capitals, as shown here.

This document is licensed under The Apache License, Version 2.0.

Introduction

The Open Cognitive Skills (OCS) specification defines a standard, language-agnostic, and platform-agnostic language that describes how to use containerized, serverless applications to composes and executes code, machine learning models, and data. Skills enables data scientists and developers to easily compose, pack, ship, run and scale any machine learning model and code across any design and deployment environment.

Releationship to Cognitive Agent Modeling and Execution Language (CAMEL)

The Open Cognitive Skills specification is the first of a family of specifications grouped under the CAMEL initiative. The CAMEL initiative defines a standard, language-agnostic, and platform-agnostic language that describes the composition and orchestration of skills, data, and machine learning models to define and exeucte a Cognitive Agent that is used to augment human intelligence. CAMEL is an acronym that stands for "Cognitive Agent Modeling and Execution Language" and can be thought of as a language for programming Cognitive Agents. The OCS specification focuses on the Cognitive Skill construct which is the key building block of Cognitive Agents.

Definitions

Open Cognitive Skill Document

A document that defines or describes a OCS resource. Skills are the primary type of OCS resource and conform to the common properties and conventions of all CAMEL resource types. The relevant properties and conventions are repeated here in this specification for simplicity.

Resources

A Resource is a first class domain object in CAMEL.

Resources have unique names (Fully Qualified Names or FQN’s)
Resources are versioned; each modification to a resource creates a new version
A reference to a Resource ID, is a reference to a specific version of that resource for a specific owner
A reference to a Resource name must be resolved to a scope (version and ownership)
Resources have an owner (a user) and privileges (Read, Write, Admin)
Privileges can be assigned to users or groups

Resource Names

Resources are named using Fully Qualified Names (FQN’s). A FQN consists of:

Namespace: String
Name: String
Version: Integer
Format: <ns>/<name>:<version>

The Namespace is required but defaults to default. Example namespace: acme

Version is optional and defaults to the latest version.

Example FQN’s:

acme/sentiment_analysis
acme/compliance_checker:47
acme/is_it_green

Skill

A containerized, serverless application that composes and executes code, machine learning models, and data. It enables data scientists and developers to easily compose, pack, ship, run and scale any machine learning model and code across any design and deployment environment.

Dataset

Provide a means to store and retrieve data. Two-dimensional, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Can also have metadata describing schema, organization w.r.t. an industry taxonomy, licensing terms, access rights, etc.

Model

The term Model refers to the model artifact that is created by a training or configuration process. A trained ML model refers to a Model that has been created by training a learning algorithm with training data. Other Model types include linguistic models which may be learning algorithm based, rules based, or some combination. Business rules or heuristics codified in business logic can also be used to create a Model.

Message

Declaration of a type used as input and/or output to a Skills and Agents. Messages contain a list of ordered fields that each have a specific data type (string, boolean, integer, float, etc.).

Schema

Allows the definition of input and output data types in a reusuable object that can be referenced from multiple resources. For example, multiple Skills can share a single Schema for a common input or output parameter list.

Specification

Versions

The OCS Specification is versioned using Semantic Versioning 2.0.0 (semver) and follows the semver specification.

The major.minor portion of the semver (for example 1.0) SHALL designate the OCS feature set. Typically, .patch versions address errors in this document, not the feature set. Tooling which supports OCS 1.0 SHOULD be compatible with all OCS 1.0.* versions. The patch version SHOULD NOT be considered by tooling, making no distinction between 1.0.0 and 1.0.1 for example.

Subsequent minor version releases of the OCS Specification (incrementing the minor version number) SHOULD NOT interfere with tooling developed to a lower minor version and same major version. Thus a hypothetical 1.1.0 specification SHOULD be usable with tooling designed for 1.0.0.

An OCS document compatible with OCS 1.*.* contains a required camel field which designates the semantic version of the OCS specification that it uses.

Format

A OCS document that conforms to the OCS Specification is itself a JSON object, which may be represented either in JSON or YAML format.

For example, if a field has an array value, the JSON array representation will be used:

{
   "field": [ 1, 2, 3 ]
}

All field names in the specification are case sensitive.

In order to preserve the ability to round-trip between YAML and JSON formats, YAML version 1.2 is RECOMMENDED along with some additional constraints:

Tags MUST be limited to those allowed by the JSON Schema ruleset.
Keys used in YAML maps MUST be limited to a scalar string, as defined by the YAML Failsafe schema ruleset.

Data Types

Primitive data types in OCS are based on the types supported by the JSON Schema Specification Wright Draft 00. Note that integer as a type is also supported and is defined as a JSON number without a fraction or exponent part. null is not supported as a type (see nullable for an alternative solution). Data types are defined using the Parameters, which is an extended subset of JSON Schema Specification Wright Draft 00.

Primitives have an optional modifier property: format. OCS uses several known formats to define in fine detail the data type being used. However, to support documentation needs, the format property is an open string-valued property, and can have any value.

Formats such as "email", "uuid", and so on, MAY be used even though undefined by this specification.

Types that are not accompanied by a format property follow the type definition in the JSON Schema. Tools that do not recognize a specific format MAY default back to the type alone, as if the format is not specified.

The formats defined by the OCS specification are:

Common Name	`type`	`format`	Comments
integer	`integer`	`int32`	signed 32 bits
long	`integer`	`int64`	signed 64 bits
float	`number`	`float`
double	`number`	`double`
string	`string`
byte	`string`	`byte`	base64 encoded characters
binary	`string`	`binary`	any sequence of octets
boolean	`boolean`
date	`string`	`date`	As defined by `full-date` - RFC3339
dateTime	`string`	`date-time`	As defined by `date-time` - RFC3339
object	`object`		embedded/nested object
array	`array`	A valid type	embedded/nested array

Examples

An integer

{
  "type": "integer"
}

A double

{
  "type": "number",
  "format": "double"
}

A base64 encoded binary

{
  "type": "string",
  "format": "byte"
}

A date-time string

{
  "type": "string",
  "format": "date-time"
}

Embedded object

{
  "type": "object"
}

Embedded array of numbers

{
  "type": "array",
  "format": "number"
}

Rich Text Formatting

Throughout the specification description fields are noted as supporting CommonMark markdown formatting. Where OCS tooling renders rich text it MUST support, at a minimum, markdown syntax as described by CommonMark 0.27. Tooling MAY choose to ignore some CommonMark features to address security concerns.

Schema

In the following description, if a field is not explicitly REQUIRED or described with a MUST or SHALL, it can be considered OPTIONAL.

Common Fields

The following fields are common across many OCS resources and will be referenced throughout the specification.

Field Name	Type	Description
camel	`string`	REQUIRED. This string MUST be the semantic version number of the OCS Specification version that the OCS document uses. The `camel` field SHOULD be used by tooling specifications and clients to interpret the OCS document. This is not related to the resource `version` string.
name	`string`	REQUIRED. The fully qualified name of the resource.
title	`string`	OPTIONAL. The human friendly name given to a resource.
description	`string`	OPTIONAL. A free-text account of a resource. MAY include rich text. MAY be a reference to external documentation using a URL Reference Object.
tags	[Tag Object]	OPTIONAL. An array of tags used to annotate the resource.

System Fields

A OCS document MAY include system fields. Any field that starts with a _ is considered a system field. Some common system fields are described below:

Field Name	Type	Description
_version	`integer`	The resource version number.
_createdAt	`dateTime`	Captures the date and time of resource creation.
_updatedAt	`dateTime`	Captures the date and time of last resource update.
_id	`string`	A unique identifier assigned by the system to identify a resource.

Common Objects

The following objects are common across many OCS resources and will be referenced throughout the specification.

Property Definition Object

The property defintion object is used to declare a configurable property of a resource.

Fixed Fields

Field Name	Type	Description
name	`string`	REQUIRED. The unique name of this property within the resource. Tools and libraries MUST use the name to uniquely identify the property, therefore, it is RECOMMENDED to follow common programming naming conventions.
title	`string`	REQUIRED.
description	`string`	OPTIONAL.
required	`boolean`	OPTIONAL. Is this property required to be set? Default value is `false`.
type	`string`	REQUIRED. Must be one of `Enum`, `String`, `Boolean`, `Number`.
defaultValue	`any`	OPTIONAL. The default value for this property.
validValues	[`string`]	REQUIRED. An array of valid values for use with the `Enum` property type. Ignored for other property types.
qualifiedBy	string	OPTIONAL. Scopes this property to another property value by name. For example, a property `qualifiedBy=fileType` with name `json/style` will only be active when the fileType property has value `json`.
secure	`boolean`	OPTIONAL. Should this property be encrypted? Default value is `false`.

Property Value Object

The property value object is used to set a property.

Fixed Fields

Field Name	Type	Description
name	`string`	The property name.
value	`any`	The property value.

Tag Object

The tag object is used to annotate resources with descriptive labels or categories to enable discovery.

Fixed Fields

Field Name	Type	Description
label	`string`	The tag label.
value	`string`	The tag value.

Parameter Object

The parameter object is used to define a list of ordered parameters used in a Message.

Examples

Primitive types example: User, Item, Rating

parameters:
  - name: uid
    title: User ID
    type: integer
    format: int64
    required: true
  - name: iid
    title: Item ID
    type: integer
    format: int64
    required: true
  - name: rating
    title: Rating
    type: number
    format: double
    required: true

Embedded objects and arrays example: News Article

parameters:
  - name: articleId
    title: Article ID
    type: string
    required: true
  - name: headline
    title: Headline
    type: string
    required: true
  - name: text
    title: Article Text
    type: string
    required: false
  - name: imageLinks
    title: Image Links
    type: array
    format: url
    required: false
  - name: feedInfo
    title: Feed Information
    type: object
    required: false

Fixed Fields

Field Name	Type	Description
name	`string`	REQUIRED. The unique name of the parameter. The name SHOULD be a code friendly identifier.
title	`string`	OPTIONAL. See Resource Title.
description	`string`	OPTIONAL. See Resource Description.
type	`string`	REQUIRED. One of the six valid types allowed by OCS: integer, number, boolean, string, object, array.
format	`string`	OPTIONAL. A descriptive format string like `date`, `email`, or `double`. See Data Types for a discussion of the built-in formats.
required	`boolean`	OPTIONAL. Default `false`.

Reference Object

The reference object is used to declare a pointer to a resource such as a Dataset or Schema.

Examples

Dataset reference, used in "pass-by-value" scenarios:

{
    "payload": {
        "$ref": "examples/movies_dataset"
    }
}

Schema reference, used in parameter declarations:

parameters:
  $ref: examples/MovieInformation

Fixed Fields

Field Name	Type	Description
$ref	`string`	REQUIRED. The Resource Name, including version if desired, of the resource this object is pointing to.

URL Reference Object

The URL reference object is used to declare a hyperlink to an external resource, such as external documentation, and can be used in description (see Resource Description) fields or other fields that support resource linking.

Examples

External documentation reference

{
    "description": {
        "$url": "http://example.com/external_markdown_doc.md"
    }
}

Fixed Fields

Field Name	Type	Description
$url	`string`	REQUIRED. A valid RFC 1738 URL pointing to an external resource that should be used as the value for the containing field.

Skill Object

This is the root document object of a OCS document that contains a Skill definition.

Skill Object Example

An exmaple "Hello World" Skill is shown below:

camel: 1.0.0
name: default/hello_world
title: Hello World
description: The classic Hello World example.
properties:
  -
    name: lang
    title: Language
    description: The language to say hello in.
    required: true
    type: Enum
    defaultValue: en
    validValues:
      - en
      - es
      - it
      - de
inputs:
  -
    name: yourName
    title: Your Name
    parameters:
      - name: name
        type: string
        description: The name to send
        required: true
    routing:
      all:
        action: default/hello_world
        runtime: cortex/functions
        output: greeting
outputs:
  -
    name: greeting
    title: Greeting
    parameters:
      - name: message
        type: string
        description: The greeting message

Fixed Fields

Field Name	Type	Description
camel	`string`	REQUIRED. See OCS Specification Version.
name	`string`	REQUIRED. See Resource Name.
title	`string`	REQUIRED. See Resource Title.
description	`string`	OPTIONAL. See Resource Description.
tags	[Tag Object]	OPTIONAL. See Resource Tags.
properties	[Property Object]	OPTIONAL.
inputs	[Skill Input Object]	REQUIRED. An array of Input Objects. At least one Input is required.
outputs	[Skill Output Object]	OPTIONAL. An array of Output Objects.
models	[Model Defintion Object]	OPTIONAL. An array of Model Definition Objects.
datasets	[DataSet Reference Object]	OPTIONAL. An array of Dataset Reference Objects. Each reference declares a dependency on a Dataset that must be mapped at runtime.

Skill Input Object

This object defines an input message used by the Skill.

Fixed fields

Field Name	Type	Description
name	`string`	REQUIRED.
title	`string`	REQUIRED.
parameters	[Parameter Object \| Reference Object]	REQUIRED.
routing	Routing Object	REQUIRED.

Skill Output Object

This object defines an output message used by the Skill.

Fixed fields

Field Name	Type	Description
name	`string`	REQUIRED.
title	`string`	REQUIRED.
parameters	[Parameter Object \| Reference Object]	REQUIRED.

Routing Object

This object defines the routing rules for a Skill input. Skills route Messages received on an Input to an Action for processing and then to an Output. A Skill MUST define at least one routing rule for each Input. Messages can be routed based on properties or Message field values. The simpliest form of routing is the all routing which routes all Messages received on a given Input to a single Action.

Examples

The ALL routing. Routes all messages to a single action.

routing:
  all:
    action: example/hello_world
    output: greeting

Property based routing

routing:
  property: model
  default:
    action: example/sentiment_python_pattern
    output: sentiment
  rules:
    - match: Stanford Sentiment
      action: example/sentiment_stanford
      output: sentiment
    - match: Microsoft Cognitive Services
      action: example/sentiment_microsoft
      output: sentiment
    - match: IBM Watson
      action: example/sentiment_watson
      output: sentiment

Field based routing

routing:
  field: language
  default:
    action: example/sentiment_english
    output: sentiment
  rules:
    - match: es
      action: example/sentiment_spanish
      output: sentiment
    - match: de
      action: example/sentiment_german
      output: sentiment
    - match: it
      action: example/sentiment_italian
      output: sentiment

All Routing Fixed fields

Field Name	Type	Description
action	`string`	REQUIRED. The Resource Name of the action to route to.
runtime	`string`	OPTIONAL. The Resource Name of the action runtime to use. The default runtime is assumed if not provided.
output	`string`	REQUIRED. The name of the Output to route to.

Property Routing Fixed fields

Field Name	Type	Description
property	`string`	REQUIRED. The name of the property to apply routing rules to.
default	All Routing	OPTIONAL. The default routing rule used if no property matches are made.
rules	[Routing Rule Object]	REQUIRED. List of routing rules to apply to the specified property value.

Field Routing Fixed fields

Field Name	Type	Description
field	`string`	REQUIRED. The name of the Message field to apply routing rules to.
default	All Routing	OPTIONAL. The default routing rule used if no property matches are made.
rules	[Routing Rule Object]	REQUIRED. List of routing rules to apply to the specified field value.

Routing Rule Object

This object defines a routing rule to apply to a value that comes from either a Skill property or Message field value.

Fixed fields

Field Name	Type	Description
match	`string`	The value to match.
action	`string`	REQUIRED. The Resource Name of the action to route to.
runtime	`string`	OPTIONAL. The Resource Name of the action runtime to use. The default runtime is assumed if not provided.
output	`string`	REQUIRED. The name of the Output to route to.

Model Definition Object

This object declares a model that is created by or used by a Skill. A Model is considered to be any machine learning model, statistical model, or other model that is trained, versioned, and deployed by a Skill. A Model has the following declarations:

Metadata describing the model such as functional goal and algorithm used.
Model verification records to determine if a model is generating results that are consistant with its design (see PMML Model Verification)
Model quality metrics based on the type of algorithm, function, and modeling assumptions.

Fixed fields

Field Name	Type	Description
name	`string`	REQUIRED. The resource name of the model. MUST be unique within a Skill definition.
title	`string`	REQUIRED. The human friendly display title of the model.
description	`string`	OPTIONAL. The rich text description of the model.
tags	[Tag]	OPTIONAL.
functionName	`string`	OPTIONAL. The functional description of the model use. For example, `classification` or `regression`. MUST be one of the defined Model Functions.
algorithmName	`string`	OPTIONAL. The name of the underlying algorithm. For example, `Linear Regression` or `LSTM`.

Dataset Reference Object

This object defines how a Skill declares a dependency on a dataset that must be mapped to it at runtime.

Fixed fields

Field Name	Type	Description
name	`string`	REQUIRED. The name of the reference.
title	`string`	OPTIONAL. A human friendly display name for the reference.
description	`string`	OPTIONAL. A description or documentation of the reference.
parameters	[Parameter Object \| Reference Object]	OPTIONAL. The fields expected in the dataset.
requiresWrite	`boolean`	OPTIONAL. Set to `true` to indicate that the mapped dataset must support writes.

Dataset

This is the root document object of a OCS document that contains a Dataset definition.

Dataset Object Example

An exmaple Dataset is shown below:

camel: 1.0.0
name: default/movie_info
title: Movie Information
description: Contains basic information about thousands of movies.
fields:
  - name: movieId
    type: integer
  - name: movieTitle
    type: string
  - name: releaseDate
    type: string
  - name: imdbUrl
    type: string
    format: url
  - name: category
    type: string
connections:
  default:
    name: cortex/content
    type: managedContent
    query:
      - name: key
        value: movielens/ML100K-Movies.csv
      - name: contentType
        value: CSV
      - name: csv/delimiter
        value: '|'
  environments: 
    - environment: PROD
      name: example/moviesdb
      type: postgresql
      query:
        - name: query
          value: "SELECT * FROM movies"

Fixed Fields

Field Name	Type	Description
camel	`string`	REQUIRED. See OCS Specification Version.
name	`string`	REQUIRED. See Resource Name.
title	`string`	REQUIRED. See Resource Title.
description	`string`	OPTIONAL. See Resource Description.
tags	[Tag Object]	OPTIONAL. See Resource Tags.
fields	[Parameters Object]	REQUIRED. Defines the fields included in this dataset. The dataset MAY include additional fields not defined here.
connections	Connection Configuration Object	REQUIRED. Data store connection configuration for this dataset.

Connection Configuration Object

This object defines the data source connection configuration for a Dataset. A Dataset MUST have at least one connection per deployed environment or MUST provide a default connection configuration that is used in environments where no environment specific connection is defined.

Fixed fields

Field Name	Type	Description
default	Connection Reference Object	OPTIONAL. The default connection to use in environments where no environment specific connection configuration exists.
environments	[Connnection Reference Object]	OPTIONAL. A list of environment specific connections to use for this dataset. Each object in this list MUST use the `environment` property to qualify which environment it applies to. If more than one connection for an environment is defined, only the first connection in the list will be used.

Connection Reference Object

This object defines the reference to a data source connection for a specific environment. It includes information about the type of connection (e.g. mongo, s3, etc.) as well as the query parameter configuration.

Fixed fields

Field Name	Type	Description
name	`string`	REQUIRED. The fully qualified name of the connection to use.
type	`string`	REQUIRED. The name of the connection type.
query	[Property Value]]	OPTIONAL. Defines the default connection query parameters to use when reading from this connection. The definitions of these properties is provided by connection type object.
environment	`string`	OPTIONAL.

Message

This object defines the message format used for skill-to-skill and agent-to-skill communication.

Fixed fields

Field Name	Type	Description
payload	Payload Object	REQUIRED.
parameters	[Parameter Object]	OPTIONAL.

Payload Object

This object is used in Messages to carry a payload sent for processing. Payloads have four different styles demonstrated in examples below:

Inline Objects
Inline Records
Inline DataFrame
DataSet Reference (e.g. pass-by-reference)

Examples

Payload with Inline Object

{
  "payload": {"text": "Hello World"}
}

Payload with Inline Records

{
  "payload": 
    {
      "records": [
        {},
        {},
        {}
      ]
    }
}

Payload with Inline DataFrame

{
  "payload": 
    {
      "columns": []
      "values": []
    }
}

Payload with DataSet Reference

{
  "payload": {"$ref": "default/MyDataSet"}
}

Fixed fields

Field Name	Type	Description
records	[`any`]	OPTIONAL. An array of JSON objects representing dataset records. The fields of the these records MUST match the fields defined in the dataset.
columns	[`string`]	OPTIONAL. The column names that match the records containing in the `values` array. The column names MUST at least match the field names defined by this dataset.
values	[`any`]	OPTIONAL. A two dimensional array representing payload records and their values.
$ref	Reference Object	OPTIONAL. if present, MUST refer to a Dataset using its fully qualified name.

Files

1.0.0.md

Latest commit

History

1.0.0.md

File metadata and controls

Open Cognitive Skills Specification

Version 1.0.0

Introduction

Releationship to Cognitive Agent Modeling and Execution Language (CAMEL)

Table of Contents

Definitions

Open Cognitive Skill Document

Resources

Resource Names

Skill

Dataset

Model

Message

Schema

Specification

Versions

Format

Data Types

Examples

Rich Text Formatting

Schema

Common Fields

System Fields

Common Objects

Property Definition Object

Fixed Fields

Property Value Object

Fixed Fields

Tag Object

Fixed Fields

Parameter Object

Examples

Fixed Fields

Reference Object

Examples

Fixed Fields

URL Reference Object

Examples

Fixed Fields

Skill Object

Skill Object Example

Fixed Fields

Skill Input Object

Fixed fields

Skill Output Object

Fixed fields

Routing Object

Examples

All Routing Fixed fields

Property Routing Fixed fields

Field Routing Fixed fields

Routing Rule Object

Fixed fields

Model Definition Object

Fixed fields

Dataset Reference Object

Fixed fields

Dataset

Dataset Object Example

Fixed Fields

Connection Configuration Object

Fixed fields

Connection Reference Object

Fixed fields

Message

Fixed fields

Payload Object

Examples

Payload with Inline Object

Payload with Inline Records

Payload with Inline DataFrame

Payload with DataSet Reference

Fixed fields