This document answers common questions about how to use the OCSF Schema
Depending on the type of event, a data producer or data mapper should first determine what event class best suits your event. Start with the OCSF category to narrow down the choices. For example, an endpoint security product would likely choose an event class from the System Activity category, for example, File System Activity for an AV product. Every event class has an activity_id
enumeration which narrows down the intended activity of the event. Sometimes these are simple CRUD activities, but often they are more specific to the class, such as Logon
for the Authentication
class in the Identity and Access Management
category.
Since endpoint security products typically send alert events when malware is detected, the producer or mapper would apply the Security Control profile, which adds important attributes to the File System Activity event class, e.g. a Malware object, a MITRE ATT&CK object, the disposition etc. These profiles have their own attributes that must be populated.
If your endpoint security product also has network security capabilities, you would choose an event class from the Network Activity category, for example the general Network Activity event class. Given that the endpoint product will have information about the host system, you would apply the Host profile, as well as the Security Control profile. The Host profile includes attributes about the device and the actor (e.g. process or user) on the host.
Every OCSF event must have all of its event class Required attributes populated, and should have its Recommended attributes populated, if possible. This includes any of the embedded objects, such as the Malware, Process and Device objects above.
All OCSF events have a set of required classification attributes from the Base Event class: the class_uid
the category_uid
the activity_id
and the derived type_uid
. Their associated *_name
attributes are optional.
In addition to the classification attributes, a number of other Base Event class attributes are required and must be populated: the time
metadata
and severity
attributes. The metadata
attribute is an object that itself requires the product
and associated version
of the reporting event, as well as the version of the OCSF schema adhered to with the event.
Note that the product should be the originating event producer (i.e. not the mapping system, nor any intermediary event processing systems) in order to best represent the origin of the event. The time
should be the time that the event actually occurred assuming that information is known, or the earliest possible time available to the event producer or mapper.
Although the observables
array attribute is optional, populating it can make things easier for event consumers and analysts. Each Observable object surfaces an important attribute of the event in a common location in a simple tuple: name, value, type. For example, if the event class has a device
user
and process
populated, an array of three Observable objects will refer to them in a common location to all OCSF events.
There are three important attributes of the Observable
object, and the Base Event class allows for an array of these objects with the observables
attribute: name
, type_id
and value
. The first two are required attributes, while value
is optional. Why it is optional will become clear soon. There can be multiple observables within an event, even of the same type. This is why observables
is an array.
The required name
attribute of the Observable
object should be the fully qualified attribute name within the event. E.g. fingerprint.value
or actor.process.file
or actor.process.file.name
. In other words, observable.name
is the locator of that observable within the instance of the event. Note that the observable attribute can be a scalar, like device.ip
, or it can be an object, like actor.process.file
.
When the type_id
of the observable indicates that the observable's name
attribute is of object type, e.g. Fingerprint, the observable's value
attribute is not populated. When the type_id
indicates the observable's name
is a scalar, e.g. File Hash or File Name, then the observable's value
should be populated with the value of that attribute, that is, a copy of the value from the event.
A Finding in OCSF represents the result of some type of enrichment, correlation, aggregation, analysis or other processing of one or more events or alerts, producing a derived insight. Most security events and alerts are activity events with a dispostion (e.g. Blocked), for example when using the Security Control profile. Findings in OCSF are not always alerts themselves, although alerts may be triggered by findings or findings might be added to an incident further downstream.
For example, an email security product may determine that a user has been phished or an email attachment is malicious. It would send an email activity event (from its standpoint an alert) containing the user and sender, supplemented by the Security Control profile with a disposition of Blocked, and information about the Malware, to its management console which in turn sends it to a SIEM.
The SIEM might receive other related events or alerts, for example for other users in the same circumstance or for general email activity from the same sender. The SIEM might enrich the events with information from a Threat Intelligence Platform or threat feed pertaining to the email sender. The result of the aggregation, and enrichment would constitute an OCSF Finding. The SIEM might create an incident that includes or refers to the finding, in the event that there are remediation steps required.
Note that in a more complex processing architecture, there may be layered findings. That is, the original event may go to product A which eventually triggers a finding. Product B meanwhile may take in a lot of other events and findings (including those from product A) and make its own findings. In the example above, the originating email alert might have been a finding from the producer's standpoint if the event was enriched by its management system before being collected by the SIEM, which then produced a more complete finding.
When an event producer or mapper emits multiple events that have some grouping characteristic, or similarity of any form, it should populate the metadata.correlation_uid
attribute with a constant identifier. This allows consumers and analysts of the set of events to more easily aggregate and correlate the events.
A simple example would be a vulnerability scanner that emits events at the start of a scan of a system, at the end of the scan, and separate events for each vulnerability discovered. If these are separate events, they would all have their metadata.correlation_uid
set to the same value.
It is possible for an intermediary system to determine the grouping characcteristic as well, populating the attribute after collection of the events, although when OCSF events are immutable a copy of the original events would be made with added correlation information. See the next question.
Yes, they are also events with a base class metadata object that can follow the same pattern. E.g. a SIEM that creates findings may have enough knowledge and state to tie multiple findings together with a metadata.correlation_uid.
The Actor object is intended for use in event classes when knowledge of one entity that is initiating or causing some action on another entity. For example, if one process spawns another process, or deletes a file, the first process is the actor in the activity.
From a structural standpoint, the actor
attribute avoids name collisions with the other end of the activity in cases where a process acts on another process, as those attribute names would be in contention at the same level within the class.
Currently the Actor object has a process
and user
attribute, where one or the other is in the role of the actor in the activity. It also has Optional attributes for Session, authorizations
, idp
, and invoked_by
.
The idp
is populated in IAM category event classes, when the actor's identity provider is known and logged with Authentication and related events.
The authorizations
attribute is an array of information pertaining to what permissions and privileges the actor has at the time of the event, if known.
The invoked_by
attribute is populated with the name of the service or application through which the actor's activity was initiated.
The session
attribute is usually paired with the user
attribute. A Session object has information pertaining to a particular user session through which the activity was initiated. User is an entity object that isn't always associated with a session, and isn't always an actor, hence Session isn't part of the User object, but is included with the Actor object for actor semantics.
Related to this, the process
attribute of type Process has a User object which represents the user's account that the process runs under or is impersonating. Hence, the Process object also has a session
attribute paired with its user
attribute.
Often, User and Session objects will be paired in many event classes.
The unmapped
attribute is a catchall for event producers and mappers when there is data that doesn't populate the more specific attributes of the class. For example, product specific data that is extracted into fields and values from a log that aren't mapped.
Where unmapped
is best used, is for a mapper who is mapping events from multiple vendors where each vendor may have unique fields not common to other vendors for the same type of data source.
However, using unmapped
is not recommended for event producers. A native event producer should extend the schema to properly capture the data that can't be mapped. For product specific data, an extension is preferred, using either a vendor developed profile, or in some cases a new event class if the core event class doesn't adequately represent the event due to data that can't be naturally mapped, or activities not captured by the core class.
Object is the empty complex data type from which all OCSF objects extend with JSON formatted attributes, requirements, and descriptions. Think of unmapped
as if it were an OCSF object that you created on the fly. In the Java programming language, it would be like an inner class that doesn't need to be declared externally or globally. That is to say, it is used within the instance of an OCSF class only, and not part of the schema.
JSON is more free form data, hence the data
attribute is of type JSON. It can be anything encapsulated within JSON and does not need to look like an OCSF object. It should not be used for unmapped extracted fields, but rather other data that may be captured with the event. It is used, for example, within the Enrichment object (the enrichments
array attribute of the Base Event) to augment one or more of the mapped or unmapped attributes.
A String type is reserved for unformatted text, such as the raw_data
attribute of the Base Event class. Binary data is Base64 encoded in an attribute of bytestring_t
type, currently not used in the core schema but may be used in extensions or within the unmapped
object.
When should I use Authorize Session from Identity and Access Management vs. Web Resource Access Activity from the Application category?
These two event classes are complementary. Changes to a security principal's permissions, privileges, roles are authorization activities, while the access of web resources by a security principal is logged as Web Access Activity. IAM category authorization or change events are independent of a particular resource access, while enforcement of authorization restrictions is made at access time and is logged as such. For example, when a new Logon session is created, authorization checks are made and if logged, belong in the Authorize Session class. However, when the user or process that has those permissions accesses a web resource, and it is granted or denied, the Web Access Activity class is used.
HTTP Activity is information focused on the network protocol, and not the gating of the resource. While access to a resource is often requested via a web service or REST APIs, the HTTP Activity is the protocol activity for that access, not the activity of the gating service to the resource, which might be via the HTTP server nevertheless. And of course access activity in general is not uniquely via HTTP: Kerberos and LDAP servers grant and deny access to resources over their respective protocols.
Profiles in OCSF are a way to uniformly add a set of attributes to one or more event classes or objects. Event classes provide the basic structure and type of an event, while objects provide the structure of complex types. Their definitions can indicate that additional attributes may be included with an event instance via profiles specified with the class or object definition. In effect, adding a profile or profiles to the definition gives you the permission to dynamically include those attributes. When constructing an event, you would add an OCSF profile name to the metadata.profiles
array to mix-in the additional attributes with the event.
An event that has that profile applied is then a kind of that profile, as well as a kind of the event class. For example, if the Host
profile was applied to the HTTP Activity
class to add the actor.process
making a request, the event would be queriable either via the metadata.profiles[] as Host
or via class_name as HTTP Activity
. If using Host
other events from System Activity
could also be returned with the same actor.
Not all of the attributes from the profile need be added together. For example, a profile with attributes A, B, C can be defined within the definition of class D and object E. Class D can include A and B, while object E can include attribute C. You can also build in a profile, by adding the attributes of the profile directly into your class, and referencing the profile in your class definition. In this case, as with class and object extensions, the profile defined requirements, group or description can be overridden within the definition of the class or object, although this is not recommended. Only the attribute data type and constraints cannot be overridden.
Yes there is, although OCSF is considerably simpler. At a fundamental level LDAP consists of attributes and object classes, while OCSF consists of attributes and event classes. Attributes in LDAP have syntaxes and in OCSF have data types (OCSF objects are complex data types). An event class is similar to an LDAP structural object class; it defines the basic structure of an event, as the LDAP object class defines the structure of an entry. Like LDAP, an OCSF event class can be constructed via extending a super class to inherit attributes. And an OCSF profile is similar to an LDAP auxiliary class which can be applied to a structural object class so that an entry can mix in additional attributes, independent of structural hierarchy of the entry.
These are naming conventions rather than metaschema or data type validation factors. _id
is the convention for OCSF enumerated attributes. These attributes can be integer data types, or string data types, although OCSF favors integer data types with string labels. Every integer enum attribute SHOULD have standard values of 0
for Unknown
and 99
for Other
. There is no requirement that the integers stay within those bounds, or that they increment by 1
. Every enum attribute SHOULD have a string sibling attribute of the same name but without the _id
suffix. A sibling is declared within the attribute definition of the enum attribute. When the logged value is not mappable within the enum listed values, Other
can be set and a source specific label can populate the sibling attribute. The exception to this convention is when an enum attribute mirrors an external standard, for example with the dns_query
object's opcode_id
which mirrors the values requested from a resolver. It is recommended that the sibling attribute is populated with the enum label so that human queries can be made against a more easily remembered string, rather than a number.
_uid
suffix attributes are for unique identifier values within the schema, or external identifier values, e.g. coming from a public cloud resource or similar entity. For this reason _uid
suffix attributes are usually strings, in order to accomodate any type of alphanumeric format, but they MAY be integers (or longs). Within OCSF Classification attributes, _uid
attributes are integers or longs (see class_uid
or type_uid
). The sibling for _uid
attributes is an attribute of the same base name with the _name
suffix (see class_name
or type_name
). The exception for Classification attributes is activity_id
which is an enum rather than a singular identifier. However its sibling is also of suffix _name
: activity_name
following the convention for _uid
attributes of the Classification group.
Note that sibling string attributes can be used standalone, i.e. without an associated enum or unique identifier.
OCSF follows the semver versioning scheme.
From the semver documentation:
Given a version number MAJOR.MINOR.PATCH, increment the:
MAJOR version when you make incompatible API changes MINOR version when you add functionality in a backward compatible manner PATCH version when you make backward compatible bug fixes
In terms practical to OCSF users, this means:
- PATCH version increments may change documentation values like
description
. - MINOR version increments may add new schemata like attributes or events.
- MAJOR version increments may add and remove anything.
So any version in the 1.x line should be backwards-compatible with previous 1.x versions.
- The removal of an event, object, attribute, data type, or enum member.
- The
name
of an event or object is missing in the NEW schema. - The dictionary key of an attribute or data type (its implied
name
) is missing in the NEW schema. - The dictionary key of an enum member (its value) is missing in the NEW schema.
- The
uid
orclass_uid
of an event is missing in the NEW schema.
- The
- Renaming an event, object, attribute, or enum member.
- A special case of removal in which the same
caption
belongs to an element with a differentname
, key, orclass_uid
; or the sameclass_uid
belongs to an event with a different name.
- A special case of removal in which the same
- Changing the data type of an attribute unless the data type is changing from
int
tolong
. This exception is allowed on the basis that nearly all encodings use variable lengths by default, meaning that data written in nearly all encodings as anint
can be safely interpreted as along
. - Changing the
requirement
value of an attribute fromoptional
orrecommended
torequired
. - Making the
constraints
of a data type more restrictive. - Adding a
required
attribute to an existing event or object.
The convention we try to stick to when authoring OCSF classes and objects is to use status_id
and its sibling status
for the result of an activity, usually as a class attribute, and use state_id
and its sibling state
for the state of an object. The latter might sound obvious but it may not be obvious to not use status
for objects. The reasoning is that an object exist independent of time or an activity or action, and therefore it has a state. It could have just as easily had a status, over an indeterminate period of time, but we have tried to distinguish between the two situations by reserving status
for the point in time result of an activity or action.