[RFC] Stage 0: Introduce Entity Field Set into ECS #2434

tinnytintin10 · 2025-01-29T08:22:45Z

Overview

An entity represents a discrete, identifiable component within an IT environment that can be described by a set of attributes and maintains its identity over time. Entities can be physical (like hosts or devices), logical (like containers or processes), or abstract (like applications or services).

Currently, ECS provides specific field sets for certain categories of entities (e.g., host, user, cloud, orchestrator) to capture their metadata. However, as IT infrastructure continues to evolve, we encounter an increasing number of entity types that don't cleanly fit into existing field sets – for example, storage services like S3, database instances like DynamoDB, or various other cloud services and IT-related infrastructure components (both digital and physical).

This RFC proposes a new entity fieldset that aims to solve this and several other challenges. Currently at Stage 0 (strawperson), seeking initial feedback on the approach and concept. See /rfcs/text/0049-entity-fields.md for more details.

Sponsor: @MikePaquette & @YulNaumenko
Author: @tinnytintin10

PR Guidelines

Have you signed the contributor license agreement? ✅
Have you followed the contributor guidelines? ✅
For proposing substantial changes or additions to the schema, have you reviewed the [RFC process] (https://github.com/elastic/ecs/blob/main/rfcs/README.md)? ✅
If submitting code/script changes, have you verified all tests pass locally using make test? N/A
If submitting schema/fields updates, have you generated new artifacts by running make and committed those changes? N/A
Is your pull request against main? Unless there is a good reason otherwise, we prefer pull requests against main and will backport as needed. ✅
Have you added an entry to the CHANGELOG.next.md? N/A

tinnytintin10 · 2025-01-29T08:29:46Z

@MikePaquette @YulNaumenko, I've drafted the RFC to introduce the entity field set into ECS like we talked about. Before taking it out of draft, I wanted to check with you both to see if there's anything you think should be included or addressed as part of this stage. Lmk 🙏🏾

oren-zohar · 2025-02-11T12:51:22Z

rfcs/text/0049-entity-fields.md

+| entity.name | keyword, text | The human-readable name of the entity. The keyword field enables exact matches for filtering and aggregations, while the text field enables full-text search. For entities with dedicated field sets (e.g., `host`), this field should mirrors the corresponding *.name value. |
+| entity.address | keyword | A URI, URL, or other direct reference to access or locate the entity in its source system. This could be an API endpoint, web console URL, or other addressable location. Format may vary by entity type and source system. |
+| entity.Attributes.* | object |  Entity type-specific attributes using capitalized field names to indicate custom field space. The capital `A` in "Attributes" and the capitalization of all subfields (e.g., `entity.Attributes.StorageClass`, `entity.Attributes.EngineVersion`) distinguishes these as custom entity-type-specific fields that won't be enumerated in the ECS schema.  | 
+| entity.metadata.* | flattened | A flexible container for entity metadata that doesn't fit into other structured fields. This field uses the flattened type to allow arbitrary key-value pairs while maintaining searchability. Useful for provider-specific or non-standardized attributes that don't warrant dedicated fields. |


@tinnytintin10 would we still need it?

@oren-zohar Can you elaborate? Are you referring to entity.metadat (i.e, do we still need metadata when we have attribute)? Also, see the updated name and description (now entity.raw instead of metadata)

maxcold · 2025-02-21T14:25:56Z

rfcs/text/0049-entity-fields.md

+
+| Field | Type | Description |
+|-------|------|-------------|
+| entity.id | keyword | A unique identifier for the entity. This should be a stable, unique value that persists across different observations of the same entity. For entities with dedicated field sets (e.g., host.id, user.id), this value should match the corresponding *.id field. |


do we want to solve the problem of different id/name sources for entities? eg. EC2 instance id vs arn?

I've updated the entity.id field description to provide guidance for choosing between multiple identifiers. To be transparent - there will always be some ambiguity in this selection that we can't fully resolve yet as it depends on specific use cases and contexts. However, we can provide these basic criteria for selecting the primary identifier:

Persists across the entity's lifecycle

Ensures uniqueness within its scope

Is commonly used for queries and correlation

Is readily available in most observations (events/logs, etc.,)

Alternative identifiers are preserved in entity.raw. Wdyt?

yeah, I don't have a good way to solve it tbh. What I can only think of is another field, smth like id type which could be smth like arn or instance-id but I'm not sure it's a good idea tbh

The issue with ids is something beyond the scope of this RFC, in my opinion. More important than knowing what type of id is being persisted, is having a consistent id of that entity when ingested from different sources. I don't think adding a id type field would ensure consistency, it only gives a context in which I'm not sure it's needed.

We need consistent ids and entity resolution when needed. Neither of these solutions touch the RFC right now, as far as I understand.

maxcold · 2025-02-21T14:29:29Z

rfcs/text/0049-entity-fields.md

+|-------|------|-------------|
+| entity.id | keyword | A unique identifier for the entity. This should be a stable, unique value that persists across different observations of the same entity. For entities with dedicated field sets (e.g., host.id, user.id), this value should match the corresponding *.id field. |
+| entity.source | keyword | The module or integration that provided this entity data (similar to event.module). |
+| entity.category | keyword | A standardized high-level classification of the entity type. This provides a normalized way to group similar entities across different providers or systems. Example values: `bucket`, `database`, `container`, `function`, `queue`, `host`, `user`, etc.,. There will be an allowed set of values maintained for this field to ensure consistency. |


I'm guessing category and type are chosen to be consistent with event.* , but should we be consistent here? I personally always found type and category confusing, especially when it comes to understanding what is higher in the hierarchy. Wouldn't it be simpler to have category and sub-category or smth like that?

I've had some offline discussions with tin on this subject, my suggestion is type as the higher classification and subtype (or anything else) for secondary. This will better align with Observability Inventory which uses entity.type to search, group and filter:

And will also align better with our codebase which already mentions and uses entity.type quite a bit. having two different fields will require us to "juggle" between them depending on the part of the code we are working in.

I like the suggestion of using more explicitly hierarchical terms. Something like type/sub_type would make the relationship self-evident without needing to know any conventions.

However, I deliberately aligned with ECS's categorization hierarchy (where event.category is higher level than event.type) when coming up with this. I'd love to get input from the ECS team on whether maintaining this consistency is warranted here. While consistency with existing patterns has value, entities serve a different purpose and might benefit from more intuitive naming.

cc @MikePaquette

There will be an allowed set of values maintained for this field to ensure consistency.

In that case, there will be an allowed_values property to entity.category, similarly as for the event.category? That would be nice, as it allows us to list the expected values and include a description for every category, such as the example below:

- name: category allowed_values: - name: host description: > Entities in this category represent computing devices such as physical machines, virtual machines, or cloud instances. This includes hosts in on-premise data centers, cloud providers (AWS EC2, GCP Compute Engine, Azure VM), and edge devices. Events in this category may relate to system health, performance metrics, security monitoring, and configuration changes. - name: user description: > This category represents human or service identities that interact with systems and resources. Users can be defined in directories such as Active Directory, IAM roles in cloud providers, or application-specific accounts. Events in this category include authentication attempts, role changes, permissions updates, and identity-related security incidents. ...

… fields (like event.url)

cla-checker-service · 2025-02-24T02:45:37Z

❌ Author of the following commits did not sign a Contributor Agreement:
c20956e, 8d583c0, 347c3d3

Please, read and sign the above mentioned agreement if you want to contribute to this project

…th entity.raw.

tinnytintin10 · 2025-02-24T03:35:50Z

Reviewed this with @MikePaquette and are good to go for boarder reviews 🚀

cc @tehilashn @oren-zohar @YulNaumenko

opauloh · 2025-02-25T19:52:25Z

rfcs/text/0049-entity-fields.md

+|-------|------|-------------|
+| entity.id | keyword | A unique identifier for the entity. This should be a stable, unique value that persists across different observations of the same entity. For entities with dedicated field sets (e.g., host.id, user.id), this value should match the corresponding *.id field. |
+| entity.source | keyword | The module or integration that provided this entity data (similar to event.module). |
+| entity.category | keyword | A standardized high-level classification of the entity type. This provides a normalized way to group similar entities across different providers or systems. Example values: `bucket`, `database`, `container`, `function`, `queue`, `host`, `user`, etc.,. There will be an allowed set of values maintained for this field to ensure consistency. |


There will be an allowed set of values maintained for this field to ensure consistency.

In that case, there will be an allowed_values property to entity.category, similarly as for the event.category? That would be nice, as it allows us to list the expected values and include a description for every category, such as the example below:

- name: category allowed_values: - name: host description: > Entities in this category represent computing devices such as physical machines, virtual machines, or cloud instances. This includes hosts in on-premise data centers, cloud providers (AWS EC2, GCP Compute Engine, Azure VM), and edge devices. Events in this category may relate to system health, performance metrics, security monitoring, and configuration changes. - name: user description: > This category represents human or service identities that interact with systems and resources. Users can be defined in directories such as Active Directory, IAM roles in cloud providers, or application-specific accounts. Events in this category include authentication attempts, role changes, permissions updates, and identity-related security incidents. ...

opauloh · 2025-02-25T21:27:59Z

rfcs/text/0049-entity-fields.md

+
+| Field | Type | Description |
+|-------|------|-------------|
+| entity.risk.* | * | Fields for describing risk score and risk level of entities such as hosts and users. |


Should criticality be included in entity as well? i.e entity.criticality

JordanSh · 2025-02-26T14:22:15Z

rfcs/text/0049-entity-fields.md

+
+This approach would allow ECS to accommodate new types of entities without requiring continuous schema expansion through new field sets, while maintaining a consistent structure for entity representation.
+
+## Fields


shouldn't tags and labels also be included in entity?

tinnytintin10 added 3 commits January 29, 2025 02:55

first draft of entity fieldset strawman ecs rfc

5ceb23b

overview update

ebb691c

included PR link in rfc doc

47951c9

tinnytintin10 requested review from YulNaumenko, MikePaquette, tehilashn, oren-zohar and jaredburgettelastic January 29, 2025 18:40

oren-zohar reviewed Feb 11, 2025

View reviewed changes

opauloh mentioned this pull request Feb 10, 2025

[Cloud Security] Asset Inventory table flyout controls elastic/kibana#208452

Merged

maxcold reviewed Feb 21, 2025

View reviewed changes

Updated entity.address to entity.url to follow similar usage by other…

c20956e

… fields (like event.url)

Tinsae Erkailo added 2 commits February 23, 2025 22:06

refine entity.attributes (lowercase A) and replace entity.metadata wi…

8d583c0

…th entity.raw.

clarify entity.id selection criteria

347c3d3

tinnytintin10 marked this pull request as ready for review February 24, 2025 03:34

tinnytintin10 requested a review from a team as a code owner February 24, 2025 03:34

tinnytintin10 requested review from romulets and hop-dev February 24, 2025 03:36

Merge branch 'main' into add-entity-fields

2fbfa28

opauloh reviewed Feb 25, 2025

View reviewed changes

JordanSh reviewed Feb 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Stage 0: Introduce Entity Field Set into ECS #2434

[RFC] Stage 0: Introduce Entity Field Set into ECS #2434

tinnytintin10 commented Jan 29, 2025

tinnytintin10 commented Jan 29, 2025

oren-zohar Feb 11, 2025

tinnytintin10 Feb 24, 2025

maxcold Feb 21, 2025

tinnytintin10 Feb 24, 2025

maxcold Feb 25, 2025

romulets Feb 26, 2025

maxcold Feb 21, 2025

JordanSh Feb 23, 2025 •

edited

Loading

tinnytintin10 Feb 24, 2025

opauloh Feb 25, 2025

cla-checker-service bot commented Feb 24, 2025 •

edited

Loading

tinnytintin10 commented Feb 24, 2025

opauloh Feb 25, 2025

opauloh Feb 25, 2025

JordanSh Feb 26, 2025


		This approach would allow ECS to accommodate new types of entities without requiring continuous schema expansion through new field sets, while maintaining a consistent structure for entity representation.

		## Fields

[RFC] Stage 0: Introduce Entity Field Set into ECS #2434

Are you sure you want to change the base?

[RFC] Stage 0: Introduce Entity Field Set into ECS #2434

Conversation

tinnytintin10 commented Jan 29, 2025

Overview

PR Guidelines

tinnytintin10 commented Jan 29, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JordanSh Feb 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cla-checker-service bot commented Feb 24, 2025 • edited Loading

tinnytintin10 commented Feb 24, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JordanSh Feb 23, 2025 •

edited

Loading

cla-checker-service bot commented Feb 24, 2025 •

edited

Loading