Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add component specification #8858

Merged
merged 19 commits into from
Aug 30, 2021
Merged
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
214 changes: 214 additions & 0 deletions docs/specs/component.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
# Component Specification

This document specifies Vector Component behavior (source, transforms, and
sinks) for the development of Vector.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”,
“SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be
interpreted as described in [RFC 2119].

<!-- MarkdownTOC autolink="true" style="ordered" indent=" " -->

1. [Introduction](#introduction)
1. [Scope](#scope)
1. [How to read this document](#how-to-read-this-document)
1. [Configuration](#configuration)
1. [Options](#options)
1. [`address`](#address)
1. [`endpoint(s)`](#endpoints)
1. [Instrumentation](#instrumentation)
1. [Batching](#batching)
1. [Events](#events)
1. [BytesReceived](#bytesreceived)
1. [EventsRecevied](#eventsrecevied)
1. [EventsSent](#eventssent)
1. [BytesSent](#bytessent)
1. [Error](#error)

<!-- /MarkdownTOC -->

## Introduction

Vector is a highly flexible observability data pipeline due to its directed
acyclic graph processing model. Each node in the graph is a Vector Component,
and in order to meet our [high user experience expectations] each Component must
adhere to a common set of behaviorial rules. This document aims to clearly
outline these rules to guide new component development and ongoing maintenance.

## Scope

This specification addresses _direct_ component development and does not cover
aspects that components inherit "for free". For example, this specification does
not cover gloal context, such as `component_id`, that all components receive in
their telemetry by nature of being a Vector compoent.

## How to read this document

This document is written from the broad perspective of a Vector component.
Unless otherwise stated, a section applies to all component types (sources,
transforms, and sinks).

## Configuration

### Options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion is still as stated on #8877 (comment):

  • endpoint(s) for when a URL is used to configure a remote destination that vector is communicating with
  • address for when a simple ip:port combo is used to configure a remote destination
  • bind for any time we are binding to an address

I think bind (or listen) makes it clearer that Vector is binding to the port. address seems, to me, more suitable for outgoing connections.

I think having endpoint separate and only representing URLs is also clearer that you need a full URL and not just a host:port.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What endpoint do we have that is totally separate from address?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What endpoint do we have that is totally separate from address?

Ah, sorry, I just mean the name should be distinct. That is we should call URLs endpoint in the config (like for the http sink). For components that send raw data to an ip:port I think address is a more suitable name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 I'd favor endpoints for both, personally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I feel more strongly about using bind (or listen) for options where Vector binds the given value than endpoint vs. address. I do think the distinction is useful though, so users immediately know what is expected and are less likely to, for example, think that they should provide a URL scheme to an address value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree on the endpoint vs address distinction, I am still not a fan of either bind or listen. No existing component uses bind as the configured local address, which means all users configuring one of those components must change their config. Granted, we will be breaking metric names, but that doesn't make a component fail to work after upgrading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there will be a fight to the death over this naming

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've yanked the option under dispute so we can move forward on the rest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which option was yanked? I didn't see any new commits and it seems the same to me since last time I looked.

Granted, we will be breaking metric names, but that doesn't make a component fail to work after upgrading.

In both the rename to bind case and the metric name case, I was thinking we'd maintaining aliases for backwards compatibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I pushed but missed the error message saying my HEAD was not up to date. Fixed.


#### `address`

When a component binds to an address, it should expose an `address` option that
takes a `string` representing a single address.

#### `endpoint(s)`

When a component sends data to a downstream target, it should expose an
`endpoint(s)` option that takes a `string` representing one or more comma
separated endpoints.
bruceg marked this conversation as resolved.
Show resolved Hide resolved

## Instrumentation

Vector components MUST be instrumented for optimal observability and monitoring.
This is required to drive various interfaces that Vector users depend on to
manage Vector installations in mission critical production environments.

### Batching

For performance reasons, components SHOULD instrument batches of Vector events
as opposed to individual Vector events. [Pull request #8383] demonstrated
meaningful performance improvements as a result of this strategy.

### Events

Vector implements an event driven pattern ([RFC 2064]) for internal
instrumentation. This section lists all required and optional events that a
component MUST emit. It is expected that components will emit custom events
beyond those listed here that reflect component specific behavior.

binarylogic marked this conversation as resolved.
Show resolved Hide resolved
There is leeway in the implementation of these events:

* Events MAY be augmented with additional component-specific context. For
example, the `socket` source adds a `mode` attribute as additional context.
* The naming of the events MAY deviate to satisfy implementation. For example,
the `socket` source may rename the `EventRecevied` event to
`SocketEventReceived` to add additional socket specific context.
bruceg marked this conversation as resolved.
Show resolved Hide resolved
* Components MAY emit events for batches of Vector events for performance
reasons, but the resulting telemetry state MUST be equivalent to emitting
individual events. For example, emitting the `EventsReceived` event for 10
events MUST increment the `events_in_total` by 10.

#### BytesReceived

*Sources* MUST emit a `BytesReceived` event immediately after receiving bytes
from the upstream source and before the creation of a Vector event.
jszwedko marked this conversation as resolved.
Show resolved Hide resolved

* Properties
* `byte_size`
* For UDP, TCP, and Unix protocols, the total number of bytes received from
the socket excluding the delimiter.
* For HTTP-based protocols, the total number of bytes in the HTTP body, as
represented by the `Content-Length` header.
* For files, the total number of bytes read from the file excluding the
delimiter.
* `protocol` - The protocol used to send the bytes (i.e., `tcp`, `udp`,
`unix`, `http`, `https`, `file`, etc.)
* `address` - If relevant, the bound address that the bytes were received
from. For HTTP, this MUST be the host and path only, excluding the query
string.
bruceg marked this conversation as resolved.
Show resolved Hide resolved
* `path` - If relevant, the HTTP path, excluding query strings.
bruceg marked this conversation as resolved.
Show resolved Hide resolved
* `socket` - If relevant, the socket number that bytes were received from.
* `remote_address` - If relevant, the remote IP address of the upstream
client.
bruceg marked this conversation as resolved.
Show resolved Hide resolved
* `file` - If relevant, the absolute path of the file.
* Metrics
* MUST increment the `received_bytes_total` counter by the defined value with
the defined properties as metric tags.
bruceg marked this conversation as resolved.
Show resolved Hide resolved
* Logs
* MUST log a `{byte_size} bytes received.` message at the `trace` level with
the defined properties as structured data. It MUST NOT be rate limited.
bruceg marked this conversation as resolved.
Show resolved Hide resolved

#### EventsRecevied

*All components* MUST emit an `EventsReceived` event immediately after creating
or receiving one or more Vector events.

* Properties
* `quantity` - The quantity of Vector events.
binarylogic marked this conversation as resolved.
Show resolved Hide resolved
* `byte_size` - The cumulative byte size of all events in JSON representation.
bruceg marked this conversation as resolved.
Show resolved Hide resolved
* Metrics
* MUST increment the `received_events_total` counter by the defined `quantity`
property with the other properties as metric tags.
* MUST increment the `received_event_bytes_total` counter by the defined
`byte_size` property with the other properties as metric tags.
* Logs
* MUST log a `{quantity} events received.` message at the `trace` level with
the defined properties as structured data. It MUST NOT be rate limited.

#### EventsSent

*All components* MUST emit an `EventsSent` event immediately before sending the
event down stream. This should happen before any transmission preparation, such
as encoding.

* Properties
* `quantity` - The quantity of Vector events.
* `byte_size` - The cumulative byte size of all events in JSON representation.
* Metrics
* MUST increment the `sent_events_total` counter by the defined value with the
defined properties as metric tags.
jszwedko marked this conversation as resolved.
Show resolved Hide resolved
* MUST increment the `sent_event_bytes_total` counter by the event's byte size
in JSON representation.
* Logs
* MUST log a `{quantity} events sent.` message at the `trace` level with the
defined properties as structured data. It MUST NOT be rate limited.

#### BytesSent

*Sinks* MUST emit a `BytesSent` event immediately after sending bytes to the
downstream target regardless if the transmission was successful or not.

* Properties
* `byte_size`
* For UDP, TCP, and Unix protocols, the total number of bytes placed on the
socket excluding the delimiter.
* For HTTP-based protocols, the total number of bytes in the HTTP body, as
represented by the `Content-Length` header.
* For files, the total number of bytes written to the file excluding the
delimiter.
* `protocol` - The protocol used to send the bytes (i.e., `tcp`, `udp`,
`unix`, `http`, `http`, `file`, etc.)
* `endpoint` - If relevant, the endpoint that the bytes were sent to. For
HTTP, this MUST be the host and path only, excluding the query string.
* `file` - If relevant, the absolute path of the file.
* Metrics
* MUST increment the `bytes_in_total` counter by the defined value with the
defined properties as metric tags.
bruceg marked this conversation as resolved.
Show resolved Hide resolved
* Logs
* MUST log a `{byte_size} bytes received.` message at the `trace` level with
the defined properties as structured data. It MUST NOT be rate limited.

#### Error

*All components* MUST emit error events when an error occurs, and errors MUST be
named with an `Error` suffix. For example, the `socket` source emits a
`SocketReceiveError` representing any error that occurs while receiving data off
of the socket.

This specification does list a standard set of errors that components must
implement since errors are specific to the component.
binarylogic marked this conversation as resolved.
Show resolved Hide resolved

* Properties
* `error` - The string representation of the error.
bruceg marked this conversation as resolved.
Show resolved Hide resolved
* `stage` - The stage at which the error occured. MUST be one of `receiving`,
`processing`, `sending`.
Copy link
Contributor Author

@binarylogic binarylogic Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a property I don't love. The idea is to provide better signal around where the error is happening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's even necessary. I think it's a good way of thinking about these errors, that the stage at which they happen is an important piece of information. However, these errors are typically tied to exactly one stage, and so adding it to the event structure is just extraneous.

* Metrics
* MUST increment the `errors_total` counter by 1 with the defined properties
bruceg marked this conversation as resolved.
Show resolved Hide resolved
as metric tags.
* MUST increment the `discarded_events_total` counter by the number of Vector
events discarded if the error resulted in discarding (dropping) events.
* Logs
* MUST log a `{stage} error: {error}` message at the `error` level with the
defined properties as structured data. It SHOULD be rate limited to 10
seconds.

[high user experience expectations]: https://github.com/timberio/vector/blob/master/docs/USER_EXPERIENCE_DESIGN.md
[Pull request #8383]: https://github.com/timberio/vector/pull/8383/
[RFC 2064]: https://github.com/timberio/vector/blob/master/rfcs/2020-03-17-2064-event-driven-observability.md
[RFC 2119]: https://datatracker.ietf.org/doc/html/rfc2119