Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: RFC #1999 - 2020-03-06 - API extensions for lua transform #2000

Merged
20 commits merged into from
Mar 17, 2020
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 175 additions & 0 deletions rfcs/2020-03-06-1999-api-extensions-for-lua-transform.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
# RFC #1999 - 2020-03-06 - API extensions for `lua` transform

This RFC proposes a new API for the `lua` transform.

## Motivation

Currently the [`lua` transform](https://vector.dev/docs/reference/transforms/lua/) has some limitations in its API. In particular, the following features are missing:

* **Nested Fields**

Currently accessing nested fields is possible using the dot notation:

```lua
event["nested.field"] = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can expose this functionality in a .get API call or something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. There are some intricacies, for example if it is added as a method, it would not work with new user-created events which are tables:

event = {
  log = {
    -- ...
  }
}
emit(event)

Here event is not a userdata, but a table, so it would not have a get method.

However, it might be possible to provide a global function for accessing this, although see #2000 (comment).

```

However, users expect nested fields to be accessible as native Lua structures, for example like this:

```lua
event["nested"]["field"] = 5
```

See [#706](https://github.com/timberio/vector/issues/706) and [#1406](https://github.com/timberio/vector/issues/1406).

* **Setup Code**

Some scripts require expensive setup steps, for example, loading of modules or invoking shell commands. These steps should not be part of the main transform code.

For example, this code adding custom hostname

```lua
if event["host"] == nil then
local f = io.popen ("/bin/hostname")
local hostname = f:read("*a") or ""
f:close()
hostname = string.gsub(hostname, "\n$", "")
event["host"] = hostname
end
```

Should be split into two parts, the first part executed just once at the initialization:

```lua
local f = io.popen ("/bin/hostname")
local hostname = f:read("*a") or ""
f:close()
hostname = string.gsub(hostname, "\n$", "")
```

and the second part executed for each incoming event:

```lua
if event["host"] == nil then
event["host"] = hostname
end
```

See [#1864](https://github.com/timberio/vector/issues/1864).

* **Control Flow**

It should be possible to define channels for output events, similarly to how it is done in [`swimlanes`](https://vector.dev/docs/reference/transforms/swimlanes/) transform.

See [#1942](https://github.com/timberio/vector/issues/1942).

## Prior Art

The implementation of `lua` transform has the following design:

* There is a `source` parameter which takes a string of code.
* When a new event comes in, the global variable `event` is set inside the Lua context and the code from `source` is evaluated.
* After that, Vector reads the global variable `event` as the processed event.
* If the global variable `event` is set to `nil`, then the event is dropped.

Events have type [`userdata`](https://www.lua.org/pil/28.1.html) with custom [metamethods](https://www.lua.org/pil/13.html), so they are views to Vector's events. Thus passing an event to Lua has zero cost, so only when fields are actually accessed the data is copied to Lua.

The fields are accessed through string indexes using [Vector's dot notation](https://vector.dev/docs/about/data-model/log/#dot-notation).

## Guide-level Proposal

### Motivating example


```toml
[transforms.lua]
type = "lua"
inputs = []
source = """
counter = counter + 1
event = nil
"""
[transforms.lua.hooks]
init = """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, I originally considered functions (Eg start, stop) but this works too!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably, if we want a consistent interface for lua, javascript, and later wasm this may not be the best choice. It may be better to let folks specify some functions as hooks instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I was thinking about an API looking like this:

source = """
  function (input, emit)
    input:on_event(function (event)
      -- do something
      emit(event)
    end)
    input:on_start(function ()
      event = -- ...
      emit(event)
    end)
    input:on_stop(function ()
      -- ...
    end)
    input:on_interval(10, function()
      event = -- ...
      emit(event)
    end)
  end
"""

However, if such a script is a part of an already large TOML config file, such as the one from #1721 (comment), declaring the hooks using the TOML syntax might make it easier to read and reason about. It limits generality, but in return it makes it easier to both get started and reason about scripts written by other people, especially if the user is not deeply familiar with Lua programming.

In someone needs for some reason to put functions as hooks, it is possible too:

hooks.start = """
require "mymodule"
function timer_handler(emit)
-- ...
end
my_processor = mymodule.create_processor()
"""
timers = [{ interval = 10, source = "timer_handler(emit)" }]
source = "my_processor:process(event, emit)"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if such a script is a part of an already large TOML config file, such as the one from #1721 (comment), declaring the hooks using the TOML syntax might make it easier to read and reason about.

In the context of the vector.toml file I tend to agree. But I could also see users wanting to include separate .lua files, which makes this point moot. For example:

source = """
require("transform.lua")
"""

That said, what if allowed the user to specify handlers instead of source code? Ex:

[transforms.lua]
  type = "lua"
  source = """
counter = 0

function init
  # ...
end

function process(event)
  # ...
end
"""

  handlers.init = "init" # default 
  handlers.process = "process" # default
  handlers.shutdown = "shutdown" # default 

  timers = [
    {handler = "flush", interval_secs = 10}
  ]

This would then allow a user to do something like:

[transforms.lua]
  type = "lua"
  source = "require('transform.lua')"

  handlers.init = "init" # default 
  handlers.process = "process" # default
  handlers.shutdown = "shutdown" # default 

  timers = [
    {handler = "flush", interval_secs = 10}
  ]

Copy link
Contributor

@Hoverbear Hoverbear Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a-rodin Right, so in your example each of those scripts shares a global state? I was under the initial impression they were distinct.

How would this work if the language didn't work well with random mutable global variables, like Rust? Would we just be concatting them?

Copy link
Author

@ghost ghost Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@binarylogic I think this approach suits both simple and advanced use cases. I’m going to update the RFC text.

@Hoverbear
I’m not sure what is the best approach for Rust, but I think that it might be a solution to define a trait with methods init, process, timers, shutdown which would then be implemented by the users (or have default implementations for methods like init/shutdown hooks). So the execution model could still be the same, although the configuration would have to appear somewhat differently because it is not practical to write inline Rust in the config files.

On the other hand, as Lua is an interpreted language, for light transforms it would be possible to just write inline Lua functions with the current approach, and the trait-like approach would require writing unnecessary boilerplate code and still be not idiomatic.

counter = 0
previous_timestamp = os.time()
Event = Event.new_log()
event["message"] = "starting up"
event:set_lane("auxiliary")
"""
shutdown = """
final_stats_event = Event.new_log()
final_stats_event["stats"] = { count = counter, interval = os.time() - previous_timestamp }
final_stats_event["stats.rate"] = final_stats_event["stats"].count / final_stats_event["stats.interval"]

shutdown_event = Event.new_log()
shutdown_event["message"] = "shutting down"
shutdown_event:set_lane("auxiliary")

event = {final_stats_event, shutdown_event}
"""
[[transforms.lua.timers]]
interval = 10
source = """
event = Event.new_log()
event["stats"] = { count = counter, interval = 10 }
event["stats.rate"] = event["stats"].count / event["stats.interval"]
counter = 0
previous_timestamp = os.time()
"""
[[transforms.lua.timers]]
interval = 60
source = """
event["message"] = "heartbeat"
event:set_lane("auxiliary")
""
```

The code above consumes the incoming events, counts them, and then emits these stats about these counts every 10 seconds. In addition, it sends debug logs about its functioning into a separate lane called `auxiliary`.

### Proposed changes

* Hooks for initialization and shutdown called `init` and `shutdown`. They are defined as strings of Lua code in the `hooks` section of the configuration of the transform.
* Timers which define pieces of code that are executed periodically. They are defined in array `timers`, each timer takes two configuration options: `interval` which is the interval for execution in seconds and `source` which is the code which is to be executed periodically.
* Support for setting the output lane using `set_lane` method on the event which takes a string as the parameter. It should also be possible to read the lane using `get_lane` method. Reading from the lanes can be done in the downstream sinks by specifying the name of transform suffixed by a dot and the name of the lane.
* Support multiple output events by making it possible to set the `event` global variable to an [sequence](https://www.lua.org/pil/11.1.html) of events.
* Support direct access to the nested fields (in both maps and arrays).

## Sales Pitch

The proposal

* gives users more power to create custom transforms;
* does not break backward compatibility (except `pairs` method in case of nested fields);
* makes it possible to add complexity to the configuration of the transform gradually only when needed.

## Drawbacks

The only drawback is that supporting both dot notation and classical indexing makes it impossible to add escaping of dots in field names. For example, for incoming event structure like
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'm in favor of not supporting the dot notation within runtimes, even if this is a breaking change. This syntax was one of the reasons that prompted a refactoring of our internal model, and dot notation should only be used in circumstances where a field must be accessed with a string (eg not in a runtime).


```json
{
"field.first": {
"second": "value"
}
}
```

accessing `event["field.first"]` would return `nil`.

However, because of the specificity of the observability data, there seems to be no need to have both field names with dots and nested fields.

## Outstanding Questions

* In access to the arrays should the indexes be 0-based or 1-based? Vector uses 0-based indexing, while in Lua the indexing is traditionally 1-based. However, technically it is possible to implement 0-based indexing for arrays which are stored inside events, as both [`__index`](https://www.lua.org/pil/13.4.1.html) and [`__len`](https://www.lua.org/manual/5.3/manual.html#3.4.7) need to have custom implementations in any case.

* Is it confusing that the same global variable name `event` used also for outputting multiple events? The alternative, using a different name, for example, `events`, would lead to questions of precedence in case if both `event` and `events` are set.

## Plan of Action

- [ ] Add `init` and `shutdown` hooks.
- [ ] Add timers.
- [ ] Implement `set_lane` and `get_lane` methods on the events.
- [ ] Support multiple output events.
- [ ] Implement `Event.new_log()` function.
- [ ] Support direct access to the nested fields in addition to the dot notation.