Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: RFC #1999 - 2020-03-06 - API extensions for lua transform #2000

Merged
20 commits merged into from
Mar 17, 2020

Conversation

ghost
Copy link

@ghost ghost commented Mar 6, 2020

Rendered

Closes #1999


## Drawbacks

The only drawback is that supporting both dot notation and classical indexing makes it impossible to add escaping of dots in field names. For example, for incoming event structure like
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'm in favor of not supporting the dot notation within runtimes, even if this is a breaking change. This syntax was one of the reasons that prompted a refactoring of our internal model, and dot notation should only be used in circumstances where a field must be accessed with a string (eg not in a runtime).

Copy link
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I am a big fan of the hooks and timers API. It's a very simple progressive UX versus handlers/closures. I have one question:

How are metrics and logs handled? Could you provide an example for each? I know in previous examples you had the log and metric nested under the event (which I liked), but I don't see that here.

@ghost
Copy link
Author

ghost commented Mar 8, 2020

How are metrics and logs handled? Could you provide an example for each? I know in previous examples you had the log and metric nested under the event (which I liked), but I don't see that here.

Initially I wanted to skip metrics from this RFC and to address this problem in subsequent ones and because I was trying to devise an approach alternative to nested logs and metrics which could have not broken backward compatibility. However, if are are going to add breaking change about disabling dot notation, then it makes sense to bundle all of this into a single RFC. I'm going to add it.

@binarylogic
Copy link
Contributor

Yeah. To handle backward compatibility I'm a fan of having users specify an API version. Ex:

[transforms.lua]
  type = "lua"
  version = "2" # defaults to 1

It's interesting to think about this pattern on a per-component basis. I'm not suggesting that we support the old syntax (although we can), but at the very least printing an error that the version attribute is not present which should signal to the user that their script should be updated.

This might be better solved with #1037, but we should pay mind to help users upgrade easily. Even though we detail this in release notes it can still be a frustrating process, especially if you're upgrading multiple minor versions.

@ghost
Copy link
Author

ghost commented Mar 8, 2020

It's interesting to think about this pattern on a per-component basis. I'm not suggesting that we support the old syntax (although we can), but at the very least printing an error that the version attribute is not present which should signal to the user that their script should be updated.

I like the per-component versioning, this seems to be much more flexible than per-config one.


* Should timestamps be automatically inserted to created logs and metrics created as tables inside the transform is they are not present?
* Are there better alternatives to the proposed solution for supporting of the timestamp type?
* Could some users be surprised if the transform which doesn't call `emit` function doesn't output anything?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a possibility, but I think docs are sufficient here.


## Outstanding Questions

* Should timestamps be automatically inserted to created logs and metrics created as tables inside the transform is they are not present?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why they should...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, stdin source inserts timestamps to newly created events automatically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be possible for a source to emit an event without a timestamp field, so the lua transform should not concern itself with this. If a user explicitly removes the timestamp field then they'll deal with problems, but we should not cater to that edgecase.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is about newly created events inside the transforms, as the user can forget to add the timestamp to them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Yes, in that case, we should populate the appropriate schema fields. The names should respect those provided as part of global log_schema setting. I'd argue that both timestamp and host should be automatically set.

-- without calling `emit` function nothing is produced by default
"""
[transforms.lua.hooks]
init = """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, I originally considered functions (Eg start, stop) but this works too!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably, if we want a consistent interface for lua, javascript, and later wasm this may not be the best choice. It may be better to let folks specify some functions as hooks instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I was thinking about an API looking like this:

source = """
  function (input, emit)
    input:on_event(function (event)
      -- do something
      emit(event)
    end)
    input:on_start(function ()
      event = -- ...
      emit(event)
    end)
    input:on_stop(function ()
      -- ...
    end)
    input:on_interval(10, function()
      event = -- ...
      emit(event)
    end)
  end
"""

However, if such a script is a part of an already large TOML config file, such as the one from #1721 (comment), declaring the hooks using the TOML syntax might make it easier to read and reason about. It limits generality, but in return it makes it easier to both get started and reason about scripts written by other people, especially if the user is not deeply familiar with Lua programming.

In someone needs for some reason to put functions as hooks, it is possible too:

hooks.start = """
require "mymodule"
function timer_handler(emit)
-- ...
end
my_processor = mymodule.create_processor()
"""
timers = [{ interval = 10, source = "timer_handler(emit)" }]
source = "my_processor:process(event, emit)"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if such a script is a part of an already large TOML config file, such as the one from #1721 (comment), declaring the hooks using the TOML syntax might make it easier to read and reason about.

In the context of the vector.toml file I tend to agree. But I could also see users wanting to include separate .lua files, which makes this point moot. For example:

source = """
require("transform.lua")
"""

That said, what if allowed the user to specify handlers instead of source code? Ex:

[transforms.lua]
  type = "lua"
  source = """
counter = 0

function init
  # ...
end

function process(event)
  # ...
end
"""

  handlers.init = "init" # default 
  handlers.process = "process" # default
  handlers.shutdown = "shutdown" # default 

  timers = [
    {handler = "flush", interval_secs = 10}
  ]

This would then allow a user to do something like:

[transforms.lua]
  type = "lua"
  source = "require('transform.lua')"

  handlers.init = "init" # default 
  handlers.process = "process" # default
  handlers.shutdown = "shutdown" # default 

  timers = [
    {handler = "flush", interval_secs = 10}
  ]

Copy link
Contributor

@Hoverbear Hoverbear Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a-rodin Right, so in your example each of those scripts shares a global state? I was under the initial impression they were distinct.

How would this work if the language didn't work well with random mutable global variables, like Rust? Would we just be concatting them?

Copy link
Author

@ghost ghost Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@binarylogic I think this approach suits both simple and advanced use cases. I’m going to update the RFC text.

@Hoverbear
I’m not sure what is the best approach for Rust, but I think that it might be a solution to define a trait with methods init, process, timers, shutdown which would then be implemented by the users (or have default implementations for methods like init/shutdown hooks). So the execution model could still be the same, although the configuration would have to appear somewhat differently because it is not practical to write inline Rust in the config files.

On the other hand, as Lua is an interpreted language, for light transforms it would be possible to just write inline Lua functions with the current approach, and the trait-like approach would require writing unnecessary boilerplate code and still be not idiomatic.

* Should timestamps be automatically inserted to created logs and metrics created as tables inside the transform is they are not present?
* Are there better alternatives to the proposed solution for supporting of the timestamp type?
* Could some users be surprised if the transform which doesn't call `emit` function doesn't output anything?
* `null` might present in the events would be lost because in Lua setting a field to `nil` means deletion. Is it acceptable? If it is not, it is possible to introduce a new kind of `userdata` for representing `null` values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your take on this @binarylogic ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's odd, but I think this is fine since it is a common Lua behavior.

Currently accessing nested fields is possible using the dot notation:

```lua
event["nested.field"] = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can expose this functionality in a .get API call or something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. There are some intricacies, for example if it is added as a method, it would not work with new user-created events which are tables:

event = {
  log = {
    -- ...
  }
}
emit(event)

Here event is not a userdata, but a table, so it would not have a get method.

However, it might be possible to provide a global function for accessing this, although see #2000 (comment).

type = "lua"
inputs = []
version = "2"
hooks.process = """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really like the syntax here 👍

@binarylogic
Copy link
Contributor

I would also like to address backward compatibility in this RFC. I know we're planning to add a version attribute, but are we planning to support the old syntax as well? There is considerable user value in doing so, but I also want to migrate users off of the old version as soon as possible.

@lukesteensen
Copy link
Member

As the transforms become more complex, I want to avoid polluting the global namespace, especially with such a short name as emit. On the other hand, as handlers are functions in any case, it is easy to just pass the emitting function to them.

Functions like emit are exactly what I'd expect to be in the global namespace, not pollution IMO. Passing a function as an argument feels like a rather strange API to expose to users. If we're concerned about name collisions with globals when importing large wasm modules, I'd be more inclined to explore that ecosystem's capabilities for modularity and namespacing. Some globally available module like vector.emit could be another option.

I think about the following:

  • Add a new property for transforms, which would identify that a transform is able to produce events on arbitrary lane.
  • Add a new optional property for events which would identify the target lane.
  • When transforms and sinks are added to the topology, then if their inputs contain names like transform_name.lane_name and the transform with transform_name has this property set, then additional hidden transforms with names transform_name.lane_name are added to the topology (similarly to how hidden transforms are added in swimlanes) and connected to the original transform. These transforms would then do filtering using the events' property identifying the lane.

If I'm understanding this correctly, you'd use the fact that the "lanes" must be defined statically in another component's input to avoid the problem of truly dynamic lanes. This seems possible, but I worry about the complexity of the expansion and analysis we'd need to perform it. I have also been considering some tweaks to how our topology works that could potentially make this more natural. For those reasons, I would propose that we postpone this portion of the RFC. Could a 2-arity emit be something we do in a second phase, after we've thoroughly considered our options around topology?

@binarylogic
Copy link
Contributor

Could a 2-arity emit be something we do in a second phase, after we've thoroughly considered our options around topology?

I’d also be ok postponing this change, but it is definitely something I’d like to do as a follow up.

@ghost
Copy link
Author

ghost commented Mar 12, 2020

@binarylogic

I've added the versions section.

@lukesteensen

Passing a function as an argument feels like a rather strange API to expose to users.

I've updated the RFC to use vector.emit function instead of the one passed as an argument. But this kind of API is used, for example, here.

I think passing it as an argument instead of exposing it globally has a nice property of making it available only when it actually can be used. For example, calling vector.emit outside of a hook or timer handler (which can happen when the transform is just created, but the associated stream is not created yet) would result in an error because at that point the transform is not able to produce events. When the emitting function is passed as an argument to handlers, it is automatically unavailable in other parts of code where it can't be called (such as the source section of the config or the Lua code which creates the handler function).

@github-actions
Copy link

Great PR! Please pay attention to the following items before merging:

Files matching rfcs/**:

  • Have at least 3 team members approved this RFC?

This is an automatically generated QA checklist based on modified files

@binarylogic
Copy link
Contributor

@a-rodin just to be ultra-clear, we will be supporting version 1. In other words, we will not be breaking backward compatibility?

@ghost
Copy link
Author

ghost commented Mar 12, 2020

@a-rodin just to be ultra-clear, we will be supporting version 1. In other words, we will not be breaking backward compatibility?

Yes, we will support version 1. I will also be default, so the backward compatibility would not be broken.

I considered possibility of making version 2 default in distant future, when version 2 is tested in real world and stabilized, while keeping version 1 available for use by explicit specification of the version. On the other hand, one extra line in the config might be not worth breaking someone's config on upgrade, so we can just keep version 1 as the default indefinitely, and instead recommend to always use version 2 in the docs.

Copy link
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a-rodin

I think passing it as an argument instead of exposing it globally has a nice property of making it available only when it actually can be used.

This is a good argument. I just want to make sure we're confident that users will find the API relatively easy to understand. We should also think about whether this first-class functions approach is something that'd work well in wasm-based plugins or not (/cc @Hoverbear).


@binarylogic

I’d also be ok postponing this change, but it is definitely something I’d like to do as a follow up.

I definitely see the appeal of the feature and agree we can pursue it. My point is really just that it involves an internal complexity high enough that it's probably better off considered on its own, instead of lumped in with this change. If it can be added in a backwards-compatible way later, that's what I'd recommend.

@binarylogic
Copy link
Contributor

I've updated the RFC to use vector.emit function instead of the one passed as an argument. But this kind of API is used, for example, here.

Given the linked example, I actually prefer passing in the callback, but I do not feel strongly. @Hoverbear you're the tiebreaker! 😄

@Hoverbear
Copy link
Contributor

We should also think about whether this first-class functions approach is something that'd work well in wasm-based plugins or not (/cc @Hoverbear).

Sorry about the delay, based on my experience so far with WASM all communication with the host machine needs to happen over traditional FFI, so any functionality a wasm plugin could access inside of vector would need to be exposed that way. So in the short term, WASM plugins will likely only be able to talk to vector through explicitly exposed extern C functions.

In theory, a RUst plugin and Vector could do some first class function-style system, but that's going to be black magic until we see this stuff standardized.

@binarylogic
Copy link
Contributor

@a-rodin go with whatever ‘emit’ syntax you think is best. I don’t think passing it as an argument is wrong, unless @luke has strong opinions.

@binarylogic binarylogic self-requested a review March 14, 2020 17:36
Copy link
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done! This all looks great to me. Excited to implement this.

@lukesteensen
Copy link
Member

@a-rodin go with whatever ‘emit’ syntax you think is best. I don’t think passing it as an argument is wrong, unless @luke has strong opinions.

My opinion is not strong, and the argument of allowing emit only where it makes sense is a good one. I agree you should go with whichever you think is best.

@binarylogic
Copy link
Contributor

@lukesteensen would you mind approving the RFC is you don't have any other comments? @Hoverbear could you do the same?

@ghost ghost merged commit c8a5e4f into master Mar 17, 2020
@binarylogic
Copy link
Contributor

Nice work @a-rodin! Really happy with how this came out.

@ghost ghost deleted the rfcs/2020-03-06-1999-api-extensions-for-lua-transform branch March 25, 2020 17:12
@ghost ghost mentioned this pull request Apr 2, 2020
@binarylogic binarylogic added type: feature A value-adding code addition that introduce new functionality. and removed type: new feature labels Jun 16, 2020
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
4 participants