Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New remap transform v1 #3119

Closed
binarylogic opened this issue Jul 20, 2020 · 4 comments · Fixed by #3341
Closed

New remap transform v1 #3119

binarylogic opened this issue Jul 20, 2020 · 4 comments · Fixed by #3341
Assignees
Labels
domain: transforms Anything related to Vector's transform components type: feature A value-adding code addition that introduce new functionality.

Comments

@binarylogic
Copy link
Contributor

The first version is mostly focused on laying a solid foundation that:

  • Integrates properly with our event model
  • Produces user-friendly error messages
@binarylogic binarylogic added domain: transforms Anything related to Vector's transform components type: feature A value-adding code addition that introduce new functionality. labels Jul 20, 2020
@binarylogic binarylogic changed the title New vicscript transform v1 New remap transform v1 Aug 3, 2020
@Jeffail
Copy link
Contributor

Jeffail commented Aug 17, 2020

There are two major remaining features that ought to be added to the remap transform which in my view would otherwise block it from replacing all the current mapping transforms. Those two features are type coercion and error handling.

Type coercion is straight forward and in order to fit this in nicely with our existing syntaxes I would suggest we start small and add query functions (one for each type) that take a path parameter, it would look like this:

mapping = """
  .id = string(.id)
  .bytes_in = int(.bytes_in)
  .bytes_out = int(.bytes_in)
  .timestamp = timestamp(.timestamp, "%d/%m/%Y:%H:%M:%S %z")
  .status = int(.status)
"""

However, we're now introducing more ways in which our mappings might fail, and therefore it would be nice to be able to specify per-field how (and whether) to recover in the event where these type coercions cannot be performed. I suggest we allow an optional second parameter for these functions that defines the default value to use if for any reason the coercion is going to fail (the field doesn't exist, its value cannot be converted, etc). It would look like this:

mapping = """
  .id = string(.id)
  .bytes_in = int(.bytes_in, 0)
  .bytes_out = int(.bytes_in, 0)
  .timestamp = timestamp(.timestamp, "%d/%m/%Y:%H:%M:%S %z")
  .status = int(.status, 200)
"""

Note that in this example we don't specify a default for the field .id because we want the mapping to fail entirely if the id cannot be parsed. We also won't have defaults for timestamp initially as timestamp literals are probably worthy of their own RFC in order to agree on the syntax. However, you could do something like this as a short-term work around:

  .timestamp = timestamp(string(.timestamp, "11/10/1946:12:00:00 -0400"), "%d/%m/%Y:%H:%M:%S %z")

This syntax would also solve general purpose error handling as it would allow users to specify type assertions and also fallback values in the case where the target paths don't exist.

@jszwedko
Copy link
Member

If it's possible, I think allowing defaulting to null would be nice too for cases where the field cannot be coerced (or is missing). I think this would presumably look something like:

.id = string(.id, null)

I suppose there is also a distinction between null and unassigned that we could make to allow for the .id field to not be set if it could not be parsed and coerced.

@binarylogic
Copy link
Contributor Author

@Jeffail all of that sounds good to me. What do you think about @jszwedko's proposal for nulls? I opened #3479 for this work.

@Jeffail
Copy link
Contributor

Jeffail commented Aug 17, 2020

Sounds good, I think allowing null as an alternative default is reasonable, although we need to make it clear that the field will show in the output. For the value to represent not setting at all we'd need something special, that's where stuff like deleted() comes in, but we can approach that in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: transforms Anything related to Vector's transform components type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants