-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VRL is_string function does not imply data type in filter transform #9406
Comments
I think this would work, though I'm not sure it's more readable than including EDIT: fix vrl 😅 condition = '''message = string!(.message); !contains(message, s'test')''' @JeanMertz or @StephenWakely could confirm that. Sidenote on your minified config. |
Thanks for the quick response. I guess you mean
otherwise I'm curious about the performance impact of adding the string() function in each contains function. I want to apply this filter to approximately 1 billion lines a day, with 5 contains conditions. |
I don't believe there should be much performance overhead to the casting (especially compared to calling I opened vectordotdev/vrl#91 to track this as an enhancement. It is something we've thought about before, but haven't fully discussed or decided on the approach. I'll close out this issue in-lieu of that one, but please feel free to add any additional thoughts there. Notably, we have some upcoming work to add the concept of "schemas" to Vector in #9388 which would also help address this by not requiring the type-cast. |
Just to confirm, it does seem the overhead of calling Local benchmarks:
|
@jszwedko I imagine those benches are for single usage of either functions, right? Considering I want to use 5 Perhaps an idea that comes up: it would be cool if there was a way you could benchmark your vector configs "end 2 end" , for example using a generator or file source and blackhole sink. This would basically benchmark all your intermediate transforms topology at once. I would see myself using this to optimize my configs. |
Yeah, these benches are just for calling the individual function. For separate
Yeah, this would be cool. It is definitely possible right now, you just have to wire it up yourself (comment out all sources/sinks and replace them with a generator/blackhole). I could see us making this easier being valuable though, for sure. |
Vector Version
Vector Configuration File
Debug Output
Expected Behavior
I believe this should work. The runtime error comes from the contains() function expecting .message to be a string.
Actual Behavior
In order to make this work, you need to use f.e.
However, my config is a minified example. I use several contains statements, and adding the string() function to all of them makes it harder to read, and is potentially wasteful if vector has to do the same check multiple times.
Example Data
Additional Context
References
The text was updated successfully, but these errors were encountered: