-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sane flush #91
Sane flush #91
Conversation
Conflicts: lib/logstash/outputs/amqp.rb
if gzip | ||
@files[path] = Zlib::GzipWriter.new(@files[path]) | ||
end | ||
class << @files[path] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eek, monkeypatching. Bad mojo. I'd much prefer something that was state held in the plugin instead of in a monkeypatched part of the File or GZipWriter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to wrap things and track state like activity, it might be simpler to write your own IO-wrapper class
class MyFile
def initialize(io)
@io = io
end
def write(*args)
@last_write = TIme.now
@io.write(*args)
end
...
end
And delegate flush/etc to @io, but add activity checks and stuff
overall this looks pretty good, I added some in-line comments on specific issues. Thanks for your work! :) |
Erk!!! my mailbox feels like messanger! Thanks for the review. The reason for the flushes is C habits (although $ ruby test_flush.rb $ jruby test_flush.rb my guess is that Java is doing some dark IO magic somewhere... On Sat 28 Jan 2012 01:41:29 AM IST, Jordan Sissel wrote:
|
Righto then. Can you add a big comment explaining what exactly you're monkeypatching and why, then for every method you invoke later (fd.active, etc) put a comment there saying explicitly that you are invoking a monkeypatched method - this'll help future debugging efforts :) |
I removed the monkey patching as per your suggestions. |
Add flush intervals and gzip support.
Thanks a bunch for this :) |
Adds a `config.field_reference.escape_style` option and a companion command-line flag `--field-reference-escape-style` allowing a user to opt into one of two proposed escape-sequence implementations for field reference parsing: - `PERCENT`: URI-style `%`+`HH` hexadecimal encoding of UTF-8 bytes - `AMPERSAND`: HTML-style `&#`+`DD`+`;` encoding of decimal Unicode code-points The default is `NONE`, which does _not_ proccess escape sequences. With this setting a user effectively cannot reference a field whose name contains FieldReference-reserved characters. | ESCAPE STYLE | `[` | `]` | | ------------ | ------- | ------- | | `NONE` | _N/A_ | _N/A_ | | `PERCENT` | `%5B` | `%5D` | | `AMPERSAND` | `&elastic#91;` | `&elastic#93;` |
Adds a `config.field_reference.escape_style` option and a companion command-line flag `--field-reference-escape-style` allowing a user to opt into one of two proposed escape-sequence implementations for field reference parsing: - `PERCENT`: URI-style `%`+`HH` hexadecimal encoding of UTF-8 bytes - `AMPERSAND`: HTML-style `&#`+`DD`+`;` encoding of decimal Unicode code-points The default is `NONE`, which does _not_ proccess escape sequences. With this setting a user effectively cannot reference a field whose name contains FieldReference-reserved characters. | ESCAPE STYLE | `[` | `]` | | ------------ | ------- | ------- | | `NONE` | _N/A_ | _N/A_ | | `PERCENT` | `%5B` | `%5D` | | `AMPERSAND` | `&elastic#91;` | `&elastic#93;` |
Adds a `config.field_reference.escape_style` option and a companion command-line flag `--field-reference-escape-style` allowing a user to opt into one of two proposed escape-sequence implementations for field reference parsing: - `PERCENT`: URI-style `%`+`HH` hexadecimal encoding of UTF-8 bytes - `AMPERSAND`: HTML-style `&#`+`DD`+`;` encoding of decimal Unicode code-points The default is `NONE`, which does _not_ proccess escape sequences. With this setting a user effectively cannot reference a field whose name contains FieldReference-reserved characters. | ESCAPE STYLE | `[` | `]` | | ------------ | ------- | ------- | | `NONE` | _N/A_ | _N/A_ | | `PERCENT` | `%5B` | `%5D` | | `AMPERSAND` | `&elastic#91;` | `&elastic#93;` |
Adds a `config.field_reference.escape_style` option and a companion command-line flag `--field-reference-escape-style` allowing a user to opt into one of two proposed escape-sequence implementations for field reference parsing: - `PERCENT`: URI-style `%`+`HH` hexadecimal encoding of UTF-8 bytes - `AMPERSAND`: HTML-style `&#`+`DD`+`;` encoding of decimal Unicode code-points The default is `NONE`, which does _not_ proccess escape sequences. With this setting a user effectively cannot reference a field whose name contains FieldReference-reserved characters. | ESCAPE STYLE | `[` | `]` | | ------------ | ------- | ------- | | `NONE` | _N/A_ | _N/A_ | | `PERCENT` | `%5B` | `%5D` | | `AMPERSAND` | `&elastic#91;` | `&elastic#93;` |
* add failing tests for Event.new with field that look like field references * fix: correctly handle FieldReference-special characters in field names. Keys passed to most methods of `ConvertedMap`, based on `IdentityHashMap` depend on identity and not equivalence, and therefore rely on the keys being _interned_ strings. In order to avoid hitting the JVM's global String intern pool (which can have performance problems), operations to normalize a string to its interned counterpart have traditionally relied on the behaviour of `FieldReference#from` returning a likely-cached `FieldReference`, that had an interned `key` and an empty `path`. This is problematic on two points. First, when `ConvertedMap` was given data with keys that _were_ valid string field references representing a nested field (such as `[host][geo][location]`), the implementation of `ConvertedMap#put` effectively silently discarded the path components because it assumed them to be empty, and only the key was kept (`location`). Second, when `ConvertedMap` was given a map whose keys contained what the field reference parser considered special characters but _were NOT_ valid field references, the resulting `FieldReference.IllegalSyntaxException` caused the operation to abort. Instead of using the `FieldReference` cache, which sits on top of objects whose `key` and `path`-components are known to have been interned, we introduce an internment helper on our `ConvertedMap` that is also backed by the global string intern pool, and ensure that our field references are primed through this pool. In addition to fixing the `ConvertedMap#newFromMap` functionality, this has three net effects: - Our ConvertedMap operations still use strings from the global intern pool - We have a new, smaller cache of individual field names, improving lookup performance - Our FieldReference cache no longer is flooded with fragments and therefore is more likely to remain performant NOTE: this does NOT create isolated intern pools, as doing so would require a careful audit of the possible code-paths to `ConvertedMap#putInterned`. The new cache is limited to 10k strings, and when more are used only the FIRST 10k strings will be primed into the cache, leaving the remainder to always hit the global String intern pool. NOTE: by fixing this bug, we alow events to be created whose fields _CANNOT_ be referenced with the existing FieldReference implementation. Resolves: #13606 Resolves: #11608 * field_reference: support escape sequences Adds a `config.field_reference.escape_style` option and a companion command-line flag `--field-reference-escape-style` allowing a user to opt into one of two proposed escape-sequence implementations for field reference parsing: - `PERCENT`: URI-style `%`+`HH` hexadecimal encoding of UTF-8 bytes - `AMPERSAND`: HTML-style `&#`+`DD`+`;` encoding of decimal Unicode code-points The default is `NONE`, which does _not_ proccess escape sequences. With this setting a user effectively cannot reference a field whose name contains FieldReference-reserved characters. | ESCAPE STYLE | `[` | `]` | | ------------ | ------- | ------- | | `NONE` | _N/A_ | _N/A_ | | `PERCENT` | `%5B` | `%5D` | | `AMPERSAND` | `[` | `]` | * fixup: no need to double-escape HTML-ish escape sequences in docs * Apply suggestions from code review Co-authored-by: Karol Bucek <[email protected]> * field-reference: load escape style in runner * docs: sentences over semiciolons * field-reference: faster shortcut for PERCENT escape mode * field-reference: escape mode control downcase * field_reference: more s/experimental/technical preview/ * field_reference: still more s/experimental/technical preview/ Co-authored-by: Karol Bucek <[email protected]>
Added flush cycle at configurable interval and inactive files close cycle every 10 seconds.
Logstash is blocking so flushing many files together in the event handler thread may be a problem for heavy setups that write a lot of data to a lot of files. I didn't think this edge case is worth complicating the code so i didn't handle it.