Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElasticsearchSink 2.0 #394

Closed
5 of 7 tasks
Mpdreamz opened this issue Feb 24, 2015 · 12 comments
Closed
5 of 7 tasks

ElasticsearchSink 2.0 #394

Mpdreamz opened this issue Feb 24, 2015 · 12 comments

Comments

@Mpdreamz
Copy link
Member

Today I met up in person with @mivano, who initially wrote the sink, and @zidad. Together we hashed out some steps to get to a more stable/uniform elasticsearch sink.

  • The sink will take full ownership of serializing exceptions, work on this has already been started so that no matter if (or what) serializer you have configured an exception is always outputted in a uniform manner. The serializer is still handy to serialize context information under fields
  • The advertised elasticsearch template currently is a hotchpotch of an older logstash template and some recommendations on how to configure elasticsearch online from nearly 3 years ago. Much of it no longer applies, we need to resync this with current best practices.
  • The sink will allow you to register this template when configuring. Elasticsearch will need to be running.
  • Better innerexception handling right now innerexceptions are nested N levels deep making it very hard to write uniform queries/dashboards over all your exceptions. The toplevel exception will still reside under exception but also duplicated under exceptions which is a flattened view of the exception and all its inner exceptions up to a configurable depth (default 20?). This allows you to aggregate on exceptions.ClassName uniformly.
"exception" : { },
"exceptions" : [{ }]
  • Remove the old ElasticSearch() obsolete extension methods. Allow a complex object to be initialized from appsettings as well. @nblumhardt I have some idea's on supporting any parameter of type ISinkOptions to be new()'ed and populated by serilog rather then relying on named parameters, is this something you've explored already?
  • Handle bulk insert failures by logging the serilog's selflog, how granular should these events be?
  • Edge case, handle positional formatting sneaking into fields with a fixed string mapping when you proxy 3rd party logging to serilog.

cc @tynor88 @konste @mookid8000 @gmarz

Some of these are breaking and thus the 2.0, something I initially tried hard to avoid but in order to move forward we'll need to.

this ties into #391 and #384

@zidad
Copy link

zidad commented Feb 24, 2015

Where is the 'like' button on github when you need it :)

@Mpdreamz
Copy link
Member Author

A ponder that surfaced while implementing:

I think its best to omit the exception property altogether, exceptions will be a nested object mapping so querying for just the surfaced exception can always be done using exceptions.depth: 0.

@Mpdreamz
Copy link
Member Author

Here's an example of the new uniform output to elasticsearch:

https://gist.github.com/Mpdreamz/9fe1e18aca522f50b702

The ElasticsearchJsonFormatter creates a structured ExceptionMethod object, I'm wondering if its useful to the do same for the stacktraces much like raygun does: https://github.com/MindscapeHQ/raygun4net/blob/96a3a56ef153ef0ca6af40eb3d8def6fc1df8f05/Mindscape.Raygun4Net4/Builders/RaygunErrorMessageBuilder.cs#L57

Thoughts appreciated!

@konste
Copy link

konste commented Feb 24, 2015

I have shared some modification I did to the current sink to make it more usable here: http://1drv.ms/1AVHkVN Let me highlight a few points.

  1. Ability to specify custom JSON Formatter (in my case it is ElasticsearchJsonFormatterMod) derived from ElasticsearchJsonFormatter is a must!
  2. Another thing I consider important is how I override WriteJsonProperty:

protected override void WriteJsonProperty(string propertyName, object propertyValue, ref string precedingDelimiter, TextWriter output)
{
propertyName = ApplyTypeSuffixToPropertyName(propertyName, propertyValue);
base.WriteJsonProperty(propertyName, propertyValue, ref precedingDelimiter, output);
}

Methos ApplyTypeSuffixToPropertyName is also shared. The big point is Elasticsearch chokes if some property first came in as integer (for instance) and later some value for that property cannot be interpreted as integer. This causes ES to throw out the whole log record, which for Serilog purposes is catastrophic. Applying type dependent suffixes to properties names I make sure that if some property is recognized as particular type it stays “compatible”.

  1. (this part is minor – just FYI) In my custom formatter I suppress Exceptions and Renderings like this:
    protected override void WriteException(Exception exception, ref string delim, TextWriter output) { }
    protected override void WriteRenderings(IGrouping<string, PropertyToken>[] tokensWithFormat, IReadOnlyDictionary<string, LogEventPropertyValue> properties, TextWriter output) { }

Regarding exceptions – I don’t want exception object to be treated any special. It is just the regular object we log with destructuring. Look at SerilogExceptionEnricher.cs I have shared. Exception object will be in the output, I just don’t want it to be rendered any different than others, so WriteException is suppressed.
Regarding renderings… frankly I could not even understand what it is and why would I need it, so suppressed it too.

  1. Another important part is handling the output of client.Bulk operation in ElasticsearchSinkMod.cs I’m not sure how exactly it should be propagated to the caller (probably new ElasticsearchSink should allow derivation), but it is important to check if all log input was successfully accepted by Elasticsearch.

Thank you for consideration!

Konstantin

@mivano
Copy link
Member

mivano commented Feb 25, 2015

A ponder that surfaced while implementing:

I think its best to omit the exception property altogether, exceptions will be a nested object mapping so querying for just the surfaced exception can always be done using exceptions.depth: 0.

It will prevent duplicate data and you can indeed easily find the root exception using the depth 0 filtering. Our original idea was to still have the old behaviour with the Exception property to be there, but since it is a version 2 anyway it makes sense to do it right and not store duplicate data.

@mivano
Copy link
Member

mivano commented Feb 25, 2015

Another important part is handling the output of client.Bulk operation in ElasticsearchSinkMod.cs I’m not sure how exactly it should be propagated to the caller (probably new ElasticsearchSink should allow derivation), but it is important to check if all log input was successfully accepted by Elasticsearch.

@konste How much details do you want here? We discussed to log it at least to the selflog of serilog. But the trouble is that you need to explicitly monitor this selflog. Does it make sense to have an event that gets the rejected/faulted events so you can do something else with it? Because I think in your implementation you send it to another logging system. And for that you really need to override the sinks. If there is a way to hook into the error handling of the ES sink we might be able to overcome this.

@mivano
Copy link
Member

mivano commented Feb 25, 2015

Methos ApplyTypeSuffixToPropertyName is also shared. The big point is Elasticsearch chokes if some property first came in as integer (for instance) and later some value for that property cannot be interpreted as integer. This causes ES to throw out the whole log record, which for Serilog purposes is catastrophic. Applying type dependent suffixes to properties names I make sure that if some property is recognized as particular type it stays “compatible”.

Yeah, we noticed the same. Part of it is also that the mappings are not that nice like @Mpdreamz mentioned. We discussed if we can have a better one that we optionally put to ES at startup. And in there we correct at least the indexed properties (like {0} {1} etc) to be a string. Not sure if that is enough (I think in my situation it is), but it is on our agenda to see how to overcome this.

@konste
Copy link

konste commented Feb 25, 2015

@mivano: Using SelfLog for ES failures seems to be the only option and we should use it. I also shared SerilogSelfLogShipper.cs (at the same location). When we initialize primary logging system we do

// Provision for self logging for the logging subsystem.
SelfLog.Out = SerilogSelfLogShipper.Instance;

and with that we get selflog shipped to Seq for analysis. Proved to be way more effective that hunting the exceptions directly in Elasticsearch own logs.

@nblumhardt
Copy link
Member

  • Case all names consistently Elasticsearch (including the namespace and assembly)

@nblumhardt
Copy link
Member

The code is now at https://github.com/serilog/serilog-sinks-elasticsearch

Just adding this note since no CI is set up yet, and hence there's no "2.0" package published. It would be great for eveyrone on this thread to coordinate around making any changes that should go in before it ships! :)

@nblumhardt
Copy link
Member

All, is this one closable now? Cheers!

@mivano
Copy link
Member

mivano commented Apr 9, 2015

I would say yes. If there are any outstanding issues, lets create new issues to track them separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants