-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tracking important errors #1
Comments
You can use the existing events in Grafana. https://github.com/raintank/grafana/blob/master/pkg/events/events.go If enabled, these events will be pushed to a rabbitmq Topic Exchange with the routingKey set to So an event defined as
will use the routingKey The events package will need some updating as it hard codes the event.Priority to 'INFO' |
so do we have consensus that this is the best approach? is it still the right approach when we consider regular grafana users who want to run grafana on their server and have a different way to track errors? they often use something like logstash, so in that case a different eventlistener would be needed that pushes to a logstash queue or something i suppose? or perhaps an event listener that writes events to a text file log? (not something for us to worry too much about now, but good to keep in the back of our mind) |
one solution is to have something external (like logstash), tail log files and push to ES |
but I guess pushing directly to rabbit -> logstash -> ES has some advantages in that you can log more rich data (as json) and have that data indexed and searchable in ES, without going through logfile -> logstash parsing -> ES |
not sure how i should go about the actual log calls. FWIW my current idea is (not tested yet)
|
bus Publish code does not use publishAfterCommit , publishAfterCommit is just a utility function in the sqlstore package. |
not sure I understand your comment "my current idea is"? just looks like code paste from the events.go |
no it's a diff that shows some additions, basically an error type and a way to override the priority |
k i'll use bus.Publish |
if you want to pipe log messages to rabbitmq you could write a rabbitmq log writer |
hm not sure if i'll get to that before end of next week, but i guess that could fairly easily be done by one of you guys if needed. so i'll focus more on the specifics of alerting itself for now. |
Can the messages generated by Litmus go into the same storage backend as the messages that are generated from the Collectors? (so they can be viewed in the events panel from elasticsearch)? |
i don't see why they couldn't, but one thing to keep in mind is decoupling the monitoring system from the system being monitored. if prod ES goes down for whatever reason then we'll want to look at events in a monitoring system which probably should not be the same ES instance. |
@woodsaj to make a call on this potentially loggly, potentially ELK... but all in agreement that centralized logging for * is required |
Given that we now have ELK set up, i think the best approach is to just log warnings and errors. WE just need to ensure that the log messages contain all relevant data. |
yep. do we need to do a big code review to make sure we have the right log calls in all places or are we pretty confident we're in good shape? I'll review the alerting pkg to be sure of that, at least. |
We can do some processing and mutating of the logs as they come into logstash too. I'll have that set up soon. |
Tuesday May 12, 2015 at 22:17 GMT
Originally opened as raintank/grafana#91
do we have any convention on how we will keep tabs on critical/important error events in grafana?
maybe log to a file and then use heka or logstash to shove them into ES?
The text was updated successfully, but these errors were encountered: