-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging Suddenly Stops on Production #17
Comments
FYI Sending my logs to redis. (instead of UDP directly to logstash) Also making my puma cluster reconnect on worker launch:
So far, so good. |
Nope... still randomly stops logging :( Won't even send a message to redis. |
So when logstash-logger stops logging my rails server output, I decided to make a new controller#action that uses logstash directly to log a message. Such as: def check_if_logstash_logger_still_alive However even that does not work. Needless to say, no other logging works. At this point, we're no longer using lograge. It's simply that logstash-logger will not send redis/udp messages outbound after a random amount of time. |
For Redis: Currently trying to ask @io.client.reconnect if @io.connected? is not true. ... Nope that doesn't matter. Reconnecting does not help. |
If you could answer the following questions, that would help narrow down the possible problems.
|
Also, can you modify your controller to print out |
MRI Ruby 2.1.2
Yes: https://github.com/roidrage/lograge
UDP and Redis so far are all I tested. Both have the issue.
I turned off logstash logger & lograge at the same time. Then put in a instrument listener on "process_action.action_controller" which directly logs to logstash. Works like a charm (minus all the console debug messages...)
Totally can do after the weekend is over :) From what you're saying, the idea of the logger.level getting changed unexpectedly sounds possible. I've grepped my code, and it's nothing I am doing. However... there is admittedly more than just my code. There are lots of gems too. In a way, for me, logging the "process_action.action_controller" is all I really "need". The remaining console messages are helpful, but not my main goal. However, the console messages become much more helpful when I really need to debug something, so there is value to getting this whole thing working. More news next week! Thanks for the insight! |
@pctj101 Did you find anything out yet? |
@dwbutler It looks like I'm hitting the same issue. I'm using MRI 2.2.0, Unicorn 4.5.0, lograge 0.3.1, and logstash-logger 0.8.0. Data is sent to logstash via UDP and also logged to a file:
It looks like you're right about the log level being changed. Really good catch! Printing it reveals that it is raised to 2 for more and more Unicorn workers over time. Restarting the server obviously fixes the problem temporarily. So now I'll try to hunt down who is changing the log level. In case you're interested, here is our "impressive" Gemfile.lock: https://gist.github.com/chopmo/88f10097f94db452089f |
Well that didn't take as long as expected. I'm 99% sure that we found the problem. We're using the Savon gem, and when calling We accidentally passed Simple as that :) |
Ouch... It sounds like Savon is not being a good logging citizen. I'll add a note to the Troubleshooting section of the readme. Thanks for the tip! |
I added a general note about gems changing the log level to the Troubleshooting section and linked to this issue. I think it's been well-established that this issue is caused by other gems. I can't think of any good way to protect one's logger level from getting stomped. So I'm going to close this issue. If anyone finds another gem causing a problem, or has any ideas about how to protect the log level from getting changed, feel free to add more comments to this issue. |
i'm also suffering this. i even followed instructions in a book. the buffer by default is supposed to collect up to 50 messages and flush them every 5 seconds. but it doesn't for my case. anything i'm missing? Rails.application.configure do
config.lograge.enabled = true
config.lograge.custom_options = lambda do |event|
exceptions = %w(controller action format id)
{
params: event.payload[:params].except(*exceptions).to_json,
type: :rails,
environment: Rails.env
}
end
config.lograge.formatter = Lograge::Formatters::Logstash.new
config.lograge.logger = LogStashLogger.new uri: 'udp://logstash-server:1234'
end |
@glaszig I noticed that you are using UDP output. Based on the symptoms you're describing, I believe you are observing normal Ruby behavior. When Ruby writes to an IO device such as a file or socket, it doesn't write immediately. It keeps it's own internal buffer and flushes it "periodically." If you write infrequently enough and don't have enough memory pressure, "periodically" could mean "never." Ruby will, of course, flush all buffers when you exit the program (which is what happens when you restart the server). The only way to control this behavior is to set |
sorry, i didn't know enough about these kind of ruby internals but i was imagining something like that. there is indeed very low traffic on this system. thanks for getting back. |
if anybody else comes across this and is interested in ruby's io buffering, watch this little screencast by @jstorimer. |
Hi, I believe I'm hitting this issue now myself (boo!). Everything is fine on staging, but as soon as I get to production, nada. Are there any tips/tools that would allow me to search through all my gem dependencies to find the culprit or is it just a case of lots of searching? |
So my production server is a puma cluster and it logs perfectly... until recently.
Currently, logging is successful for a number of hours, then suddenly stops.
I traced one cause back to Logstash::Event where Event#to_json was throwing errors on encoding to JSON, and usually that was the last log message. So I upgraded Logstash::Event to the latest master branch, fixed a couple logstash-event bugs, and the system is much more stable now.
[ Ref: https://github.com/pctj101/logstash/tree/event_timestamp ]
But... after a few more hours, logging still stops.
So I don't think it's a Logstash::Event#to_json problem any more.
Granted I use lograge to reduce my logs and format into json, but I don't see any problems there either.
And too be fair, I don't see any problems in logstash-logger either.
So it's not clear what to debug.
So while I'm looking into this I thought I'd ask a few simple questions:
Is there anything in general where a bug could cause some exception to be raised, and permanently disable logstash-logger or Rails.logger until the next server restart?
Any similar problems seen in the past?
Thanks!So my production server is a puma cluster and it logs perfectly... until recently.
Currently, logging is successful for a number of hours, then suddenly stops.
I traced one cause back to Logstash::Event where Event#to_json was throwing errors on encoding to JSON, and usually that was the last log message. So I upgraded Logstash::Event to the latest master branch, fixed a couple logstash-event bugs, and the system is much more stable now.
[ Ref: https://github.com/pctj101/logstash/tree/event_timestamp ]
But... after a few more hours, logging still stops.
So I don't think it's a Logstash::Event#to_json problem any more.
Granted I use lograge to reduce my logs and format into json, but I don't see any problems there either.
And too be fair, I don't see any problems in logstash-logger either.
So it's not clear what to debug.
So while I'm looking into this I thought I'd ask a few simple questions:
Is there anything in general where a bug could cause some exception to be raised, and permanently disable logstash-logger or Rails.logger until the next server restart?
Any similar problems seen in the past?
Thanks!
The text was updated successfully, but these errors were encountered: