Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC][WIP] ActiveSupport::Notifications approach to determining slow requests #17958

Conversation

NickLaMuro
Copy link
Member

This is a different approach to #17842

Description

This peers into the private APIs of ActiveSupport and ActionController::LogSubscriber to check the status of the queued up events for 'process_action.action_controller', and see if any are larger than 10 seconds (will probably update this to 1 minute). If they are, it will report them to the Rails.logger.

Pros

  • Zero object allocations and stacklevel weight added to requests
  • Almost zero object allocations in the watching thread

Cons

  • Very "Rails internal API" focused (aka "confusing as F...")
  • Touching internal thread variables that are used elsewhere (more unknowns)

Usage

This can be instrumented in many ways, but the easiest is via a config/initializer for local development use:

 # config/initializers/request_watcher.local.rb
if ENV['WATCH_FOR_LONG_REQUESTS']
 require 'workers/rails_request_monitor'

  Thread.new do
   loop do
     sleep 10
     RailsRequestMonitor.log_long_running_requests
   end
 end.abort_on_exception = true
end

And running a development server by running:

$ WATCH_FOR_LONG_REQUESTS=1 bin/rails s

This will check every 10 seconds, and when long requests are found, they will be output to the Rails.log on level WARN.

This can also be instrumented in MiqWebServerRunnerMixin with relative ease:

 # app/models/mixins/miq_web_server_runner_mixin.rb

+
+ def do_heartbeat_work
+   RailsRequestMonitor.log_long_running_requests
+ end

Or anywhere that it makes sense (signal handlers, etc.)

Links

Steps for Testing/QA

The steps in the BZ from #17842 is a good start, but this is what I did:

  1. Get the code from this branch: git apply <(curl -L https://github.com/ManageIQ/manageiq/pull/17958.path)

  2. Update the API (bundle open manageiq-api) to include this code in app/controllers/api/ping_controller.rb:

module Api
  class PingController < ActionController::API
    def index
      sleep 60 if rand(5) == 0
      render :plain => 'pong'
    end
  end
end
  1. Add the initializer from above to config/initializers/.

  2. Run a rails server: $ WATCH_FOR_LONG_REQUESTS=1 bin/rails s

  3. Hit localhost:3000/api/ping until you get a few requests that take a while a bunch of times and watch the logs

  • continuing to refresh regardless if you get a response is probably the best technique here
  1. You should see [WARN] messages eventually for the requests that don't complete immediately.

This peers into the private APIs of `ActiveSupport` and
`ActionController::LogSubscriber` to check the status of the queued up
events for 'process_action.action_controller', and see if any are larger
than 10 seconds.  If they are, it will report them to the Rails.logger.

Usage
-----

This can be instrumented in many ways, but the easiest is via
config/initializer:

    # config/initializers/request_watcher.local.rb
    if ENV['WATCH_FOR_LONG_REQUESTS']
      require 'workers/rails_request_monitor'

      Thread.new do
        loop do
          sleep 10
          RailsRequestMonitor.log_long_running_requests
        end
      end.abort_on_exception = true
    end

And running a development server by running:

    $ WATCH_FOR_LONG_REQUESTS=1 bin/rails s

This will check every 10 seconds, and when long requests are found, they
will be output to the Rails.log on level WARN.

This can also be instrumented in MiqWebServerRunnerMixin with relative
ease:

    # app/models/mixins/miq_web_server_runner_mixin.rb

    +
    + def do_heartbeat_work
    +   RailsRequestMonitor.log_long_running_requests
    + end

Or anywhere that it makes sense (signal handlers, etc.)

TODO
----

- Update to 1.minute instead of 10.seconds (for quicker testing)
- Tests
- Instrument in `MiqWebServerRunnerMixin` (new commit?)
- Defers `now` and `too_slow` instantiations to happen after the first
  loop happens
- Uses integers for seconds instead of `ActiveSupport`'s `.seconds`
  (this actually ends up creating a ton of objects doing this
- Memoizes the seconds taken in a constant.

This leaves a few object allocations for each time
`.log_long_running_requests` is called without any long requests
currently active.
@NickLaMuro
Copy link
Member Author

To @jrafanie 's comment in my commit (the monster)... he can tell you that the answer is a bit of a 📖...

@miq-bot miq-bot changed the title [POC][WIP] ActiveSupport::Notifications approach to determining slow requests [WIP] [POC][WIP] ActiveSupport::Notifications approach to determining slow requests Sep 7, 2018
@miq-bot miq-bot added the wip label Sep 7, 2018
@miq-bot
Copy link
Member

miq-bot commented Sep 7, 2018

Checked commits NickLaMuro/manageiq@fb7f4bd~...c6c55d1 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
1 file checked, 9 offenses detected

lib/workers/rails_request_monitor.rb

@NickLaMuro NickLaMuro changed the title [WIP] [POC][WIP] ActiveSupport::Notifications approach to determining slow requests [POC][WIP] ActiveSupport::Notifications approach to determining slow requests Sep 10, 2018
@NickLaMuro
Copy link
Member Author

@jrafanie going to close this for now since it seems like you went ahead with #17842

@NickLaMuro NickLaMuro closed this Sep 27, 2018
@jrafanie
Copy link
Member

I prefer the simplicity of the initializer in this PR. I just wish some of these API were public. I don't feel confident in that these will likely change underneath us minor or major releases of rails. Perhaps, this can be resurrected if we know APIs we can use that are stable and easy to test against for each rails version. For now though, it feels like the thread watcher thing, #17842, is relying on more stable things.

Honestly, though... this PR, if we can write tests against various rails versions, would be easier to hotfix, so the jury isn't out on this. What's your opinion on this @NickLaMuro ?

@NickLaMuro
Copy link
Member Author

@jrafanie sorry, missed this message with the morning flood of notifications to I sifted through:

Honestly, though... this PR, if we can write tests against various rails versions, would be easier to hotfix, so the jury isn't out on this. What's your opinion on this @NickLaMuro ?

I would almost say maybe it makes sense to make this a Rails feature if we go that route, and either:

a) update some of these methods to have public interfaces
b) include this interface in Rails core so that this can be accessed and tested with Rails releases.

Again, I think what we both do like about this approach is we aren't adding anything on to what is already there, and just including a light monitor to watch that already existing data. What really sucks about it though is we are having to dive deep into some internal API's to get this data, and that isn't good long term.

I will maybe look into seeing what implementing something in Rails core would look like when I have some spare cycles.

@jrafanie
Copy link
Member

Yeah, that summarizes my feelings too. My middleware approach duplicates data that's already there but it controls the CRUD of this data so it's safer to use. If rails provides mechanism to introspect what's already there, we'd get the safety of the other PR without the extra overhead/duplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants