Skip to content

Commit

Permalink
Add ability to restart all workers after a time
Browse files Browse the repository at this point in the history
When we can't be smart about when we restart (i.e. if memory measurements aren't really cutting it) we can do timed restarts. If you see your app is hitting swap after 13 hours of running, restart it at 12 hours. It's that easy.
  • Loading branch information
schneems committed Jul 23, 2015
1 parent 3442c12 commit f80d62d
Show file tree
Hide file tree
Showing 7 changed files with 80 additions and 8 deletions.
28 changes: 26 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ If you have a memory leak in your code, finding and plugging it can be a hercule

Puma worker killer can only function if you have enabled cluster mode or hybrid mode (threads + worker cluster). If you are only using threads (and not workers) then puma worker killer cannot help keep your memory in control.

BTW restarting your processes to controll memory is like putting a bandaid on a gunshot wound, try figuring out the reason you're seeing so much memory bloat [derailed benchmarks](https://github.com/schneems/derailed_benchmarks) can help.


## Install

Expand Down Expand Up @@ -45,7 +47,9 @@ PumaWorkerKiller.config do |config|
config.ram = 1024 # mb
config.frequency = 5 # seconds
config.percent_usage = 0.98
config.rolling_restart_frequency = 12 * 3600 # 12 hours in seconds
end
PumaWorkerKiller.start
```

It is important that you tell your code how much RAM is available on your system. The default is 512 mb (the same size as a Heroku 1x dyno). You can change this value like this:
Expand All @@ -66,9 +70,29 @@ You may want to tune the worker killer to run more or less often. You can adjust
PumaWorkerKiller.frequency = 20 # seconds
```

## Heroku
You may want to periodically restart all of your workers rather than simply killing your largest. To do that set:

```ruby
PumaWorkerKiller.rolling_restart_frequency = 12 * 3600 # 12 hours in seconds
```

By default PumaWorkerKiller will perform a rolling restart of all your worker processes every 12 hours. To disable, set to `false`.

## Only turn on Rolling Restarts

If you're running on a platform like [Heroku where it is difficult to measure RAM from inside of a container accurately](https://github.com/schneems/get_process_mem/issues/7), you may want to disable the "worker killer" functionality and only use the rolling restart. You can do that by running:

```ruby
PumaWorkerKiller.enable_rolling_restart
```

or you can pass in the restart frequency

```ruby
PumaWorkerKiller.enable_rolling_restart(12 * 3600) # 12 hours in seconds
```

This gem does not behave as intended on Heroku or other platforms that run workers inside a container. This is because accurate memory usage is not available from inside a container. See https://github.com/schneems/get_process_mem/issues/7.
Make sure if you do this to not accidentally call `PumaWorkerKiller.start` as well.

## License

Expand Down
12 changes: 10 additions & 2 deletions lib/puma_worker_killer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@
module PumaWorkerKiller
extend self

attr_accessor :ram, :frequency, :percent_usage
attr_accessor :ram, :frequency, :percent_usage, :rolling_restart_frequency
self.ram = 512 # mb
self.frequency = 10 # seconds
self.percent_usage = 0.99 # percent of RAM to use
self.rolling_restart_frequency = 6 * 3600

def config
yield self
Expand All @@ -18,10 +19,17 @@ def reaper(ram = self.ram, percent = self.percent_usage)

def start(frequency = self.frequency, reaper = self.reaper)
AutoReap.new(frequency, reaper).start
enable_rolling_restart(rolling_restart_frequency) if rolling_restart_frequency
end

def enable_rolling_restart(frequency = self.rolling_restart_frequency)
frequency = frequency + rand(0..10.0) # so all workers don't restart at the exact same time across multiple machines
AutoReap.new(frequency, RollingRestart.new)
end
end

require 'puma_worker_killer/puma_memory'
require 'puma_worker_killer/reaper'
require 'puma_worker_killer/rolling_restart'
require 'puma_worker_killer/auto_reap'
require 'puma_worker_killer/version'
require 'puma_worker_killer/version'
2 changes: 1 addition & 1 deletion lib/puma_worker_killer/auto_reap.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ def start
end

end
end
end
21 changes: 21 additions & 0 deletions lib/puma_worker_killer/rolling_restart.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
module PumaWorkerKiller
class RollingRestart
def initialize(master = nil)
@cluster = PumaWorkerKiller::PumaMemory.new(master)
end

# used for tes
def get_total_memory
@cluster.get_total_memory
end

def reap(wait_till_next = 60)
return false unless @cluster.running?
@cluster.workers.sort.shuffle.each do |worker, ram|
@cluster.master.log "PumaWorkerKiller: Rolling Restart. #{@cluster.workers.count} workers consuming total: #{ get_total_memory } mb out of max: #{@max_ram} mb. Sending TERM to #{worker.inspect}"
worker.term
sleep wait_till_next
end
end
end
end
2 changes: 2 additions & 0 deletions puma_worker_killer.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,6 @@ Gem::Specification.new do |gem|
gem.add_dependency "puma", "~> 2.7"
gem.add_dependency "get_process_mem", "~> 0.2"
gem.add_development_dependency "rake", "~> 10.1"
gem.add_development_dependency "test-unit", ">= 0"

end
2 changes: 2 additions & 0 deletions test/fixtures/app.ru
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ end
PumaWorkerKiller.start


puts "Frequency: #{PumaWorkerKiller.frequency}" if ENV['PUMA_FREQUENCY']

class HelloWorld
def response
[200, {}, ['Hello World']]
Expand Down
21 changes: 18 additions & 3 deletions test/puma_worker_killer_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ class PumaWorkerKillerTest < Test::Unit::TestCase
def test_starts
app_path = fixture_path.join("app.ru")
port = 0 # http://stackoverflow.com/questions/200484/how-do-you-find-a-free-tcp-server-port-using-ruby
puma_log = Pathname.new "puma.log"
`rm #{puma_log}; touch #{puma_log}`
pid = Process.spawn("PUMA_FREQUENCY=1 bundle exec puma #{app_path} -t 1:1 -w 5 --preload --debug -p #{port} > #{puma_log}")
puma_log = Pathname.new "#{ SecureRandom.hex }-puma.log"
pid = Process.spawn("PUMA_FREQUENCY=1 bundle exec puma #{ app_path } -t 1:1 -w 5 --preload --debug -p #{ port } > #{puma_log}")
sleep 5
assert_match "PumaWorkerKiller:", puma_log.read
ensure
puma_log.delete
Process.kill('TERM', pid) if pid
end

Expand Down Expand Up @@ -59,4 +59,19 @@ def test_kills_memory_leak
cluster.workers.map(&:term)
end


def test_rolling_restart
ram = rand(75..100) #mb
cluster = FakeCluster.new
cluster.add_worker

worker = cluster.workers.first
reaper = PumaWorkerKiller::RollingRestart.new(cluster)
reaper.reap(1)

assert_equal 1, cluster.workers.select {|w| w.is_term? }.count
ensure
cluster.workers.map(&:term)
end
end

0 comments on commit f80d62d

Please sign in to comment.