Skip to content

Commit

Permalink
Fork workers no more 🔥 🔥
Browse files Browse the repository at this point in the history
Remove conditional allowing spawn instead of fork and default to fork.
Remove various code supporting fork.

Background on WHY:

Fork, while creating processes much faster for us, relies on programs
being copy-on-write(CoW) friendly to take advantage of shared memory.

To do this, your program needs to preload as much of it's code ahead of
time and avoid writing to memory locations on OS pages containing objects
shared with other processes.

Ruby's garbage collector is naive in how it allocates objects on the
ruby heap pages as both old and new objects can be located on the same
OS page.  Because most objects die young, it's wise to keep old objects on
separate OS pages from new objects.  If a parent or child process
allocates or frees memory on the OS page containing shared memory objects,
the whole OS page is copied, including the shared objects.  This happens
frequently if your heap is fragmented with old and new objects.

In addition, ruby, as of 2.2, has the notion of young and old objects
and the garbage collector can be more efficient by only traversing young
objects on a minor GC since young objects generally die young.
Unfortunately, the age field is used to track this and is directly on the
object header.  This means that after surviving 3 GCs, an object is 'touched'
to mark it's age as 'old'.  This causes any shared memory on the same
OS page to be copied with this object. This was mitigated by
https://github.com/ko1/nakayoshi_fork, but only for objects created before
you fork.  Any new objects created after fork could be allocated on an OS
page coresident with shared objects, causing the whole OS page to be copied.

Ultimately, the shared memory for ManageIQ processes was often less than
15% within minutes of the processes starting and this number continued
to drop as more and more shared memory locations were copied on a
"neighboring" write.

This means that fork was only buying us faster process creation.

As we move towards running ManageIQ in containers, we also need to move
towards running processes in isolation via command lines.

We have begun separating out dependencies and selectively loading them
via bundler groups, which drops memory usage for non-fork processes but
makes fork less efficient since less and less code is preloaded and
shared.

Finally, fork is not implemented in every platform, such as Windows, or even
ruby, jRuby for example, so it was preventing usage of ManageIQ in those
ecosystems.

With all this said, it's time to say goodbye to fork.
  • Loading branch information
jrafanie committed Oct 5, 2017
1 parent c1370cf commit c602d0c
Show file tree
Hide file tree
Showing 3 changed files with 2 additions and 46 deletions.
1 change: 0 additions & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ gem "manageiq-api-client", "~>0.1.0", :require => false
gem "manageiq-network_discovery", "~>0.1.2", :require => false
gem "mime-types", "~>2.6.1", :path => "mime-types-redirector"
gem "more_core_extensions", "~>3.3"
gem "nakayoshi_fork", "~>0.0.3" # provides a more CoW friendly fork (GC a few times before fork)
gem "net-ldap", "~>0.14.0", :require => false
gem "net-ping", "~>1.7.4", :require => false
gem "net-ssh", "=3.2.0", :require => false
Expand Down
45 changes: 1 addition & 44 deletions app/models/miq_worker.rb
Original file line number Diff line number Diff line change
Expand Up @@ -304,29 +304,6 @@ def send_message_to_worker_monitor(message, *args)
)
end

def self.before_fork
preload_for_worker_role if respond_to?(:preload_for_worker_role)
end

def self.after_fork
close_pg_sockets_inherited_from_parent
DRb.stop_service
renice(Process.pid)
end

# When we fork, the children inherits the parent's file descriptors
# so we need to close any inherited raw pg sockets in the child.
def self.close_pg_sockets_inherited_from_parent
owner_to_pool = ActiveRecord::Base.connection_handler.instance_variable_get(:@owner_to_pool)
owner_to_pool[Process.ppid].values.compact.each do |pool|
pool.connections.each do |conn|
socket = conn.raw_connection.socket
_log.info("Closing socket: #{socket}")
IO.for_fd(socket).close
end
end
end

# Overriding queue_name as now some queue names can be
# arrays of names for some workers not just a singular name.
# We use JSON.parse as the array of names is stored as a string.
Expand All @@ -339,26 +316,6 @@ def queue_name
end
end

def start_runner
if ENV['MIQ_SPAWN_WORKERS'] || !Process.respond_to?(:fork)
start_runner_via_spawn
else
start_runner_via_fork
end
end

def start_runner_via_fork
self.class.before_fork
pid = fork(:cow_friendly => true) do
self.class.after_fork
self.class::Runner.start_worker(worker_options)
exit!
end

Process.detach(pid)
pid
end

def self.build_command_line(guid)
command_line = "#{Gem.ruby} #{runner_script} --heartbeat --guid=#{guid} #{name}"
ENV['APPLIANCE'] ? "nice #{nice_increment} #{command_line}" : command_line
Expand All @@ -370,7 +327,7 @@ def self.runner_script
script
end

def start_runner_via_spawn
def start_runner
pid = Kernel.spawn(self.class.build_command_line(guid), [:out, :err] => [Rails.root.join("log", "evm.log"), "a"])
Process.detach(pid)
pid
Expand Down
2 changes: 1 addition & 1 deletion lib/workers/bin/run_single_worker.rb
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@
require File.expand_path("../../../config/environment", __dir__)

worker_class = worker_class.constantize
worker_class.before_fork
worker_class.preload_for_worker_role if worker_class.respond_to?(:preload_for_worker_role)
unless options[:dry_run]
create_options = {:pid => Process.pid}
runner_options = {}
Expand Down

0 comments on commit c602d0c

Please sign in to comment.