Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Refactor Metric::Processing.process_derived_columns to use less memory #15757

Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Refactore / "Simplfy" process_derived_columns
The complexity of the process_derived_columns method was quite high, and
a lot of the logic for it was simply calling out specific routines
needed to generate the DERIVED_COLS data, and using "types" to do it.

At it's heart, this simply changes DERIVED_COLS to be a hash of the name
of the derived column, and the method necessary to generate it, and then
the code in process_derived_columns is replaced with a simple send to
that method, passing the reference to col, state, attrs, and result to
the method for it to be calculated.

All of the methods for DERIVED_COLS are then private class methods at
the bottom of the file, and a decent amount of them are meta-programmed
in (since many are similar).  The way these are generated are
specifically trying to avoid generating new strings to determine what
methods to call, so `module_eval` statements in this case are preferred
over using define_method.

This then allows:

- The "TYPE_" constants to be removed
- The DERIVED_COLS definition to be simplified
- The process_derived_columns columns method to be greatly simplified
- No real changes to how the data is derived**

The cost of this  is a bit of complexity due to the meta-programmed
methods, some more additions than deletions (sorry LJ...), and a few
more method definitions existing.  The extra method definitions is
probably negligible from a performance front, but the complexity might
be up for debate.

** This a few if statements in generate_derived_memory_used and
generate_cpu_usagemhz_rate_average were reduced/simplified to either
avoid extra calls to helper methods, or remove complex unless
statements, but should be functionally equivalent.
  • Loading branch information
NickLaMuro committed Aug 10, 2017

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit 343199a398f1c30369fbf1307c926745caf826d5
207 changes: 121 additions & 86 deletions app/models/metric/processing.rb
Original file line number Diff line number Diff line change
@@ -1,36 +1,24 @@
module Metric::Processing
TYPE_AVAILABLE = "available".freeze
TYPE_ALLOCATED = "allocated".freeze
TYPE_USED = "used".freeze
TYPE_RATE = "rate".freeze
TYPE_RESERVED = "reserved".freeze
TYPE_COUNT = "count".freeze
TYPE_NUMVCPUS = "numvcpus".freeze
TYPE_SOCKETS = "sockets".freeze

DERIVED_COLS = {
:derived_cpu_available => [nil, "cpu", TYPE_AVAILABLE],
:derived_cpu_reserved => [:reserve_cpu, "cpu", TYPE_RESERVED],
:derived_host_count_off => [:host_count_off, "host", TYPE_COUNT],
:derived_host_count_on => [:host_count_on, "host", TYPE_COUNT],
:derived_host_count_total => [:host_count_total, "host", TYPE_COUNT],
:derived_memory_available => [nil, "memory", TYPE_AVAILABLE],
:derived_memory_reserved => [:reserve_mem, "memory", TYPE_RESERVED],
:derived_memory_used => [nil, "memory", TYPE_USED],
:derived_host_sockets => [nil, "host", TYPE_SOCKETS],
:derived_vm_allocated_disk_storage => [:vm_allocated_disk_storage, "vm", TYPE_ALLOCATED],
:derived_vm_count_off => [:vm_count_off, "vm", TYPE_COUNT],
:derived_vm_count_on => [:vm_count_on, "vm", TYPE_COUNT],
:derived_vm_count_total => [:vm_count_total, "vm", TYPE_COUNT],
:derived_cpu_available => :generate_derived_cpu_available,
:derived_cpu_reserved => :generate_derived_cpu_reserved,
:derived_host_count_off => :generate_derived_host_count_off,
:derived_host_count_on => :generate_derived_host_count_on,
:derived_host_count_total => :generate_derived_host_count_total,
:derived_memory_available => :generate_derived_memory_available,
:derived_memory_reserved => :generate_derived_memory_reserved,
:derived_memory_used => :generate_derived_memory_used,
:derived_host_sockets => :generate_derived_host_sockets,
:derived_vm_allocated_disk_storage => :generate_derived_vm_allocated_disk_storage,
:derived_vm_count_off => :generate_derived_vm_count_off,
:derived_vm_count_on => :generate_derived_vm_count_on,
:derived_vm_count_total => :generate_derived_vm_count_total,
# TODO: This is cpu_total_cores and needs to be renamed, but reports depend on the name :numvcpus
:derived_vm_numvcpus => [nil, "vm", TYPE_NUMVCPUS],
:derived_vm_numvcpus => :generate_derived_vm_numvcpus,
# See also #TODO on VimPerformanceState.capture
:derived_vm_used_disk_storage => [:vm_used_disk_storage, "vm", TYPE_USED],
:derived_vm_used_disk_storage => :generate_derived_vm_used_disk_storage,
# TODO(lsmola) as described below, this field should be named derived_cpu_used
# FIXME(nicklamuro) FYI: This doesn't really follow the convention of those
# above it (see above), so I just tried to follow the pattern the others
# have of [method, group, type]
:cpu_usagemhz_rate_average => [nil, "cpu", TYPE_RATE]
:cpu_usagemhz_rate_average => :generate_cpu_usagemhz_rate_average
}.freeze

VALID_PROCESS_TARGETS = [
@@ -59,66 +47,9 @@ def self.process_derived_columns(obj, attrs, ts = nil)

ts = attrs[:timestamp] if ts.nil?
state = obj.vim_performance_state_for_ts(ts)
total_cpu = state.total_cpu || 0
total_mem = state.total_mem || 0
result = {}

have_cpu_metrics = attrs[:cpu_usage_rate_average] || attrs[:cpu_usagemhz_rate_average]
have_mem_metrics = attrs[:mem_usage_absolute_average] || attrs[:derived_memory_used]

DERIVED_COLS.each do |col, val|
method, group, typ = val
next if group == "vm" && obj.kind_of?(Service) && typ != "count"
case typ
when TYPE_AVAILABLE
# Do not derive "available" values if there haven't been any usage
# values collected
if group == "cpu"
result[col] = total_cpu if have_cpu_metrics && total_cpu > 0
else
result[col] = total_mem if have_mem_metrics && total_mem > 0
end
when TYPE_ALLOCATED
result[col] = state.send(method) if state.respond_to?(method)
when TYPE_USED
if group == "cpu"
# TODO: This branch is never called because there isn't a column
# called derived_cpu_used. The callers, such as chargeback, generally
# use cpu_usagemhz_rate_average directly, and the column may not be
# needed, but perhaps should be added to normalize like is done for
# memory. The derivation here could then use cpu_usagemhz_rate_average
# directly if avaiable, otherwise do the calculation below.
result[col] = (attrs[:cpu_usage_rate_average] / 100 * total_cpu) unless total_cpu == 0 || attrs[:cpu_usage_rate_average].nil?
elsif group == "memory"
if attrs[:mem_usage_absolute_average].nil?
# If we can't get percentage usage, just used RAM in MB, lets compute percentage usage
attrs[:mem_usage_absolute_average] = 100.0 / total_mem * attrs[:derived_memory_used] if total_mem > 0 && !attrs[:derived_memory_used].nil?
else
# We have percentage usage of RAM, lets compute consumed RAM in MB
result[col] = (attrs[:mem_usage_absolute_average] / 100 * total_mem) unless total_mem == 0 || attrs[:mem_usage_absolute_average].nil?
end
else
result[col] = state.send(method) if state.respond_to?(method)
end
when TYPE_RATE
if col == :cpu_usagemhz_rate_average && attrs[:cpu_usagemhz_rate_average].blank?
# TODO(lsmola) for some reason, this column is used in chart, although from processing code above, it should
# be named derived_cpu_used. Investigate what is the right solution and make it right. For now lets fill
# the column shown in charts.
result[col] = (attrs[:cpu_usage_rate_average] / 100 * total_cpu) unless total_cpu == 0 || attrs[:cpu_usage_rate_average].nil?
end
when TYPE_RESERVED
result[col] = state.send(method)
when TYPE_COUNT
result[col] = state.send(method)
when TYPE_NUMVCPUS # This is actually logical cpus. See note above.
# Do not derive "available" values if there haven't been any usage
# values collected
result[col] = state.numvcpus if have_cpu_metrics && state.try(:numvcpus).to_i > 0
when TYPE_SOCKETS
result[col] = state.host_sockets
end
end
DERIVED_COLS.each { |col, method| send(method, col, state, attrs, result) }

result[:assoc_ids] = state.assoc_ids
result[:tag_names] = state.tag_names
@@ -187,4 +118,108 @@ def self.interval_name_to_interval(name)
else raise _("unknown interval name: [%{name}]") % {:name => name}
end
end

##### DEFINING METHODS FOR .process_derived_columns #####

# Shared_methods
def self.cpu_metrics?(attrs)
attrs[:cpu_usage_rate_average] || attrs[:cpu_usagemhz_rate_average]
end
private_class_method :cpu_metrics?

def self.mem_metrics?(attrs)
attrs[:mem_usage_absolute_average] || attrs[:derived_memory_used]
end
private_class_method :mem_metrics?

# Defines:
# total_cpu (helper method)
# total_mem (helper method)
# generate_derived_cpu_available
# generate_derived_cpu_reserved (calls state.reserve_cpu)
# generate_derived_memory_available
# generate_derived_memory_reserved (calls state.reserve_mem)
%w[cpu memory].each do |group|
method_def = <<-METHOD_DEF
def self.total_#{group[0..2]}(state)
state.total_#{group[0..2]} || 0
end
private_class_method :total_#{group[0..2]}

def self.generate_derived_#{group}_available(col, state, attrs, result)
result[col] = total_#{group[0..2]}(state) if #{group[0..2]}_metrics?(attrs) && total_#{group[0..2]}(state) > 0
end
private_class_method :generate_derived_#{group}_available

def self.generate_derived_#{group}_reserved(col, state, _, result)
result[col] = state.reserve_#{group[0..2]}
end
private_class_method :generate_derived_#{group}_reserved
METHOD_DEF
eval method_def.split("\n").map(&:strip).join(';')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we went with the eval, why the need for all of the split/map/join? I thought eval just worked with the original full string...

[1] pry(main)> class Foo
[1] pry(main)*   %w{a b}.each do |x|
[1] pry(main)*     eval <<-EOS
[1] pry(main)* def #{x}
[1] pry(main)*   puts "#{x}"
[1] pry(main)* end
[1] pry(main)*     EOS
[1] pry(main)*   end
[1] pry(main)* end
=> ["a", "b"]
[2] pry(main)> Foo.new.a
a
=> nil
[3] pry(main)> Foo.new.b
b
=> nil

Copy link
Member Author

@NickLaMuro NickLaMuro Jan 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably could go without it, but I think I was trying to channel what was done here to a degree (probably not needed though):

rails/rails@44c51fc (see associated issue as well).

I would have to check to see if that even helps though. Honestly, I forget since it has been a while since I have put this code together. Will test out in a bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Rails commit link there doesn't work.

Copy link
Member Author

@NickLaMuro NickLaMuro Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fryguy Shoot... sorry, I usually self shorten the SHA's to 7 chars, and I guess that was the one time Rails actually had a conflicting commit SHA with that... updated in that last comment so there isn't a broken link.

Regardless, it was stupid to do that in this case since I wasn't building a DSL, so that probably shouldn't stay. Line numbers would be exactly where we want them in this case. (past @NickLaMuro was being stupid)

end

# Defines:
# generate_derived_host_count_off
# generate_derived_host_count_on
# generate_derived_host_count_total
# generate_derived_vm_count_off
# generate_derived_vm_count_on
# generate_derived_vm_count_total
%w[host vm].each do |group|
%w[off on total].each do |mode|
method_def = <<-METHOD_DEF
def self.generate_derived_#{group}_count_#{mode}(col, state, _, result)
result[col] = state.#{group}_count_#{mode}
end
private_class_method :generate_derived_#{group}_count_#{mode}
METHOD_DEF
eval method_def.split("\n").map(&:strip).join(';')
end
end

# Defines:
# generate_host_count_off
# generate_host_count_on
# generate_host_count_total
%i[host_sockets vm_allocated_disk_storage vm_used_disk_storage].each do |method|
method_def = <<-METHOD_DEF
def self.generate_derived_#{method}(col, state, _, result)
result[col] = state.#{method} if state.respond_to?(:#{method})
end
private_class_method :generate_derived_#{method}
METHOD_DEF
eval method_def.split("\n").map(&:strip).join(';')
end

def self.generate_derived_memory_used(col, state, attrs, result)
if total_mem(state) > 0 # eject early since we can't do anything without this having a value
if attrs[:mem_usage_absolute_average].nil? && !attrs[:derived_memory_used].nil?
# If we can't get percentage usage, just used RAM in MB, lets compute percentage usage
# FIXME: Is this line a bug? Why are we assigning attr here?
attrs[:mem_usage_absolute_average] = 100.0 / total_mem(state) * attrs[:derived_memory_used]
elsif !attrs[:mem_usage_absolute_average].nil?
# We have percentage usage of RAM, lets compute consumed RAM in MB
result[col] = (attrs[:mem_usage_absolute_average] / 100 * total_mem(state))
end
end
end
private_class_method :generate_derived_memory_used

# This is actually logical cpus. See note above for :derived_vm_numvcpus
def self.generate_derived_vm_numvcpus(col, state, attrs, result)
# Do not derive "available" values if there haven't been any usage values collected
result[col] = state.numvcpus if cpu_metrics?(attrs) && state.try(:numvcpus).to_i > 0
end
private_class_method :generate_derived_vm_numvcpus

def self.generate_cpu_usagemhz_rate_average(col, state, attrs, result)
if attrs[:cpu_usagemhz_rate_average].blank? && total_cpu(state) > 0 && !attrs[:cpu_usage_rate_average].nil?
# TODO(lsmola) for some reason, this column is used in chart, although from processing code above, it should
# be named derived_cpu_used. Investigate what is the right solution and make it right. For now lets fill
# the column shown in charts.
result[col] = (attrs[:cpu_usage_rate_average] / 100 * total_cpu(state))
end
end
private_class_method :generate_cpu_usagemhz_rate_average
end