Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[V2V] Add CPU and network throttling in model #18576

Merged

Conversation

ghost
Copy link

@ghost ghost commented Mar 20, 2019

Now that we introduced default values for CPU and network limits in the settings and that the throttling mechanism is backported into the model, it is time to add the model code to leverage the CPU and network limits at the conversion host level.

This pull request adds:

  • New methods in the ConversionHost class to compute the CPU and network limits. It divides the limit per host by the number of active migration tasks.
  • New method in the ConversionHost class to apply the limits for a specific task. It creates a throttling file on the conversion host. The path of the file is provided by virt-v2v-wrapper and stored in the task.
  • New method in the ServiceTemplateTransformationPlanTask to apply the limits. It collects all the limits in a hash and asks the conversion host to write them in the throttling file.
  • Extend the InfraConversionJob class to ask the task to apply the limits. This allows to revisit the limits on every poll.

Currently all limits are at the conversion host level, so it looks like the apply_virtv2v_limits method should belong to the ConversionHost class, but later we may have limits on different scopes. This allows better flexibility to concentrate the hash build in the task.

Associated RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1690851
Depends On: ManageIQ/manageiq-gems-pending#426

@ghost
Copy link
Author

ghost commented Mar 20, 2019

@miq-bot add_label transformation, enhancement, hammer/yes, wip
@miq-bot add_reviewer @jameswnl
@miq-bot add_reviewer @djberg96

@miq-bot miq-bot changed the title Add CPU and network throttling in model [WIP] Add CPU and network throttling in model Mar 20, 2019
@miq-bot miq-bot requested review from jameswnl and djberg96 March 20, 2019 11:32
@ghost ghost changed the title [WIP] Add CPU and network throttling in model [V2V] Add CPU and network throttling in model Mar 20, 2019
@ghost ghost changed the title [V2V] Add CPU and network throttling in model [WIP] [V2V] Add CPU and network throttling in model Mar 20, 2019
value == 'unlimited' ? value : "#{value.to_i / active_tasks.size}"
end

def apply_virtv2v_limits(path, limits)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any reasonable defaults we can set for path and/or limits?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Path is stored in the task and specific to a single migration, so no default. And limits is built by the task object from different sources, so the default can be {}.

@@ -56,6 +56,22 @@ def ipaddress(family = 'ipv4')
resource.ipaddresses.detect { |ip| IPAddr.new(ip).send("#{family}?") }
end

def get_cpu_limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need get_, or can we just redefine the existing method?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm neither a big fan of get_. It's an attribute of ConversionHost, but should default to Settings.transformation.limits.cpu_limit_per_host which we implemented in a previous PR. Can we override an attribute with a method ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -104,6 +104,11 @@ def poll_conversion

case v2v_status
when 'active'
begin
migration_task.apply_virtv2v_limits if migration_task.options.fetch_path(:virtv2v_wrapper, 'throttling_file')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is throttling_file coming from? Is that a configurable thing? Or is hard coding it ok?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It comes from the output of virt-v2v-wrapper, like state_file. So, pretty much hard coded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fdupont-redhat, 2 question

  • are we doing this apply_virtv2v_limits in every polling, like every 15s, regardless limits changed or not?
  • so this control flow is infra_conversion_job -> miq_request_task.apply_virtv2v_limits -> conversion_host.apply_virtv2v_limits? seems convoluted. This is the reason I asked to re-consider where we want to keep the throttling logic.

Copy link
Contributor

@djberg96 djberg96 Apr 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @jameswn. Seems like too many hoops.

end

def get_network_limit
value = network_limit || Setting.transformation.limits.network_limit_per_host
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Settings

@@ -233,6 +233,16 @@ def get_conversion_state
update_options(updates)
end

# Applies the limits for the task.
# CPU and network limits are set at the host level but other limits may not.
def apply_virtv2v_limits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking where this and forthcoming throttling code should reside.
May be a mixin or a sub-module of infra_conversion_job?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the logic should reside in the task as it has all the knowledge of the execution context. The infra_conversion_job is mainly a state machine, and should remain pretty stupid and delegate any intelligence to the task. As whether it should be in a mixin, I'm wondering how much code can be reused by other classes.

@agrare agrare self-assigned this Mar 20, 2019
@@ -104,6 +104,11 @@ def poll_conversion

case v2v_status
when 'active'
begin
migration_task.apply_virtv2v_limits if migration_task.options.fetch_path(:virtv2v_wrapper, 'throttling_file')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fdupont-redhat, 2 question

  • are we doing this apply_virtv2v_limits in every polling, like every 15s, regardless limits changed or not?
  • so this control flow is infra_conversion_job -> miq_request_task.apply_virtv2v_limits -> conversion_host.apply_virtv2v_limits? seems convoluted. This is the reason I asked to re-consider where we want to keep the throttling logic.

:cpu => conversion_host.get_cpu_limit,
:network => conversion_host.get_network_limit
}
conversion_host.apply_virtv2v_limits(options[:virtv2v_wrapper]['throttling_file'], limits)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This construction of limits seems redundant. If this self object has no opinion on how the limits are constructed, why not let conversion_host do it internally?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I kind of agree with @jameswnl. Seems like this could be refactored away or delegated.

@agrare
Copy link
Member

agrare commented Apr 9, 2019

hey guys what is the status of this? I don't see any updates for a couple of weeks and it is still WIP

@mturley
Copy link
Contributor

mturley commented Apr 16, 2019

@fdupont-redhat @djberg96 @jameswnl any updates on this one? We're holding off on merging ManageIQ/manageiq-v2v#915 until it's ready

@ghost ghost force-pushed the v2v_apply_resource_limits_to_virtv2v branch from 3323486 to a770fe2 Compare April 17, 2019 22:05
@ghost
Copy link
Author

ghost commented Apr 17, 2019

@djberg96 @jameswnl I took time to revisit and rethink this, and I moved the limits calculation into InfraConversionThrottler class, and added a dispatch to update and apply the limits to all jobs. This simplifies the design. I still need to add the specs.

@ghost
Copy link
Author

ghost commented Apr 18, 2019

@miq-bot remove-label wip
@djberg96 @agrare could you review, please ?

@miq-bot miq-bot changed the title [WIP] [V2V] Add CPU and network throttling in model [V2V] Add CPU and network throttling in model Apr 18, 2019
@miq-bot miq-bot removed the wip label Apr 18, 2019
Copy link
Contributor

@djberg96 djberg96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Some comments would be good if you have some time to add them.

@ghost
Copy link
Author

ghost commented Apr 18, 2019

@miq-bot add-reviewer @agrare

@miq-bot miq-bot requested a review from agrare April 18, 2019 15:52
# Applying the limits is done via the conversion_host which handles the writing.
def self.apply_limits
running_conversion_jobs.each do |ch, jobs|
number_of_jobs = ch.active_tasks.size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm if ConversionHost#active_tasks got out of sync with the running jobs somehow and was 0 then this division will blow up...maybe set this to 1 if it is 0?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. jobs.size is safer because jobs can't be empty inside the loop.

throttling_file_path = migration_task.options.fetch_path(:virtv2v_wrapper, 'throttling_file')
next unless throttling_file_path
limits = {
:cpu => cpu_limit == 'unlimited' ? cpu_limit : (cpu_limit.to_i / number_of_jobs).to_s,
Copy link
Member

@agrare agrare Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is minor but this calculation doesn't depend on anything in this loop so you could calculate it up here

Something like

cpu_limit = ch.cpu_limit || Settings.transformation.limits.cpu_limit_per_host
cpu_limit = (cpu_limit.to_i / number_of_jobs).to_s unless cpu_limit == "unlimited"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved.

@@ -40,16 +40,21 @@ def self.running_conversion_jobs
# Applying the limits is done via the conversion_host which handles the writing.
def self.apply_limits
running_conversion_jobs.each do |ch, jobs|
number_of_jobs = ch.active_tasks.size
number_of_jobs = jobs.size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 even better

Copy link
Member

@agrare agrare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@miq-bot
Copy link
Member

miq-bot commented Apr 18, 2019

Checked commits fabiendupont/manageiq@5531207~...bb72895 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
5 files checked, 0 offenses detected
Everything looks fine. 👍

@agrare agrare merged commit ce7c67c into ManageIQ:master Apr 18, 2019
@agrare agrare added this to the Sprint 110 Ending Apr 29, 2019 milestone Apr 18, 2019
simaishi pushed a commit that referenced this pull request Apr 22, 2019
…ts_to_virtv2v

[V2V] Add CPU and network throttling in model

(cherry picked from commit ce7c67c)

https://bugzilla.redhat.com/show_bug.cgi?id=1702085
@simaishi
Copy link
Contributor

Hammer backport details:

$ git log -1
commit b34ee8b44080745a2441cbc3bf3be890258637d3
Author: Adam Grare <[email protected]>
Date:   Thu Apr 18 14:15:24 2019 -0400

    Merge pull request #18576 from fdupont-redhat/v2v_apply_resource_limits_to_virtv2v
    
    [V2V] Add CPU and network throttling in model
    
    (cherry picked from commit ce7c67c404127843801f22471f1d286383a9ef17)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1702085

simaishi added a commit that referenced this pull request May 7, 2019
@simaishi
Copy link
Contributor

simaishi commented May 7, 2019

Reverted Hammer backport

commit 9d0322d76c646d4f3a15fc36b10caa6972b013b7
Author: Satoe Imaishi <[email protected]>
Date:   Tue May 7 18:26:33 2019 -0400

    Revert "Merge pull request #18576 from fdupont-redhat/v2v_apply_resource_limits_to_virtv2v"
    
    This reverts commit b34ee8b44080745a2441cbc3bf3be890258637d3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants