Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor get_ems_folders to create less strings #17358

Conversation

NickLaMuro
Copy link
Member

@NickLaMuro NickLaMuro commented Apr 27, 2018

Creating some strings in this method is inevitable, since we are generating string paths based on this compiled info, but there were a lot of excessive strings being created because of it:

  • Both of the constants in this method were being created every time this method was called. For the "Datacenter".freeze one, it was just used as a comparison as well
  • Using << where possible will avoid generating new substrings, so we are only creating one new string at most.
  • Deferred the .dup of full_path until a new string was known that it would be generated. This did require creating a new variable to hold on to the existing variable, but that wasn't a big deal and didn't increase the number of allocations.

Split up some ternary operators as well just to make running profilers on this code easier to determine what as causing allocations (granularity in tracing object allocations in ruby is only down to a line by line basis).

Benchmarks

The benchmark script used here basically runs the following:

ManageIQ::Providers::Vmware::InfraManager::ProvisionWorkflow.new.init_from_dialog

And is targeting a fairly large EMS, with about 600 hosts.

ms queries rows
before 15023 1961 48395
after 15017 1961 48395
Before                                                     After
------                                                     -----

Total allocated: 2069091751 bytes (23226536 objects)   |   Total allocated: 2055920344 bytes (22741877 objects)
                                                       |   
allocated objects by gem                               |   allocated objects by gem
-----------------------------------                    |   -----------------------------------
   9665227  pending                                    |      9665227  pending
   5561668  manageiq/app  <<<<<<<<<<                   |      5076997  manageiq/app  <<<<<<<<<<
   3513258  manageiq/lib                               |      3513258  manageiq/lib
   2512007  activerecord-5.0.7                         |      2512007  activerecord-5.0.7
    861270  activesupport-5.0.7  <<<<<<<<<<            |       861282  activesupport-5.0.7  <<<<<<<<<<
    418737  ancestry-2.2.2                             |       418737  ancestry-2.2.2
    278793  activemodel-5.0.7                          |       278793  activemodel-5.0.7
    178419  ruby-2.3.3/lib                             |       178419  ruby-2.3.3/lib
    165577  arel-7.1.4                                 |       165577  arel-7.1.4
     52875  manageiq-providers-vmware-0be2f13a0dc9     |        52875  manageiq-providers-vmware-0be2f13a0dc9
     14424  fast_gettext-1.2.0                         |        14424  fast_gettext-1.2.0
       ...                                             |          ...

Note: The benchmarks for this change do NOT include the changes from other PRs (#17354 for example). Benchmarks of all changes can be found here.

Links

@NickLaMuro
Copy link
Member Author

@miq-bot assign @gmcculloug
@miq-bot add_label performance

@miq-bot
Copy link
Member

miq-bot commented Apr 27, 2018

This pull request is not mergeable. Please rebase and repush.

if path.blank?
path << folder.name.to_s
else
path << FOLDER_PATH_SEPARATOR << folder.name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NickLaMuro - Since there isn't a ternary here constantly generating new String objects. Within the confines of a conditional Is the CONSTANT still more performant than just including the "/" here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to say "yes", but I would rather have some proof. Let me run some tests quick.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up doing by testing against the DATACENTER_FOLDER_TYPE constant and putting my findings there:

https://github.com/ManageIQ/manageiq/pull/17358/files#r188783333

but the same should apply here as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my tendency is to go the easier to read path, which would mean pulling out the constants and replacing them with string_name.freeze.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmk, I can go with that. I tend to do constants myself because I can wrap my brain around it a little easier. When it is in the "string".freeze mode, I still think of/read it as "allocate string, then make it immutable", when it really isn't doing the "allocate string" bit at all.

Any who, I will make the change. 👍

else
path << FOLDER_PATH_SEPARATOR << folder.name
end
dh[folder.id] = path unless folder.type == DATACENTER_FOLDER_TYPE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here.

Copy link
Member Author

@NickLaMuro NickLaMuro May 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so turns out the CONSTANT isn't necessary, but a .freeze if it isn't a constant definitely is. So with the following:

What we have currently

dh[folder.id] = path unless folder.type == DATACENTER_FOLDER_TYPE

# Total allocated: 2055942638 bytes (22741876 objects)
# 
# allocated objects by gem
# -----------------------------------
#    9665227  pending
#    5076997  manageiq/app
#    3513259  manageiq/lib
#    2512007  activerecord-5.0.7
#     861280  activesupport-5.0.7
#     418737  ancestry-2.2.2
#     278793  activemodel-5.0.7
#     178419  ruby-2.3.3/lib
#     165577  arel-7.1.4
#      52875  manageiq-providers-vmware-e2e2b8029538
#      14424  fast_gettext-1.2.0
#       4115  other
#         75  default_value_for-3.0.5
#         68  concurrent-ruby-1.0.5
#         16  memoist-0.15.0
#          4  rubygems
#          3  manageiq-providers-lenovo-ed197e6bea5b

With a normal string

dh[folder.id] = path unless folder.type == "Datacenter"

# Total allocated: 2066619256 bytes (23009605 objects)
# 
# allocated objects by gem
# -----------------------------------
#    9665227  pending
#    5344726  manageiq/app
#    3513259  manageiq/lib
#    2512007  activerecord-5.0.7
#     861280  activesupport-5.0.7
#     418737  ancestry-2.2.2
#     278793  activemodel-5.0.7
#     178419  ruby-2.3.3/lib
#     165577  arel-7.1.4
#      52875  manageiq-providers-vmware-e2e2b8029538
#      14424  fast_gettext-1.2.0
#       4115  other
#         75  default_value_for-3.0.5
#         68  concurrent-ruby-1.0.5
#         16  memoist-0.15.0
#          4  rubygems
#          3  manageiq-providers-lenovo-ed197e6bea5b

With a frozen string (not a constant though)

dh[folder.id] = path unless folder.type == "Datacenter".freeze

# Total allocated: 2055940416 bytes (22741876 objects)
# 
# allocated objects by gem
# -----------------------------------
#    9665227  pending
#    5076997  manageiq/app
#    3513259  manageiq/lib
#    2512007  activerecord-5.0.7
#     861280  activesupport-5.0.7
#     418737  ancestry-2.2.2
#     278793  activemodel-5.0.7
#     178419  ruby-2.3.3/lib
#     165577  arel-7.1.4
#      52875  manageiq-providers-vmware-e2e2b8029538
#      14424  fast_gettext-1.2.0
#       4115  other
#         75  default_value_for-3.0.5
#         68  concurrent-ruby-1.0.5
#         16  memoist-0.15.0
#          4  rubygems
#          3  manageiq-providers-lenovo-ed197e6bea5b

With a non-frozen constant

# DATACENTER_FOLDER_TYPE = "Datacenter".freeze
DATACENTER_FOLDER_TYPE = "Datacenter"

# ...

dh[folder.id] = path unless folder.type == DATACENTER_FOLDER_TYPE

# Total allocated: 2055902327 bytes (22741876 objects)
# 
# allocated objects by gem
# -----------------------------------
#    9665227  pending
#    5076997  manageiq/app
#    3513259  manageiq/lib
#    2512007  activerecord-5.0.7
#     861280  activesupport-5.0.7
#     418737  ancestry-2.2.2
#     278793  activemodel-5.0.7
#     178419  ruby-2.3.3/lib
#     165577  arel-7.1.4
#      52875  manageiq-providers-vmware-e2e2b8029538
#      14424  fast_gettext-1.2.0
#       4115  other
#         75  default_value_for-3.0.5
#         68  concurrent-ruby-1.0.5
#         16  memoist-0.15.0
#          4  rubygems
#          3  manageiq-providers-lenovo-ed197e6bea5b

So basically is it is a matter of preference between the 1st, 3rd, and 4th options since they all perform the same.

Creating some strings in this method is inevitable, since we are
generating string paths based on this compiled info, but there were a
lot of excessive strings being created because of it:

- Both of the strings in this method were being created every time
  this method was called.  For the "Datacenter".freeze one, it was
  just used as a comparison as well
- Using << where possible will avoid generating new substrings, so we
  are only creating one new string at most.
- Deferred the `.dup` of full_path until a new string was known that it
  would be generated.  This did require creating a new variable to hold
  on to the existing variable, but that wasn't a big deal and didn't
  increase the number of allocations.

Split up some ternary operators as well just to make running profilers
on this code easier to determine what as causing allocations
(granularity in tracing object allocations in ruby is only down to a
line by line basis).
@NickLaMuro NickLaMuro force-pushed the refactor_MiqRequestWorkflow_get_ems_folders branch from c12c75b to 1e0d6af Compare May 17, 2018 21:03
@miq-bot
Copy link
Member

miq-bot commented May 17, 2018

Checked commit NickLaMuro@1e0d6af with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
1 file checked, 1 offense detected

app/models/miq_request_workflow.rb

  • ❗ - Line 877, Col 5 - Style/SafeNavigation - Use safe navigation (&.) instead of checking if an object exists before calling the method.

Copy link
Contributor

@syncrou syncrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NickLaMuro - Thanks - LGTM

Copy link
Member

@kbrock kbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just the one possible change (am good either way)

else
path << " / ".freeze << folder.name
end
dh[folder.id] = path unless folder.type == "Datacenter".freeze
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to have `elseif folder.type != "Datacenter".freeze ?
(not sure if you want to pull out into a const)

Copy link
Member Author

@NickLaMuro NickLaMuro May 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not sure if you want to pull out into a const)

Per @syncrou 's previous review, he thought it was easier to read when it was not a constant. I already had it as a constant originally.

would it make sense to have `elseif folder.type != "Datacenter".freeze ?

Possibly, though I was mostly trying to keep the logic the same, and just adjusting things to limit object allocations. If you look at the previous code, the if logic is basically exactly the same, I just expanded out the ternary operator since I thought it was easier to read in that form. Figured trying to tweak the logic would add unneeded risk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah - you are actually using path below

@gmcculloug gmcculloug merged commit 7060b2d into ManageIQ:master May 22, 2018
@gmcculloug gmcculloug added this to the Sprint 87 Ending Jun 4, 2018 milestone May 22, 2018
@simaishi
Copy link
Contributor

Added fine/yes and gaprindashvili/yes as per https://bugzilla.redhat.com/show_bug.cgi?id=1569626

simaishi pushed a commit that referenced this pull request Jun 21, 2018
…_get_ems_folders

Refactor get_ems_folders to create less strings
(cherry picked from commit 7060b2d)

https://bugzilla.redhat.com/show_bug.cgi?id=1593798
@simaishi
Copy link
Contributor

Fine backport details:

$ git log -1
commit ec917a83d9a8fc2f135fb40c575776391d68b6aa
Author: Greg McCullough <[email protected]>
Date:   Tue May 22 17:27:56 2018 -0400

    Merge pull request #17358 from NickLaMuro/refactor_MiqRequestWorkflow_get_ems_folders
    
    Refactor get_ems_folders to create less strings
    (cherry picked from commit 7060b2df04b5364bf6164a0368b5ea7edab6209e)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1593798

simaishi pushed a commit that referenced this pull request Jun 21, 2018
…_get_ems_folders

Refactor get_ems_folders to create less strings
(cherry picked from commit 7060b2d)

https://bugzilla.redhat.com/show_bug.cgi?id=1593797
@simaishi
Copy link
Contributor

Gaprindashvili backport details:

$ git log -1
commit 2f85825a773b3d894df8041a271e42217cda8b2f
Author: Greg McCullough <[email protected]>
Date:   Tue May 22 17:27:56 2018 -0400

    Merge pull request #17358 from NickLaMuro/refactor_MiqRequestWorkflow_get_ems_folders
    
    Refactor get_ems_folders to create less strings
    (cherry picked from commit 7060b2df04b5364bf6164a0368b5ea7edab6209e)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1593797

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants