Amazon S3 objects inventoring #123

aiperon · 2017-01-27T09:52:46Z

Work for Amazon ObjectStorage (S3) provider.

Parsing information about objects on S3 buckets. Calculating bucket's size and object count.
Implemented with both old and inventory parser.

@miq-bot add_label enhancement, euwe/no

miq-bot · 2017-02-02T09:17:28Z

This pull request is not mergeable. Please rebase and repush.

Parsing information about objects on S3 buckets. Calculating bucket's size and object count.

miq-bot · 2017-02-20T09:37:29Z

Checked commit aiperon@59a95ae with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
8 files checked, 0 offenses detected
Everything looks good. ⭐

aiperon · 2017-02-20T09:48:05Z

config/settings.yml

@@ -23,6 +23,8 @@
        :values:
          - machine
    :ignore_terminated_instances: true
+  :s3:
+    :inventory_object_refresh: true


Old way of inventory refresh is not able to handle large amount of objects.
For example, 100k S3 objects handled by:

2 minutes with the new inventory object refresher

about 2 hours with the old refresher

As far as removing old refresher is the question of time, we've decided to set the default setting to new mechanism usage.

👍 I'll be doing the same for the other managers soon

hm, 2m vs. 2h, that is quite a bit difference, I wonder what makes it so slow in the old refresh

Most of the time consumed on the next lines:
https://github.com/ManageIQ/manageiq/blob/master/app/models/ems_refresh/save_inventory_helper.rb#L55
https://github.com/ManageIQ/manageiq/blob/master/app/models/ems_refresh/save_inventory_helper.rb#L69
To reproduce that:

change s3_objects_per_bucket_count to "scaling * 10000" in aws_stubs.rb

run rspec plugins/managiq-amazon-provider/spec/models/manageiq/providers/amazon/storage_manager/s3/stubbed_refresher_spec.rb

I've rejected idea of optimizing legacy parser's core.

yeah, I mean the new refresh is 'the optimization' :-D

but I think that in this case you are hitting a 5.0.1 Rails bug, the association.push has there a horrible O(n^2), it's already fixed in rails Master. So with that fixed, the number should not be that big for the old refresh. :-) Should be something around 15 minutes

you can try by removing the 'unless @target.include?' here https://github.com/rails/rails/blob/v5.0.1/activerecord/lib/active_record/associations/collection_association.rb#L663

Well, association.push is not the major problem there. Iterating through hash is the bigger problem. Like 80 / 20 (iterating hash with find or create / association push).

Yeah, and thanks for the optimized new refresh! It's complex, but at least pretty fast))

hm you are right, for me it's 7500s vs 60s, with the rails fix applied. So something is really wrong in the old refresh

aiperon · 2017-02-20T11:57:00Z

There is performance issue with provider deletion. I wasn't able to do it until I delete all of the CloudObjectStoreObject records (>40k) manually in console. Mass-deleting of objects should be implemented on Manager level.

Ladas · 2017-02-21T12:53:57Z

@aiperon so a similar issue is for cloud.vms, so it's solved by doing dependent => :nullify instead of destroy
https://github.com/Ladas/manageiq/blob/4240b7c1f4d42d7013c1ba11ad6d55eaa84a53bd/app/models/ext_management_system.rb#L37-L37

Then it leaves there the vms as 'archived'

So having :dependent => :destroy for a large number of records will be always slow. This is kind of a way around. We should probably just mark provider as (being deleted) while doing all this on background. The difference with Vms is that we never delete them, but always archive them.

Ladas · 2017-02-21T13:13:15Z

@aiperon looks like a great first version 👍 since the code here got more complex, I think we could use the nice DSL the Ansible is using https://github.com/Ladas/manageiq/blob/35da1fa8e9804259fe133a834a7b48b93245464e/app/models/manageiq/providers/ansible_tower/inventory/parser/automation_manager.rb#L28-L28

process_inventory_collection helper is too clunky, when there is too much logic

But this is good to merge, we can do refactoring as another PR, while passing the same specs. :-)

aiperon · 2017-02-21T13:22:39Z

@Ladas
I don't think dependent => :nullify is the appropriate way. Potentially, we could get zombies.
We could set cloud_object_store_objects as :dependent => :delete_all. But they're taggable.
Is it possible to describe more complex before_destroy for storage_manager?

Ladas · 2017-02-21T13:33:36Z

So the best should be to somehow mark the Manager as being deleted, which would prevent other operations on it and then do the deletion as a background task. The main issue is that it runs or foreground and blocks the UI, right?

Or doing the dependent => :nullify and a background clean up later. But the point of :dependent => :nullify is to keep the zombie. For vms, we need to keep them so e.g. reports and events won't miss the association.

aiperon · 2017-02-21T13:37:48Z

The main issue is that it runs or foreground and blocks the UI, right?

I suppose it's in a background already. But "somehow" it doesn't do the job.

miq-bot added enhancement euwe/no labels Jan 27, 2017

aiperon force-pushed the amazon-s3-objects-parsing branch from ccbfd12 to 525e236 Compare January 31, 2017 10:20

miq-bot added the unmergeable label Feb 2, 2017

aiperon force-pushed the amazon-s3-objects-parsing branch from 525e236 to e91fc36 Compare February 19, 2017 18:30

miq-bot removed the unmergeable label Feb 19, 2017

aiperon force-pushed the amazon-s3-objects-parsing branch 3 times, most recently from 6dce432 to 47eff74 Compare February 20, 2017 09:31

aiperon changed the title ~~[WIP] Amazon S3 objects listing~~ Amazon S3 objects listing Feb 20, 2017

aiperon changed the title ~~Amazon S3 objects listing~~ Amazon S3 objects inventoring Feb 20, 2017

Amazon S3 objects listing

59a95ae

Parsing information about objects on S3 buckets. Calculating bucket's size and object count.

aiperon force-pushed the amazon-s3-objects-parsing branch from 47eff74 to 59a95ae Compare February 20, 2017 09:34

aiperon commented Feb 20, 2017

View reviewed changes

durandom assigned Ladas Feb 20, 2017

durandom requested a review from Ladas February 20, 2017 10:23

Ladas merged commit a2b9fe6 into ManageIQ:master Feb 21, 2017

aiperon mentioned this pull request Feb 21, 2017

Perfomance fix for Object Storage Manager deletion ManageIQ/manageiq#14009

Merged

blomquisg added this to the Sprint 55 Ending Feb 27, 2017 milestone Feb 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amazon S3 objects inventoring #123

Amazon S3 objects inventoring #123

aiperon commented Jan 27, 2017 •

edited

Loading

miq-bot commented Feb 2, 2017

miq-bot commented Feb 20, 2017

aiperon Feb 20, 2017 •

edited

Loading

Ladas Feb 21, 2017

aiperon Feb 21, 2017 •

edited

Loading

Ladas Feb 21, 2017

Ladas Feb 21, 2017 •

edited

Loading

aiperon Feb 21, 2017

Ladas Feb 21, 2017

aiperon commented Feb 20, 2017

Ladas commented Feb 21, 2017 •

edited

Loading

Ladas commented Feb 21, 2017

aiperon commented Feb 21, 2017

Ladas commented Feb 21, 2017

aiperon commented Feb 21, 2017 •

edited

Loading

Amazon S3 objects inventoring #123

Amazon S3 objects inventoring #123

Conversation

aiperon commented Jan 27, 2017 • edited Loading

miq-bot commented Feb 2, 2017

miq-bot commented Feb 20, 2017

aiperon Feb 20, 2017 • edited Loading

Choose a reason for hiding this comment

Ladas Feb 21, 2017

Choose a reason for hiding this comment

aiperon Feb 21, 2017 • edited Loading

Choose a reason for hiding this comment

Ladas Feb 21, 2017

Choose a reason for hiding this comment

Ladas Feb 21, 2017 • edited Loading

Choose a reason for hiding this comment

aiperon Feb 21, 2017

Choose a reason for hiding this comment

Ladas Feb 21, 2017

Choose a reason for hiding this comment

aiperon commented Feb 20, 2017

Ladas commented Feb 21, 2017 • edited Loading

Ladas commented Feb 21, 2017

aiperon commented Feb 21, 2017

Ladas commented Feb 21, 2017

aiperon commented Feb 21, 2017 • edited Loading

aiperon commented Jan 27, 2017 •

edited

Loading

aiperon Feb 20, 2017 •

edited

Loading

aiperon Feb 21, 2017 •

edited

Loading

Ladas Feb 21, 2017 •

edited

Loading

Ladas commented Feb 21, 2017 •

edited

Loading

aiperon commented Feb 21, 2017 •

edited

Loading