-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Soft delete instead of disconnect for containers models migrations #15250
Soft delete instead of disconnect for containers models migrations #15250
Conversation
@miq-bot add_label enhancement |
cc @zeari |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good start.
just indexing deleted
will help with using bitmap indexes, but these are not that good.
We need to get the where clause into the actual index that will be used.
So I'm guessing we're going to need to look at how deleted will actually be used and change most of our indexes for these tables.
end | ||
|
||
def disconnect_to_soft_delete(model) | ||
model.where(:ems_id => nil).find_each do |rec| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to move over to a single query?
model.where(:ems_id => nil).update_all('ems_id = old_ems_id AND deleted = true')
0666604
to
6c89530
Compare
@kbrock data migration is using update_all now @cben @kbrock I am using just the :deleted_on column now, but the query cost came up almost 3x times. But I guess it's not that bad in the end, what do you think? The partial index is only slightly better now, so not really worth it. using partial index on deleted_on IS NULL ContainerImage Load (4.8ms) SELECT "container_images".* FROM "container_images" WHERE "container_images"."ems_id" = $1 AND "container_images"."deleted_on" IS NULL AND (id > 1111) LIMIT $2 [["ems_id", 26], ["LIMIT", 1000]]
ContainerImage Inst Including Associations (18.2ms - 1000rows)
EXPLAIN for: SELECT "container_images".* FROM "container_images" WHERE "container_images"."ems_id" = $1 AND "container_images"."deleted_on" IS NULL AND (id > 1111) LIMIT $2 [["ems_id", 26], ["LIMIT", 1000]]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..419.70 rows=1000 width=922)
-> Index Scan using index_container_images_on_deleted_on_null on container_images (cost=0.29..8284.82 rows=19753 width=922)
Index Cond: (deleted_on IS NULL)
Filter: ((id > 1111) AND (ems_id = '26'::bigint))
(4 rows)
ContainerImage Load (27.2ms) SELECT "container_images".* FROM "container_images" WHERE "container_images"."ems_id" = $1 AND "container_images"."deleted_on" IS NULL ORDER BY "container_images"."name" ASC LIMIT $2 [["ems_id", 26], ["LIMIT", 1000]]
ContainerImage Inst Including Associations (21.8ms - 1000rows)
EXPLAIN for: SELECT "container_images".* FROM "container_images" WHERE "container_images"."ems_id" = $1 AND "container_images"."deleted_on" IS NULL ORDER BY "container_images"."name" ASC LIMIT $2 [["ems_id", 26], ["LIMIT", 1000]]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=9283.32..9285.82 rows=1000 width=922)
-> Sort (cost=9283.32..9333.40 rows=20034 width=922)
Sort Key: name
-> Index Scan using index_container_images_on_deleted_on_null on container_images (cost=0.29..8184.87 rows=20034 width=922)
Index Cond: (deleted_on IS NULL)
Filter: (ems_id = '26'::bigint)
(6 rows)
ContainerImage Load (3.3ms) SELECT "container_images".* FROM "container_images" WHERE "container_images"."deleted_on" IS NULL LIMIT $1 [["LIMIT", 1000]]
ContainerImage Inst Including Associations (26.5ms - 1000rows)
EXPLAIN for: SELECT "container_images".* FROM "container_images" WHERE "container_images"."deleted_on" IS NULL LIMIT $1 [["LIMIT", 1000]]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..202.51 rows=1000 width=922)
-> Index Scan using index_container_images_on_deleted_on_null on container_images (cost=0.29..8084.92 rows=39979 width=922)
Index Cond: (deleted_on IS NULL)
(3 rows)
ContainerImage Load (40.2ms) SELECT "container_images".* FROM "container_images" WHERE "container_images"."deleted_on" IS NULL ORDER BY "container_images"."name" ASC LIMIT $1 [["LIMIT", 1000]]
ContainerImage Inst Including Associations (11.8ms - 1000rows)
=> EXPLAIN for: SELECT "container_images".* FROM "container_images" WHERE "container_images"."deleted_on" IS NULL ORDER BY "container_images"."name" ASC LIMIT $1 [["LIMIT", 1000]]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=10276.93..10279.43 rows=1000 width=922)
-> Sort (cost=10276.93..10376.88 rows=39979 width=922)
Sort Key: name
-> Index Scan using index_container_images_on_deleted_on_null on container_images (cost=0.29..8084.92 rows=39979 width=922)
Index Cond: (deleted_on IS NULL)
(5 rows) using index on :deleted_on ContainerImage Load (3.6ms) SELECT "container_images".* FROM "container_images" WHERE "container_images"."ems_id" = $1 AND "container_images"."deleted_on" IS NULL AND (id > 1111) LIMIT $2 [["ems_id", 26], ["LIMIT", 1000]]
ContainerImage Inst Including Associations (30.7ms - 1000rows)
EXPLAIN for: SELECT "container_images".* FROM "container_images" WHERE "container_images"."ems_id" = $1 AND "container_images"."deleted_on" IS NULL AND (id > 1111) LIMIT $2 [["ems_id", 26], ["LIMIT", 1000]]
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..818.82 rows=1000 width=924)
-> Index Scan using index_container_images_on_deleted_on_null on container_images (cost=0.29..8069.36 rows=9858 width=924)
Index Cond: (deleted_on IS NULL)
Filter: ((id > 1111) AND (ems_id = '26'::bigint))
(4 rows)
ContainerImage Load (29.3ms) SELECT "container_images".* FROM "container_images" WHERE "container_images"."ems_id" = $1 AND "container_images"."deleted_on" IS NULL ORDER BY "container_images"."name" ASC LIMIT $2 [["ems_id", 26], ["LIMIT", 1000]]
ContainerImage Inst Including Associations (15.1ms - 1000rows)
EXPLAIN for: SELECT "container_images".* FROM "container_images" WHERE "container_images"."ems_id" = $1 AND "container_images"."deleted_on" IS NULL ORDER BY "container_images"."name" ASC LIMIT $2 [["ems_id", 26], ["LIMIT", 1000]]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=8517.52..8520.02 rows=1000 width=924)
-> Sort (cost=8517.52..8542.52 rows=10001 width=924)
Sort Key: name
-> Index Scan using index_container_images_on_deleted_on_null on container_images (cost=0.29..7969.17 rows=10001 width=924)
Index Cond: (deleted_on IS NULL)
Filter: (ems_id = '26'::bigint)
(6 rows)
ContainerImage Load (3.0ms) SELECT "container_images".* FROM "container_images" WHERE "container_images"."deleted_on" IS NULL LIMIT $1 [["LIMIT", 1000]]
ContainerImage Inst Including Associations (10.0ms - 1000rows)
EXPLAIN for: SELECT "container_images".* FROM "container_images" WHERE "container_images"."deleted_on" IS NULL LIMIT $1 [["LIMIT", 1000]]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..196.64 rows=1000 width=924)
-> Index Scan using index_container_images_on_deleted_on_null on container_images (cost=0.29..7868.98 rows=40075 width=924)
Index Cond: (deleted_on IS NULL)
(3 rows)
ContainerImage Load (40.4ms) SELECT "container_images".* FROM "container_images" WHERE "container_images"."deleted_on" IS NULL ORDER BY "container_images"."name" ASC LIMIT $1 [["LIMIT", 1000]]
ContainerImage Inst Including Associations (14.8ms - 1000rows)
=> EXPLAIN for: SELECT "container_images".* FROM "container_images" WHERE "container_images"."deleted_on" IS NULL ORDER BY "container_images"."name" ASC LIMIT $1 [["LIMIT", 1000]]
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=10066.25..10068.75 rows=1000 width=924)
-> Sort (cost=10066.25..10166.44 rows=40075 width=924)
Sort Key: name
-> Index Scan using index_container_images_on_deleted_on_null on container_images (cost=0.29..7868.98 rows=40075 width=924)
Index Cond: (deleted_on IS NULL)
(5 rows) |
@lpichler could you review the migrations? |
6c89530
to
e317c15
Compare
cool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with these changes.
Once, we change the code for deleted
, then we can see what other indexes are needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Add :deleted to the disconnectable Containers tables, which will allow us to mark soft deleted records.
Change disconnected Containers records to soft deleted, reconnecting the old_ems_id if possible.
Specs for changing disconnected Containers records to soft deleted
Add partial index to :deleted col to speedup queries. It has to be a partial index, otherwise the cost is too high and PG will not pick it.
Fix the comment :deleted => true goes to :ems_id => nil
Optimize data migration queries to run in 1 query per model
@agrare Can you do a first-pass review? |
|
||
add_index :container_definitions, :deleted, | ||
:name => "index_container_definitions_on_deleted_false", | ||
:where => "NOT deleted" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does a partial index on the boolean do? Why have the where clause at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a boolean it's recommended. Since PG would not pick a normal index on a boolean, because the cost is too high.
add_column :container_groups, :deleted, :boolean, :default => false, :null => false | ||
add_column :container_images, :deleted, :boolean, :default => false, :null => false | ||
add_column :container_projects, :deleted, :boolean, :default => false, :null => false | ||
add_column :containers, :deleted, :boolean, :default => false, :null => false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer default_value_for over the :default => false, :null => false
, or does having it in the column contribute to better performance?
Additionally, this PR doesn't handle upgrading existing values in the database to set the default to true where things are already deleted. NVM, didn't see the second follow-up migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I wan't just true/false ensured, so the indexing space is small
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default_value_for ensures a value and would keep the indexing space small as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a default_value_for a ruby thing, done on Model? It make sense to have this ensured on the DB layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not going to have this debate again. 😄 We intentionally avoid DB layer logic (foreign keys, default values, etc), so that we can maintain maximum flexibility for release branches. That is, since we don't allow schema changes / db migrations on release branches, if we make a mistake we can't fix it. (Additionally, when we were on rubyrep, these constraints had always proven to be pratically impossible for replication. Now that we are on pglogical, that argument is probably moot.)
I've been hesitant to allow these through unless they are absolutely necessary (e.g. there'd have to be a compelling performance reason)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we use a DB level replication, that argument is not valid, but good to know the past reasons. :-)
So saying that, what are the other concrete blockers? Given we want to design it right, this should be the correct way. So I would like to avoid bad DB design, unless there are real concerns.
Btw. I can't think of any reason that would require a later change in this column. Deleted true/false should be the end state.
@Ladas Can you give me a tl;dr on your EXPLAINs there? Maybe a little chart or something that can show me the pertinent details in a glance? |
@Fryguy right, so the explains are testing few SQL queries we are using through the MIQ and it seems that using a partial index on a boolean has the lowest cost. The cost of using :deleted_on => nil was almost 3x as high. The queries were tested on a DB having 1M records + in each table, to make sure sequential scan will not be picked. (1M on a different manager, 40k on a tested manager, where 20k is active and 20k disconnected) |
but 3x on what magnitude? Are we talking 0.001ms to 0.003 ms? 😆 |
I depesz-ed your explains above for readability (I just can't read them haha). however it looks like you just ran them with EXPLAIN instead of EXPLAIN ANALYZE. Can you do the latter, which tends to give more interesting results as you also get the real-world query times? using partial index on deleted_on IS NULL
using index on :deleted_on
ProTip: Open the links for the same query in two separate tabs then click back and forth to see the differences. |
@zakiva could you do it this way then? So we don't have to do more data migrations. |
@Ladas #15351 is targeted to 5.8.1 BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1460401 and we would like to merge this as soon as possible, so I prefer not to create dependency on this PR with all its migrations. |
@Fryguy let me measure the EXPLAIN ANALYZE and get back to you :-) |
@Fryguy this is explain analyze for the partial index on :deleted pp ems = ManageIQ::Providers::ContainerManager.find(26); conl = ->(q) {ActiveRecord::Base.connection.execute("EXPLAIN ANALYZE #{q.to_sql}") }; pp conl.call(ems.container_images.where("id > 1111").limit(1000)).to_a; pp conl.call(ems.container_images.order(:name).limit(1000)).to_a; pp conl.call(ContainerImage.active.limit(1000)).to_a; pp conl.call(ContainerImage.active.order(:name).limit(1000)).to_a
|
@Fryguy and for general index on :deleted_on
|
Would like @simon3z to review before proceeding to make sure this works for him and his team. |
…nnect Migration keep :deleted_on as the only only source of truth for disconnect
b11b3cc
to
c6199d7
Compare
Reword the migration comments
4ce65c3
to
64e655f
Compare
@cben please review/approve, thanks. |
Update specs to work with deleted_on
Checked commits Ladas/manageiq@9f2ce26~...143d67d with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 db/migrate/20170530102536_use_deleted_in_containers_tables.rb
|
Database migrations have now been moved to the https://github.com/ManageIQ/manageiq-schema repo. Please see http://talk.manageiq.org/t/new-split-repo-manageiq-schema/2478 for instructions on how to transfer your database migrations. If this PR contains only migrations, I will leave it open for a short time during the transition, after which I will close this if it has not been moved over. |
replaced with ManageIQ/manageiq-schema#18 |
Soft delete instead of disconnect for containers models migrations + data migration specs
Using partial indexes have actually a better cots than indexing :ems_id and using disconnect:
Testing on queries, on 20k deleted and 20k not deleted container images:
pp ems = ManageIQ::Providers::ContainerManager.find(26); pp ems.container_images.where("id > 1111").limit(1000).explain; pp ems.container_images.order(:name).limit(1000).explain; pp ContainerImage.not_deleted.limit(1000).explain; ContainerImage.not_deleted.order(:name).limit(1000).explain
The soft delete with partial index on :deleted => true:
The :ems_id => nil disconnect