-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synchronous deletion of small objects #1174
Conversation
d6844a3
to
3e3f9a6
Compare
* add an configuration item `active_delete_threshold` (default is 0) * small objects (UUIDs) above that threshold are synchronously deleted right after marked as `pending_delete`, while others are to be marked as `scheduled_delete` * The deletion involves invoking `riak_cs_delete_fsm` and removing the UUID and manifest from that object history * This does not change manifest handling semantics where any failure between marking as `pending_delete` and `scheduled_delete`. * If active deletion failed, it will be handled next time that object is updated as `pending_delete` manifests as before. * This commit introduces risk of blocks leak, in case where deleted UUIDs could be erased at sink side without deleting all blocks. If any RTQ drop were detected, block leak checker should be run against sink cluster.
3e3f9a6
to
4b72102
Compare
[{cleanup_manifests, false}]], | ||
{ok, Pid} = riak_cs_delete_fsm_sup:start_delete_fsm(node(), Args), | ||
Ref = erlang:monitor(process, Pid), | ||
receive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when delete_fsm dies unexpectedly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be catched by the last clause of receive...
Some notes on this: https://github.com/basho/riak_cs/wiki/Active-Deletion-of-Small-Objects |
Some corner cases:
|
I'd like to cover those corner cases in next pull request, to keep this work small - fair? |
Yup. I must not forget about them 😅 |
Fmmm, but do you think covering them in documentation are enough, like mentioning "use this feature special care if you combine with mdc replication"? |
Two corner cases above have nothing to do with MDC. They will occur in single cluster environment. |
oops, confused |
I had to be clearer. I am afraid for manifest ressurrection is because it is end user facing issue. For something just bothering operators, we may be tolerent (depending on operators). However, for end user facing issue, such switch should not be turned on for public storage service based on riak cs. On the other hand, empty manifest history is NOT end user facing, so I think it as minor. |
-spec maybe_delete_small_objects([cs_uuid_and_manifest()], riak_client(), non_neg_integer()) -> | ||
{[cs_uuid_and_manifest()], [cs_uuid()]}. | ||
maybe_delete_small_objects(Manifests, RcPid, Threshold) -> | ||
{ok, BagId} = riak_cs_riak_client:get_manifest_bag(RcPid), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will follow up with multibag code.
Besides issues bellow, this is ready for review again as well as basho/riak_cs_multibag#26 . Let's discuss 3 options after this got merged. How to handle
|
{PDManifests0, []} | ||
end, | ||
|
||
PDUUIDs = [UUID || {UUID, _} <- ToGC], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PDUUIDs
is used only in the ok
banch below and not in the other. Moving this line into ok
branch would be better.
So we chose Option 1 as its assumption was wrong and all manifests will be collected after leeway period. I think I've addressed all your comments, @shino. Please take a look again. Also, I'm having problem with my riak_test installation, I'd be happy if you run it with multibag enabled. |
All riak_test has passed, it's ready for review now, although it's Friday afternoon ... |
Nice summary for resurrection vs lingering of manifest trade-off. Agree with adopting option 1. |
Not for code, but wiki page https://github.com/basho/riak_cs/wiki/Active-Deletion-of-Small-Objects, For section : https://github.com/basho/riak_cs/wiki/Active-Deletion-of-Small-Objects#things-to-be-cared-in-production
|
@@ -200,6 +200,16 @@ | |||
hidden | |||
]}. | |||
|
|||
%% @doc Small objects to be synchronously deleted including blocks on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some discussion, only blocks are deleted on the fly. Then, "including blocks" may lead to misunderstanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention lingering manifest here too?
In multiple delete, following errors happened.
In riak_cs_gc,
If delete fsm dies before demonitor, this process gets |
Executed basho_bench with cs2 driver for this branch and found The gist [1] includes
Execution with redbug
shows,
|
It seems like the commit b4af3a9 fixed the above HTTP 400 response too. Great 😄 |
Great work! I will run all riak_test cases and +1 after a few remaining comments are reflected 🎆 |
@shino updated. |
{ok, Pid} = riak_cs_delete_fsm_sup:start_delete_fsm(node(), Args), | ||
Ref = erlang:monitor(process, Pid), | ||
receive | ||
{Pid, {ok, _}} -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paste previous comment which was turned off by another change:
From riak_cs_delete_fsm.erl:
deleting({block_deleted, {ok, BlockID}, DeleterPid},
State=#state{deleted_blocks=DeletedBlocks}) ->
UpdState = deleting_state_update(BlockID, DeleterPid, DeletedBlocks+1, State),
ManifestState = UpdState#state.manifest?MANIFEST.state,
deleting_state_result(ManifestState, UpdState);
deleting({block_deleted, {error, {unsatisfied_constraint, _, BlockID}}, DeleterPid},
State=#state{deleted_blocks=DeletedBlocks}) ->
UpdState = deleting_state_update(BlockID, DeleterPid, DeletedBlocks, State),
ManifestState = UpdState#state.manifest?MANIFEST.state,
deleting_state_result(ManifestState, UpdState);
This shows fsm does not terminate after unsatisfied_constraint
errors,
but continue to work without incrementing DeletedBlocks
.
Counterpart GC worker code is like this:
handle_delete_fsm_reply({ok, {_, _, _, TotalBlocks, TotalBlocks}},
?STATE{current_files=[CurrentManifest | RestManifests],
current_fileset=FileSet,
block_count=BlockCount} = State) ->
ok = continue(),
UpdFileSet = twop_set:del_element(CurrentManifest, FileSet),
State?STATE{delete_fsm_pid=undefined,
current_fileset=UpdFileSet,
current_files=RestManifests,
block_count=BlockCount+TotalBlocks};
handle_delete_fsm_reply({ok, {_, _, _, NumDeleted, _TotalBlocks}},
?STATE{current_files=[_CurrentManifest | RestManifests],
block_count=BlockCount} = State) ->
ok = continue(),
State?STATE{delete_fsm_pid=undefined,
current_files=RestManifests,
block_count=BlockCount+NumDeleted};
Remove manifest from twop_set only if DeletedBlocks
equals to TotalBlocks
.
Add one comment (actually just a copy :) ). It will be the last. |
0e88aa6
to
fdd94bd
Compare
All r_t passed. Great option for certain users 🎆 |
Synchronous deletion of small objects Reviewed-by: shino
@borshop merge |
active_delete_threshold
(default is 0)small objects (UUIDs) above that threshold are synchronously deleted, while others are to beright after marked as
pending_delete
marked as
scheduled_delete
scheduled_delete
and stored thereriak_cs_delete_fsm
and removing theUUID and manifest from that object history
between marking as
pending_delete
andscheduled_delete
.is updated as
pending_delete
manifests as before.UUIDs could be erased at sink side without deleting all blocks. If
any RTQ drop were detected, block leak checker should be run against
sink cluster.
Addresses #1138.