VAULT-28677: Fix dangling entity-aliases in MemDB after invalidation #27750

marcboudreau · 2024-07-10T18:24:02Z

Description

This change corrects a regression that was introduced by #27184.

When an entity has been modified in a storage bucket such that one or more aliases has been removed, those removed aliases were not being deleted from the MemDB table containing them. This change corrects this by scanning all associated aliases with entities that have been determined to be modified in the storage bucket, and deleting any associated aliases from MemDB that are no longer associated with the entity in the storage bucket.

TODO only if you're a HashiCorp employee

Labels: If this PR is the CE portion of an ENT change, and that ENT change is
getting backported to N-2, use the new style backport/ent/x.x.x+ent labels
instead of the old style backport/x.x.x labels.
Labels: If this PR is a CE only change, it can only be backported to N, so use
the normal backport/x.x.x label (there should be only 1).
ENT Breakage: If this PR either 1) removes a public function OR 2) changes the signature
of a public function, even if that change is in a CE file, double check that
applying the patch for this PR to the ENT repo and running tests doesn't
break any tests. Sometimes ENT only tests rely on public functions in CE
files.
Jira: If this change has an associated Jira, it's referenced either
in the PR description, commit message, or branch name.
RFC: If this change has an associated RFC, please link it in the description.
ENT PR: If this change has an associated ENT PR, please link it in the
description. Also, make sure the changelog is in this PR, not in your ENT PR.

github-actions · 2024-07-10T18:28:39Z

CI Results:
All Go tests succeeded! ✅

github-actions · 2024-07-10T18:28:51Z

Build Results:
All builds succeeded! ✅

biazmoreira · 2024-07-16T12:52:22Z

vault/identity_store.go

+			}
+
+			// If the entity exists in MemDB it must differ from the entity in
+			// the storage bucket because of above test. Go through all of the


If the entity exists and it differs, are we assuming that the difference between them is always going to be the aliases? Why don't we simply delete the entity from memdb and add it again like the previous implementation did?

If the MemDB entity differs from its corresponding storage bucket entity, it may or may not be the aliases. But this makes me think we could detect if the Aliases slices are the same but that would entail walking the entire set and that's what the current algorithm is doing. I could add a test case where there are no changes to the Aliases to really make sure it works.

But this makes me think we could detect if the Aliases slices are the same but that would entail walking the entire set and that's what the current algorithm is doing.

Wouldn't be easy to just detect if there are changes and replace one entity with the other? Deleting from memdb and adding without having to walk a slice and detect changes?

+1 to what Bianca is saying. Just chiming in so I can follow along 😄

I've pushed a change that follows this suggestion.

vault/identity_store.go

biazmoreira · 2024-07-23T14:27:36Z

vault/identity_store.go

+			// function does not delete those aliases, it only creates missing
+			// ones.
+			if memDBEntity != nil {
+				if err := i.deleteAliasesInEntityInTxn(txn, memDBEntity, memDBEntity.Aliases); err != nil {


@marcboudreau @elliesterner brought up a valid point while we were talking about entity merge prevention. Do you think not deleting the entity from memdb might cause an automatic merge to be triggered? if you could write a test for that, that would be awesome. we would like to prevent further merges from happening.

From looking at the code in (*IdentityStore).upsertEntityInTxn, there are 2 circumstances that lead to (*IdentityStore).mergeEntityAsPartOfUpsert being called:

previousEntity is not nil and entity has an alias in its Aliases field that exists in MemDB and whose CanonicalID field is set to the value of previousEntity.ID

entity has an alias in its Aliases field that exists in MemDB and whose CanonicalID field is set to a value that is different than entity.ID.

In the (*IdentityStore).invalidateEntityBucket function, when upsertEntityInTxn is called, the previousEntity argument is always nil, so that rules out circumstance 1.

And by pre-deleting the aliases, we ensure that circumstance 2 cannot happen either.

For the sake of clarity, pre-deleting the entity from MemDB and then calling (*IdentityStore).upsertEntityInTxn won't prevent an entity merge from happening, since the logic that decides that doesn't take into account whether the entity exists in MemDB or not. I think the only way to prevent an entity merge from happening, would be to scan each of the aliases associated with the entity (instead of pre-deleted them) and search for any alias in MemDB with a matching alias name and mount accessor and delete those. That would make it impossible for circumstance 2 to happen.

elliesterner

looks great!!

banks · 2024-07-29T16:49:24Z

vault/identity_store.go

+			// If this is a performance secondary, the entity created on
+			// this node would have been cached in a local cache based on
+			// the result of the CreateEntity RPC call to the primary
+			// cluster. Since this invalidation is signaling that the
+			// entity is now in the primary cluster's storage, the locally
+			// cached entry can be removed.
+			if i.localNode.ReplicationState().HasState(consts.ReplicationPerformanceSecondary) && i.localNode.HAState() == consts.Active {
+				if err := i.localAliasPacker.DeleteItem(ctx, bucketEntity.ID+tmpSuffix); err != nil {
+					i.logger.Error("failed to clear local alias entity cache", "error", err, "entity_id", bucketEntity.ID)
+					return


This whole dance is fascinating. I'm a little curious how you discovered it here - it seems like this being missing is an unrelated bug to the regression right?

@biazmoreira you probably know all about this already, is this another place that breaks the mental model of all global writes go to primary because we updated memdb with a global thing outside of replication? I think we saw places like that with standbys but this was new to me that we have perf secondaries updating their memdb outside of replication for replicated state.

For anyone who reads this later, I realised this wasn't new code - just moved down from further up (see lines 739 and on in the before part of this diff).

Marc Boudreau added 2 commits July 10, 2024 11:20

properly cleanup aliases no longer in entity during invalidation

b217021

test: verify proper alias removal from entity in invalidation

a230738

marcboudreau added backport/ent/1.16.x+ent Changes are backported to 1.16.x+ent backport/1.17.x labels Jul 10, 2024

marcboudreau added this to the 1.17.3 milestone Jul 10, 2024

github-actions bot added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Jul 10, 2024

add changelog entry

4ba821e

document dangling entity-alias known issue

60210bf

marcboudreau requested a review from a team as a code owner July 10, 2024 19:31

vercel bot deployed to Preview July 10, 2024 19:39 View deployment

Marc Boudreau added 2 commits July 10, 2024 16:10

improve entity-alias delete test

6b5c05d

fixup! document dangling entity-alias known issue

2669a06

vercel bot deployed to Preview July 12, 2024 17:44 View deployment

biazmoreira self-requested a review July 15, 2024 16:04

biazmoreira reviewed Jul 16, 2024

View reviewed changes

use simpler approach to reconcile entity aliases in invalidation

b2a7812

biazmoreira reviewed Jul 22, 2024

View reviewed changes

vault/identity_store.go Show resolved Hide resolved

biazmoreira approved these changes Jul 23, 2024

View reviewed changes

adjust comment to match previous code change

f7e47ff

biazmoreira reviewed Jul 23, 2024

View reviewed changes

Marc Boudreau added 2 commits July 24, 2024 14:52

add test covering local aliases

eb7df5f

pre-delete changed entity in invalidation

6e7aea6

elliesterner approved these changes Jul 24, 2024

View reviewed changes

marcboudreau merged commit a41c21b into main Jul 25, 2024
84 checks passed

marcboudreau deleted the marcboudreau/fix-merge-entities-on-invalidate branch July 25, 2024 19:36

hc-github-team-secure-vault-core mentioned this pull request Jul 25, 2024

Backport of VAULT-28677: Fix dangling entity-aliases in MemDB after invalidation into release/1.17.x #27870

Merged

6 tasks

banks reviewed Jul 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VAULT-28677: Fix dangling entity-aliases in MemDB after invalidation #27750

VAULT-28677: Fix dangling entity-aliases in MemDB after invalidation #27750

marcboudreau commented Jul 10, 2024

github-actions bot commented Jul 10, 2024 •

edited

Loading

github-actions bot commented Jul 10, 2024 •

edited

Loading

biazmoreira Jul 16, 2024

marcboudreau Jul 16, 2024

biazmoreira Jul 17, 2024

elliesterner Jul 19, 2024

marcboudreau Jul 22, 2024

biazmoreira Jul 23, 2024

marcboudreau Jul 23, 2024 •

edited

Loading

marcboudreau Jul 23, 2024

elliesterner left a comment

banks Jul 29, 2024

banks Sep 9, 2024

VAULT-28677: Fix dangling entity-aliases in MemDB after invalidation #27750

VAULT-28677: Fix dangling entity-aliases in MemDB after invalidation #27750

Conversation

marcboudreau commented Jul 10, 2024

Description

TODO only if you're a HashiCorp employee

github-actions bot commented Jul 10, 2024 • edited Loading

github-actions bot commented Jul 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcboudreau Jul 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elliesterner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jul 10, 2024 •

edited

Loading

github-actions bot commented Jul 10, 2024 •

edited

Loading

marcboudreau Jul 23, 2024 •

edited

Loading