Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Courey/update table stream #2781

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

Courey
Copy link
Contributor

@Courey Courey commented Dec 12, 2024

This change updates the table-stream to account for the past synonymId when a synonym group is deleted.

Testing

This is very challenging to test. It does not work as expected when using the moderator interface locally because table-streams are not supported locally.

I tested it by running the event via the invoker.
The challenge with this method is that I don't think it takes into account the state of dynamodb. This means that during testing with the invoker, the original dynamodb record is still in dynamo. This breaks the functionality for the table-stream because it is checking to make sure nothing still exists in the original synonymId's group. But if the record in dynamodb is not updated, that check will find the existing record and skip the delete.

I added in this line to work around that:
filter((synonym) => synonym.eventId != eventId)

It feels a little odd because in order to trigger this table-stream that original record should have changed already. If I understand correctly, the order of operations is the dynamodb change is made which then triggers the table-stream. So that record should not exist with that synonymId in dynamo when this runs.

It shouldn't hurt to have that filter on there, but I'm unsure if it's necessary because of limitations in local testing.

This is the result of running a modifiy:

{
  body: {
    _index: 'synonym-groups',
    _id: '1aa5892d-4cea-4391-9750-e5cf681bdfd4',
    _version: 2,
    result: 'deleted',
    _shards: { total: 2, successful: 1, failed: 0 },
    _seq_no: 123,
    _primary_term: 1
  },
  statusCode: 200,
  headers: {
    'x-elastic-product': 'Elasticsearch',
    'content-type': 'application/json',
    'content-length': '184'
  },
  meta: {
    context: null,
    request: { params: [Object], options: {}, id: 2 },
    name: 'opensearch-js',
    connection: {
      url: 'http://localhost:9200/',
      id: 'http://localhost:9200/',
      headers: {},
      deadCount: 0,
      resurrectTimeout: 0,
      _openRequests: 0,
      status: 'alive',
      roles: [Object]
    },
    attempts: 0,
    aborted: false
  }
}

EDITED TO ADD:
I'd like to explain the updates so everyone is clear on the cases involved.
originally the synonyms were deleted if everything was removed and when a synonym was removed from a group, no new synonym was created. We changed this when we made it so that every eventId has a synonym record and I didn't update the table sync.

code walk through

cases:

  1. If a synonym is created initially, there are no existing records to update and there would be no previous synonymId involved.
  2. if a synonym is removed from a group, it needs to not only update the new synonym record, but it also needs to be removed from the old synonym record.
  3. If the synonym is the last one remaining in a group and that group has been deleted (so all the members get new synonymIds) then the opensearch record for that group needs to be removed while also creating the new records for the synonyms.

here is a breakdown of the code for the handler:

    const { synonymId, eventId } = unmarshallTrigger(
      dynamodb!.NewImage
    ) as Synonym
    const dynamoSynonyms = await getSynonymsByUuid(synonymId)

the updated or new synonym comes through. We get any other dynamo records that belong in the group

    const opensearchSynonym = await opensearchKeywordSearch({ eventId })

here I am checking to see if there is an existing opensearch record for that eventId. This is because in the update for dynamo, the eventId doesn't change, but the synonymId does. There is nothing retaining the knowledge of if there was a previous group involved in this update and what that previous synonymId would be. So to get that, we have to look at opensearch to see if there is an existing record for this eventId

    const previousSynonymId = opensearchSynonym
      ? opensearchSynonym.synonymId
      : null
    const dynamoPreviousGroup = previousSynonymId
      ? (await getSynonymsByUuid(previousSynonymId)).filter(
          (synonym) => synonym.eventId != eventId
        )
      : []

If there is a previous synonymId, it is stored in the previousSynonym variable. That is used to determine if we need to get the members of that previous group. If there was a previous synonymId involved, we look to see if there are other members of the group left. We do this because we need to update the existing openSearch record from the previous group in addition to updating the current synonym group.

    if (dynamoPreviousGroup.length === 0 && previousSynonymId) {
      await removeIndex(previousSynonymId)
    } else  if (previousSynonymId && dynamoPreviousGroup.length > 0){
      await putIndex({
        synonymId: previousSynonymId,
        eventIds: dynamoPreviousGroup.map((synonym) => synonym.eventId),
        slugs: dynamoPreviousGroup.map((synonym) => synonym.slug),
      })
    }

If there is a previous group, and it has no members left, it needs to remove the previous synonym record from opensearch.
If there is a previous group and it still has members, it needs to be updated to remove the current eventId from the opensearch record.

    if (dynamoSynonyms.length > 0) {
      await putIndex({
        synonymId,
        eventIds: dynamoSynonyms.map((synonym) => synonym.eventId),
        slugs: dynamoSynonyms.map((synonym) => synonym.slug),
      })
    } else {
      await removeIndex(synonymId)
    }
  }
)

This part is updating the current synonym group. If there are any members in the current group, then the opensearch record is updated with the current members.

If there are no members of the current group (which shouldn't really happen since we create a new synonym record when we update a synonymId on a dynamo record) i left this in so if there were no members of the group it would be removed.

test walk through

I updated the unit tests for this. These kinds of tests are heavily mocked, so the test is only as good as the mock is.
There are only five cases under test here.

  1. an initial synonym record is created. So no previous synonym group exists so there is nothing to update except the current group.
  2. an eventId is removed from it's existing group and placed into a new group. In this case it has to do an update to the new group and the old group is removed because it has no members.
  3. moving an eventId into existing synonym group with removal from old group with remaining members, so both are updated
  4. moving an eventId into new synonym group with removal from old group with remaining members so both the new and old opensearch records are updated
  5. insert into new synonym group with removal from old group without remaining members so the new record is updated and the old is deleted.

@Courey Courey requested review from lpsinger and dakota002 December 12, 2024 19:46
Copy link

codecov bot commented Dec 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 6.08%. Comparing base (909af08) to head (3dc65fb).
Report is 4 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff            @@
##            main   #2781      +/-   ##
========================================
- Coverage   6.08%   6.08%   -0.01%     
========================================
  Files        171     171              
  Lines       4340    4342       +2     
  Branches     476     478       +2     
========================================
  Hits         264     264              
- Misses      4074    4075       +1     
- Partials       2       3       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Courey Courey force-pushed the courey/update_table_stream branch from b7b818f to 4e4a9cb Compare December 13, 2024 01:05
@lpsinger
Copy link
Member

This is very challenging to test. It does not work as expected when using the moderator interface locally because table-streams are not supported locally.

This is, indeed, a longstanding pain point. It may be worth investing some effort with helping upstream to add support for table streams: architect/architect#1140

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants