[Event-based entry cache] replayMissedEvents queries DB in loop for every missed ID #5342

chiragk25 · 2024-07-30T22:38:41Z

In the event based cache implementation, replayMissedEvents queries DB in loop for every missed ID. Since this is invoked every time we update the cache i.e. every cacheReloadInterval seconds, it can cause unintended load on the DB if there are a lot of missed events (eg due to a canceled transaction)

Ideally we should query all the missed events at once (or batch with pagination) and process them instead of querying for each missed event.

Version: 01bedb8
Platform: x86_64 GNU/Linux
Subsystem: server

The text was updated successfully, but these errors were encountered:

amoore877 · 2024-07-31T01:01:17Z

As a further enhancement I'd say that the batch query for missed events should also return unique foreign keys, so that the next set of follow-on queries do not make duplicate requests in the event that multiple missed events correlate to the same entity.

stevend-uber · 2024-07-31T21:00:07Z

Spoke to @edwbuck and @sorindumitru over slack. Edwin is planning on tackling this to optimize for the multiple transaction loops that the query currently does by implementing:

SELECT * WHERE EVENT_ID IN (id1, id2, ..., idn)

Where we should also choose to support @sorindumitru 's suggestion:

You could select event ids from a list and go through all of them. Probably with a limit to how many events you want to pull in one go.

To ensure that the list doesn't get too long and create a long running transaction.

edwbuck · 2024-08-02T13:48:49Z

@amoore877 Please disregard the prior deleted post, I misunderstood which items you were referring to.

We are processing events, and it is quite possible that two events validly update one record. For example, I might alter an entry's SPIFFE ID, and then decide to also alter the entry's selectors, after verifying that the SPIFFE ID was updates. In such cases, we expect two DB events, both referring to one SPIRE Entry.

Currently, we did not think to collapse the two events into one update, because the logic to do so would add complexity, and we believe that the scenarios where it occurs seem rare. With that in mind, I'll create a new Issue to cover the scenario, as "multi-issues" create difficulty in getting them scheduled, worked, and completed (the last concern holds up the completed concerns).

For those who might be skimming, this embedded request does not affect open transaction / skipped db event id logic, thus this request is not about optimizing any polling, but is about optimizing the single batch of fetches after an event is detected, fetching the affected entry once when it has seen two or more events in one cache refresh cycle.

edwbuck · 2024-08-02T14:01:09Z

@amoore877 #5349 should cover the optimization you requested, please track it there, and let's keep this open for the issue of replaying the event id polling when a transaction is cancelled.

edwbuck · 2024-09-10T15:36:52Z

@amartinezfayo Please assign this issue to me.

amartinezfayo · 2024-11-05T18:29:34Z

We will be looking at the implementation of a different algorithm for the events-based cache event tracking in #5624.
The new algorithm should solve existing issues related with the events-based cache. Once the algorithm described in #5624 is implemented, we can go back to this issue and figure out if any additional work is needed.

MarcosDY added the triage/in-progress Issue triage is in progress label Aug 1, 2024

amoore877 mentioned this issue Aug 2, 2024

Filter cache update reads to only read each entry once, when an entry has two events in the same refresh cycle. #5349

Closed

azdagron added this to the 1.10.2 milestone Aug 6, 2024

azdagron added priority/backlog Issue is approved and in the backlog and removed triage/in-progress Issue triage is in progress labels Aug 6, 2024

edwbuck mentioned this issue Aug 16, 2024

Add comments to events based cache code #5327

Merged

3 tasks

amartinezfayo modified the milestones: 1.10.2, 1.11.0 Aug 22, 2024

edwbuck mentioned this issue Sep 6, 2024

Full missed-event-reconciliation based on SQLTransactionTimeout Proposal #5470

Closed

amartinezfayo assigned edwbuck Sep 10, 2024

rturner3 modified the milestones: 1.11.0, 1.11.1 Oct 3, 2024

amartinezfayo removed this from the 1.11.1 milestone Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Event-based entry cache] replayMissedEvents queries DB in loop for every missed ID #5342

[Event-based entry cache] replayMissedEvents queries DB in loop for every missed ID #5342

chiragk25 commented Jul 30, 2024

amoore877 commented Jul 31, 2024

stevend-uber commented Jul 31, 2024

edwbuck commented Aug 2, 2024

edwbuck commented Aug 2, 2024

edwbuck commented Sep 10, 2024

amartinezfayo commented Nov 5, 2024

[Event-based entry cache] replayMissedEvents queries DB in loop for every missed ID #5342

[Event-based entry cache] replayMissedEvents queries DB in loop for every missed ID #5342

Comments

chiragk25 commented Jul 30, 2024

amoore877 commented Jul 31, 2024

stevend-uber commented Jul 31, 2024

edwbuck commented Aug 2, 2024

edwbuck commented Aug 2, 2024

edwbuck commented Sep 10, 2024

amartinezfayo commented Nov 5, 2024