Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added short-term caching of the first column in the channel manifest #180

Merged
merged 2 commits into from
Jun 20, 2018

Conversation

billkalter
Copy link
Contributor

Github Issue

None

What Are We Doing Here?

An inefficiency was uncovered when the head of the "manifest" table for queues and databus subscriptions have accrued many tombstones which have not yet been compacted away. Since every poll re-reads the manifest from the beginning the entire poll is slowed down as Cassandra scans over the tombstoned records. This PR briefly caches the oldest know slab ID in the manifest plus a 1 minute buffer. Future queries can be run starting at this manifest to bypass any tombstones from older, fully read and deleted slabs.

How to Test and Verify

There is no test specifically for this condition. The most important test is regression.

Risk

This is a fairly low-risk update. Even though it is at the heart of databus and queue channels the caching should serve as an optimization without risking that any manifest data goes completely unread.

Level

Medium

Required Testing

Regression

Code Review Checklist

  • Tests are included. If not, make sure you leave us a line or two for the reason.

  • Pulled down the PR and performed verification of at least being able to
    build and run.

  • Well documented, including updates to any necessary markdown files. When
    we inevitably come back to this code it will only take hours to figure out, not
    days.

  • Consistent/Clear/Thoughtful? We are better with this code. We also aren't
    a victim of rampaging consistency, and should be using this course of action.
    We don't have coding standards out yet for this project, so please make sure to address any feedback regarding STYLE so the codebase remains consistent.

  • PR has a valid summary, and a good description.

try {
// Subtract 1 minute from the slab ID to allow for a reasonable window of out-of-order writes while
// constraining the number of tombstones read to 1 minute's worth of rows.
_oldestSlab.get(channel, () ->
Copy link
Contributor

@sujithvaddi sujithvaddi Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@billkalter should it be put() here instead of get() :?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is very confusing and I tried to put a comment to clarify. With the newer Java interfaces like ConcurrentMap you can use computeIfAbsent() or putIfAbsent(). The Guava cache interface doesn't have a similar method, but if you do a get() this way it does the same thing: caches the new value only if there is no current un-expired version in the cache. From their docs:

This method provides a simple substitute for the conventional "if cached, return; otherwise create, cache and return" pattern.

Copy link
Contributor

@sujithvaddi sujithvaddi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@billkalter looks good.

@billkalter billkalter merged commit 98d6195 into bazaarvoice:master Jun 20, 2018
@billkalter billkalter deleted the cache-manifest-head branch June 20, 2018 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants