Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.1: stats: don't delete stale cache entries, update them asynchronously instead #52191

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Jul 31, 2020

Backport:

Please see individual PRs for details.

/cc @cockroachdb/release

rytaft added 5 commits July 31, 2020 10:05
…nstead

Prior to this commit, any time new statistics were added for a table, the
cache entry for that table was deleted so that the new statistics would be
fetched by the next query. This had the effect of forcing queries to wait
unnecessarily to get the freshest stats, even if slightly stale stats would
have produced the same query plan. For workloads with frequent stats updates,
contention on the system.table_statistics table could result in latency spikes.

This commit fixes the problem by performing a "refresh" rather than an
"invalidation" of the cache. Any time new stats are available, the freshest
stats are fetched asynchronously from the database and added to the cache when
they are available in memory. In the mean time, other queries can continue
using the stale stats without blocking.

Fixes cockroachdb#36365

Release note (performance improvement): Queries no longer block during planning
if cached table statistics have become stale and the new statistics have not
yet been loaded. Instead, the stale statistics are used for planning until the
new statistics have been loaded. This improves performance because it prevents
latency spikes that may occur if there is a delay in loading the new
statistics.
This commit fixes a flake caused by the new stats cache update mechanism,
which refreshes the stats asynchronously. The solution is to retry the
test query until the new stats are available in the cache.

Fixes cockroachdb#50863

Release note: None
This commit fixes a race condition that was introduced in cockroachdb#51616.
The race condition existed because it was possible for two refreshes
to be triggered in close succession, and there was no mechanism to
prevent the first refresh from overwriting the second if it was
delayed.

We fix the race condition by introducing two boolean flags into each
cache entry: refreshing and mustRefreshAgain. If refreshing is true,
that indicates that the current statistics for this table are stale,
and we are in the process of fetching the updated stats from the database.
In the mean time, other callers can use the stale stats and do not need to
wait.

If a goroutine tries to perform a refresh when a refresh is already
in progress, it will see that refreshing=true and will set the
mustRefreshAgain flag to true before returning. When the original
goroutine that was performing the refresh returns from the database and
sees that mustRefreshAgain=true, it will trigger another refresh.

Informs cockroachdb#50863

Release note: None
This commit fixes a flake in TestCacheWait, which was introduced
by cockroachdb#51616. That PR changed the call to InvalidateTableStats in
TestCacheWait to RefreshTableStats, but that change should not have
been made. The purpose of the test is to test that the cache invalidation
and waiting mechanisms work correctly, and therefore it must call
InvalidateTableStats, not RefreshTableStats.

Fixes cockroachdb#51712

Release note: None
This commit fixes a flake in TestQueryCache/group/statschange,
which was introduced by cockroachdb#51616. That PR made updates to the stats
cache asynchronous, so we can no longer expect the query cache to
be invalidated immediately after a stats update. This commit fixes
the problem by introducing a retry mechanism into the test.

Fixes cockroachdb#51693

Release note: None
@rytaft rytaft requested a review from RaduBerinde July 31, 2020 15:07
@rytaft rytaft requested a review from a team as a code owner July 31, 2020 15:07
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but let's not merge until Monday when 20.1.4 is out (just in case).

@rytaft
Copy link
Collaborator Author

rytaft commented Jul 31, 2020

Sounds good. TFTR!

@RaduBerinde RaduBerinde merged commit 039834c into cockroachdb:release-20.1 Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants