-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
option to disable caching field names to avoid schema mismatches #5572
option to disable caching field names to avoid schema mismatches #5572
Conversation
Signed-off-by: Paul Hemberger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been wanting to make this behavior permanent. Is there a way you can benchmark to see if there's a performance difference with vs without field caching? I think the difference will be negligible enough that we can run without caching always. The win will be the avoidance of yet another flag.
If you don't have the bandwidth, I'll approve this PR.
We don't have formal benchmarks, but we're planning on rolling it out internally on our end this week. We'll update with how the broader before / after numbers look. |
If the benchmarks are inconclusive, you could change the default of this option so that caching is disabled. If we don't hear any negative feedback for a release, we can remove the option. The unfortunate part about this option is that users will not know it is available unless they have a lot of expertise. |
Sounds good -- I also realized that this particular flag conflicts with the consolidator code path (if this flag is off, the consolidator is off). We have also disabled the consolidator so that doesn't affect us, but for this PR we'll want to untangle those options. |
If the benchmarks are inconclusive, you could change the default of this
option so that caching is disabled. If we don't hear any negative feedback
for a release, we can remove the option
As a general rule, we should not change any default behavior that might
affect performance in fundamentally unknowable ways, without a significant
period of time and explicit clear deprecation warnings to give users time
to try out the change in their environment.
I'm not opposed to a plan that moves us towards deprecation eventually, but
we should definitely not change the default option to disabled initially.
IMO the best approach would be to add the option, announce the deprecation,
then later remove the underlying cache (keeping the flag as a no-op for
backwards compatibility), then eventually remove the flag altogether.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @demmer's comment. Although introducing the new flag is more convoluted, it's better to be more cautious about breaking changes.
We've had several incidents where schema migrations + the query plan cache can combine to return results with the wrong column/field names until the cache is refreshed.
ex:
SELECT * FROM table WHERE id = something
is executed and cached with field names(foo, bar, baz)
ptosc
orghost
migration completes to dropbar
SELECT * FROM table WHERE id = something
is executed again, and the tablet grabs the field names from the cache, which includesbar
. However, MySQL only returned data for two columns,(foo, baz)
From my understanding, this mismatch can happen when:
ghost
orptosc
The simplest & most expedient thing I could think of to circumvent this problem entirely is to just stop caching field names with the query plans, which is what this PR enables. Looking for feedback on the changes, and I'm curious if others have run into these issues as well.
Signed-off-by: Paul Hemberger [email protected]