-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner: improve row count estimation for index range containing correlated columns #9738
Conversation
/run-all-tests |
Codecov Report
@@ Coverage Diff @@
## master #9738 +/- ##
===============================================
- Coverage 67.2181% 67.2101% -0.008%
===============================================
Files 381 381
Lines 79846 79851 +5
===============================================
- Hits 53671 53668 -3
- Misses 21389 21393 +4
- Partials 4786 4790 +4 |
planner/core/logical_plans.go
Outdated
hist, ok := ds.statisticTable.Columns[col.ID] | ||
var ndv float64 | ||
if ok && hist.Count > 0 { | ||
factor := float64(ds.statisticTable.Count) / float64(hist.Count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ds.statisticTable.Count
? This factor does not consider the selectivity of previous index conditions.
} else { | ||
profile.Cardinality[i] = profile.RowCount * distinctFactor | ||
} | ||
profile.Cardinality[i] = ds.getColumnNDV(col.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems incorrect, should it be:
hist, ok := ds.statisticTable.Columns[colID]
if ok && hist.Count > 0 {
profile.Cardinality[i] = float64(hist.Count)
} else {
profile.Cardinality[i] = ds.statisticTable.Count
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, cardinality
stores NDV not the total row count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to s/Cardinality/NDV/ to eliminate ambiguity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
Fix #9722
What is changed and how it works?
After appending
col = correlation_col
into access conditions, adjust the estimated row count by multiplying a factor for each condition appended, and the factor is computed as1 / NDV
.Check List
Tests
Code changes
N/A
Side effects
N/A
Related changes