Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: improve row count estimation for index range containing correlated columns #9738

Merged
merged 3 commits into from
Mar 19, 2019

Conversation

eurekaka
Copy link
Contributor

What problem does this PR solve?

Fix #9722

What is changed and how it works?

After appending col = correlation_col into access conditions, adjust the estimated row count by multiplying a factor for each condition appended, and the factor is computed as 1 / NDV.

Check List

Tests

  • Integration test

Code changes

N/A

Side effects

N/A

Related changes

  • Need to cherry-pick to the release branch

@eurekaka eurekaka added type/bugfix This PR fixes a bug. sig/planner SIG: Planner labels Mar 14, 2019
@eurekaka
Copy link
Contributor Author

/run-all-tests

@codecov
Copy link

codecov bot commented Mar 14, 2019

Codecov Report

Merging #9738 into master will decrease coverage by 0.0079%.
The diff coverage is 90.909%.

@@               Coverage Diff               @@
##             master      #9738       +/-   ##
===============================================
- Coverage   67.2181%   67.2101%   -0.008%     
===============================================
  Files           381        381               
  Lines         79846      79851        +5     
===============================================
- Hits          53671      53668        -3     
- Misses        21389      21393        +4     
- Partials       4786       4790        +4

hist, ok := ds.statisticTable.Columns[col.ID]
var ndv float64
if ok && hist.Count > 0 {
factor := float64(ds.statisticTable.Count) / float64(hist.Count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ds.statisticTable.Count? This factor does not consider the selectivity of previous index conditions.

@eurekaka eurekaka requested a review from alivxxx March 18, 2019 07:17
planner/core/stats.go Outdated Show resolved Hide resolved
} else {
profile.Cardinality[i] = profile.RowCount * distinctFactor
}
profile.Cardinality[i] = ds.getColumnNDV(col.ID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems incorrect, should it be:

hist, ok := ds.statisticTable.Columns[colID]
if ok && hist.Count > 0 {
	profile.Cardinality[i] = float64(hist.Count)
} else {
	profile.Cardinality[i] = ds.statisticTable.Count
}

@winoros

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, cardinality stores NDV not the total row count.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to s/Cardinality/NDV/ to eliminate ambiguity

Copy link
Contributor

@alivxxx alivxxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alivxxx alivxxx added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 19, 2019
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alivxxx alivxxx added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Mar 19, 2019
@zz-jason zz-jason merged commit 0b28f30 into pingcap:master Mar 19, 2019
kolbe pushed a commit to kolbe/tidb that referenced this pull request Mar 20, 2019
@eurekaka eurekaka deleted the corr_range branch March 20, 2019 03:00
eurekaka added a commit to eurekaka/tidb that referenced this pull request Mar 28, 2019
zz-jason pushed a commit that referenced this pull request Mar 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants