Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner, statistics: maintain histogram for inner join #8097

Closed
wants to merge 20 commits into from

Conversation

winoros
Copy link
Member

@winoros winoros commented Oct 29, 2018

What problem does this PR solve?

Initial commit for maintaining histogram for inner join.
Don't consider use index to calculate join's StatsInfo currently.

What is changed and how it works?

Following the description in #7605

Check List

Tests

  • Unit test
  • Integration test

Not enough. I'll add more, but maybe not in this pr.

Code changes

  • Has exported function/method change

Side effects

  • Possible performance regression
  • Increased code complexity

This change is Reviewable

@winoros winoros added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner labels Oct 29, 2018
statistics/histogram.go Outdated Show resolved Hide resolved
statistics/histogram.go Outdated Show resolved Hide resolved
statistics/histogram.go Outdated Show resolved Hide resolved
statistics/histogram.go Outdated Show resolved Hide resolved
statistics/histogram.go Outdated Show resolved Hide resolved
statistics/histogram.go Show resolved Hide resolved
planner/core/stats.go Show resolved Hide resolved
@winoros
Copy link
Member Author

winoros commented Nov 14, 2018

all addressed
@lamxTyler @zz-jason @eurekaka PTAL

planner/core/stats.go Outdated Show resolved Hide resolved
planner/core/stats.go Show resolved Hide resolved
planner/core/stats.go Show resolved Hide resolved
planner/core/stats.go Outdated Show resolved Hide resolved
statistics/histogram.go Outdated Show resolved Hide resolved
statistics/histogram.go Outdated Show resolved Hide resolved
@winoros
Copy link
Member Author

winoros commented Nov 19, 2018

I'll add more tests to cover the case that the count of HistColl is not equal to the histogram's count.

statistics/dump.go Outdated Show resolved Hide resolved
planner/core/stats_test.go Show resolved Hide resolved
planner/core/stats_test.go Show resolved Hide resolved
statistics/histogram.go Outdated Show resolved Hide resolved
planner/core/stats.go Outdated Show resolved Hide resolved
@winoros winoros requested review from alivxxx and eurekaka January 7, 2019 05:11
Copy link
Contributor

@alivxxx alivxxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alivxxx alivxxx added the status/LGT1 Indicates that a PR has LGTM 1. label Jan 11, 2019
@codecov-io
Copy link

codecov-io commented Jan 14, 2019

Codecov Report

Merging #8097 into master will increase coverage by 0.06%.
The diff coverage is 87.16%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8097      +/-   ##
==========================================
+ Coverage   67.16%   67.23%   +0.06%     
==========================================
  Files         371      371              
  Lines       76393    76543     +150     
==========================================
+ Hits        51311    51461     +150     
- Misses      20494    20496       +2     
+ Partials     4588     4586       -2
Impacted Files Coverage Δ
statistics/feedback.go 73.28% <100%> (ø) ⬆️
statistics/dump.go 73.64% <75%> (-0.56%) ⬇️
statistics/histogram.go 81.45% <83.33%> (+0.1%) ⬆️
statistics/table.go 82.75% <85.71%> (ø) ⬆️
planner/core/stats.go 89.59% <89.28%> (+2.26%) ⬆️
statistics/scalar.go 85.13% <90.56%> (+0.45%) ⬆️
executor/join.go 77.92% <0%> (-0.52%) ⬇️
executor/executor.go 67.08% <0%> (+0.14%) ⬆️
planner/core/planbuilder.go 50.37% <0%> (+0.41%) ⬆️
planner/core/exhaust_physical_plans.go 92.57% <0%> (+0.47%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dca815c...41b5a4b. Read the comment docs.

@@ -281,6 +281,9 @@ func (p *LogicalJoin) DeriveStats(childStats []*property.StatsInfo) (*property.S
leftKeys = append(leftKeys, eqCond.GetArgs()[0].(*expression.Column))
rightKeys = append(rightKeys, eqCond.GetArgs()[1].(*expression.Column))
}
if p.JoinType == InnerJoin && p.ctx.GetSessionVars().OptimizerSelectivityLevel >= 1 {
return p.deriveInnerJoinStatsWithHist(leftKeys, rightKeys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we pass the childStats parameter down to deriveInnerJoinStatsWithHist? so this DeriveStats function can be used by cascades planner as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, using childStats in new commits. But it uses the schema of join's children inside this method. Seems that i need some way to not rely on it.

@qw4990 qw4990 self-requested a review March 4, 2019 06:36
leftCol := &statistics.Column{Info: leftHist.Info, Histogram: *newHist}
rightCol := &statistics.Column{Info: rightHist.Info, Histogram: *newHist}
lIncreaseFactor := leftHist.GetIncreaseFactor(leftProfile.HistColl.Count)
// The factor is used to scale the NDV. When it's higher than one. NDV doesn't need to be changed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we change NDV if this factor is larger than one?

keyNdv = float64(newHist.NDV) * lIncreaseFactor * rIncreaseFactor
lPosNew := p.schema.ColumnIndex(leftKeys[i])
rPosNew := p.schema.ColumnIndex(rightKeys[i])
cardinality[lPosNew] = float64(newHist.NDV)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not multiply this cardinality by lIncreaseFactor like keyNdv.

@winoros
Copy link
Member Author

winoros commented Jul 25, 2019

closed for a while. will reopen it in the future.

@winoros winoros closed this Jul 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner status/LGT1 Indicates that a PR has LGTM 1. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants