Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: introduce Allocator to manage memory used by executors #10580

Closed
wants to merge 2 commits into from

Conversation

qw4990
Copy link
Contributor

@qw4990 qw4990 commented May 23, 2019

What problem does this PR solve?

The first step to fix #9396.
Introduce an Allocator to manage memory used by chunk in executors.

What is changed and how it works?

  1. Introduce an interface Allocator.
  2. Introduce two structs which implement Allocator.
  3. Add Release method to Chunk.

Check List

Tests

  • Unit test

Side effects

  • Increased code complexity

@qw4990 qw4990 added the sig/execution SIG execution label May 23, 2019
@qw4990
Copy link
Contributor Author

qw4990 commented May 23, 2019

/rebuild

@qw4990
Copy link
Contributor Author

qw4990 commented May 23, 2019

Here are the sysbench results.
The performance gain is about 4%~4.5%.

I have tested two rounds with oltp_read_only cmd, 4 tables, 100000 table size and 128 threads.

The first round:
before:

[ 10s ] thds: 128 tps: 857.21 qps: 13792.18 (r/w/o: 12066.47/0.00/1725.71) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 128 tps: 862.83 qps: 13832.88 (r/w/o: 12107.52/0.00/1725.36) lat (ms,95%): 170.48 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 128 tps: 858.68 qps: 13719.83 (r/w/o: 12002.56/0.00/1717.27) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 128 tps: 852.17 qps: 13785.62 (r/w/o: 12060.98/0.00/1724.64) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 128 tps: 849.44 qps: 13626.50 (r/w/o: 11926.43/0.00/1700.08) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 128 tps: 851.50 qps: 13622.00 (r/w/o: 11919.70/0.00/1702.30) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 128 tps: 854.45 qps: 13654.27 (r/w/o: 11945.37/0.00/1708.90) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 80s ] thds: 128 tps: 849.65 qps: 13611.15 (r/w/o: 11911.46/0.00/1699.69) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 90s ] thds: 128 tps: 850.22 qps: 13570.98 (r/w/o: 11873.05/0.00/1697.94) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 100s ] thds: 128 tps: 836.05 qps: 13391.99 (r/w/o: 11717.79/0.00/1674.20) lat (ms,95%): 179.94 err/s: 0.00 reconn/s: 0.00

after:

[ 10s ] thds: 128 tps: 876.46 qps: 14110.00 (r/w/o: 12346.00/0.00/1764.00) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 128 tps: 909.81 qps: 14550.99 (r/w/o: 12731.47/0.00/1819.52) lat (ms,95%): 167.44 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 128 tps: 890.89 qps: 14273.10 (r/w/o: 12489.93/0.00/1783.18) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 128 tps: 886.20 qps: 14164.98 (r/w/o: 12394.07/0.00/1770.91) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 128 tps: 883.95 qps: 14160.23 (r/w/o: 12391.04/0.00/1769.19) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 128 tps: 882.29 qps: 14099.32 (r/w/o: 12335.24/0.00/1764.08) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 128 tps: 883.13 qps: 14133.13 (r/w/o: 12366.68/0.00/1766.45) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 80s ] thds: 128 tps: 887.83 qps: 14209.15 (r/w/o: 12433.58/0.00/1775.57) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 90s ] thds: 128 tps: 879.13 qps: 14064.53 (r/w/o: 12307.38/0.00/1757.15) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 100s ] thds: 128 tps: 886.66 qps: 14198.12 (r/w/o: 12423.90/0.00/1774.21) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00

avg(before) = 852.22,
avg(after) = 886.635,
gain = 4.04%.

The second round:
before:

[ 10s ] thds: 128 tps: 853.53 qps: 13740.08 (r/w/o: 12021.63/0.00/1718.45) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 128 tps: 856.28 qps: 13702.70 (r/w/o: 11990.24/0.00/1712.46) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 128 tps: 862.01 qps: 13800.70 (r/w/o: 12076.78/0.00/1723.93) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 128 tps: 856.01 qps: 13695.81 (r/w/o: 11983.38/0.00/1712.43) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 128 tps: 858.04 qps: 13739.59 (r/w/o: 12022.92/0.00/1716.67) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 128 tps: 856.30 qps: 13680.26 (r/w/o: 11967.75/0.00/1712.51) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 128 tps: 840.39 qps: 13460.79 (r/w/o: 11780.52/0.00/1680.27) lat (ms,95%): 179.94 err/s: 0.00 reconn/s: 0.00
[ 80s ] thds: 128 tps: 810.95 qps: 12973.82 (r/w/o: 11352.03/0.00/1621.79) lat (ms,95%): 186.54 err/s: 0.00 reconn/s: 0.00
[ 90s ] thds: 128 tps: 836.47 qps: 13376.46 (r/w/o: 11703.03/0.00/1673.43) lat (ms,95%): 183.21 err/s: 0.00 reconn/s: 0.00
[ 100s ] thds: 128 tps: 831.68 qps: 13302.81 (r/w/o: 11640.15/0.00/1662.66) lat (ms,95%): 183.21 err/s: 0.00 reconn/s: 0.00

after:

[ 10s ] thds: 128 tps: 889.70 qps: 14320.46 (r/w/o: 12530.48/0.00/1789.98) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 128 tps: 882.17 qps: 14118.14 (r/w/o: 12353.60/0.00/1764.54) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 128 tps: 886.17 qps: 14171.29 (r/w/o: 12397.64/0.00/1773.65) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 128 tps: 878.14 qps: 14084.97 (r/w/o: 12328.28/0.00/1756.68) lat (ms,95%): 176.73 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 128 tps: 891.94 qps: 14251.45 (r/w/o: 12468.07/0.00/1783.38) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 60s ] thds: 128 tps: 880.07 qps: 14084.96 (r/w/o: 12325.61/0.00/1759.34) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 70s ] thds: 128 tps: 875.24 qps: 13992.97 (r/w/o: 12241.60/0.00/1751.37) lat (ms,95%): 179.94 err/s: 0.00 reconn/s: 0.00
[ 80s ] thds: 128 tps: 886.61 qps: 14219.37 (r/w/o: 12446.25/0.00/1773.12) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 90s ] thds: 128 tps: 888.27 qps: 14182.50 (r/w/o: 12406.95/0.00/1775.55) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00
[ 100s ] thds: 128 tps: 891.88 qps: 14259.02 (r/w/o: 12474.77/0.00/1784.25) lat (ms,95%): 173.58 err/s: 0.00 reconn/s: 0.00

avg(before) = 846.166,
avg(after) = 885.019,
gain = 4.59%.

@zhouqiang-cl
Copy link
Contributor

/rebuild

Copy link
Contributor

@tiancaiamao tiancaiamao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest that this PR just add the library code and do not use it in the executor?

@@ -84,6 +84,7 @@ type baseExecutor struct {
children []Executor
retFieldTypes []*types.FieldType
runtimeStats *execdetails.RuntimeStats
allocator chunk.Allocator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to store it into executor? we can always get if from ctx.SessionVars()

@@ -908,6 +910,7 @@ func (e *SelectionExec) Open(ctx context.Context) error {

// Close implements plannercore.Plan Close interface.
func (e *SelectionExec) Close() error {
e.childResult.Release()
e.childResult = nil
e.selected = nil
return e.baseExecutor.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about put the Release() to the baseExecutor.Close() ?
It's easy to leak resource if we have to add Release() in every executor...

@@ -0,0 +1,247 @@
package chunk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LICENSE

type Allocator interface {
Alloc(l, c int) []byte
Free(buf []byte)
SetParent(a Allocator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of SetParent ?


// Free releases memory.
func (b *multiBufChan) Free(buf []byte) {
b.allocators[atomic.AddUint32(&b.freeIndex, 1)%b.numAllocators].Free(buf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer from one allocator would be give back to another one?

pad = make([]byte, (1<<uint(capIndexBit))+1)
)

func getCapIndex(bitN uint) ([]int, []int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any comment to explain this function?


type bufChan struct {
maxCap int
bufList []chan []byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct create too many small objects and bad for GC

// Close closes this allocator.
func (b *bufChan) Close() {
for _, ch := range b.bufList {
for len(ch) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for _, buf := range ch {
...
}

for _, ch := range b.bufList {
for len(ch) > 0 {
buf := <-ch
b.parent.Free(buf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the buf return to a Closed allocator, taking concurrency into consideration?

col.elemBuf = nil
} else {
col.elemBuf = a.Alloc(elemLen, elemLen)
col.data = a.Alloc(0, cap*elemLen)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the usage of the chunk, the data/elemBuf/nullBitmap will grow atomically and it's improper to return the memory to the allocator...

		c.nullBitmap = append(c.nullBitmap, b)

@qw4990
Copy link
Contributor Author

qw4990 commented Jun 21, 2019

Here is a post about how to recycle memory buffers in Golang.
We can do the same test to see the effect of this PR on GC.

@qw4990
Copy link
Contributor Author

qw4990 commented Jul 22, 2019

Close it because it increases the complexity of memory management and the 4.5% profit is not worth doing that.
We should use some other approach to manage our memory.

@qw4990 qw4990 closed this Jul 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make newFirstChunk more effective
3 participants