Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate performance regression in development #142

Closed
domodwyer opened this issue Apr 11, 2018 · 1 comment
Closed

Investigate performance regression in development #142

domodwyer opened this issue Apr 11, 2018 · 1 comment

Comments

@domodwyer
Copy link

domodwyer commented Apr 11, 2018

NOTE: This issue is for the development branch, and does not effect the production release.


Commit 860240e (for #116) introduces a performance regression of ~5k/s reads (and probably overall, though mongo is the bottleneck for write operations).

A comparison between 91cf46c (the previous commit) and 860240e shows that the average throughput on the same instance dropped from ~64,000 reads/s to ~59,500 reads/s - this is due to an increase in request latency variance (99 percentile increased from 7ms to 12.5ms).

This might not be an issue, the feature added is pretty useful however a change of ~6% should be looked at - there might be a way to optimise the fast path.

throughput-overview

Throughput:

x => 91cf46c3ef1013857ec915403d99a9007f84d06b-select-zipfian-throughput.log
y => 860240ea96bd411b6f6165bb634ac6209981f2c5-select-zipfian-throughput.log

     n   min   max  median  average   stddev      p99
x 3600 52778 70224 63914.0 63945.67 2320.809 68475.48
y 3600 49235 64286 59562.5 59583.09 2117.059 63637.03

       55000                 60000                 65000                  70000
 |-------+---------------------+---------------------+----------------------+--|
                                          +------+-----+
1                        -----------------|      |     |------------------
                                          +------+-----+
                       +-----+-----+
2    ------------------|     |     |---------------
                       +-----+-----+
Legend: 1=data$x, 2=data$y

At 95% probablitiy:
===> average is statistically significant     (p=0.000000, diff ~4362.578056)
===> variance is statistically significant    (p=0.000000)

Latency:

x => 91cf46c3ef1013857ec915403d99a9007f84d06b-select-zipfian-latency.log
y => 860240ea96bd411b6f6165bb634ac6209981f2c5-select-zipfian-latency.log

          n min    max median average   stddev   p99
x 214499384 0.5   7.05      4 3.89542 1.353147  7.05
y 214499384 0.5 351.65      4 4.57718 3.109341 12.50

   0                        5                        10
 |-+------------------------+-------------------------+------------------------|
                  +----+----+
1    -------------|    |    |-----------
                  +----+----+
             +---------+---------------+
2    --------|         |               |----------------------------------
             +---------+---------------+
Legend: 1=data$x, 2=data$y

At 95% probablitiy:
===> average is statistically significant     (p=0.000000, diff ~0.681760)
===> variance is statistically significant    (p=0.000000)
domodwyer added a commit that referenced this issue Apr 19, 2018
#116 adds a much needed ability to shrink the connection pool, but requires
tracking the last-used timestamp for each socket after every operation. Frequent
calls to time.Now() in the hot-path reduced read throughput by ~6% and increased
the latency (and variance) of socket operations as a whole.

This PR adds a periodically updated time value to amortise the cost of the last-
used bookkeeping, restoring the original throughput at the cost of approximate
last-used values (configured to be ~25ms of potential skew).

On some systems (currently including FreeBSD) querying the time counter also
requires a syscall/context switch.

Fixes #142.
domodwyer added a commit that referenced this issue Apr 19, 2018
#116 adds a much needed ability to shrink the connection pool, but requires
tracking the last-used timestamp for each socket after every operation. Frequent
calls to time.Now() in the hot-path reduced read throughput by ~6% and increased
the latency (and variance) of socket operations as a whole.

This PR adds a periodically updated time value to amortise the cost of the last-
used bookkeeping, restoring the original throughput at the cost of approximate
last-used values (configured to be ~25ms of potential skew).

On some systems (currently including FreeBSD) querying the time counter also
requires a syscall/context switch.

Fixes #142.
domodwyer added a commit that referenced this issue Apr 19, 2018
#116 adds a much needed ability to shrink the connection pool, but requires
tracking the last-used timestamp for each socket after every operation. Frequent
calls to time.Now() in the hot-path reduced read throughput by ~6% and increased
the latency (and variance) of socket operations as a whole.

This PR adds a periodically updated time value to amortise the cost of the last-
used bookkeeping, restoring the original throughput at the cost of approximate
last-used values (configured to be ~25ms of potential skew).

On some systems (currently including FreeBSD) querying the time counter also
requires a syscall/context switch.

Fixes #142.
domodwyer added a commit that referenced this issue Apr 19, 2018
#116 adds a much needed ability to shrink the connection pool, but requires
tracking the last-used timestamp for each socket after every operation. Frequent
calls to time.Now() in the hot-path reduced read throughput by ~6% and increased
the latency (and variance) of socket operations as a whole.

This PR adds a periodically updated time value to amortise the cost of the last-
used bookkeeping, restoring the original throughput at the cost of approximate
last-used values (configured to be ~25ms of potential skew).

On some systems (currently including FreeBSD) querying the time counter also
requires a syscall/context switch.

Fixes #142.
@domodwyer
Copy link
Author

See #149

libi pushed a commit to libi/mgo that referenced this issue Dec 1, 2022
globalsign#116 adds a much needed ability to shrink the connection pool, but requires
tracking the last-used timestamp for each socket after every operation. Frequent
calls to time.Now() in the hot-path reduced read throughput by ~6% and increased
the latency (and variance) of socket operations as a whole.

This PR adds a periodically updated time value to amortise the cost of the last-
used bookkeeping, restoring the original throughput at the cost of approximate
last-used values (configured to be ~25ms of potential skew).

On some systems (currently including FreeBSD) querying the time counter also
requires a syscall/context switch.

Fixes globalsign#142.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant