Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Update/upsert performanace regression 1.0.4 -> master #17557

Closed
bdarnell opened this issue Aug 9, 2017 · 11 comments
Closed

perf: Update/upsert performanace regression 1.0.4 -> master #17557

bdarnell opened this issue Aug 9, 2017 · 11 comments
Assignees
Milestone

Comments

@bdarnell
Copy link
Contributor

bdarnell commented Aug 9, 2017

The test in #17507 showed that update and upsert performance got significantly slower (from ~30ms to ~60ms) since 1.0.4. Full details are in that issue, but I'm filing this issue to separately track investigation of the performance regression (as opposed to the difference between schemas with and without column families which is the main focus of #17507).

@petermattis petermattis added this to the 1.1 milestone Aug 9, 2017
@jordanlewis
Copy link
Member

I've reproduced the regression using kv:

1.0.4:

[13:33:2]% ./kv --read-percent=0 --cycle-length=1                                                 
_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
      1s        0          285.6          285.6      3.7    268.4    385.9    453.0
      2s        0          262.2          273.9      3.8    453.0    704.6   1275.1
      3s        0          270.9          272.9      3.7    402.7    604.0    906.0
      4s        0          261.0          269.9      3.8    369.1    738.2    973.1
      5s        0          259.0          267.8      3.9    486.5   1073.7   1879.0
      6s        0          261.0          266.6      3.8    352.3    704.6   1040.2
      7s        0          245.0          263.5      4.2    335.5    738.2   1275.1
      8s        0          256.0          262.6      3.7    318.8    771.8   1476.4
      9s        0          260.0          262.3      3.7    335.5    671.1   1208.0
     10s        0          253.0          261.4      3.8    335.5   1040.2   1342.2
     11s        0          252.0          260.5      3.8    402.7    805.3   2013.3
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
   11.1s        0           2893          260.4     59.4      3.8    369.1    838.9   2013.3

BenchmarkBlocks     2893           3840472.8 ns/op

master:

[13:33]% ./kv --read-percent=0 --cycle-length=1                      
_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
      1s        0           73.9           73.9     22.0    604.0    838.9   1006.6
      2s        0           65.9           69.9     22.0    939.5   1879.0   1946.2
      3s        0           65.2           68.3     22.0   1140.9   1610.6   2281.7
      4s        0           65.0           67.5     23.1   1811.9   2415.9   3758.1
      5s        0           68.0           67.6     22.0    838.9   1610.6   3087.0
      6s        0           66.0           67.3     23.1    805.3   1543.5   2550.1
      7s        0           64.0           66.8     23.1    671.1   1476.4   2147.5
      8s        0           65.9           66.7     22.0   1476.4   2080.4   2147.5
      9s        0           69.1           67.0     23.1    939.5   1744.8   1879.0
     10s        0           65.0           66.8     22.0   1140.9   2147.5   2147.5
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
   10.2s        0            683           66.7    226.0     22.0   1140.9   2147.5   3758.1

BenchmarkBlocks      683          14991670.2 ns/op

Now to figure out what's going on.

@petermattis
Copy link
Collaborator

This is a highly concurrent workload. The problem is likely in the guts of storage and is reminiscent of (or the same as) #15797.

Curious that master is so much slower than 1.0.4, though.

@petermattis
Copy link
Collaborator

And by high concurrency, I mean concurrency on a single row.

@jordanlewis
Copy link
Member

I thought that workload that @robert-s-lee produced to demonstrate the upsert/insert slowdown was pretty much the same as that the above kv command produces, but I was mistaken. Actually, each thread in his workload is repeatedly updating the same row, local to that thread, so there should be no contention.

I'll make a more accurate reproduction now.

@jordanlewis
Copy link
Member

Bisecting the high-concurrency case.

@jordanlewis
Copy link
Member

The regression seems to have been introduced in 82cbb49.

@petermattis
Copy link
Collaborator

Can you try setting rocksdb.min_wal_sync_interval = 0ms?

@jordanlewis
Copy link
Member

That cluster setting removes the performance issue. Is that a setting that we can recommend? Is there a more balanced default that we can pick?

@petermattis
Copy link
Collaborator

This is the second time this setting is come up. The effect on heavily loaded systems was present, but small. Can you send a PR to change the default to 0ms?

Thanks for tracking this down.

@irfansharif
Copy link
Contributor

why does/did rocksdb.min_wal_sync_interval = 0ms work?

@petermattis
Copy link
Collaborator

why does/did rocksdb.min_wal_sync_interval = 0ms work?

The setting specifies a minimum interval between WAL syncs and thus adds a minimum latency to write operations when there is no concurrency. In the test @jordanlewis was performing, there were concurrent workers, but they were all hitting the same key so there was no concurrency during syncing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants