Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare for removal of the CMS garbage collector #46973

Closed
4 tasks done
danielmitterdorfer opened this issue Sep 23, 2019 · 7 comments
Closed
4 tasks done

Prepare for removal of the CMS garbage collector #46973

danielmitterdorfer opened this issue Sep 23, 2019 · 7 comments
Assignees
Labels
:Core/Infra/Core Core issues without another label >enhancement Meta

Comments

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Sep 23, 2019

Context

According to JEP 363 (originally draft JEP 8229049) the CMS garbage collector will be removed from OpenJDK (discussion of the patch and commit). The corresponding discussion on the OpenJDK mailing list mentions JDK 14 as targeted JDK. There is currently no release date for JDK 14 but based on past release cadence we should expect it to be released in late March 2020. As Elasticsearch currently recommends the CMS garbage collector we should be prepared for its removal. I'm raising this issue as a meta-issue to collect our thoughts on what is missing to run Elasticsearch with a different garbage collector than CMS out of the box.

Prior work

We have already investigated G1GC as an alternative garbage collector for Elasticsearch in #33685 and have recently adjusted the out-of-the box settings in #46169 to improve the effectiveness of Elasticsearch's real-memory circuit breaker with G1GC.

Tasks

The following list may be incomplete, please add new tasks as needed:

@danielmitterdorfer danielmitterdorfer added >enhancement :Core/Infra/Core Core issues without another label labels Sep 23, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@ebadyano
Copy link
Contributor

ebadyano commented Oct 24, 2019

I did a few experiments with jdk13 G1GC, running nyc_taxic, 1 node benchmark on our original nightly benchmarks environment:

  1. with 4G heap, I was previously triggering circuit breaker when running with jdk12 and G1GC enabled; with jdk13 I can run to completion fine with even with default (1G heap) and 4G heaps, although the total accumulated GC pauses are slightly higher than with CMS. Indexing throughout is regressed by ~5%

  2. With 8G heap, I see an improvement in GC pauses when running with 8G heap and interestingly indexing throughout seems to improve by ~8% at least with nyc_taxis-1-node.

  3. I tried running with -XX:+UseStringDeduplication + G1GC, and for some reason circuit breaker is triggered with 4G heap (didn't try with 1G heap). With 8G heap I see slightly higher accumulated GC pauses with -XX:+UseStringDeduplication and indexing throughput is regressed by ~6% compared to G1GC without -XX:+UseStringDeduplication

  4. I also tried http_logs with jdk13 and 8G heap: indexing throughput for G1GC is more or less on par with CMS, sometimes slightly higher, sometimes slightly lower

@ebadyano
Copy link
Contributor

ebadyano commented Nov 18, 2019

Logs for nyc_taxis benchmark with 4g heap + G1GC with jdk12 (intermittently fails with circuit breaker, logs with fail and completed runs), and jdk13 run:
https://drive.google.com/open?id=16clDwJohf50-KA1YFuLzrc7WooUwi9Zg

@ebadyano
Copy link
Contributor

ebadyano commented Dec 6, 2019

Tried -XX:++UseParallelOldGC with default and 1g heap with http_logs, it keeps triggering circuit breaker. I tried running with indices.breaker.total.limit: 99% it takes a bit longer to trigger circuit breaker, but it still gets there during indexing. With indices.breaker.total.limit: 99.9% it triggers circuit breaker intermittently during queries.

@ebadyano
Copy link
Contributor

ebadyano commented Dec 6, 2019

when Parallel GC does finish, indexing and queries seem to be faster for 1g heap:
http_logs G1GC vs Parallel

|                                                        Metric |                                        Task |   Baseline |   Contender |     Diff |    Unit |
|--------------------------------------------------------------:|--------------------------------------------:|-----------:|------------:|---------:|--------:|
|                    Cumulative indexing time of primary shards |                                             |    128.667 |     105.868 | -22.7986 |     min |
|             Min cumulative indexing time across primary shard |                                             |     0.4068 |      0.3474 |  -0.0594 |     min |
|          Median cumulative indexing time across primary shard |                                             |    1.26412 |      0.9528 | -0.31132 |     min |
|             Max cumulative indexing time across primary shard |                                             |     18.991 |     15.6193 | -3.37172 |     min |
|           Cumulative indexing throttle time of primary shards |                                             |          0 |           0 |        0 |     min |
|    Min cumulative indexing throttle time across primary shard |                                             |          0 |           0 |        0 |     min |
| Median cumulative indexing throttle time across primary shard |                                             |          0 |           0 |        0 |     min |
|    Max cumulative indexing throttle time across primary shard |                                             |          0 |           0 |        0 |     min |
|                       Cumulative merge time of primary shards |                                             |    111.338 |     105.281 | -6.05687 |     min |
|                      Cumulative merge count of primary shards |                                             |        526 |         523 |       -3 |         |
|                Min cumulative merge time across primary shard |                                             |  0.0459833 |   0.0406833 |  -0.0053 |     min |
|             Median cumulative merge time across primary shard |                                             |   0.279417 |      0.2484 | -0.03102 |     min |
|                Max cumulative merge time across primary shard |                                             |    22.1314 |     21.0968 | -1.03457 |     min |
|              Cumulative merge throttle time of primary shards |                                             |    68.1129 |     65.1518 | -2.96117 |     min |
|       Min cumulative merge throttle time across primary shard |                                             |          0 |           0 |        0 |     min |
|    Median cumulative merge throttle time across primary shard |                                             |          0 |           0 |        0 |     min |
|       Max cumulative merge throttle time across primary shard |                                             |     14.112 |     14.4817 |  0.36973 |     min |
|                     Cumulative refresh time of primary shards |                                             |    13.0514 |     13.4064 |  0.35507 |     min |
|                    Cumulative refresh count of primary shards |                                             |       2130 |        2293 |      163 |         |
|              Min cumulative refresh time across primary shard |                                             |    0.04335 |      0.0387 | -0.00465 |     min |
|           Median cumulative refresh time across primary shard |                                             |     0.1217 |    0.139367 |  0.01767 |     min |
|              Max cumulative refresh time across primary shard |                                             |    1.97307 |     1.96867 |  -0.0044 |     min |
|                       Cumulative flush time of primary shards |                                             |    4.22462 |       4.856 |  0.63138 |     min |
|                      Cumulative flush count of primary shards |                                             |        117 |         115 |       -2 |         |
|                Min cumulative flush time across primary shard |                                             |      0.001 |     0.00415 |  0.00315 |     min |
|             Median cumulative flush time across primary shard |                                             |  0.0108167 |   0.0139667 |  0.00315 |     min |
|                Max cumulative flush time across primary shard |                                             |   0.840933 |      0.9375 |  0.09657 |     min |
|                                            Total Young Gen GC |                                             |    105.258 |      88.501 |  -16.757 |       s |
|                                              Total Old Gen GC |                                             |          0 |      21.663 |   21.663 |       s |
|                                                    Store size |                                             |    19.1857 |     18.9982 | -0.18753 |      GB |
|                                                 Translog size |                                             | 1.7928e-06 |  1.7928e-06 |        0 |      GB |
|                                        Heap used for segments |                                             |    105.664 |     102.275 | -3.38937 |      MB |
|                                      Heap used for doc values |                                             | 0.00476456 |  0.00476456 |        0 |      MB |
|                                           Heap used for terms |                                             |    98.7018 |     95.3662 | -3.33561 |      MB |
|                                           Heap used for norms |                                             | 0.00213623 |  0.00213623 |        0 |      MB |
|                                          Heap used for points |                                             |          0 |           0 |        0 |      MB |
|                                   Heap used for stored fields |                                             |    6.95544 |     6.90169 | -0.05376 |      MB |
|                                                 Segment count |                                             |         35 |          35 |        0 |         |
|                                                Min Throughput |                                index-append |     150090 |      161244 |  11153.6 |  docs/s |
|                                             Median Throughput |                                index-append |     163942 |      170634 |  6692.13 |  docs/s |
|                                                Max Throughput |                                index-append |     195448 |      199597 |  4149.01 |  docs/s |
|                                       50th percentile latency |                                index-append |    234.193 |     208.972 | -25.2208 |      ms |
|                                       90th percentile latency |                                index-append |     359.03 |     339.721 | -19.3083 |      ms |
|                                       99th percentile latency |                                index-append |    1038.53 |     989.105 | -49.4283 |      ms |
|                                     99.9th percentile latency |                                index-append |    1858.92 |     1756.26 |  -102.66 |      ms |
|                                    99.99th percentile latency |                                index-append |    2853.51 |     2383.23 | -470.275 |      ms |
|                                      100th percentile latency |                                index-append |    2880.77 |     2675.52 | -205.252 |      ms |
|                                  50th percentile service time |                                index-append |    234.193 |     208.972 | -25.2208 |      ms |
|                                  90th percentile service time |                                index-append |     359.03 |     339.721 | -19.3083 |      ms |
|                                  99th percentile service time |                                index-append |    1038.53 |     989.105 | -49.4283 |      ms |
|                                99.9th percentile service time |                                index-append |    1858.92 |     1756.26 |  -102.66 |      ms |
|                               99.99th percentile service time |                                index-append |    2853.51 |     2383.23 | -470.275 |      ms |
|                                 100th percentile service time |                                index-append |    2880.77 |     2675.52 | -205.252 |      ms |
|                                                    error rate |                                index-append |          0 |           0 |        0 |       % |
|                                                Min Throughput |                                     default |    8.01299 |     8.01308 |    9e-05 |   ops/s |
|                                             Median Throughput |                                     default |    8.01403 |     8.01398 |   -6e-05 |   ops/s |
|                                                Max Throughput |                                     default |    8.01519 |     8.01524 |    5e-05 |   ops/s |
|                                       50th percentile latency |                                     default |    5.36415 |     5.00933 | -0.35483 |      ms |
|                                       90th percentile latency |                                     default |     5.6521 |     5.43838 | -0.21372 |      ms |
|                                       99th percentile latency |                                     default |    7.69529 |     5.78511 | -1.91017 |      ms |
|                                      100th percentile latency |                                     default |    8.21218 |     7.35293 | -0.85925 |      ms |
|                                  50th percentile service time |                                     default |    5.06884 |     4.70835 | -0.36049 |      ms |
|                                  90th percentile service time |                                     default |    5.35518 |     5.12218 | -0.23299 |      ms |
|                                  99th percentile service time |                                     default |    7.38002 |     5.47012 |  -1.9099 |      ms |
|                                 100th percentile service time |                                     default |    7.89408 |     7.03901 | -0.85508 |      ms |
|                                                    error rate |                                     default |          0 |           0 |        0 |       % |
|                                                Min Throughput |                                        term |    50.0332 |     50.0609 |  0.02773 |   ops/s |
|                                             Median Throughput |                                        term |    50.0368 |      50.061 |  0.02415 |   ops/s |
|                                                Max Throughput |                                        term |    50.0405 |      50.061 |  0.02057 |   ops/s |
|                                       50th percentile latency |                                        term |    12.0435 |     6.72208 | -5.32139 |      ms |
|                                       90th percentile latency |                                        term |    12.6216 |      7.9926 | -4.62904 |      ms |
|                                       99th percentile latency |                                        term |    16.6301 |     11.4386 | -5.19152 |      ms |
|                                      100th percentile latency |                                        term |    20.4803 |     11.9234 | -8.55687 |      ms |
|                                  50th percentile service time |                                        term |    11.8239 |     6.58407 |  -5.2398 |      ms |
|                                  90th percentile service time |                                        term |    12.3903 |     7.85069 |  -4.5396 |      ms |
|                                  99th percentile service time |                                        term |    16.4101 |     11.2991 | -5.11099 |      ms |
|                                 100th percentile service time |                                        term |    20.2541 |     11.7813 | -8.47277 |      ms |
|                                                    error rate |                                        term |          0 |           0 |        0 |       % |
|                                                Min Throughput |                                       range |    1.00494 |     1.00494 |    1e-05 |   ops/s |
|                                             Median Throughput |                                       range |    1.00658 |     1.00658 |    1e-05 |   ops/s |
|                                                Max Throughput |                                       range |    1.00983 |     1.00985 |    2e-05 |   ops/s |
|                                       50th percentile latency |                                       range |    16.4738 |     15.6901 |  -0.7837 |      ms |
|                                       90th percentile latency |                                       range |    17.1529 |      15.946 | -1.20687 |      ms |
|                                       99th percentile latency |                                       range |    19.1744 |     16.3289 | -2.84554 |      ms |
|                                      100th percentile latency |                                       range |    19.4498 |     19.1965 | -0.25334 |      ms |
|                                  50th percentile service time |                                       range |    15.2952 |     14.5162 |   -0.779 |      ms |
|                                  90th percentile service time |                                       range |    15.9749 |       14.76 | -1.21491 |      ms |
|                                  99th percentile service time |                                       range |    17.9938 |     15.1449 | -2.84887 |      ms |
|                                 100th percentile service time |                                       range |    18.2629 |     18.0331 | -0.22988 |      ms |
|                                                    error rate |                                       range |          0 |           0 |        0 |       % |
|                                                Min Throughput |                                  hourly_agg |   0.200472 |    0.200461 |   -1e-05 |   ops/s |
|                                             Median Throughput |                                  hourly_agg |   0.200628 |    0.200614 |   -1e-05 |   ops/s |
|                                                Max Throughput |                                  hourly_agg |   0.200941 |    0.200935 |   -1e-05 |   ops/s |
|                                       50th percentile latency |                                  hourly_agg |    2637.86 |     2671.83 |  33.9609 |      ms |
|                                       90th percentile latency |                                  hourly_agg |     2661.4 |     2707.89 |  46.4864 |      ms |
|                                       99th percentile latency |                                  hourly_agg |    2675.39 |     2725.14 |  49.7542 |      ms |
|                                      100th percentile latency |                                  hourly_agg |    2676.09 |     2730.02 |  53.9239 |      ms |
|                                  50th percentile service time |                                  hourly_agg |    2635.29 |     2669.32 |  34.0258 |      ms |
|                                  90th percentile service time |                                  hourly_agg |    2658.85 |     2705.36 |  46.5122 |      ms |
|                                  99th percentile service time |                                  hourly_agg |    2672.83 |     2722.64 |  49.8052 |      ms |
|                                 100th percentile service time |                                  hourly_agg |     2673.5 |     2727.54 |  54.0336 |      ms |
|                                                    error rate |                                  hourly_agg |          0 |           0 |        0 |       % |
|                                                Min Throughput |                                      scroll |    25.0149 |      25.014 | -0.00091 | pages/s |
|                                             Median Throughput |                                      scroll |    25.0315 |     25.0239 | -0.00756 | pages/s |
|                                                Max Throughput |                                      scroll |    25.1304 |     25.1191 | -0.01125 | pages/s |
|                                       50th percentile latency |                                      scroll |    777.278 |     810.028 |  32.7497 |      ms |
|                                       90th percentile latency |                                      scroll |    819.226 |     830.047 |  10.8214 |      ms |
|                                       99th percentile latency |                                      scroll |    829.322 |     844.551 |  15.2294 |      ms |
|                                      100th percentile latency |                                      scroll |    842.365 |     849.221 |  6.85621 |      ms |
|                                  50th percentile service time |                                      scroll |    776.894 |     809.656 |  32.7618 |      ms |
|                                  90th percentile service time |                                      scroll |    818.816 |     829.658 |   10.842 |      ms |
|                                  99th percentile service time |                                      scroll |    828.862 |     844.167 |  15.3056 |      ms |
|                                 100th percentile service time |                                      scroll |    841.959 |     848.848 |  6.88915 |      ms |
|                                                    error rate |                                      scroll |          0 |           0 |        0 |       % |
|                                                Min Throughput |                         desc_sort_timestamp |   0.501638 |    0.501645 |    1e-05 |   ops/s |
|                                             Median Throughput |                         desc_sort_timestamp |   0.501969 |    0.501972 |        0 |   ops/s |
|                                                Max Throughput |                         desc_sort_timestamp |   0.502457 |    0.502459 |        0 |   ops/s |
|                                       50th percentile latency |                         desc_sort_timestamp |    34.9101 |     32.3529 | -2.55717 |      ms |
|                                       90th percentile latency |                         desc_sort_timestamp |    36.8483 |     35.2537 | -1.59454 |      ms |
|                                       99th percentile latency |                         desc_sort_timestamp |    40.2969 |       37.66 | -2.63687 |      ms |
|                                      100th percentile latency |                         desc_sort_timestamp |    41.9122 |     38.7657 | -3.14649 |      ms |
|                                  50th percentile service time |                         desc_sort_timestamp |    32.7502 |     30.2327 | -2.51751 |      ms |
|                                  90th percentile service time |                         desc_sort_timestamp |    34.6973 |     33.0975 | -1.59983 |      ms |
|                                  99th percentile service time |                         desc_sort_timestamp |    38.1306 |     35.5072 | -2.62336 |      ms |
|                                 100th percentile service time |                         desc_sort_timestamp |    39.8577 |     38.3515 | -1.50618 |      ms |
|                                                    error rate |                         desc_sort_timestamp |          0 |           0 |        0 |       % |
|                                                Min Throughput |                          asc_sort_timestamp |    0.50162 |    0.501645 |    3e-05 |   ops/s |
|                                             Median Throughput |                          asc_sort_timestamp |   0.501942 |    0.501971 |    3e-05 |   ops/s |
|                                                Max Throughput |                          asc_sort_timestamp |    0.50241 |    0.502459 |    5e-05 |   ops/s |
|                                       50th percentile latency |                          asc_sort_timestamp |    62.5424 |     32.2299 | -30.3125 |      ms |
|                                       90th percentile latency |                          asc_sort_timestamp |    70.6123 |     34.3233 | -36.2889 |      ms |
|                                       99th percentile latency |                          asc_sort_timestamp |    73.8659 |     37.8268 | -36.0391 |      ms |
|                                      100th percentile latency |                          asc_sort_timestamp |    80.2303 |     39.4474 | -40.7829 |      ms |
|                                  50th percentile service time |                          asc_sort_timestamp |    60.4461 |     30.0738 | -30.3724 |      ms |
|                                  90th percentile service time |                          asc_sort_timestamp |    68.4823 |     32.1535 | -36.3288 |      ms |
|                                  99th percentile service time |                          asc_sort_timestamp |    71.7583 |     35.6573 | -36.1009 |      ms |
|                                 100th percentile service time |                          asc_sort_timestamp |    78.1952 |     37.2704 | -40.9248 |      ms |
|                                                    error rate |                          asc_sort_timestamp |          0 |           0 |        0 |       % |
|                                                Min Throughput | desc-sort-timestamp-after-force-merge-1-seg |   0.501474 |    0.501463 |   -1e-05 |   ops/s |
|                                             Median Throughput | desc-sort-timestamp-after-force-merge-1-seg |   0.501778 |    0.501825 |    5e-05 |   ops/s |
|                                                Max Throughput | desc-sort-timestamp-after-force-merge-1-seg |   0.502238 |    0.502286 |    5e-05 |   ops/s |
|                                       50th percentile latency | desc-sort-timestamp-after-force-merge-1-seg |    226.611 |     167.949 | -58.6622 |      ms |
|                                       90th percentile latency | desc-sort-timestamp-after-force-merge-1-seg |    236.714 |     174.961 | -61.7524 |      ms |
|                                       99th percentile latency | desc-sort-timestamp-after-force-merge-1-seg |    242.694 |     215.673 | -27.0209 |      ms |
|                                      100th percentile latency | desc-sort-timestamp-after-force-merge-1-seg |    244.922 |     267.275 |  22.3528 |      ms |
|                                  50th percentile service time | desc-sort-timestamp-after-force-merge-1-seg |    224.641 |     165.933 | -58.7081 |      ms |
|                                  90th percentile service time | desc-sort-timestamp-after-force-merge-1-seg |    234.722 |      172.93 | -61.7924 |      ms |
|                                  99th percentile service time | desc-sort-timestamp-after-force-merge-1-seg |    240.715 |     213.656 | -27.0583 |      ms |
|                                 100th percentile service time | desc-sort-timestamp-after-force-merge-1-seg |    242.962 |     265.244 |  22.2815 |      ms |
|                                                    error rate | desc-sort-timestamp-after-force-merge-1-seg |          0 |           0 |        0 |       % |
|                                                Min Throughput |  asc-sort-timestamp-after-force-merge-1-seg |   0.501561 |    0.501605 |    4e-05 |   ops/s |
|                                             Median Throughput |  asc-sort-timestamp-after-force-merge-1-seg |   0.501875 |    0.501926 |    5e-05 |   ops/s |
|                                                Max Throughput |  asc-sort-timestamp-after-force-merge-1-seg |   0.502342 |    0.502398 |    6e-05 |   ops/s |
|                                       50th percentile latency |  asc-sort-timestamp-after-force-merge-1-seg |    122.138 |     79.3329 | -42.8048 |      ms |
|                                       90th percentile latency |  asc-sort-timestamp-after-force-merge-1-seg |    129.641 |     83.0229 | -46.6184 |      ms |
|                                       99th percentile latency |  asc-sort-timestamp-after-force-merge-1-seg |     143.98 |     87.5347 | -56.4454 |      ms |
|                                      100th percentile latency |  asc-sort-timestamp-after-force-merge-1-seg |    145.287 |     87.9334 | -57.3537 |      ms |
|                                  50th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |    120.054 |     77.2313 | -42.8229 |      ms |
|                                  90th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |    127.579 |     80.8976 | -46.6814 |      ms |
|                                  99th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |    141.903 |      85.767 | -56.1359 |      ms |
|                                 100th percentile service time |  asc-sort-timestamp-after-force-merge-1-seg |    143.198 |     85.8129 |  -57.385 |      ms |
|                                                    error rate |  asc-sort-timestamp-after-force-merge-1-seg |          0 |           0 |        0 |       % |


nyc_taxis 1G heap G1GC vs Parallel:

|                                                        Metric |                Task |    Baseline |   Contender |     Diff |   Unit |
|--------------------------------------------------------------:|--------------------:|------------:|------------:|---------:|-------:|
|                    Cumulative indexing time of primary shards |                     |     237.357 |     208.174 | -29.1829 |    min |
|             Min cumulative indexing time across primary shard |                     |     237.357 |     208.174 | -29.1829 |    min |
|          Median cumulative indexing time across primary shard |                     |     237.357 |     208.174 | -29.1829 |    min |
|             Max cumulative indexing time across primary shard |                     |     237.357 |     208.174 | -29.1829 |    min |
|           Cumulative indexing throttle time of primary shards |                     |           0 |           0 |        0 |    min |
|    Min cumulative indexing throttle time across primary shard |                     |           0 |           0 |        0 |    min |
| Median cumulative indexing throttle time across primary shard |                     |           0 |           0 |        0 |    min |
|    Max cumulative indexing throttle time across primary shard |                     |           0 |           0 |        0 |    min |
|                       Cumulative merge time of primary shards |                     |     76.6012 |      73.989 | -2.61217 |    min |
|                      Cumulative merge count of primary shards |                     |         236 |         245 |        9 |        |
|                Min cumulative merge time across primary shard |                     |     76.6012 |      73.989 | -2.61217 |    min |
|             Median cumulative merge time across primary shard |                     |     76.6012 |      73.989 | -2.61217 |    min |
|                Max cumulative merge time across primary shard |                     |     76.6012 |      73.989 | -2.61217 |    min |
|              Cumulative merge throttle time of primary shards |                     |     16.5041 |     16.8715 |   0.3674 |    min |
|       Min cumulative merge throttle time across primary shard |                     |     16.5041 |     16.8715 |   0.3674 |    min |
|    Median cumulative merge throttle time across primary shard |                     |     16.5041 |     16.8715 |   0.3674 |    min |
|       Max cumulative merge throttle time across primary shard |                     |     16.5041 |     16.8715 |   0.3674 |    min |
|                     Cumulative refresh time of primary shards |                     |      1.4489 |     1.16713 | -0.28177 |    min |
|                    Cumulative refresh count of primary shards |                     |         105 |         104 |       -1 |        |
|              Min cumulative refresh time across primary shard |                     |      1.4489 |     1.16713 | -0.28177 |    min |
|           Median cumulative refresh time across primary shard |                     |      1.4489 |     1.16713 | -0.28177 |    min |
|              Max cumulative refresh time across primary shard |                     |      1.4489 |     1.16713 | -0.28177 |    min |
|                       Cumulative flush time of primary shards |                     |     2.11675 |     2.26642 |  0.14967 |    min |
|                      Cumulative flush count of primary shards |                     |          28 |          33 |        5 |        |
|                Min cumulative flush time across primary shard |                     |     2.11675 |     2.26642 |  0.14967 |    min |
|             Median cumulative flush time across primary shard |                     |     2.11675 |     2.26642 |  0.14967 |    min |
|                Max cumulative flush time across primary shard |                     |     2.11675 |     2.26642 |  0.14967 |    min |
|                                            Total Young Gen GC |                     |     151.055 |     101.125 |   -49.93 |      s |
|                                              Total Old Gen GC |                     |           0 |      44.269 |   44.269 |      s |
|                                                    Store size |                     |     25.3155 |     25.2885 | -0.02698 |     GB |
|                                                 Translog size |                     | 5.12227e-08 | 5.12227e-08 |        0 |     GB |
|                                        Heap used for segments |                     |      57.188 |     56.0933 | -1.09465 |     MB |
|                                      Heap used for doc values |                     |   0.0374985 |   0.0377731 |  0.00027 |     MB |
|                                           Heap used for terms |                     |     52.9576 |     51.8433 | -1.11429 |     MB |
|                                           Heap used for norms |                     |           0 |           0 |        0 |     MB |
|                                          Heap used for points |                     |           0 |           0 |        0 |     MB |
|                                   Heap used for stored fields |                     |     4.19288 |     4.21225 |  0.01937 |     MB |
|                                                 Segment count |                     |          32 |          34 |        2 |        |
|                                                Min Throughput |               index |       73110 |       81399 |  8289.03 | docs/s |
|                                             Median Throughput |               index |     76809.5 |     85367.3 |  8557.86 | docs/s |
|                                                Max Throughput |               index |     84554.2 |     94064.6 |  9510.38 | docs/s |
|                                       50th percentile latency |               index |     944.755 |     854.654 | -90.1011 |     ms |
|                                       90th percentile latency |               index |     1638.11 |     1517.59 | -120.521 |     ms |
|                                       99th percentile latency |               index |     3025.67 |     2792.63 |  -233.04 |     ms |
|                                     99.9th percentile latency |               index |     7393.48 |     5759.41 | -1634.06 |     ms |
|                                    99.99th percentile latency |               index |     11106.5 |      8920.5 | -2186.04 |     ms |
|                                      100th percentile latency |               index |     11211.6 |     9027.56 | -2184.04 |     ms |
|                                  50th percentile service time |               index |     944.755 |     854.654 | -90.1011 |     ms |
|                                  90th percentile service time |               index |     1638.11 |     1517.59 | -120.521 |     ms |
|                                  99th percentile service time |               index |     3025.67 |     2792.63 |  -233.04 |     ms |
|                                99.9th percentile service time |               index |     7393.48 |     5759.41 | -1634.06 |     ms |
|                               99.99th percentile service time |               index |     11106.5 |      8920.5 | -2186.04 |     ms |
|                                 100th percentile service time |               index |     11211.6 |     9027.56 | -2184.04 |     ms |
|                                                    error rate |               index |           0 |           0 |        0 |      % |
|                                                Min Throughput |             default |     3.01982 |     3.01976 |   -7e-05 |  ops/s |
|                                             Median Throughput |             default |     3.02937 |     3.02932 |   -5e-05 |  ops/s |
|                                                Max Throughput |             default |      3.0573 |     3.05632 | -0.00098 |  ops/s |
|                                       50th percentile latency |             default |      10.227 |     10.3453 |   0.1182 |     ms |
|                                       90th percentile latency |             default |        11.5 |      11.346 | -0.15395 |     ms |
|                                       99th percentile latency |             default |     13.8499 |     12.6521 | -1.19781 |     ms |
|                                      100th percentile latency |             default |     13.8753 |      13.911 |  0.03564 |     ms |
|                                  50th percentile service time |             default |     9.70619 |     9.82953 |  0.12333 |     ms |
|                                  90th percentile service time |             default |     10.9778 |     10.8312 | -0.14668 |     ms |
|                                  99th percentile service time |             default |     13.3356 |     12.1355 | -1.20017 |     ms |
|                                 100th percentile service time |             default |     13.3526 |     13.4993 |  0.14666 |     ms |
|                                                    error rate |             default |           0 |           0 |        0 |      % |
|                                                Min Throughput |               range |     1.00262 |     1.00283 |  0.00021 |  ops/s |
|                                             Median Throughput |               range |     1.00393 |     1.00427 |  0.00034 |  ops/s |
|                                                Max Throughput |               range |     1.00876 |     1.00839 | -0.00037 |  ops/s |
|                                       50th percentile latency |               range |     609.087 |     573.153 | -35.9342 |     ms |
|                                       90th percentile latency |               range |     613.388 |     580.881 | -32.5071 |     ms |
|                                       99th percentile latency |               range |     637.052 |     586.129 | -50.9224 |     ms |
|                                      100th percentile latency |               range |     642.315 |     608.689 | -33.6253 |     ms |
|                                  50th percentile service time |               range |     608.499 |     572.526 | -35.9734 |     ms |
|                                  90th percentile service time |               range |     612.795 |     580.258 | -32.5368 |     ms |
|                                  99th percentile service time |               range |     636.462 |     585.509 | -50.9529 |     ms |
|                                 100th percentile service time |               range |     641.723 |     608.066 |  -33.657 |     ms |
|                                                    error rate |               range |           0 |           0 |        0 |      % |
|                                                Min Throughput | distance_amount_agg |      2.0134 |     2.01334 |   -6e-05 |  ops/s |
|                                             Median Throughput | distance_amount_agg |     2.01994 |     2.01995 |    1e-05 |  ops/s |
|                                                Max Throughput | distance_amount_agg |     2.03945 |     2.03944 |   -1e-05 |  ops/s |
|                                       50th percentile latency | distance_amount_agg |     6.47976 |     6.39307 | -0.08669 |     ms |
|                                       90th percentile latency | distance_amount_agg |     6.94396 |     6.70241 | -0.24155 |     ms |
|                                       99th percentile latency | distance_amount_agg |     7.78218 |     7.41149 | -0.37069 |     ms |
|                                      100th percentile latency | distance_amount_agg |     7.79633 |     11.7932 |  3.99688 |     ms |
|                                  50th percentile service time | distance_amount_agg |     5.79209 |     5.70696 | -0.08513 |     ms |
|                                  90th percentile service time | distance_amount_agg |       6.257 |     6.01192 | -0.24508 |     ms |
|                                  99th percentile service time | distance_amount_agg |     7.09178 |     6.71757 | -0.37421 |     ms |
|                                 100th percentile service time | distance_amount_agg |     7.11111 |     11.1103 |  3.99915 |     ms |
|                                                    error rate | distance_amount_agg |           0 |           0 |        0 |      % |
|                                                Min Throughput |       autohisto_agg |     1.50269 |     1.50184 | -0.00085 |  ops/s |
|                                             Median Throughput |       autohisto_agg |     1.50467 |     1.50336 | -0.00131 |  ops/s |
|                                                Max Throughput |       autohisto_agg |     1.50976 |     1.50661 | -0.00315 |  ops/s |
|                                       50th percentile latency |       autohisto_agg |     446.781 |     517.428 |  70.6471 |     ms |
|                                       90th percentile latency |       autohisto_agg |     481.706 |     541.319 |  59.6129 |     ms |
|                                       99th percentile latency |       autohisto_agg |     492.219 |     558.418 |  66.1988 |     ms |
|                                      100th percentile latency |       autohisto_agg |     493.117 |     560.282 |  67.1646 |     ms |
|                                  50th percentile service time |       autohisto_agg |      446.39 |     517.088 |  70.6973 |     ms |
|                                  90th percentile service time |       autohisto_agg |     481.297 |     540.975 |   59.678 |     ms |
|                                  99th percentile service time |       autohisto_agg |     491.891 |     558.076 |  66.1854 |     ms |
|                                 100th percentile service time |       autohisto_agg |       492.7 |     559.941 |  67.2413 |     ms |
|                                                    error rate |       autohisto_agg |           0 |           0 |        0 |      % |
|                                                Min Throughput |  date_histogram_agg |     1.50311 |     1.50189 | -0.00122 |  ops/s |
|                                             Median Throughput |  date_histogram_agg |     1.50456 |     1.50343 | -0.00113 |  ops/s |
|                                                Max Throughput |  date_histogram_agg |     1.50859 |     1.50718 | -0.00141 |  ops/s |
|                                       50th percentile latency |  date_histogram_agg |     455.839 |     507.236 |  51.3972 |     ms |
|                                       90th percentile latency |  date_histogram_agg |     491.334 |     540.334 |  48.9998 |     ms |
|                                       99th percentile latency |  date_histogram_agg |      497.47 |     547.114 |  49.6444 |     ms |
|                                      100th percentile latency |  date_histogram_agg |     497.498 |      549.48 |  51.9821 |     ms |
|                                  50th percentile service time |  date_histogram_agg |     455.453 |     506.898 |  51.4447 |     ms |
|                                  90th percentile service time |  date_histogram_agg |     490.931 |     539.977 |  49.0459 |     ms |
|                                  99th percentile service time |  date_histogram_agg |     497.062 |     546.891 |  49.8297 |     ms |
|                                 100th percentile service time |  date_histogram_agg |     497.082 |     549.136 |  52.0533 |     ms |
|                                                    error rate |  date_histogram_agg |           0 |           0 |        0 |      % |


@ebadyano
Copy link
Contributor

1G heap experiments:
http_logs Parallel GC vs G1GC indexing throughout is ~4% higher, for most queries latency is the same, except *force-merge-1-seg queries latency is 35% better.
nyc_taxes queries the same more or less, indexing throughput is 11% better.
It seems Parallel GC shows advantage vs G1GC on smaller heaps, but even with indices.breaker.total.limit: 99.9% we still sometimes trigger circuit breaker with Parallel GC while not with G1GC. @danielmitterdorfer Do we want to investigate further if we want to use Parallel on smaller heaps?

@danielmitterdorfer
Copy link
Member Author

@danielmitterdorfer Do we want to investigate further if we want to use Parallel on smaller heaps?

Thanks for doing these experiments Evgenia! IMHO we should be good for now with what we have and we can tackle this question separately at a later point. I'd expect that ParallelGC would require us to investigate thoroughly also on other fronts to ensure cluster stability as this collector is working quite differently than CMS and G1 (it's blocking instead of concurrent).

@ebadyano ebadyano closed this as completed Jan 8, 2020
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this issue Jan 23, 2020
jrodewig added a commit that referenced this issue Dec 2, 2020
jrodewig added a commit that referenced this issue Dec 2, 2020
2lambda123 pushed a commit to 2lambda123/elastic-elasticsearch that referenced this issue May 2, 2024
2lambda123 pushed a commit to 2lambda123/elastic-elasticsearch that referenced this issue May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >enhancement Meta
Projects
None yet
Development

No branches or pull requests

3 participants