Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Tokio compat #1922

Merged
merged 6 commits into from
Mar 18, 2020
Merged

perf: Tokio compat #1922

merged 6 commits into from
Mar 18, 2020

Conversation

MOZGIII
Copy link
Contributor

@MOZGIII MOZGIII commented Feb 25, 2020

This PR follows the proposal for #1142

It switches our runtime to tokio 0.2 with very few changes, and the value in this is that we can run benchmarks and performance tests to ensure there's no degradation from upgrading to the new tokio reactor.

Addresses #1695 and #1696.

Current state (updated as we go):

There's a tokio-compat-debug branch that I use to dump the trashed code version altered with extensive debugging. I'm only using it to run the code against the CI, since my local setup doesn't reproduce the issues., and that branch isn't supposed to be merged. Rather, we'll just take the end results from it, if there are any.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 26, 2020

I've got a test that failed once, and didn't faile the second time I ran cargo test:

     Running target/debug/deps/buffering-769c046a37c114ce

running 4 tests
test test_reclaim_disk_space ... ignored
test test_buffering ... ok
test test_max_size ... FAILED
test test_max_size_resume ... ok

failures:

---- test_max_size stdout ----
thread 'test_max_size' panicked at 'assertion failed: `(left == right)`
  left: `500`,
 right: `0`', tests/buffering.rs:182:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.


failures:
    test_max_size

test result: FAILED. 2 passed; 1 failed; 1 ignored; 0 measured; 0 filtered out

error: test failed, to rerun pass '--test buffering'

This is going to be fun.

@MOZGIII MOZGIII force-pushed the tokio-compat branch 2 times, most recently from 7f6637a to 647b5d1 Compare February 26, 2020 15:14
@LucioFranco
Copy link
Contributor

@MOZGIII the max size test does fail occasionally but we should investigate that.

As for the statsd test we should enable tracing in ci via crate::test_util::trace_init and TEST_LOG=debug that should show us the error its failing on.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 26, 2020

@MOZGIII the max size test does fail occasionally but we should investigate that.

Ok! For the purposes of this PR I'm ignoring it.

As for the statsd test we should enable tracing in ci via crate::test_util::trace_init and TEST_LOG=debug that should show us the error its failing on.

Do you want this only for this branch, or everywhere?

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 26, 2020

So, the plan for this PR is the following:

After that, we should be ready to merge!

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 26, 2020

#1935 this is a candidate for merging in after we fix file sink.

@LucioFranco
Copy link
Contributor

Do you want this only for this branch, or everywhere?

I think its fine to add the trace init anywhere we have issues, its fine to also leave it since they can be useful.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 26, 2020

Now sinks::statsd::test::test_send_to_statsd passes in CI 😕

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 27, 2020

sinks::statsd::test::test_send_to_statsd failed again! I restarted the CI job manually, and it got stuck this time.

Successful run: https://circleci.com/gh/timberio/vector/86075
Failed run: https://circleci.com/gh/timberio/vector/86365 (I manually retriggered the successful run)

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 27, 2020

I got lucky and reproduced the fds error locally.

Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) high mild

Benchmarking interconnected: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 15.9s.
Benchmarking interconnected: Collecting 10 samples in estimated 15.912 s (55 iterations)thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Other, message: "Too many open files" }', src/libcore/result.rs:1188:5
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:84
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:61
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1025
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1426
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:65
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:50
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:193
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:210
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:471
  11: rust_begin_unwind
             at src/libstd/panicking.rs:375
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:84
  13: core::result::unwrap_failed
             at src/libcore/result.rs:1188
  14: vector::test_util::count_receive
  15: bench::benchmark_interconnected::{{closure}}::{{closure}}
             at benches/bench.rs:300
  16: criterion::Bencher<M>::iter_batched
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/lib.rs:504
  17: criterion::Bencher<M>::iter_with_setup
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/lib.rs:393
  18: bench::benchmark_interconnected::{{closure}}
             at benches/bench.rs:271
  19: criterion::benchmark::Benchmark<M>::with_function::{{closure}}
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/benchmark.rs:274
  20: <criterion::routine::Function<M,F,T> as criterion::routine::Routine<M,T>>::bench::{{closure}}
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/routine.rs:206
  21: core::iter::adapters::map_fold::{{closure}}
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libcore/iter/adapters/mod.rs:704
  22: core::iter::traits::iterator::Iterator::fold::ok::{{closure}}
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libcore/iter/traits/iterator.rs:1829
  23: core::iter::traits::iterator::Iterator::try_fold
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libcore/iter/traits/iterator.rs:1710
  24: core::iter::traits::iterator::Iterator::fold
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libcore/iter/traits/iterator.rs:1832
  25: <core::iter::adapters::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libcore/iter/adapters/mod.rs:737
  26: core::iter::traits::iterator::Iterator::for_each
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libcore/iter/traits/iterator.rs:632
  27: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::spec_extend
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/liballoc/vec.rs:2041
  28: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/liballoc/vec.rs:2024
  29: <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/liballoc/vec.rs:1911
  30: core::iter::traits::iterator::Iterator::collect
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libcore/iter/traits/iterator.rs:1494
  31: <criterion::routine::Function<M,F,T> as criterion::routine::Routine<M,T>>::bench
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/routine.rs:202
  32: criterion::routine::Routine::sample
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/routine.rs:134
  33: criterion::analysis::common
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/analysis/mod.rs:105
  34: <criterion::benchmark::Benchmark<M> as criterion::benchmark::BenchmarkDefinition<M>>::run
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/benchmark.rs:323
  35: criterion::Criterion<M>::bench
             at /home/mozgiii/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/criterion-0.3.1/src/lib.rs:1506
  36: bench::benchmark_interconnected
             at benches/bench.rs:268
  37: bench::benches
             at ./<::criterion::macros::criterion_group macros>:7
  38: bench::main
             at ./<::criterion::macros::criterion_main macros>:5
  39: std::rt::lang_start::{{closure}}
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/rt.rs:67
  40: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  41: std::panicking::try::do_call
             at src/libstd/panicking.rs:292
  42: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:78
  43: std::panicking::try
             at src/libstd/panicking.rs:270
  44: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  45: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  46: main
  47: __libc_start_main
  48: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
error: bench failed

I looked at the bench, and it looks like the bench itself is causing the error, not tokio. But I don't feel certain about it, please confirm if you too think the bench code is problematic.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 27, 2020

I got to a point where I see some consistency: bench on tokio-compat branch fails at interconnected as seen above. I spawned up an ec2 vm, and there bench command fails immediately after start at a different bench.

@LucioFranco
Copy link
Contributor

14: vector::test_util::count_receive looks like its failing to create a TcpConnection?

@LucioFranco
Copy link
Contributor

sinks::statsd::test::test_send_to_statsd failed again! I restarted the CI job manually, and it got stuck this time.

This might have to do with CI noisiness?

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 28, 2020

sinks::statsd::test::test_send_to_statsd failed again! I restarted the CI job manually, and it got stuck this time.

This might have to do with CI noisiness?

Locally I have observed the same before - sometimes it worked, sometimes it failed. Can't reproduce it locally anymore. 😞

14: vector::test_util::count_receive looks like its failing to create a TcpConnection?

It is the reason the process crashes, however it's not necessarily the root cause. Something else might be eating up all the fds.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 28, 2020

I ran the process under the debugger properly. This is lsof capture right when it panics.
lsof.txt

Definitely looks like it's the pipes that are leaking.

Looking at the output of the bench command, this makes a lot of sense.

Benchmarking pipe: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 10.0s.
pipe                    time:   [172.13 ms 175.84 ms 179.82 ms]               
                        thrpt:  [53.036 MiB/s 54.236 MiB/s 55.403 MiB/s]
                 change:
                        time:   [-0.1389% +2.7502% +5.6913%] (p = 0.10 > 0.05)
                        thrpt:  [-5.3848% -2.6766% +0.1391%]
                        No change in performance detected.

Benchmarking pipe_with_tiny_lines: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 8.5s.
pipe_with_tiny_lines    time:   [149.74 ms 151.97 ms 154.53 ms]                               
                        thrpt:  [631.95 KiB/s 642.60 KiB/s 652.15 KiB/s]
                 change:
                        time:   [-1.6871% +0.6380% +3.0342%] (p = 0.63 > 0.05)
                        thrpt:  [-2.9449% -0.6340% +1.7160%]
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe

Benchmarking pipe_with_huge_lines: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 55.5s.
pipe_with_huge_lines    time:   [973.54 ms 987.96 ms 1.0061 s]                                
                        thrpt:  [189.58 MiB/s 193.06 MiB/s 195.92 MiB/s]
                 change:
                        time:   [+12.523% +14.183% +16.047%] (p = 0.00 < 0.05)
                        thrpt:  [-13.828% -12.421% -11.130%]
                        Performance has regressed.

Benchmarking pipe_with_many_writers: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 8.3s.
pipe_with_many_writers  time:   [148.02 ms 151.04 ms 154.03 ms]                                 
                        thrpt:  [61.915 MiB/s 63.141 MiB/s 64.428 MiB/s]
                 change:
                        time:   [+0.9937% +4.4664% +7.3335%] (p = 0.02 < 0.05)
                        thrpt:  [-6.8324% -4.2754% -0.9839%]
                        Change within noise threshold.

Benchmarking interconnected: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 15.6s.
Benchmarking interconnected: Collecting 10 samples in estimated 15.648 s (55 iterations)thread 'tokio-runtime-worker' panicked at 'Os { code: 24, kind: Other, message: "Too many open files" }', src/test_util.rs:362:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

This doesn't necessarily means the pipes are special - maybe TCP sockets leak too, but at least with pipes we could observe this. I'll try to test whether it's specifically the pipes that cause us these issues.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 28, 2020

@LucioFranco I have this view:

image

Would it be useful to us to team up and have a call to look at the stacks? You know what to expect from tokio, maybe you'll be able to spot something that's off.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 28, 2020

I managed to isolate the issue! 🎉 https://github.com/timberio/tokio-fds-issue

The code in that repo demonstrates exactly the same behavior:

  • with tokio-compat with enable_all it drains fds and crashes;
  • with tokio-compat patched with enable_time it stabilizes at 98 fds and completes successfully.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 28, 2020

Created tokio-rs/tokio-compat#27

@LucioFranco
Copy link
Contributor

@MOZGIII sounds good, feel free to set this PR to ready for review and I'll give it a review :)

@binarylogic
Copy link
Contributor

Very happy about that. Thanks for pushing through. Looking forward to finishing off this migration.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Mar 17, 2020

I think we should double-check the data that we have 0.8.2 in test harness archive. It looks like it's been collected for some other version, cause it shows the stats that are way too high for my 0.8.x runs.

I also think we should change the approach to doing such benchmarking tests. Switching to a baremetal server should produce more consistent results. @Hoverbear mentioned PingCap have had issues with benching in the cloud. I can double that, and I've had inconsistencies with benching on different on-premise datacenter nodes, so I agree that using a single (or a couple) bare-metal server for benches is a better way.

@github-actions
Copy link

github-actions bot commented Mar 17, 2020

Great PR! Please pay attention to the following items before merging:

Files matching Cargo.lock:

  • Has at least one other team member approved the dependency changes?

Files matching src/**:

  • For each failure path, is there sufficient context logged for users to investigate the issue?
  • Do the tests ensure that behavior is sane for inputs that don't meet normal assumptions (e.g. missing field, non-string, etc)?
  • Did you add adequate documentation?

This is an automatically generated QA checklist based on modified files

@MOZGIII MOZGIII marked this pull request as ready for review March 17, 2020 22:04
@MOZGIII MOZGIII requested a review from a user March 17, 2020 22:04
@binarylogic
Copy link
Contributor

binarylogic commented Mar 17, 2020

We can discuss that separately. I personally think that is overkill for our current needs. I haven't seen large inconsistencies in the results. If you can demonstrate the problem then that can serve as the basis for the discussion. Let's work with actual data and examples.

The reason the test harness didn't contain the results is that the partitions were not discovered. This requires an MSCK REPAIR... query which takes multiple minutes to run since it scans all of S3. Hence why we don't run it every time. There are problems running this query immediately after a bin/test because tests are run on a separate AWS account. Setting all of that aside, here are the results after discovering the partitions:

➜ aws-vault exec vector -- ./bin/compare -s vector -t tcp_to_tcp_performance -v dev-tokio-compat-7-c3edeb6-1 -v 0.8.2

                                   __   __  __
                                   \ \ / / / /
                                    \ V / / /
                                     \_/  \/

                                   V E C T O R

--------------------------------------------------------------------------------
Test Comparison
Test name: tcp_to_tcp_performance
Test configuration: default
Subject: vector
Versions: dev-tokio-compat-7-c3edeb6-1 0.8.2
--------------------------------------------------------------------------------
Metric          | 0.8.2              | dev-tokio-compat-7-c3edeb6-1
----------------|--------------------|-----------------------------
Test count      | 2                  | 1                           
Duration (avg)  | 72.5s              | 69s                         
Duration (max)  | 73s                | 69s                         
CPU sys (max)   | 5.8 W              | 7.6 (+32%)                  
CPU usr (max)   | 96.5 (+342%)       | 21.8 W                      
Load 1m (avg)   | 0.9 (+446%)        | 0.2 W                       
Mem used (max)  | 194.6 MiB W        | 3.3 gib (+1646%)            
Disk read (avg) | 500.3 kib/s W      | 55.3 kib/s (-88%)           
Disk read (sum) | 35.4 MiB (+850%)   | 3.7 MiB W                   
Disk writ (sum) | 7.9 MiB (+381410%) | 2.1 kib W                   
Net recv (avg)  | 43.3 MiB/s W       | 244.4 kib/s (-99%)          
Net recv (sum)  | 3.1 gib W          | 16.5 MiB (-99%)             
Net send (sum)  | 3.1 gib            | 51 b                        
TCP estab (avg) | 441                | 256                         
TCP syn (avg)   | 1                  | 29                          
TCP close (avg) | 2                  | 140                         
--------------------------------------------------------------------------------
W = winner
vector = 0.8.2
vector = dev-tokio-compat-7-c3edeb6-1                    

This is quite a bit slower. I'm going to try and test a result off of this branch.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Mar 17, 2020

We can discuss that separately.

I agree!

Net recv (avg)  | 43.3 MiB/s W       | 244.4 kib/s (-99%)          
Net recv (sum)  | 3.1 gib W          | 16.5 MiB (-99%)             
Net send (sum)  | 3.1 gib            | 51 b  

Oh wow, clearly dev-tokio-compat-7-c3edeb6-1 wasn't in a good shape. Is that the comparison to the very first one we ran? Actually, I think this dev-tokio-compat-7-c3edeb6-1 is the version I just ran manually (hence the -1 at the end).

@binarylogic
Copy link
Contributor

Is this the correct version?

•8% ➜ aws-vault exec vector -- ./bin/compare -s vector -t tcp_to_tcp_performance -v dev-vector-pr-dummy-4e51cba-1 -v 0.8.2

                                   __   __  __
                                   \ \ / / / /
                                    \ V / / /
                                     \_/  \/

                                   V E C T O R

--------------------------------------------------------------------------------
Test Comparison
Test name: tcp_to_tcp_performance
Test configuration: default
Subject: vector
Versions: dev-vector-pr-dummy-4e51cba-1 0.8.2
--------------------------------------------------------------------------------
Metric          | 0.8.2            | dev-vector-pr-dummy-4e51cba-1
----------------|------------------|------------------------------
Test count      | 2                | 2                            
Duration (avg)  | 72.5s            | 61s                          
Duration (max)  | 73s              | 61s                          
CPU sys (max)   | 5.8 W            | 6.9 (+19%)                   
CPU usr (max)   | 96.5 W           | 97 (+0%)                     
Load 1m (avg)   | 0.9 W            | 1.1 (+18%)                   
Mem used (max)  | 194.6 MiB (+0%)  | 193.9 MiB W                  
Disk read (avg) | 500.3 kib/s W    | 445.5 kib/s (-10%)           
Disk read (sum) | 35.4 MiB (+33%)  | 26.5 MiB W                   
Disk writ (sum) | 7.9 MiB W        | 23.4 MiB (+194%)             
Net recv (avg)  | 43.3 MiB/s (-1%) | 44.2 MiB/s W                 
Net recv (sum)  | 3.1 gib W        | 2.6 gib (-14%)               
Net send (sum)  | 3.1 gib          | 2.6 gib                      
TCP estab (avg) | 441              | 436                          
TCP syn (avg)   | 1                | 0                            
TCP close (avg) | 2                | 2                            
--------------------------------------------------------------------------------
W = winner

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Mar 17, 2020

No, that would be a run from vectordotdev/vector-test-harness-github-actions-test-repo#3

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Mar 17, 2020

After rebase:

dstat net recv:
  tokio-01-check-release        3206964633600
  tokio-01-check                3209309069312
  tokio-02-check-release        3206584020992
  tokio-02-check                3207505895424
  tokio-compat-check-release    3206339022848
  tokio-compat-check            3207427837952
  vector-tokio-01-release       3206977159168
  vector-tokio-01               3208219553792
  vector-tokio-compat-release   3206756052992
  vector-tokio-compat           3207896813568
dstat net tcp syn:
  tokio-01-check-release        0
  tokio-01-check                0
  tokio-02-check-release        0
  tokio-02-check                0
  tokio-compat-check-release    0
  tokio-compat-check            1
  vector-tokio-01-release       0
  vector-tokio-01               0
  vector-tokio-compat-release   0
  vector-tokio-compat           0
tcp test server message_count:
  tokio-01-check-release        606642928
  tokio-01-check                601513349
  tokio-02-check-release        610203759
  tokio-02-check                602828073
  tokio-compat-check-release    610267357
  tokio-compat-check            599830882
  vector-tokio-01-release       62751651
  vector-tokio-01               2806304
  vector-tokio-compat-release   81763730
  vector-tokio-compat           3212955

results.tar.gz

Copy link
Contributor

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code looks 👍, just need to rebase against master and we can merge!

@github-actions
Copy link

Great PR! Please pay attention to the following items before merging:

Files matching Cargo.lock:

  • Has at least one other team member approved the dependency changes?

Files matching src/**:

  • For each failure path, is there sufficient context logged for users to investigate the issue?
  • Do the tests ensure that behavior is sane for inputs that don't meet normal assumptions (e.g. missing field, non-string, etc)?
  • Did you add adequate documentation?

This is an automatically generated QA checklist based on modified files

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Mar 18, 2020

I'm about to merge this!

Yesterday the test harness was broken, but it's now fixed and I've gathered some results with it too for completeness.

$ aws-vault exec vector -- bin/compare -t tcp_to_tcp_performance -c default -s vector -v dev-tokio-compat-base-mike-1 -v dev-tokio-compat-mike-1 -r

                                   __   __  __
                                   \ \ / / / /
                                    \ V / / /
                                     \_/  \/

                                   V E C T O R

--------------------------------------------------------------------------------
Test Comparison
Test name: tcp_to_tcp_performance
Test configuration: default
Subject: vector
Versions: dev-tokio-compat-base-mike-1 dev-tokio-compat-mike-1
--------------------------------------------------------------------------------
Metric          | dev-tokio-compat-base-mike-1 | dev-tokio-compat-mike-1
----------------|------------------------------|------------------------
Test count      | 1                            | 1                      
Duration (avg)  | 68s                          | 68s                    
Duration (max)  | 68s                          | 68s                    
CPU sys (max)   | 4 W                          | 4.4 (+10%)             
CPU usr (max)   | 97.5 (+22%)                  | 79.5 W                 
Load 1m (avg)   | 0.9 (+7%)                    | 0.9 W                  
Mem used (max)  | 207.8 MiB (+6%)              | 195.3 MiB W            
Disk read (avg) | 523.4 kib/s W                | 513.8 kib/s (-1%)      
Disk read (sum) | 34.8 MiB (+1%)               | 34.1 MiB W             
Disk writ (sum) | 4.4 MiB (+44%)               | 3.1 MiB W              
Net recv (avg)  | 28 MiB/s W                   | 23.4 MiB/s (-16%)      
Net recv (sum)  | 1.9 gib W                    | 1.6 gib (-16%)         
Net send (sum)  | 1.8 gib                      | 1.6 gib                
TCP estab (avg) | 435                          | 435                    
TCP syn (avg)   | 0                            | 0                      
TCP close (avg) | 1                            | 0                      
--------------------------------------------------------------------------------
W = winner
vector = dev-tokio-compat-base-mike-1
vector = dev-tokio-compat-mike-1

dev-tokio-compat-mike-1 is this branch (6cd58d63bb5d73465f4cabd2eb36e8054d0b4d35);
dev-tokio-compat-base-mike-1 is this branch's base (master at 56bad9d02543fa0861df5262afcb59b09f12580c).

Note how the base performance is much worse than 0.8.0 and 0.8.2 from above? This makes me think we were doing the comparison with test harness wrong all along, and even though it shows some degradation from master to this PR branch, it's not as bad really, yet there's some unrelated degradation in the current master since the last release, which we didn't account for.

@MOZGIII MOZGIII merged commit ccd3cf7 into master Mar 18, 2020
@binarylogic binarylogic deleted the tokio-compat branch April 24, 2020 20:38
@binarylogic binarylogic added type: enhancement A value-adding code change that enhances its existing functionality. domain: performance Anything related to Vector's performance and removed type: performance labels Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: performance Anything related to Vector's performance type: enhancement A value-adding code change that enhances its existing functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Verify that tokio 0.2 fixes concurrency bottleneck Upgrade to tokio v0.2 and std::future::Future
5 participants