Skip to content
This repository has been archived by the owner on Sep 12, 2023. It is now read-only.

Flaky Rollover test #2274

Closed
bonomat opened this issue Jun 21, 2022 · 7 comments · Fixed by #2275
Closed

Flaky Rollover test #2274

bonomat opened this issue Jun 21, 2022 · 7 comments · Fixed by #2275
Labels
bug Something isn't working

Comments

@bonomat
Copy link
Collaborator

bonomat commented Jun 21, 2022

I believe this is not the first time that I've seen the rollover test failing on CI so I thought I create this ticket:

https://github.com/itchysats/itchysats/runs/6978234506?check_suite_focus=true
https://github.com/itchysats/itchysats/runs/6978501105?check_suite_focus=true

logs:

2022-06-21T05:07:30.890106ZTRACEwire: Received msg_name="MakerToTaker::CurrentOffers"
thread 'rollover_an_open_cfd_maker_going_short' panicked at 'assertion failed: nanoseconds <= 0', /Users/runner/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/time-0.3.10/src/duration.rs:205:13
2022-06-21T05:07:33.186718ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:35.114795ZERRORdaemon::rollover::protocol: Rollover failed: End of stream while receiving rollover decision from maker order_id=039c4a12-7ae9-4c99-8077-b7bb99d7e7cc
2022-06-21T05:07:35.117854Z WARNmodel::cfd: Rollover failed: End of stream while receiving rollover decision from maker order_id=039c4a12-7ae9-4c99-8077-b7bb99d7e7cc
2022-06-21T05:07:35.581490ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:35.582253ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:35.776453ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:38.310996ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:40.583780ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:40.584497ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:40.875338ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:43.495683ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:45.655280ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:45.656029ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:46.011020ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:48.614530ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:50.657671ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:50.658498ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:51.175912ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:53.734707ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:55.709514ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:55.710251ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:07:56.265887ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:07:58.811785ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:00.851667ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:00.852488ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:01.312877ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:03.870419ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:05.875651ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:05.876511ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:06.373971ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:08.936527ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:10.931388ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:10.932520ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:11.535411ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:14.082159ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:16.019100ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:16.019826ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:16.584303ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:19.189022ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:21.063304ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:21.064147ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:21.812936ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:24.393736ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:26.206833ZTRACEwire: Sending taker_id=c3c2a220e6a527389bc81f42531a752033b1d9486f2342b6945629e92e329b5a msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:26.207675ZTRACEwire: Received msg_name="MakerToTaker::Heartbeat"
2022-06-21T05:08:26.919567ZTRACEwire: measuring heartbeat pulse
2022-06-21T05:08:29.562996ZTRACEwire: measuring heartbeat pulse
thread 'rollover_an_open_cfd_maker_going_short' panicked at 'called `Result::unwrap()` on an `Err` value: Value channel did not satisfy predicate within 60 seconds
@bonomat
Copy link
Collaborator Author

bonomat commented Jun 21, 2022

Maybe related to : #2263

@bonomat bonomat added the bug Something isn't working label Jun 21, 2022
@klochowicz
Copy link
Contributor

Seems like we have hit this bug: time-rs/time#481

@klochowicz
Copy link
Contributor

klochowicz commented Jun 21, 2022

A bit more analysis on the bug: it's not just mere panic (because it's not happening in production), but it means that there's a bug in the time, and the result might at times not be what we expected. It would be funny if this turns out to be cause of the rollovers diverging!

@da-kami
Copy link
Contributor

da-kami commented Jun 21, 2022

With my branch rollover_refactor rebased on master I am able to (sometimes) reproduce this locally:

2022-06-21T22:32:31.404426Z TRACE wire: Sending taker_id=5f6147d246000364ed824a46ffd4b7e661cea0a066506e402213ced844578e53 msg_name="MakerToTaker::CurrentOffers"
2022-06-21T22:32:31.407120Z TRACE wire: Received msg_name="MakerToTaker::CurrentOffers"

called `Result::unwrap()` on an `Err` value: Value channel did not satisfy predicate within 60 seconds

Caused by:
    deadline has elapsed
thread 'double_rollover_an_open_cfd' panicked at 'assertion failed: nanoseconds <= 0', /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/time-0.3.10/src/duration.rs:205:13
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:143:14
   2: core::panicking::panic
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:48:5
   3: time::duration::Duration::new_unchecked
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/time-0.3.10/src/duration.rs:205:13
   4: <time::time::Time as core::ops::arith::Sub>::sub
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/time-0.3.10/src/time.rs:751:9
   5: <time::primitive_date_time::PrimitiveDateTime as core::ops::arith::Sub>::sub
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/time-0.3.10/src/primitive_date_time.rs:933:34
   6: <time::offset_date_time::OffsetDateTime as core::ops::arith::Sub>::sub
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/time-0.3.10/src/offset_date_time.rs:1218:9
   7: model::cfd::Cfd::hours_to_extend_in_rollover_based_on_event
             at /Users/dakami/CoBloX/repos/hermes/model/src/cfd.rs:1531:26
   8: model::cfd::Cfd::accept_rollover_proposal
             at /Users/dakami/CoBloX/repos/hermes/model/src/cfd.rs:918:17
   9: <daemon::rollover::maker::Actor as xtra::Handler<daemon::rollover::maker::Accept>>::handle::{{closure}}::{{closure}}::{{closure}}
             at /Users/dakami/CoBloX/repos/hermes/daemon/src/rollover/maker.rs:174:83
  10: daemon::command::Executor::execute::{{closure}}
             at /Users/dakami/CoBloX/repos/hermes/daemon/src/command.rs:38:26
  11: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/future/mod.rs:84:19
  12: <daemon::rollover::maker::Actor as xtra::Handler<daemon::rollover::maker::Accept>>::handle::{{closure}}::{{closure}}
             at /Users/dakami/CoBloX/repos/hermes/daemon/src/rollover/maker.rs:182:27
  13: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/future/mod.rs:84:19
  14: tokio_tasks::tasks::Tasks::add_fallible::{{closure}}
             at /Users/dakami/CoBloX/repos/hermes/tokio-tasks/src/tasks.rs:35:20
  15: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/future/mod.rs:84:19
  16: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::future::future::Future>::poll
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panic/unwind_safe.rs:289:9
  17: <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll::{{closure}}
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/futures-util-0.3.21/src/future/future/catch_unwind.rs:36:42
  18: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panic/unwind_safe.rs:271:9
  19: std::panicking::try::do_call
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:492:40
  20: ___rust_try
  21: std::panicking::try
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:456:19
  22: std::panic::catch_unwind
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panic.rs:137:14
  23: <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/futures-util-0.3.21/src/future/future/catch_unwind.rs:36:9
  24: <futures_util::future::future::remote_handle::Remote<Fut> as core::future::future::Future>::poll
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/futures-util-0.3.21/src/future/future/remote_handle.rs:105:29
  25: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/future/future.rs:123:9
  26: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/core.rs:161:17
  27: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/loom/std/unsafe_cell.rs:14:9
  28: tokio::runtime::task::core::CoreStage<T>::poll
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/core.rs:151:13
  29: tokio::runtime::task::harness::poll_future::{{closure}}
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/harness.rs:461:19
  30: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panic/unwind_safe.rs:271:9
  31: std::panicking::try::do_call
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:492:40
  32: ___rust_try
  33: std::panicking::try
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:456:19
  34: std::panic::catch_unwind
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panic.rs:137:14
  35: tokio::runtime::task::harness::poll_future
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/harness.rs:449:18
  36: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/harness.rs:98:27
  37: tokio::runtime::task::harness::Harness<T,S>::poll
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/harness.rs:53:15
  38: tokio::runtime::task::raw::poll
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/raw.rs:113:5
  39: tokio::runtime::task::raw::RawTask::poll
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/raw.rs:70:18
  40: tokio::runtime::task::LocalNotified<S>::run
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/task/mod.rs:343:9
  41: tokio::runtime::basic_scheduler::Inner<P>::block_on::{{closure}}::{{closure}}
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/basic_scheduler.rs:297:52
  42: tokio::coop::with_budget::{{closure}}
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/coop.rs:106:9
  43: std::thread::local::LocalKey<T>::try_with
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/thread/local.rs:413:16
  44: std::thread::local::LocalKey<T>::with
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/thread/local.rs:389:9
  45: tokio::coop::with_budget
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/coop.rs:99:5
  46: tokio::coop::budget
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/coop.rs:76:5
  47: tokio::runtime::basic_scheduler::Inner<P>::block_on::{{closure}}
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/basic_scheduler.rs:297:29
  48: tokio::runtime::basic_scheduler::enter::{{closure}}
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/basic_scheduler.rs:349:29
  49: tokio::macros::scoped_tls::ScopedKey<T>::set
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/macros/scoped_tls.rs:61:9
  50: tokio::runtime::basic_scheduler::enter
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/basic_scheduler.rs:349:5
  51: tokio::runtime::basic_scheduler::Inner<P>::block_on
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/basic_scheduler.rs:230:9
  52: tokio::runtime::basic_scheduler::InnerGuard<P>::block_on
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/basic_scheduler.rs:501:9
  53: tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/basic_scheduler.rs:186:24
  54: tokio::runtime::Runtime::block_on
             at /Users/dakami/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.14.1/src/runtime/mod.rs:461:46
  55: happy_path::double_rollover_an_open_cfd
             at ./tests/happy_path.rs:412:5
  56: happy_path::double_rollover_an_open_cfd::{{closure}}
             at ./tests/happy_path.rs:408:7
  57: core::ops::function::FnOnce::call_once
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ops/function.rs:227:5
  58: core::ops::function::FnOnce::call_once
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
2022-06-21T22:32:32.388208Z TRACE wire: measuring heartbeat pulse
2022-06-21T22:32:34.843965Z ERROR daemon::rollover::protocol: Rollover failed: End of stream while receiving rollover decision from maker order_id=aa0ee60b-6ea3-4101-afc4-4614d8ce3463
2022-06-21T22:32:34.849558Z  WARN model::cfd: Rollover failed: End of stream while receiving rollover decision from maker order_id=aa0ee60b-6ea3-4101-afc4-4614d8ce3463
2022-06-21T22:32:34.883705Z TRACE wire: Sending taker_id=5f6147d246000364ed824a46ffd4b7e661cea0a066506e402213ced844578e53 msg_name="MakerToTaker::Heartbeat"

It does look like we run into the debug_assert in time, but I am not sure that it was this bump #2267 that is causing this...?

@da-kami
Copy link
Contributor

da-kami commented Jun 21, 2022

time panics when creating the resulting Duration at the subtraction in this line:

https://github.com/itchysats/itchysats/blob/2f31cd76ab51c88ade8bb88aea9eda6b252fa736/model/src/cfd.rs#L1531

I think the problem is that the current settlement time might be in the past. But this should work - we expect that the Duration can be negative at this point, see two lines below:

https://github.com/itchysats/itchysats/blob/2f31cd76ab51c88ade8bb88aea9eda6b252fa736/model/src/cfd.rs#L1535-L1539

Previously we saw Rolling over a contract that can be settled non-collaboratively log in the test.

This is bad 😅 - We should potentially go back to an older time version. I am surprised this happens after the bump to 0.3.10 (#2267) and not earlier.

@da-kami
Copy link
Contributor

da-kami commented Jun 21, 2022

I can confirm that pinning the time version to =0.3.9 fixes the problem on my branch.

@bonomat
Copy link
Collaborator Author

bonomat commented Jun 21, 2022

let hours_left = current_settlement_time - now;

can you deterministicly reproduce this? If so, what are the values for current_settlement_time and now?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants