Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ballista standalone mode tests fail: context::tests::test_task_stuck_when_referenced_task_failed #25

Open
alamb opened this issue Feb 15, 2022 · 3 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@alamb
Copy link
Contributor

alamb commented Feb 15, 2022

Describe the bug
The following ballista test is failing (not sure when it started failing given the tests weren't run in CI until apache/datafusion#1839 )

---- context::tests::test_task_stuck_when_referenced_task_failed stdout ----
Found object store LocalFileSystem for path /Users/alamb/Software/arrow-datafusion/parquet-testing/data/single_nan.parquet
thread 'context::tests::test_task_stuck_when_referenced_task_failed' panicked at 'called `Result::unwrap()` on an `Err` value: Execution("Job RcB8xKy failed: Task failed due to Tokio error: DataFusion error: Execution(\"ArrowError(ParseError(\\\"Error parsing line 2: Error(UnequalLengths { pos: Some(Position { byte: 104, line: 3, record: 2 }), expected_len: 2, len: 1 })\\\"))\")")', ballista/rust/client/src/context.rs:541:42
stack backtrace:
   0: rust_begin_unwind
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:498:5
   1: core::panicking::panic_fmt
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/panicking.rs:107:14
   2: core::result::unwrap_failed
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/result.rs:1613:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/result.rs:1295:23
   4: ballista::context::tests::test_task_stuck_when_referenced_task_failed::{{closure}}
             at ./src/context.rs:541:23
   5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/future/mod.rs:80:19
   6: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/future/future.rs:119:9
   7: tokio::runtime::basic_scheduler::CoreGuard::block_on::{{closure}}::{{closure}}::{{closure}}
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/basic_scheduler.rs:516:48
   8: tokio::coop::with_budget::{{closure}}
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/coop.rs:102:9
   9: std::thread::local::LocalKey<T>::try_with
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/thread/local.rs:399:16
  10: std::thread::local::LocalKey<T>::with
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/thread/local.rs:375:9
  11: tokio::coop::with_budget
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/coop.rs:95:5
  12: tokio::coop::budget
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/coop.rs:72:5
  13: tokio::runtime::basic_scheduler::CoreGuard::block_on::{{closure}}::{{closure}}
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/basic_scheduler.rs:516:25
  14: tokio::runtime::basic_scheduler::Context::enter
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/basic_scheduler.rs:374:19
  15: tokio::runtime::basic_scheduler::CoreGuard::block_on::{{closure}}
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/basic_scheduler.rs:515:36
  16: tokio::runtime::basic_scheduler::CoreGuard::enter::{{closure}}
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/basic_scheduler.rs:582:57
  17: tokio::macros::scoped_tls::ScopedKey<T>::set
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/macros/scoped_tls.rs:61:9
  18: tokio::runtime::basic_scheduler::CoreGuard::enter
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/basic_scheduler.rs:582:27
  19: tokio::runtime::basic_scheduler::CoreGuard::block_on
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/basic_scheduler.rs:506:9
  20: tokio::runtime::basic_scheduler::BasicScheduler::block_on
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/basic_scheduler.rs:182:24
  21: tokio::runtime::Runtime::block_on
             at /Users/alamb/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-1.16.1/src/runtime/mod.rs:475:46
  22: ballista::context::tests::test_task_stuck_when_referenced_task_failed
             at ./src/context.rs:542:9
  23: ballista::context::tests::test_task_stuck_when_referenced_task_failed::{{closure}}
             at ./src/context.rs:473:11
  24: core::ops::function::FnOnce::call_once
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/ops/function.rs:227:5
  25: core::ops::function::FnOnce::call_once
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

To Reproduce
Get the code from apache/datafusion#1839 and run

cd arrow-datafusion/ballista
test --no-default-features --features standalone -- --ignored

Expected behavior
Test should pass

Additional context
Add any other context about the problem here.

@Ted-Jiang
Copy link
Member

i reproduce in my local.

@Ted-Jiang
Copy link
Member

Ted-Jiang commented Feb 16, 2022

@gaojun2048 i think this test_task_stuck_when_referenced_task_failed UT will get error, is this your purpose?
I think it will return error in call collect() , is there something i miss?

@andygrove andygrove transferred this issue from apache/datafusion May 19, 2022
@EricJoy2048
Copy link
Member

@gaojun2048 i think this test_task_stuck_when_referenced_task_failed UT will get error, is this your purpose? I think it will return error in call collect() , is there something i miss?

I'm sorry I replied too late.
Yes, I have a pr about this issue : apache/datafusion#1654
The problem appears in referenced_task failed, Before I submit this PR, the query will be stuck. After my PR is merged, the query will return to failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants