Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the error message and UX of tpch benchmark program #1800

Merged
merged 1 commit into from
Feb 10, 2022

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Feb 9, 2022

Which issue does this PR close?

Closes #1799

Rationale for this change

When I run the command as suggested in

https://github.com/apache/arrow-datafusion/blob/alamb%2Fbetter_bench_ux/benchmarks/README.md#L1

It errors like this which does not tell me what file it is searching for so I don't know how to fix the problem

cargo run --bin tpch -- benchmark datafusion -o /tmp -p ~/Software/tpch_data/SF1 -q 1 --format tbl
...
     Running `/Users/alamb/Software/df-target/debug/tpch benchmark datafusion -o /tmp -p /Users/alamb/Software/tpch_data/SF1 -q 1 --format tbl`
Running benchmarks with the following options: DataFusionBenchmarkOpt { query: 1, debug: false, iterations: 3, partitions: 2, batch_size: 8192, path: "/Users/alamb/Software/tpch_data/SF1", file_format: "tbl", mem_table: false, output_path: Some("/tmp") }
[2022-02-09T18:44:26Z DEBUG datafusion::execution::memory_manager] Creating memory manager with initial size 11744051.2 TB
thread 'main' panicked at 'failed to read query: Os { code: 2, kind: NotFound, message: "No such file or directory" }', benchmarks/src/bin/tpch.rs:566:42
...

the issue is I am running from the arrow-datafusion directory but the program is looking for a file like queries/1.sql and the actual location is datafusion/benchmarks/queries/1

What changes are included in this PR?

This PR adds additional searching paths for the files and produces a nicer error if they aren't found

Example Error:

alamb@MacBook-Pro-2 Software % ./df-target/debug/tpch  benchmark datafusion -o /tmp -p ~/Software/tpch_data/SF1 -q 1 --format tbl
<mark datafusion -o /tmp -p ~/Software/tpch_data/SF1 -q 1 --format tbl
Running benchmarks with the following options: DataFusionBenchmarkOpt { query: 1, debug: false, iterations: 3, partitions: 2, batch_size: 8192, path: "/Users/alamb/Software/tpch_data/SF1", file_format: "tbl", mem_table: false, output_path: Some("/tmp") }
Error: Plan("invalid query. Could not find query: [\"queries/q1.sql: No such file or directory (os error 2)\", \"benchmarks/queries/q1.sql: No such file or directory (os error 2)\"]")

Are there any user-facing changes?

fewer errors and better error messages

@alamb alamb requested a review from andygrove February 9, 2022 18:56
@alamb
Copy link
Contributor Author

alamb commented Feb 9, 2022

Found this while testing out #1766

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good way to solve this problem. In fact, I also encountered the problem, but I just switchover the benchmark dir and run the benchmark command.

@alamb alamb merged commit 52f9dff into apache:master Feb 10, 2022
@alamb alamb deleted the alamb/better_error_ui branch August 8, 2023 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve the error message and UX of tpch benchmark program
2 participants