Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort metrics alphabetically in EXPLAIN ANALYZE output #12568

Merged
merged 1 commit into from
Sep 22, 2024

Conversation

progval
Copy link
Contributor

@progval progval commented Sep 21, 2024

Which issue does this PR close?

Closes #12567.

Rationale for this change

From ParquetExec metrics, with predicate pushdown enabled:

metrics=[output_rows=0, elapsed_compute=96ns, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=21050, file_scan_errors=0, pushdown_rows_matched=0, row_groups_pruned_statistics=173576, row_groups_pruned_bloom_filter=21050, file_open_errors=0, num_predicate_creation_errors=0, bytes_scanned=25023432248, pushdown_rows_pruned=0, page_index_rows_pruned=0, predicate_evaluation_errors=0, page_index_rows_matched=0, time_elapsed_scanning_until_data=16.622753ms, time_elapsed_processing=72.280463608s, pushdown_eval_time=382ns, page_index_eval_time=3.177676ms, time_elapsed_scanning_total=16.661811ms, time_elapsed_opening=102.989073638s]

For example, pushdown_rows_matched and pushdown_rows_pruned (highlighted in the snippet) are far away from each other, even though they refer to roughly the same thing.

The unstable sort also makes it hard to compare multiple EXPLAIN ANALYZE results.

after this change, metrics for the same query look like this:

metrics=[output_rows=0, elapsed_compute=96ns, bytes_scanned=25023432248, file_open_errors=0, file_scan_errors=0, num_predicate_creation_errors=0, page_index_rows_matched=0, page_index_rows_pruned=0, predicate_evaluation_errors=0, pushdown_rows_matched=0, pushdown_rows_pruned=0, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=21050, row_groups_pruned_bloom_filter=21050, row_groups_pruned_statistics=173576, page_index_eval_time=2.882359ms, pushdown_eval_time=382ns, time_elapsed_opening=104.629010525s, time_elapsed_processing=73.86660138s, time_elapsed_scanning_total=97.929057ms, time_elapsed_scanning_until_data=97.893962ms]

What changes are included in this PR?

Refinement of the partial order used in MetricsSet::sorted_for_display

Are these changes tested?

Yes

Are there any user-facing changes?

More readable output. There doesn't seem to be any snippet in the documentation that needs to be updated to match the new behavior.

@github-actions github-actions bot added the physical-expr Physical Expressions label Sep 21, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @progval -- this is a very nice improvement in my mind

Copy link
Member

@Weijun-H Weijun-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @progval

@Weijun-H Weijun-H merged commit 3bd41bc into apache:main Sep 22, 2024
24 checks passed
bgjackma pushed a commit to bgjackma/datafusion that referenced this pull request Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EXPLAIN ANALYZE metrics should be sorted alphabetically
3 participants