Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup ExternalSorter metrics (#5885) #6364

Merged
merged 1 commit into from
May 17, 2023

Conversation

tustvold
Copy link
Contributor

Which issue does this PR close?

Part of #5885
Part of #5108

Rationale for this change

In preparation for improving the memory accounting in ExternalSorter / SortPreservingMerge I first wanted to sanitise what already existed. I will call out the various changes in the PR

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

mem_used, spill_count and spilled_bytes are no longer included in BaselineMetrics (as they were only used by ExternalSorter)

CompositeMetricsSet and MemTrackingMetrics have been removed

@tustvold tustvold added the api change Changes the API exposed to users of the crate label May 16, 2023
@github-actions github-actions bot added the core Core DataFusion crate label May 16, 2023
Comment on lines -54 to -61
/// count of spills during the execution of the operator
spill_count: Count,

/// total spilled bytes during the execution of the operator
spilled_bytes: Count,

/// current memory usage for the operator
mem_used: Gauge,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were only used by ExternalSorter and so I figure aren't really Baseline

@tustvold tustvold force-pushed the cleanup-sort-metrics branch from 93b24f0 to 09a798a Compare May 16, 2023 14:24
/// multiple in-mem sort metrics and final merge-sort metrics from `SortPreservingMergeStream`.
/// Therefore, We need a separation of metrics for which are final metrics (for output_rows accumulation),
/// and which are intermediate metrics that we only account for elapsed_compute time.
pub struct CompositeMetricsSet {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaselineMetrics::intermediate replaces the need for this


/// Wraps a [`BaselineMetrics`] and records memory usage on a [`MemoryReservation`]
#[derive(Debug)]
pub struct MemTrackingMetrics {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this construction may date from an earlier iteration of the memory tracking, as it stands now it makes little sense.

The memory reported by the mem_used metric will be a somewhat arbitrary value based on the last value at the point the plan finished. Additionally there isn't any way to use the MemoryReservation in a fallible manner. It felt easier to just separate the concerns of reporting plan metrics from tracking runtime memory usage.

@tustvold tustvold force-pushed the cleanup-sort-metrics branch from 09a798a to eafefe4 Compare May 16, 2023 14:29
@@ -56,6 +55,27 @@ use tempfile::NamedTempFile;
use tokio::sync::mpsc::{Receiver, Sender};
use tokio::task;

struct ExternalSorterMetrics {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is much nicer to put sorting metrics on the sorter rather than BaselineMetrics

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me. Thank you @tustvold

I will also run the sort benchmark to make sure this doesn't cause a regression (I don't expect it will) and report back

cc @yjshen as I believe he contributed an early version of these metrics

@alamb
Copy link
Contributor

alamb commented May 17, 2023

My benchmark run shows no changes in performance, as expected

--------------------
Benchmark sort.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       sort ┃       sort ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ Qsort utf8   │ 63572.72ms │ 65037.97ms │     no change │
│ Qsort int    │ 77433.88ms │ 75017.20ms │     no change │
│ Qsort        │ 65095.90ms │ 63913.09ms │     no change │
│ decimal      │            │            │               │
│ Qsort        │ 83230.11ms │ 80413.45ms │     no change │
│ integer      │            │            │               │
│ tuple        │            │            │               │
│ Qsort utf8   │ 64949.20ms │ 61437.19ms │ +1.06x faster │
│ tuple        │            │            │               │
│ Qsort mixed  │ 74037.84ms │ 70042.36ms │ +1.06x faster │
│ tuple        │            │            │               │

@tustvold tustvold merged commit cf81117 into apache:main May 17, 2023
richox pushed a commit to richox/arrow-datafusion that referenced this pull request May 29, 2023
richox pushed a commit to richox/arrow-datafusion that referenced this pull request May 29, 2023
richox pushed a commit to richox/arrow-datafusion that referenced this pull request Jul 5, 2023
richox pushed a commit to richox/arrow-datafusion that referenced this pull request Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants