Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SQLMetric APIs, port existing metrics #908

Merged
merged 1 commit into from
Aug 24, 2021

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Aug 19, 2021

Which issue does this PR close?

Closes #679. See #901 for an earlier version with feedback from @tustvold and @andygrove

Rationale for this change

See the description on #679 (comment) for the full rationale, but the TLDR version is:

  1. Better align the metric data model with industry best practice to ease integration in other metric systems (e.g. prometheus, influxdb, etc)
  2. Ability to get per-partition metrics
  3. Ability to get current metric values during execution without allocation

What changes are included in this PR?

  1. Update the SQLMetric API to be in its own module, have labels, know about partitions, and allow for real time inspection
  2. Rename SQLMetric --> Metric
  3. Update uses of metrics in DataFusion and Ballista to the new API
  4. Functionality to aggregate (sum) metrics via predicate and via partition (as requested by @Dandandan in Improved features and interoperability for SQLMetrics #679 (comment) and @andygrove in Improved features and interoperability for SQLMetrics #679 (comment))
  5. Rename metric names to snake case (output_rows) rather than camel 🐫 case (outputRows) to conform to Rust expectations (see note from @andygrove on the reason the names were camelCase to begin with: Implement new metrics API / RFC #901 (comment))

Are there any user-facing changes?

YES!

The SQLMetric / Metric API is now totally different so any code that creates / uses SQLMetrics would have to be updated. The updates are fairly mechanical as you can see in this PR)

Notes

In keeping with Rust's tradition of static typing, I also changed to using more strongly typed versions of the metric values to avoid mistakes such as adding a "time" to a counter value, as well as allowing other counter specific operations.

Also, as the metrics aren't specific to SQL (they apply to any ExecutionPlan, even if that plan was created via the DataFrame API or the LogicalPlanBuilder) I renamed SQLMetric --> Metric

Not included in this PR:

  1. Ensure that all operators have reasonable metrics: (I plan this in a follow on PR for Add "baseline" metrics to all built in operators #866, using this API)
  2. Support for a global "operator id" as described by @andygrove in Improved features and interoperability for SQLMetrics #679 (comment)

@alamb alamb added the api change Changes the API exposed to users of the crate label Aug 19, 2021
@github-actions github-actions bot added ballista datafusion Changes in the datafusion crate labels Aug 19, 2021
@alamb alamb marked this pull request as ready for review August 19, 2021 20:21
@alamb
Copy link
Contributor Author

alamb commented Aug 23, 2021

Here is an example of what is needed to use the new API in IOx: https://github.com/influxdata/influxdb_iox/pull/2385

Copy link
Contributor Author

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andygrove @Dandandan @houqp I am sorry for the large PR -- though it is mostly doc comments / tests.

This PR incorporates the first round of feedback from @tustvold and @andygrove
on #901)

FYI @returnString (who wrote the initial metrics API)

/// that had the same name and partition=`Some(..)` have been
/// aggregated together. The resulting `MetricsSet` has all
/// metrics with `Partition=None`
pub fn aggregate_by_partition(&self) -> Self {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the code that can aggregate metrics by partition (and sum above is also useful)

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good. Thanks @alamb

@alamb
Copy link
Contributor Author

alamb commented Aug 24, 2021

Thanks @andygrove -- I plan to merge this in later today unless I anyone objects or would like more time to review.

Copy link
Contributor

@returnString returnString left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome, huge usability improvement 👍

@@ -2172,6 +2172,8 @@ async fn csv_explain_analyze() {
let formatted = arrow::util::pretty::pretty_format_batches(&actual).unwrap();
let formatted = normalize_for_explain(&formatted);

println!("ANALYZE EXPLAIN:\n{}", formatted);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray debug logging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really it is more like a development tool. I will plan to remove it as a follow on PR.

@alamb alamb merged commit bd49b86 into apache:master Aug 24, 2021
@alamb alamb deleted the alamb/metrics2 branch August 24, 2021 15:05
@alamb
Copy link
Contributor Author

alamb commented Aug 24, 2021

For anyone else who is updating code that uses the existing SQLMetric API, here is an example PR that updates IOx https://github.com/influxdata/influxdb_iox/pull/2385 in case those patterns are helpful

@houqp
Copy link
Member

houqp commented Aug 24, 2021

Late to the party, this is really great work, thanks @alamb !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate datafusion Changes in the datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improved features and interoperability for SQLMetrics
4 participants