Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scheduler span under query that aggregates all stage spans #20096

Merged
merged 3 commits into from
Jan 19, 2024

Conversation

wweiss-starburst
Copy link
Contributor

@wweiss-starburst wweiss-starburst commented Dec 13, 2023

Description

The Trino spans below the the Query span are Dispatch, Analyzer, Planner and then a sequence of Stage spans. Placing the set of Stage spans under a new Scheduler span allows all execution spans to be grouped together. In
addition, new Exchange spans are also being added to instrument the exchanges that operate across stages.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Dec 13, 2023
@wweiss-starburst wweiss-starburst force-pushed the wweiss/add-runner-span branch 3 times, most recently from 8702503 to 3838e1a Compare December 13, 2023 16:35
Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - mostly style related comments.

@@ -835,6 +844,7 @@ public void run()
failure = closeAndAddSuppressed(failure, nodeAllocator);

failure.ifPresent(queryStateMachine::transitionToFailed);
schedulerSpan.end();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we attach failure to schedulerSpan if present?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new event that will write the exception's getMessage() to a new FAILURE attribute key

@wweiss-starburst wweiss-starburst force-pushed the wweiss/add-runner-span branch 2 times, most recently from 25cb072 to 938be6d Compare December 18, 2023 15:04
@Experimental(eta = "2023-09-01")
public class ExchangeContext
public interface ExchangeContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the current usages, we could have leave ExchangeContext as a concrete class.
What's the reason to make it an interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the Span ends up affecting all the OpenTelemetry dependencies changing them from runtime to provided. That ends up rippling into all the other pom files. It makes the PR much larger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow

Copy link
Contributor Author

@wweiss-starburst wweiss-starburst Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding a span to ExchangeContext, it creates a new dependency on opentelemetry-context. The build error is:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.11.0:compile (default-compile) on project trino-spi: Compilation failure
[ERROR] /Users/walter.weiss/git/opensource/os_trino/core/trino-spi/src/main/java/io/trino/spi/exchange/ExchangeContext.java:[33,27] cannot access io.opentelemetry.context.ImplicitContextKeyed
[ERROR] class file for io.opentelemetry.context.ImplicitContextKeyed not found

To get around this, I changed the dependency scope of opentelemetry-context from runtime to provided. However, that forces a change in scope for nearly all the other projects as well, so to minimize the blast radius it seemed more straightforward to change ExchangeContext to an interface. If you have an alternative suggestion, I am open to it.

Comment on lines +21 to +29
default OpenTelemetry getOpenTelemetry()
{
throw new UnsupportedOperationException();
}

default Tracer getTracer()
{
throw new UnsupportedOperationException();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove default implementations. the implementations are supposed to be complete

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same pattern ConnectorContext. The reason for having this as an interface (as well as migrating ExchangeContext to interface) is that otherwise we would need to make trino-spi depend on parts of opentelemetry implementation opentelemetry-context. Not only opentelemetry-api.

@findepi
Copy link
Member

findepi commented Dec 19, 2023

cc @nineinchnick @wendigo

@wweiss-starburst wweiss-starburst force-pushed the wweiss/add-runner-span branch 2 times, most recently from aefa4a6 to abd4cb0 Compare January 5, 2024 17:05
losipiuk and others added 3 commits January 9, 2024 11:58
JSON serialzability of ExchangeSourceHandles is required in some
contexts.
The Trino spans below the the Query span are Dispatch, Analyzer, Planner
and then a sequence of Stage spans. Placing the set of Stage spans under
a new Scheduler span allows all execution spans to be grouped together.
In addition, new Exchange spans are also being added to instrument the
exchanges that operate across stages.

Normalize getter in ExchangeContext interface.
The experimental designation for exchanges had an ETA of 9/1/23 and can
now be removed.
@losipiuk losipiuk force-pushed the wweiss/add-runner-span branch from abd4cb0 to 1bf2ee6 Compare January 9, 2024 10:59
@losipiuk losipiuk merged commit 292c101 into trinodb:master Jan 19, 2024
90 of 91 checks passed
@github-actions github-actions bot added this to the 437 milestone Jan 19, 2024
@colebow
Copy link
Member

colebow commented Jan 24, 2024

Does this need a release note?

@wendigo
Copy link
Contributor

wendigo commented Jan 24, 2024

@colebow no

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants