Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update trace example for Kafka messaging #1807

Conversation

kenfinnigan
Copy link
Member

Minor update to the Kafka messaging example for tracing, as after seeing a real-life example of this situation it didn't look right. I now realize it makes more sense for the second producer to be a child of the receiver, as opposed to no parent and a link to the first producer.

@kenfinnigan kenfinnigan requested review from a team July 8, 2021 13:07
@arminru arminru added area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory labels Jul 8, 2021
@arminru
Copy link
Member

arminru commented Jul 8, 2021

Related to #1799.

@kenfinnigan kenfinnigan force-pushed the update-kafka-trace-messaging-example branch from 7b4ebc9 to 87eb392 Compare July 14, 2021 14:54
@@ -263,8 +263,8 @@ Process CB: | Span Rcv2 |
| Field or Attribute | Span Prod1 | Span Rcv1 | Span Proc1 | Span Prod2 | Span Rcv2
|-|-|-|-|-|-|
| Span name | `"T1 send"` | `"T1 receive"` | `"T1 process"` | `"T2 send"` | `"T2 receive`" |
| Parent | | Span Prod1 | Span Rcv1 | | Span Prod2 |
| Links | | | | Span Prod1 | |
| Parent | | Span Prod1 | Span Rcv1 | Span Rcv1 | Span Prod2 |
Copy link
Member

@Oberon00 Oberon00 Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1799's suggestion of

Suggested change
| Parent | | Span Prod1 | Span Rcv1 | Span Rcv1 | Span Prod2 |
| Parent | | Span Prod1 | Span Rcv1 | Span Proc1| Span Prod2 |

seems to make more sense:

Your suggestion here is:

  • Prod1
    • Rcv1
      • Proc1
      • Prod2
        • Rcv2

Assuming the second message is produced when processing the first, and not as part of just receiving the first message, #1799 seems intuitively correct and also a natural outcome as long as the processing span properly sets its Span as "current":

  • Prod1
    • Rcv1
      • Proc1
        • Prod2
          • Rcv2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's where the issue is, for most message processing with Quarkus, Spring Boot, likely others, when the library/framework handles the production of a new message, the method doing the processing has already been completed.

It's also possible for post method processing evaluation by the library/framework to determine the outcome is to negatively acknowledge the consumed message and not produce a new outgoing message.

From this perspective it's more natural to say producing a new message is a subsequent step after processing an incoming message, as opposed to a direct child of the processing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for most message processing with Quarkus, Spring Boot, likely others, when the library/framework handles the production of a new message, the method doing the processing has already been completed

Interesting, can you maybe give me a link to some example code? I would have expected the processing method simply does something like call myMessagingSystemConnection.connectToQueue("foo").send(myMessage).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what you said, maybe we need to expand the definition of message receiving to also optionally include processing by a framework. Or we should do the actual business logic processing in a nested processing span /or even an INTERNAL span)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Quarkus there is https://quarkus.io/guides/kafka#the-price-converter as an example of a processing business method that has no control over creating a PRODUCER span for @Outgoing.

For Spring there is https://docs.spring.io/spring-kafka/reference/html/#reply-message

It's possible I've misinterpreted the current spec for message tracing as well. I understood the operation name "receive" to just be related to receiving the message from some other system, and "process" was the business logic that acted on the received message. Maybe I'm wrong in my understanding.

If actual business logic was inside an INTERNAL span, what would a "process" operation relate to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how things being different with JMS comes into play.

Weren't we discussing the difference between receive/consume and the hypothetical dispatch span?

So not directly related to this example, only insofar that I don't think the Kafka case can be modeled well at all with the current semantic conventions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry. Thinking on it some more I'm not sure the possible dispatch span really adds anything, as the framework handling of the receive, executing a method with the message, and possibly producing a new outgoing message, is fairly "thin" and adding further spans in that slice can have large impacts on the performance of the reactive stream.

I still think the following is the best representation of messaging with Kafka:

  • Prod1
    • Rcv1
      • Proc1
      • Prod2
        • Rcv2

However, if there's further work required to define a more appropriate span representation of messaging with Kafka, an alternative is the removal of the example completely until it's rectified.

My main goal is to fix the current example, as it's definitely wrong. Whether that's fixing it how I've described, or removing it for later re-evaluation I don't mind

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and adding further spans in that slice can have large impacts on the performance

I did not suggest to add additional spans but to replace the receive span with a dispatch span.
Receiving should only be receiving, not the follow-up processing. A receive span as defined currently cannot even have the producer as parent because more or less by definition it starts before we can read anything from the message, as it tracks the time between asking the messaging system for a new message and getting a response. Anything that already has the message as input is a "process" span, not "receive" according to the current definitions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually I remembered that wrong, the spec says:

The receive span is [to] be used to track the time used for receiving the message(s), whereas the process span(s) track the time for processing the message(s). Note that one or multiple Spans with messaging.operation = process may often be the children of a Span with messaging.operation = receive. The distinction between receiving and processing of messages is not always of particular interest or sometimes hidden away in a framework (see the Message consumption section above) and therefore the attribute can be left out.

So actually, I think using "Recv1" as name is fine here, but it would maybe be better to leave out the messaging.operation entirely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's reasonable, let me update the example to not have messaging.operation set

@kenfinnigan kenfinnigan force-pushed the update-kafka-trace-messaging-example branch from 87eb392 to 469544c Compare July 15, 2021 17:56
@kenfinnigan
Copy link
Member Author

Any additional questions/comments?

@kenfinnigan
Copy link
Member Author

Any feedback @arminru?

@kenfinnigan
Copy link
Member Author

Any comments or feedback?

@kenfinnigan kenfinnigan force-pushed the update-kafka-trace-messaging-example branch from 469544c to f6b6a7d Compare July 29, 2021 19:54
@kenfinnigan
Copy link
Member Author

@Oberon00 With the example updated to remove message.operation, is it aligned with what you were thinking or needs further work?

I'm on PTO next week so wanted to see if this can be resolved before

@Oberon00
Copy link
Member

Oberon00 commented Aug 5, 2021

I think the example data looks fine, but you need to add some explanation summarizing the outcome of our discussion, for why the example (parent/child, messaging.operation) is the way it is now.
Also, the example should maybe be renamed to "Apache Kafka with Quarkus or Spring Boot Example"

@kenfinnigan
Copy link
Member Author

@Oberon00 Let me know how that sounds, and whether it needs some wordsmithing

Thanks for all the discussion and feedback

Frameworks such as Quarkus and Spring Boot separate processing of a received message from producing subsequent messages out.
For this reason, receiving (Span Rcv1) is the parent of both processing (Span Proc1) and producing a new message (Span Prod2).
The span representing message receiving (Span Rcv1) should not set `messaging.operation` to `receive`,
as the work is hidden away by frameworks such as Quarkus and Spring Boot.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the reasoning would be "as it does not only receive the message but also converts the input message to something suitable for the processing operation to consume and creates the output message from the result of processing."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the update to this as it's clearer than what I had. Thanks

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 14, 2021
@kenfinnigan
Copy link
Member Author

Could someone else kindly review, and hopefully approve, this?

@carlosalberto carlosalberto merged commit bc39933 into open-telemetry:main Aug 16, 2021
@kenfinnigan kenfinnigan deleted the update-kafka-trace-messaging-example branch September 13, 2021 18:10
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants