[Feature Request] Store misc information within TraceContext which are not propagated #929

alonbl · 2019-06-26T10:05:42Z

Feature:
Store misc information within TraceContext which are not propagated

Rational
Currently, based on experiments and reading the source code, the TraceContext.extra() is limited to a pre-defined key/value set that is designed to be propagated. However, it may be usable to store additional data which is local to the trace context which are not going to be propagated but consumed by Propagation.Factory.decorate(TraceContext).

Example Scenario
Provided there is HTTP header that contains a JWT of the identity of the user used to create a flow in system, this HTTP header is marked to be propagated to all calls that are part of this transaction.

There are claims in the JWT such as sub, name, email which should be written to logs per that transaction, exactly the same as the trace id and span id. These are calculated values out of JWT. There is no reason to propagate these as we already propagate the JWT, and there is no reason to calculate them over and over.

The solution we thought is to parse the token at a filter and add the required claims into the TraceContext.extra, implement Propagation.Factory.decorate(TraceContext) to load the information into MDC. However, the TraceContext.extra is not friendly for custom/generic input as implementation expects a concrete classes and structure [NOTE: it may well that I do not fully understand the multiple class handling within the list, however, it is not simple to register a new class post construction].

Will you consider a patch that will enable a generic Map<String, Object> that can be attached to the TraceContext for this kind of use case?

Prior Art

More interoperable propagation via extra stuff in TraceContext #390 - not sure if relevant
WIP TraceContext propagation handler #693 - not sure if relevant

Thanks!

codefromthecrypt · 2019-06-26T10:13:47Z

if not propagating and the only reason is MDC.. I am wondering why you wouldnt just use an MDC filter instead of using brave to do this .

…

On Wed, Jun 26, 2019, 6:05 PM Alon Bar-Lev ***@***.***> wrote: *Feature:* Store misc information within TraceContext which are not propagated *Rational* Currently, based on experiments and reading the source code, the TraceContext.extra() is limited to a pre-defined key/value set that is designed to be propagated. However, it may be usable to store additional data which is local to the trace context which are not going to be propagated but consumed by Propagation.Factory.decorate(TraceContext). *Example Scenario* Provided there is HTTP header that contains a JWT of the identity of the user used to create a flow in system, this HTTP header is marked to be propagated to all calls that are part of this transaction. There are claims in the JWT such as sub, name, email which should be written to logs per that transaction, exactly the same as the trace id and span id. These are calculated values out of JWT. There is no reason to propagate these as we already propagate the JWT, and there is no reason to calculate them over and over. The solution we thought is to parse the token at a filter and add the required claims into the TraceContext.extra, implement Propagation.Factory.decorate(TraceContext) to load the information into MDC. However, the TraceContext.extra is not friendly for custom/generic input as implementation expects a concrete classes and structure [NOTE: it may well that I do not fully understand the multiple class handling within the list, however, it is not simple to register a new class post construction]. Will you consider a patch that will enable a generic Map<String, Object> that can be attached to the TraceContext for this kind of use case? *Prior Art* - #390 <#390> - not sure if relevant - #693 <#693> - not sure if relevant Thanks! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#929?email_source=notifications&email_token=AAAPVV3UEJ4IBZOZHQKO34TP4M5PPA5CNFSM4H3QSXDKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G3YOQKA>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAPVV7WL7CHMTEMFKAT6KLP4M5PPANCNFSM4H3QSXDA> .

alonbl · 2019-06-26T10:23:03Z

if not propagating and the only reason is MDC.. I am wondering why you wouldnt just use an MDC filter instead of using brave to do this .

Hi!

Thanks for the quick response.

We need to fetch the information from somewhere... As it has no knowledge of the original thread, and has only the TraceConext, also the MDC is taking values from the context and put them into MDC (restoring it later).

Maybe I do not follow what "MDC filter" you refer to...

Brave is already keep track the transaction and the context delegation, it looks like attaching information to the trace context will provide consistency of the information, in most similar context based implementations there is a opaque map to store additional data to allow consumer to attach misc information, it is the same as brave design just that there is the limit of what can be stored.

codefromthecrypt · 2019-06-26T10:38:30Z

this should probably move to gitter but anyway. to clarify what you mean by not propagated can help. maybe choose one of.. * I need this data on the same thread that took the server request and enclosed by the processing of that request (single span) * I need this data from the initial request including any intermediate steps (in process propagation), but not downstream (remote propagation) if the former there are a number of ways to do this depending on what framework you use. for example something as simple as servlet filter which looks at that header and uses the MDC api directly. brave is not arbitrary middleware so I am trying to explore options in case you dont need anything tracing specific. some of this depends on how you need the data. for example in our grpc implementation we make available the calling method name for lookup anywhere.. I am trying to figure out if this is similar or not.

…

On Wed, Jun 26, 2019, 6:23 PM Alon Bar-Lev ***@***.***> wrote: if not propagating and the only reason is MDC.. I am wondering why you wouldnt just use an MDC filter instead of using brave to do this . Hi! Thanks for the quick response. We need to fetch the information from somewhere... As it has no knowledge of the original thread, and has only the TraceConext, also the MDC is taking values from the context and put them into MDC (restoring it later). Maybe I do not follow what "MDC filter" you refer to... Brave is already keep track the transaction and the context delegation, it looks like attaching information to the trace context will provide consistency of the information, in most similar context based implementations there is a opaque map to store additional data to allow consumer to attach misc information, it is the same as brave design just that there is the limit of what can be stored. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#929?email_source=notifications&email_token=AAAPVV35KD65YAT2XD6O3MTP4M7QPA5CNFSM4H3QSXDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYTCF3Y#issuecomment-505815791>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAPVV4JHXTF7FTDIGITDULP4M7QPANCNFSM4H3QSXDA> .

alonbl · 2019-06-26T10:54:16Z

OT: gitter requires more permissions that I agree to provide it... Looking into all my organizations in github, extracting members and have full access to by gitlab APIs... not sure why plain old mailing list is not available? Or at least plain user/password?

The problem is not to put the information at MDC at initial process, but to carry this information to all threads that belongs to the transaction... just like having the traceid carried to all threads automatically based on the TraceContext, I do not see any difference as all is related to the transaction scope.

I do think this is relating to tracing specific as the information consistency is per the tracing within the process.

If you think otherwise feel free to close this, however, them we will have to mimic the transaction tracing mechanism almost to the bit in order to implement the ability to have access to arbitrary data that belongs to a trace context... Which is exactly why we have brave :)

Thanks!

codefromthecrypt · 2019-06-26T11:10:14Z

From what I understand you want the ability to add and remove values such as exist with our extra fields propagation feature, except to not have these sent remotely. The primary use case is synchronization of mdc. I also understand that it could be fine to predefine the field names of these "locally propagated" attributes. This would make things easier in general if acceptable. Finally, I understand that you want this populated from a header with exactly the same value of the header also. In other words the literal header becomes the MDC key and the literal header value as the value. Do I understand? PS on the alternative to gitter we do have a plain old mailing list also. [email protected] https://groups.google.com/forum/#!forum/zipkin-user

…

On Wed, Jun 26, 2019, 6:54 PM Alon Bar-Lev ***@***.***> wrote: OT: gitter requires more permissions that I agree to provide it... Looking into all my organizations in github, extracting members and have full access to by gitlab APIs... not sure why plain old mailing list is not available? Or at least plain user/password? The problem is not to put the information at MDC at initial process, but to carry this information to all threads that belongs to the transaction... just like having the traceid carried to all threads automatically based on the TraceContext, I do not see any difference as all is related to the transaction scope. I do think this is relating to tracing specific as the information consistency is per the tracing within the process. If you think otherwise feel free to close this, however, them we will have to mimic the transaction tracing mechanism almost to the bit in order to implement the ability to have access to arbitrary data that belongs to a trace context... Which is exactly why we have brave :) Thanks! — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#929?email_source=notifications&email_token=AAAPVVYWEDZA5MRTBA62UGDP4NDFRA5CNFSM4H3QSXDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYTEPUI#issuecomment-505825233>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAPVV4YISJ6II3UPMWSA2DP4NDFRANCNFSM4H3QSXDA> .

alonbl · 2019-06-26T11:21:43Z

st with our extra fields propagation feature, except to not have these sent remotely. The primary use case is synchronization of mdc.

Correct! not only mdc, but let's take mdc as an example, the context data is valid in any scenario, whenever this trace is communicate to, can it be different thread, green thread or bean.

I also understand that it could be fine to predefine the field names of these "locally propagated" attributes. This would make things easier in general if acceptable.

Yes, if mandatory for the solution I can settle to have them pre-defined and generate the required initialization. However, I prefer this to be free style.

Finally, I understand that you want this populated from a header with exactly the same value of the header also. In other words the literal header becomes the MDC key and the literal header value as the value.

No... the values can be taken from any source, I just gave an example of a scenario in which there is a header with JWT format, and the information I would like in the context are selected claims, so I would like parse the header one time (even get it from baggage) and put the calculated data (in this case it is the claims) in context. Also the restriction of having string only in context is very limited for this scenario, but it is not a show stopper.

PS on the alternative to gitter we do have a plain old mailing list also.

These are not published anywhere :)
I would have used it.

This lets you use the extra field propagation mechanism to propagate in, but not out of process. This can be used to correlate properties with logs in a way that doesn't add network overhead. This design was chosen as it incurs very little in-process overhead and has the least burden on the code base. It can be used as a stop-gap until a more fully featured correlation context feature is developed. Fixes #929 See also #577

codefromthecrypt · 2019-06-26T12:50:41Z

PS I don't really want to mention or encourage use of this for passing around security related data. However, the following change should work for the use case you mention I think #931

alonbl · 2019-06-26T13:07:52Z

The root cause and security context in distributed transaction is essential for auditing and logging. The sleuth/brave mechanism enables that quite nicely.

The #931 is good enough solution, thanks for making that happen!

Please correct me if I am wrong... In sleuth words....

A new patch is required:

Introduce new variables spring.sleuth.redacted-propagation-keys of type list.
Have another loop at[1] to call addRedactedField per each entry in the environment.

Application will:
The use the existing ExtraFieldPropagation.set() to set these additional values which are added as redacted, these will be available in TraceContext but will not be sent over the wire.

BTW: We can avoid sleuth change if we have somekind of "-key" as add with redacted and "key" as existing behavior... not sure you like it but it will keep the API intact and have the ability to leverage the new feature without any change in consumer... :)

[1] https://github.com/spring-cloud/spring-cloud-sleuth/blob/v2.1.1.RELEASE/spring-cloud-sleuth-core/src/main/java/org/springframework/cloud/sleuth/autoconfig/TraceAutoConfiguration.java#L153

codefromthecrypt · 2019-06-26T13:17:07Z

The root cause and security context in distributed transaction is essential for auditing and logging. The sleuth/brave mechanism enables that quite nicely. The #931 is good enough solution, thanks for making that happen!

coolio

Please correct me if I am wrong... In sleuth words.... A new patch is required: Introduce new variables spring.sleuth.redacted-propagation-keys of type list. Have another loop at[1] to call addRedactedField per each entry in the environment. Application will: The use the existing ExtraFieldPropagation.set() to set these additional values which are added as redacted, these will be available in TraceContext but will not be sent over the wire.

yeah the above would be in a sleuth issue for integration

BTW: We can avoid sleuth change if we have somekind of "-key" as add with redacted and "key" as existing behavior... not sure you like it but it will keep the API intact and have the ability to leverage the new feature without any change in consumer... :) [1] https://github.com/spring-cloud/spring-cloud-sleuth/blob/v2.1.1.RELEASE/spring-cloud-sleuth-core/src/main/java/org/springframework/cloud/sleuth/autoconfig/TraceAutoConfiguration.java#L153

technically you can do this now without a code change. This mentioned change is just more direct and efficient. The way to do it without changing current code is via composition. You could wrap ExtraFieldPropagation with one that monkeys with the setter to ensure things aren't leaked like so: @OverRide public <C> Injector<C> injector(Setter<C, K> setter) { return delegate.injector((carrier, key, value) -> { if (key.equals("redacted key")) return; setter.put(carrier, key, value); }); }

alonbl · 2019-06-26T13:22:37Z

Oh.. I mean to have the addField check if fieldName has '-' at start or end and either behave as addField or addRedactedField without introducing a new method, so that all sleuth will require to do is use the newer version of brave to enjoy this new feature.

codefromthecrypt · 2019-06-26T13:28:16Z

Oh.. I mean to have the addField check if fieldName has '-' at start or end and either behave as addField or addRedactedField without introducing a new method, so that all sleuth will require to do is use the newer version of brave to enjoy this new feature.

heh clever idea but we don't usually do things like that. it is easier to maintain the codebase when there aren't interesting expressions.

alonbl · 2019-06-26T13:33:11Z

You cannot blame me for trying :)

Thanks!!!

The minor gaps I still have are:

Ability to attach "object" and not only plain strings.
Ability to attach arbitrary keys, can be solved by having the opposite BitSet... mark these that are regular so that everything else is redacted.

However, even without these two I can progress with this solution for the immediate term.

Thank you again.

codefromthecrypt · 2019-06-26T13:38:11Z

we need to switch to email

…

On Wed, Jun 26, 2019 at 9:33 PM Alon Bar-Lev ***@***.***> wrote: You cannot blame me for trying :) Thanks!!! The minor gaps I still have are: Ability to attach "object" and not only plain strings. Ability to attach arbitrary keys, can be solved by having the opposite BitSet... mark these that are regular so that everything else is redacted. However, even without these two I can progress with this solution for the immediate term. Thank you again. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

alonbl · 2019-06-26T14:11:10Z

we need to switch to email

Done[1]

[1] https://groups.google.com/forum/#!topic/zipkin-user/C0caGZ5B_eg

This lets you use the extra field propagation mechanism to propagate in, but not out of process. This can be used to correlate properties with logs in a way that doesn't add network overhead. This design was chosen as it incurs very little in-process overhead and has the least burden on the code base. It can be used as a stop-gap until a more fully featured correlation context feature is developed. Fixes #929 See also #577

codefromthecrypt mentioned this issue Jun 26, 2019

Adds ability to redact fields from downstream propagation #931

Merged

codefromthecrypt closed this as completed in #931 Jul 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Store misc information within TraceContext which are not propagated #929

[Feature Request] Store misc information within TraceContext which are not propagated #929

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019 •

edited

Loading

codefromthecrypt commented Jun 26, 2019

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

[Feature Request] Store misc information within TraceContext which are not propagated #929

[Feature Request] Store misc information within TraceContext which are not propagated #929

Comments

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019 • edited Loading

codefromthecrypt commented Jun 26, 2019

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

codefromthecrypt commented Jun 26, 2019 via email

alonbl commented Jun 26, 2019

alonbl commented Jun 26, 2019 •

edited

Loading