Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Cloudwatch logging via trace ID injection #188

Open
kut opened this issue Dec 3, 2019 · 16 comments
Open

Support for Cloudwatch logging via trace ID injection #188

kut opened this issue Dec 3, 2019 · 16 comments

Comments

@kut
Copy link

kut commented Dec 3, 2019

Hi - with the release of ServiceLens (which looks very helpful), I was wondering if you were considering adding trace ID injection somehow? This is required to match Cloudwatch logs to traces in XRay (and thus ServiceLens), see here.

This works in the XRay Java SDK by injecting the trace ID into the MDC (see here), which means that each log is given a suffix that includes the trace ID (can then be picked up by the ServiceLens/XRay console to tie traces to logs), e.g. :

2019-09-10 18:58:30.844 [nio-5000-exec-4] AWS-XRAY-TRACE-ID: 1-5d77f256-19f12e4eaa02e3f76c78f46a WARN 1 - Your logging message here

Any thoughts on supporting this in the way that the Java SDK does? I don't know much about the MDC but I've only seen it in Java.

@willarmiros
Copy link
Contributor

willarmiros commented Dec 11, 2019

Hi @kut,
Python uses a single built-in library for logging, which makes things much less complicated than they were in Java. It appears this library also supports context-specific information, which is what the MDC does in Java. Specifically, it seems like logging filters would be the way to go. They're already integrated with thread.local which we use under the hood. Lastly, It's very easy to format messages since the library has a built in Formatter object that does that.

So basically, all the ingredients to add trace ID injection are there. We might need to add hooks into beginSegment and endSegment to do these injections depending on how exactly filters work. Overall this seems very doable. I will add it to the backlog and hopefully we can get it prioritized soon!

@kut
Copy link
Author

kut commented Dec 12, 2019

Awesome @willarmiros thanks! ServiceLens/XRay/Cloudwatch interplay is going to save us a whole lot of pain.

@kapilt
Copy link
Contributor

kapilt commented Mar 27, 2020

interesting, so to generate servicelens support, we just need to add a logging formatter that attaches the current_segment into the log line output? is there a separate mechanism for structured logging (i assume a well known key).

@Giaco9
Copy link

Giaco9 commented May 9, 2020

Hi all. Is @kapilt right? Any news on this? ServiceLens looks amazing! Great job :)

@willarmiros
Copy link
Contributor

@kapilt That would work for to enable Trace ID injection, and there is not a specifically required key for structured logs (though we use AWS-XRAY-TRACE-ID) because we query by Trace ID rather than by key. However to enable full log correlation you also need to have the CloudWatch log group in the body of the segment so that ServiceLens knows which log group(s) to query. See the java SDK where this is done automatically via plugins.

We are actively discussing our roadmap for native ServiceLens support across all our SDKs so that integration works out of the box.

@Giaco9 thanks for the feedback :)

@kut
Copy link
Author

kut commented Nov 16, 2020

Hey @willarmiros - any update on whether this is on roadmap? Thanks!

@sstoops
Copy link

sstoops commented Nov 16, 2020

@kut Your comment reminded me I needed to mention here how I got this working, even without official SDK support. I first added a small utility to my codebase:

"""Additional tools for configuring X-Ray in a Python process.

Borrowed from: https://github.com/garyd203/xraysink/blob/master/src/xraysink/config.py

"""

from aws_xray_sdk.core import xray_recorder


def set_xray_log_group(log_group: str):
    """Set the CloudWatch Logs log group used by this process.

    The log group is used in AWS ServiceLens to help link a trace with the
    related log records. The caller is still responsible for putting the
    trace ID into every log record (there is no standard, it simply needs to
    appear verbatim somewhere in the log message, whether structured or
    unstructured)

    This function interacts with the AWS segment metadata that is set by the
    SDK plugins. This means that if you reset the plugin data (by setting the
    plugins to `()`), then the log group will be reset also. It also means
    that if any plugin also sets the log group, then the most recently set
    value will persist.

    """

    log_resources = xray_recorder._aws_metadata.setdefault("cloudwatch_logs", [{}])
    log_resources[0]["log_group"] = log_group

I execute this utility upon initial application setup. For example, in a Django application, I put the following at the bottom of my settings file:

AWS_XRAY_SDK_ENABLED = os.environ.get("AWS_XRAY_SDK_ENABLED", False)
AWS_LOG_GROUP = os.environ.get("AWS_LOG_GROUP", None)
if AWS_XRAY_SDK_ENABLED == "True" and AWS_LOG_GROUP:
    set_xray_log_group(AWS_LOG_GROUP)

I then created a custom logging filter that adds the trace ID to my log records:

import logging
import os

from aws_xray_sdk.core import xray_recorder

AWS_XRAY_SDK_ENABLED = os.environ.get("AWS_XRAY_SDK_ENABLED", False)


class TraceIdFilter(logging.Filter):
    """Adds the AWS X-Ray Trace ID to the log record.

    Since this filter checks for the current X-Ray segment, it can cause log messages
    to be generated. We need to make sure we ignore these so we don't get into an
    infinite recursion situation.

    """

    if AWS_XRAY_SDK_ENABLED == "True":

        def filter(self, record):
            if (
                record.name == "aws_xray_sdk.core.context"
                and record.funcName == "handle_context_missing"
            ):
                return True
            trace = xray_recorder.get_trace_entity()
            if trace and trace.trace_id:
                record.__dict__.update(traceId=trace.trace_id)
            else:
                record.__dict__.setdefault("traceId", "none")
            return True

    else:

        def filter(self, record):
            return True

Then you'll need to configure your logging to utilize this filter. I send all my logs to CloudWatch as json structures, so the trace id just becomes a field in that object.

I believe this is all I did to get this working. It's been a couple months since I put this all together so hopefully I have not forgotten anything.

@willarmiros
Copy link
Contributor

@Brutimus thanks for sharing that workaround! @kut our plans have been shifted slightly so we haven't been able to add trace ID injection in Python to our roadmap. We have been devoting our resources to establishing X-Ray Support in the OpenTelemetry SDK. It is possible that we or another member of the community will introduce support for Trace ID into logs injection over there as well. This issue will be updated when support is added to either.

@kut
Copy link
Author

kut commented Nov 21, 2020

Thanks for the update @willarmiros, curious to see how OT evolves.

@Brutimus this is awesome many thanks!

@kornicameister
Copy link

Hi,

has anyone used this library: https://pypi.org/project/aws-lambda-logging/

?

@TarekAS
Copy link

TarekAS commented Nov 22, 2021

Is this feature still on the roadmap or am I better off migrating to OpenTelemetry?

@NathanielRN
Copy link
Contributor

Hi @TarekAS,

The X-Ray team has been doing a lot of work to make OpenTelemetry compatible with AWS X-Ray so you should be able to find lots of useful features if you migrate to using that project!

In regards to this specific feature, it seems that we are only partially complete towards getting full CloudWatch ServceLens <-> X-Ray Segment Correlation. There are 2 parts we need, but only 1 is fully implemented:

1 - (TODO) X-Ray Segment (created from OTel Span) Trace ID Injection into Application Logs

This feature requires OpenTelemetry to have a package that will read OTel Python spans, format them in AWS X-Ray format, and inject them into logs generated using the Python logging library. We still need to create this feature, so I created an issue for this on the aws-otel-python repo: aws-observability/aws-otel-python#46

2 - (DONE) Application Log Group injection into X-Ray Segment (created from OTel Span)

This feature is complete! As long as you use OTel Python with the ADOT Collector (a subset of the OTel Collector) you'll be able to set the aws.log.group.names attribute on the Application Resource and have it appear on all exported X-Ray Segments.

In OTel Python this would look something like this:

resource = Resource(
    {
        ResourceAttributes.AWS_LOG_GROUP_NAMES: "my-group",
    }
)

This resource object is then used to uniquely identify your app as you instrument with OpenTelemetry Python. You can learn more about this by viewing our example application instrumented with OTel Python on the aws-otel-python repo.

Putting the log groups on the Segment is what helps ServiceLens know which log groups to look through when correlating X-Ray Segments to CloudWatch Logs (which have the Trace ID injected).

Conclusion

The TL;DR is that we need aws-observability/aws-otel-python#46 before OTel is ready to support this feature, but hopefully you'll find many other useful features on OTel if you do have the chance to make the switch to it!

@TarekAS
Copy link

TarekAS commented Dec 6, 2021

Hi @NathanielRN

Right now we are manually injecting the Xray Trace ID into the logs for each request, as well as adding the cloudwatch_logs group name to the trace resource metadata (as shown by @Brutimus, thanks).

It's working fine so far, except for one issue with integrating Cloudwatch/Service Lens: The View logs button is greyed out in the Service Lens UI and in the trace details the All logs for this trace section shows an error: An error occurred fetching your data. This affects the usability significantly. Could we be doing something wrong?

@NathanielRN
Copy link
Contributor

Hi @TarekAS ! I haven't personally tried out the solution above, but I would only ask if you can make sure the logging format is as in the docs that I linked in the issue above. Can you confirm that your logs have a format like the following? AWS-XRAY-TRACE-ID: 1-5d77f256-19f12e4eaa02e3f76c78f46a@1ce7df03252d99e1.

Next, can you manually search the X-Ray console for that trace ID and see that Trace there?

It sounds like you are having trouble going from the Service Lens UI -> CloudWatch Logs & Service Lens UI -> X-Ray Console. If that is the case I can follow up with my team and try to get you in contact with the right Service Lens maintainers 🙂

@TarekAS
Copy link

TarekAS commented Dec 12, 2021

Hi @NathanielRN

I fixed it by properly injecting the cloudwatch logs group name into the trace segment (I was using {} instead of [{}] for the cloudwatch_logs metadata). The logs appear correctly under the Traces view in CloudWatch under All logs for this tace.

However, the View Logs button is still greyed out in the ServiceLens Map view for some reason, even though the logs are there when I click on an individual trace under View traces

@NathanielRN
Copy link
Contributor

@TarekAS Nice I’m glad you made some progress!

Hm I’m not completely familiar with ServiceLens so I might be off here. It sounds like you did Part 1 in my comment above already; you injected the Trace ID into the log.

Can you confirm you also did Part 2? Did you inject the log group into the X-Rat Trace Segment? Specifically, can you see the log group of the logs you’re looking for under the aws.log.group.names key of the trace?

It could not be this but just an idea to cover all our bases 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants