-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Level detection #12645
Comments
We should also consider using another field like |
I am working on Alloy modules for log processing porting what was done for module v1 in agent to alloy modules v2 this is what i'm currently doing to detect / default a log-level if it helps. |
Looking forward to the upcoming improvements. Currently we have some JSON logs containing
If so, could be configurable? |
@cyriltovena Hi, sorry for troubling. I am using fluent-bit to send logs to loki 3.0.0. I decided no try non-released version of loki(due to fixes in loki-bloom feature), but noticed that I do not have just in case i have enabled log level discovery:
version: Is it expected? |
I see Loki 3.1 just landed and indeed Is there going to be a way to reconfigure this to Should I open a separate issue? |
As far as i can see there is a task in grafana: grafana/grafana#87564 To be honest it is a bit disappoint that it was changed in loki, but grafana code was not changed accordingly. I have to wait for grafana/grafana#87564 and only after that i can update to 3.1 version. |
@svennergr can we do something for Grafana to fallback on detected level ? |
Since this is sort of a breaking change - is there a way to push a feature to revert this optionally, until Grafana is updated? |
I'd love the label name to be configurable myself. Even though I completely understand the it was changed, I'm partial to AFAICT it's just a single line change so I'm almost tempted to do our own internal build, but it's hard to claim that's sane from a long-term support perspective. I'd be happy to submit a PR if someone from Grafana Labs could assure me that wouldn't be wasted time. |
1up on this |
A further difference I see between level and detected_level, detected_level is not usable in stream selector (only as label filter expression in log pipeline). {source=~".+", level="debug"} | detected_level="debug" |
That isn't a change. The auto-detected log level was always structured metadata, not a log stream label, so label filter expressions were always required. If you had a Possibly the difference here is that in Loki 3.0, if a |
Thanks for clarification @jtackaberry. Indeed in my environment the label level was always populated both in loki 3.0 and 3.1 and therefore not auto-detected in both version. The label detected_level was never populated and is therefore auto-detected in version 3.1. |
Thanks @lko23. This is another unfortunate consequence of the rename to We're in the same boat: a small handful of our services define a We haven't updated to Loki 3.1 yet but I'm starting to feel we'll need a custom build that reverts |
@jtackaberry hi, sorry for the disruption this has caused you. so, the reason for moving to the reason we introduced this feature was that people had |
@trevorwhitney, thanks for the thoughtful comment.
I'm a bit confused by this, and I think this might indicate we have different goals for the this functionality, possibly influenced by future integrations Grafana has planned, or there are some nuances I don't understand yet. Also ...
... except we can't always rely on this, at least not universally, because it's configurable, so it can be disabled per deployment. For me, the ideal outcome takes in account the following:
Given that, IMO the most desirable solution is:
This way, whether the level is auto-detected by Loki or set by the log collector, users searching logs only have to know about one label, and when I upgrade away from Loki 3.0, I don't need to ask people to update their saved queries/dashboards/wikis. And I'm in a position to dictate to everyone shipping logs to the platform what labels and structured metadata is valid and what isn't, and police their values to ensure compliance.
Ultimately I guess I don't see why this guarantee needs to exist, at least for values of "everyone" that means all Loki users across all deployments. OTOH, if "everyone" means all Loki users of a given deployment or at a given organization, I believe this can be accomplished by the proposed solution above, because organizations are in a position to enforce consistency across the services under their operation. I suppose that may be wishful thinking for some large organizations. Although in those egregious cases there's nothing preventing some smartass from configuring a label or structured metadata field called |
@jtackaberry fully agree with your proposed solution. Log lines that already have a label or metadata field called level, loki should not try to auto-detect any further level information. Making the names for auto-detected fields configurable is also a good idea.
I think the first auto-detected field with an underscore introduced by loki was service_name and the aesthetics bothered my already then. I would have preferred service or servicename. |
Got it, thanks for the feedback and helping me understand the use case. I think you nailed it that we might have different goals here. This was added largely to support Explore Logs, our new, LogQL free UI for interacting with Loki. This experience is largely driven by metric queries, visualizing the logs over time by various aspects (levels, index labels, structured log lines, etc.). A lot of these charts are broken up by level as that's the number one aspect users tend to break up their logs by. In order to do this, Explore Logs needs to know where to find the level. Similarly, the logs histogram in Grafana has a similar problem, as it too attempts to visualize by level. So, this is what I mean I say "we needed was a label we can always rely on being present, in a place we can always count on it being". Sure, you can turn off level detection completely, but when it's on, these UIs need a consistent place to look for it. So if we exposed a configuration to change the structured metadata field it's in, we would also have to expose an endpoint the UIs could hit to get that. Currently, if the log line has an indexed level (there's a few label names we accept for this), we skip detection and just copy that value to the known |
Wow, how has it been a whole month? :/
Aha, yes, things have clicked into place for me here. I understand the use case and the challenges. Thanks @trevorwhitney
Presumably another concern isn't just that the level can be reliably found in a specific place, but also one of normalization: if But Ultimately what I don't like about it is:
My personal preference would be to rule with an iron fist:
Default configuration could be Presumably you will need to do something similar to the first bullet anyway, to deal with the corner case where Is that too impractical? Is that heroics just because I really like the conciseness of |
FWIW, we just fixed this behavior in Grafana so that |
Reading this thread and the doc it seems to me the level detection uses hard coded values. This is expected but I am un-sure on what would be the best way to proceed here to have this extra log level saved as "cleared" in the detected_level aside adding this as an option in the go code. Should I add a pipelineStage to parse just this specific message and add a a |
@camrossi the values are only hardcoded when we try to detect the level from the log line. If you send an indexed or structured metatdata label that matches the allowed label names we will use the raw value for the level (which we then store in strctured metadata as |
@trevorwhitney thanks ! Unfortunately I can't control the log format and the format is something on these lines:
I ended up doing this to find lines with my levels (that are also not matching the hard coded ones in Grafana for logs graph colours) and this has been working perfectly. I hope is a sensible way to do it!
|
@camrossi where did you implement this, in loki or the collector ? |
I put it in the promtail ScrapeConfig |
We've seen case where the log would contains
Warn
as the level and anerror
message but it would still eventually end up as error because we don't properly search forWarn
with a first capital level. This should be the very least.We might want to investigate to support logfmt and json level, severity keys to make this more robust.
cc @shantanualsi @sandeepsukhani
The text was updated successfully, but these errors were encountered: