Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Space time #14

Closed
wants to merge 1 commit into from
Closed

Space time #14

wants to merge 1 commit into from

Conversation

jalvz
Copy link

@jalvz jalvz commented May 17, 2021

This PR contains a hodge podge of ideas to discuss / consider. This is not meant to be merged, but anything worthy could be extracted to a separate PR. The code in this PR has been rushed out to try different things and is not very DRY.

In no particular order:

  • Introduced a new formatter that renders the log message in a different color in function of the log level (the log level itself is not printed). This could be useful for the simple and/or compact formatter, but it needs a a simple coloring scheme that doesn't make you think about what log level is what.

  • Introduced an --include-fields option that only prints the given fields. This can be interesting for logs with many fields, like apm events (more on that bellow).

  • Shorter format for timestamp. I think this is quite valuable at least for compact and simple formatters, we generally don't need that much precision as eg. apm-server produces.

  • Convert timestamp to local timezone. This changes the contents of the output, so I think it should be opt-in and never default.

  • When logs are read from multiple files, prepend the filename to each record (like grep).

  • When logs are read from multiple files, interleave them so that they are printed in chronological order. While probably a niche feature, it would be veeeeery useful for me :) Eg. debugging Fleet I would want to read fleet server, agent and apm server logs this way to find when and where something goes out of whack.
    I ran out of time and the implementation here is not complete, but I think this would be the most important addition.

  • While general JSON support is not a goal, I would consider extend this to support any kind of Elastic logs, instead of just ECS. Then we could use it for Kibana, Elasticsearch, etc. It would introduce some complexity to handle the differences between products, but it could be interesting.

  • Added support for local config files that take precedence over the global config file at the home dir. I am unsure about this, but maybe useful if eg. you want different default experiences for different projects.

  • Tried a different formatter and painter for apm data, per se not terribly useful but it could be complemented with service name or event type filters, for instance. If well done it could be great eg. to analize logs from an SDH, where we can't just use Kibana.

  • Tried to generate Github links out the file name and number in the log line. The approach used won't work for high volume logs, and yet introduce too much complexity for my taste. I call it a failed experiment, but I think I have a bit of a better idea of how to go about it. I'm pushing the code for documentation purposes, but not intending to do anything with it.


apm events, apm formatter:

Screenshot 2021-05-17 at 15 38 20

apm-server logs, experimental formatter:

Screenshot 2021-05-17 at 15 40 44

github links (sometimes):

Screenshot 2021-05-17 at 15 42 28

multiple inputs sorted timestamps

Screenshot 2021-05-17 at 16 55 36

@jalvz jalvz force-pushed the experimental-spacetime branch from 68e48d8 to c082fde Compare May 17, 2021 15:44
@trentm
Copy link
Owner

trentm commented May 18, 2021

I'm responding just to your comments. I haven't looked at the code yet. Thanks for doing this; this is great.

  • Introduced a new formatter that renders the log message in a different color in function of the log level (the log level itself is not printed). This could be useful for the simple and/or compact formatter, but it needs a a simple coloring scheme that doesn't make you think about what log level is what.

This could be interesting. It would make errors, for example, stand out better. However I'd personally be wary of not having the level names, because then the user (a) cannot search for the level (i.e. search for "ERROR" to quickly jump through a long log output) and (b) has to know the level colours. For some use cases, however, this could be good.

  • Introduced an --include-fields option that only prints the given fields. This can be interesting for logs with many fields, like apm events (more on that bellow).

I haven't looked at code or tried it yet. Did you run into problems with, e.g. rendering of the title line if the user hasn't included any of the title line fields? The "--exclude-fields" work may have already dealt with some of that. For example, changes needed to be done to the rendering to not blow up if there was no "@timestamp" field.

I think this would be a good feature, so if/when you are able, I'd love to see a separate PR for this one.

  • Shorter format for timestamp. I think this is quite valuable at least for compact and simple formatters, we generally don't need that much precision as eg. apm-server produces.

Interesting, because I feel that I want the precision side of things more than the year and day ;)
But for logs that don't have a lot of sub-second interaction, then yes definitely.

I think this and other points you've made suggest that it might be very useful to have easily defined user formatting via some sort of templating. Perhaps that would be overkill. Not sure.

  • Convert timestamp to local timezone. This changes the contents of the output, so I think it should be opt-in and never default.

Bunyan has the following option for this:

-L, --time local

    Display the time field in local time, rather than the default UTC time.
  • When logs are read from multiple files, prepend the filename to each record (like grep).

I think this would be useful in cases where one is merging files from separate services (as in your example below). In cases, say, where I'm rendering three hourly logs from the same service, then it becomes noise.

Bunyan does not do this, however the bunyan log format requires that every record has a "name" -- which was typically the service name. That meant when merge lots from separate services the title line always disambiguated the source.

It might be nice for this filename to be a sort of "meta"-field that the various output formats could place as they see fit -- but not a requirement.

  • When logs are read from multiple files, interleave them so that they are printed in chronological order. While probably a niche feature, it would be veeeeery useful for me :) Eg. debugging Fleet I would want to read fleet server, agent and apm server logs this way to find when and where something goes out of whack.
    I ran out of time and the implementation here is not complete, but I think this would be the most important addition.

I agree this would be super useful. Bunyan has this (it was added by a co-worker at my previous job) and it is super-useful for the use case you describe. I'd be keen to get a finished PR from you on this sometime. ;)

  • While general JSON support is not a goal, I would consider extend this to support any kind of Elastic logs, instead of just ECS. Then we could use it for Kibana, Elasticsearch, etc. It would introduce some complexity to handle the differences between products, but it could be interesting.

Perhaps. I think the focus on a single format is useful, but I understand the desire. The README currently explicitly lists this as a non-goal :)

Non-goals:
- An ambitious CLI that handles multiple formats (e.g. also bunyan format, pino format, etc). Leave that for a separate science project.

Isn't ES close to ECS format already? Hrm, ES 7.12 logs:

{"type": "server", "timestamp": "2021-05-18T00:11:48,030Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "451e2b5c325c", "message": "closing ...", "cluster.uuid": "1tRECNfaQuWHTl_AE_43Kw", "node.id": "VflIGPgHQ0Sx59XBzqdTkQ"  }

so not ES 7.x. However ES 8.x is there (or close) as of elastic/elasticsearch#47105:

{"@timestamp":"2021-05-18T02:50:09.146Z", "log.level":"DEPRECATION",  "data_stream.dataset":"deprecation.elasticsearch", "data_stream.namespace":"default", "data_stream.type":"logs", "ecs.version":"1.7", "elasticsearch.event.category":"indices", "event.code":"index_name_starts_with_dot", "message":"index name [.monitoring-es-7-2021.05.18] starts with a dot '.', in the next major version, index names starting with a dot are reserved for hidden indices and system indices" , "service.name":"ES_ECS","process.thread.name":"elasticsearch[70f5fffde98a][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.deprecation.cluster.metadata.MetadataCreateIndexService","event.dataset":"deprecation.elasticsearch","elasticsearch.cluster.uuid":"V8F-2U8AQFCQItxOcXWF_A","elasticsearch.node.id":"Ut6yzwswR_GWoUlKwSc2vg","elasticsearch.node.name":"70f5fffde98a","elasticsearch.cluster.name":"docker-cluster"}
{"@timestamp":"2021-05-18T02:50:09.334Z", "log.level": "INFO", "message":"[.monitoring-es-7-2021.05.18] creating index, cause [auto(bulk api)], templates [.monitoring-es], shards [1]/[0]", "service.name":"ES_ECS","process.thread.name":"elasticsearch[70f5fffde98a][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataCreateIndexService","event.dataset":"elasticsearch.server","elasticsearch.cluster.uuid":"V8F-2U8AQFCQItxOcXWF_A","elasticsearch.node.id":"Ut6yzwswR_GWoUlKwSc2vg","elasticsearch.node.name":"70f5fffde98a","elasticsearch.cluster.name":"docker-cluster"}

One issue is that ES 8.x logs don't often have the "ecs.version" field. I think we could make ecslog lenient on this.

I think this is worth a separate ecslog issue to discuss:

  1. being lenient on ecs.version for ES 8.x logs,
  2. discussing with others if ES 8.x should grow to have "ecs.version" all the time or if the ecs-logging spec wants to consider relaxing the spec on this, and
  3. discussing if want to have ecslog explicitly support the ES 7.x format just because.

Would you like to open one or more?

For Kibana, this ticket elastic/kibana#90406 suggests Kibana (as of 7.13?) logs (when "json logs are turned on" -- I don't know the config var for that) might be ecs-logging compatible. Again a separate ticket to look into this would be worthwhile.

  • Added support for local config files that take precedence over the global config file at the home dir. I am unsure about this, but maybe useful if eg. you want different default experiences for different projects.

Yah, I'm not sure about this. I think the use case would have to be fairly strong. Currently the settings supported in the config file are more about personal preference rather than something specific to a project. It is possible that changes. However, if/when supporting this, then ecslog start forever needs to stat many directories all the way to the user home dir or root dir looking for config files.

  • Tried a different formatter and painter for apm data, per se not terribly useful but it could be complemented with service name or event type filters, for instance. If well done it could be great eg. to analize logs from an SDH, where we can't just use Kibana.

I'm not sure what you mean here. If this is clear from the code then cool. Otherwise, could you show some examples of what you mean?

  • Tried to generate Github links out the file name and number in the log line. The approach used won't work for high volume logs, and yet introduce too much complexity for my taste. I call it a failed experiment, but I think I have a bit of a better idea of how to go about it. I'm pushing the code for documentation purposes, but not intending to do anything with it.

Cool. Yah, I'm not sure what a great answer would be here.
I like your formatting of the source file and line number into the title line.

@jalvz
Copy link
Author

jalvz commented May 18, 2021

Thanks for the comments!

However I'd personally be wary of not having the level names, because then the user (a) cannot search for the level

That's a good point. I can make a proposal-PR with messages colored, but keeping the error level, and proposing an alternative coloring scheme so that the screen is not too rainbow-y. WDYT?

I feel that I want the precision side of things more than the year and day ;)

apm-server produces 2021-05-10T13:41:55.695+0200, and Kibana the same with one character more. I propose 2021-05-10 13:41:55 which is more readable IMO, and probably all that is needed for the simple formatter (and maybe compact too)?

the bunyan log format requires that every record has a "name" -- which was typically the service name. That meant when merge lots from separate services the title line always disambiguated the source

We can't make that assumption here, thou. Two options: opt-it via input flag; or try be smart deduce whether we are taking logs from one or several services from the file names (maybe calculate Humming distance or Jaccard distance). Any one appealing?

I'm not sure what you mean here. If this is clear from the code then cool.

So let me explain 2 use cases:

  • When logs are in debug mode, apm-server prints events as they are indexed in Elasticsearch. Sometimes we get log dumps from customers in debug mode and analyzing them with traditional tools (grep, etc) is a bit cumbersome (for me). Forget what I said about custom filter thou, probably we could use KQL for everything!

  • apm-server has a flag to output logs to the console instead of Elasticsearch. That is useful when we want to benchmark apm-server on itself without network / dependencies, etc. An apm formatter with some stats flags could help complementing benchmark results.

I am aware this is very apm-server specific, thou.

Otherwise, I agree that per-project config files don't have a strong use case. For the rest I will create separated issues.

Thanks!

@trentm
Copy link
Owner

trentm commented May 19, 2021

I feel that I want the precision side of things more than the year and day ;)

apm-server produces 2021-05-10T13:41:55.695+0200, and Kibana the same with one character more. I propose 2021-05-10 13:41:55 which is more readable IMO, and probably all that is needed for the simple formatter (and maybe compact too)?

I agree the "+0200" TZ takes up more space than one would want. By "precision side" I meant that sometimes the milliseconds, e.g. .695, are more useful when reading logs than, say, the year.

I don't object to some formats rendering timestamps differently. If you are suggesting this as the default, then that is a separate conversation. I think the default should probably be "show exactly the data as it is in the log record"... and then there are config options for different time formats. A middle ground might be a best-effort "trim down a RFD 3339-like timestamp your format, if possible".

2021-05-10 13:41:55 which is more readable IMO

I don't object to this as an optional abbreviated format.
Perhaps a timeFormat config var or something. However, I don't have good suggestions for that for quick formatting. E.g. if we did the full Go time.Format (https://golang.org/pkg/time/#Time.Format) then ecslog would have to parse and re-format every log record @timestamp, which might be slow?

the bunyan log format requires that every record has a "name" -- which was typically the service name. That meant when merge lots from separate services the title line always disambiguated the source

We can't make that assumption here, thou. Two options: opt-it via input flag; or try be smart deduce whether we are taking logs from one or several services from the file names (maybe calculate Humming distance or Jaccard distance). Any one appealing?

Yup, agreed we cannot make that assumption.

Hrm, what about this: If processing multiple files, then ecslog adds the input file name as log.file.path (https://www.elastic.co/guide/en/ecs/current/ecs-log.html#field-log-file-path) to the log record (if it is not already set). There is a config option to turn this behaviour off. Then we could update one or more of the output formats to include log.file.path in the title line somewhere.

Obviously this is more work, however. A first pass could be to just have a flag to turn it on or off. E.g. grep has:

     -H      Always print filename headers with output lines.

     -h, --no-filename
             Never print filename headers (i.e. filenames) with output lines.

So perhaps on by default and --no-filename to turn it off. I don't want to use up -H and -h short opts for this. :)

Tried a different formatter and painter for apm data ...

I'm not sure what you mean here. If this is clear from the code then cool.

So let me explain 2 use cases: ...

I would love to see examples of input and output here, but I think I understand what you mean. I think things like this would be very useful. I'm not yet sure how to work them into ecslog in a nice and general way. For generic things -- like formatting an HTTP request or response -- I think it is fine to build it in and have a formatting option of some time to render it. For application-specific cases... it would be nice to have some config-file-based method for it, but that might be too much of a pain.

@trentm
Copy link
Owner

trentm commented May 19, 2021

I needed this patch to hack around ES 8.x and Kibana 8.x logs not quite being conformant ecs-logging:

diff --git a/internal/ecslog/ecslog.go b/internal/ecslog/ecslog.go
index 2bfac8e..4eb7c40 100644
--- a/internal/ecslog/ecslog.go
+++ b/internal/ecslog/ecslog.go
@@ -183,16 +183,21 @@ func (r *Renderer) isECSLoggingRecord(rec *fastjson.Value) bool {
                return false
        }

-       ecsVersion := jsonutils.LookupValue(rec, "ecs", "version")
-       if ecsVersion == nil || ecsVersion.Type() != fastjson.TypeString {
-               return false
-       }
+       jsonutils.LookupValue(rec, "ecs", "version")
+       // XXX hack for ES 8.x log records that don't have ecs.version.
+       // ecsVersion := jsonutils.LookupValue(rec, "ecs", "version")
+       // if ecsVersion == nil || ecsVersion.Type() != fastjson.TypeString {
+       //      return false
+       // }

        logLevel := jsonutils.LookupValue(rec, "log", "level")
-       if logLevel == nil || logLevel.Type() != fastjson.TypeString {
-               return false
+       // XXX hack for Kibana 8.x logs: no log.level. Ah "tags.info" ... this is hapi.
+       // if logLevel == nil || logLevel.Type() != fastjson.TypeString {
+       //      return false
+       // }
+       if logLevel != nil {
+               r.logLevel = string(logLevel.GetStringBytes())
        }
-       r.logLevel = string(logLevel.GetStringBytes())

        return true
 }

@jalvz
Copy link
Author

jalvz commented May 26, 2021

I saw you added the ECS lenient option already; and I created issues and PR for the other topics, so closing this. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants