Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make "nestings" in ecs_flat.yml support deep nesting/reuse better #796

Closed
webmat opened this issue Mar 23, 2020 · 6 comments · Fixed by #803
Closed

Make "nestings" in ecs_flat.yml support deep nesting/reuse better #796

webmat opened this issue Mar 23, 2020 · 6 comments · Fixed by #803
Assignees
Labels

Comments

@webmat
Copy link
Contributor

webmat commented Mar 23, 2020

Since we've added the ability to nest field sets deeper within the field hierarchy, the "nestings" array in ecs_flat.yml is now misleading, as it doesn't fully capture what is nested where (more specifically how deep).

This has led to this bug creeping into 1.5.0: #784, where for example the nesting of interface under observer was listed as observer.interface.* instead of the intended observer.ingress.interface.* and observer.egress.interface.*.

This bug has not caused problems with the generated Beats field definitions, csv nor the sample Elasticsearch templates, as none of these rely on the "nestings" array.

@marshallmain
Copy link
Contributor

I can work on fixing this as the bug was introduced by changes that I made. This is also a good opportunity to start refactoring the document rendering to operate directly on the intermediate data structure.

Right now it's a bit of a complex chain of data transformations as the data is loaded from yaml files, converted to the intermediate structure, then flattened in 2 different ways and the flattened structures are used to render other output files in different places. If we consistently use the intermediate structure to render all the output files then the overall complexity should decrease (and hopefully help avoid introducing more bugs like this).

@gen0cide
Copy link

@webmat - can you provide any feedback about the intentions behind ecs_flat.yml and ecs_nested.yml:

  • what are their differences?
  • what is the recommended use cases for each?
  • how do they differ from the definitions in the schema/ directory?

It's not clear on any of these items.

@webmat
Copy link
Contributor Author

webmat commented Mar 25, 2020

Thanks for offering @marshallmain. Yes, if you want to tackle this, that would be very welcome :-)

@webmat
Copy link
Contributor Author

webmat commented Mar 25, 2020

@gen0cide The purpose of both of these files is to offer a fully fleshed out rendering of ECS: e.g. with defaults made explicit. The goal is to simplify the development of various artifact generators. Whatever cleanup, error checking & whatnot can be assumed to be done, when starting from these files (vs starting from schemas/*.yml).

I also personally use them to visualize the deeply nested structure of arrays and dicts that generator.py passes to the various generators here. I'm too dumb to remember by heart ;-)

The difference between "flat" and "nested" is that "flat" doesn't contain details about field sets, it only contains leaf fields. So it's easier to breeze through for a simple generator like CSV, but it's not as complete as "nested".

And yes, due to resource constraints, things are sometimes getting nasty and haven't been fixed yet, where for example the asciidoc generator requires both 🤦‍♂ 😂

@webmat
Copy link
Contributor Author

webmat commented Mar 25, 2020

@marshallmain Assigning you for now. If you can't get to it, please make sure to let me know.

@gen0cide
Copy link

gen0cide commented Mar 25, 2020

Thanks @webmat for the answers. I'm working on a generator here. The goal is to be able to generate various outputs based on the same "model" of ECS. As of now, those "outputs" for my use cases are:

  • Go code (similar to gocodegen, but more Idiomatic Go - json struct tags, etc.)
  • A different Go code implementation that is 100% type safe, pointer safe, and free from reflection.
  • A JSON Schema output.

I'm using ecs_flat.yml for now because it's easier to parse the tree via dot notation fields than the nested ones. Also, it was easier to account for all possible fields within the YAML because of it's flat structure.

If you need help with use cases or feedback, please reach out.

*EDIT: mistyped the URL to ecsgen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants