-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] data_stream fields #980
Conversation
updated with your comments addressed, thanks @ruflin ! |
@roncohen @ruflin we have an RFC to categorise datasources. Do you think we could add an appropriate field under data_stream that categorises the datasource? |
@jamiehynds Good question. Definitively possible but I would like to decouple it form this RFC to get it in first without the category field to not block it on it. One issue I see this that these fields describe the data stream itself and when I understand https://github.com/elastic/ecs/pull/958/files#r492541198 correctly, it categorised where the data is coming from. Can it be that different sources match a different category and end up in the same data stream? I know @jpountz was also thinking about categories in the past related to EQL. I guess it would be worth to file a separate issue or discuss it directly in #958 ? |
@ruflin Totally agree that the categorisation topic shouldn't block the data_stream field. Will move the discussion to #958. The issue above is certainly something we'll need to consider. Another point is when a data source produces several data streams - do we then have nested categories for each data stream. I'll move the discussion to my issue and go from there. I believe @seth-goodwin is also working on categories for detection rules too, which may be related. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @roncohen for putting together this proposal and submitting!
The current proposal focuses on how the data_stream.*
fields have been adopted as part of the new indexing strategy for Elastic Agent. I think this makes great sense given the tight relationship that the data_stream.*
fields have in that larger strategy.
Should we also note any guidance on how the data_stream.*
fields should or could be used by other data sources?
For example, a couple of questions I had after reviewing the proposal:
- Should the
data_stream.*
fields only to be used alongside data streams - If not, what cases exist where the
data_stream.*
fields could be present, but the index isn't backing a data stream?
Should any of the value limitations/restrictions discussed in elastic/kibana#75846 be included in this proposal as well? |
@ebeahan Good catch, I think yes. |
💚 CLA has been signed |
thanks again for the comments! I added a note on the restrictions on the values and the option to use the fields with other sources. Please let me know if i missed anything! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, I have a question unrelated to this RFC. Has the team thought of a name for this indexing strategy that will better stand the test of time? This indexing strategy is new now, but it will no longer be in 2 years :-) I call this the “New Beetle” naming issue. 😉 |
I took another round on this. Please have at it :)
I don't think we have. I assume it'll just be named "the indexing strategy" eventually ;) |
Or what about "The recommended indexing strategy"? ;-) |
Co-authored-by: Mathieu Martin <[email protected]>
OK, ready for another round! When Nicolas' doc is ready, we can link to it from here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the adjustments, Ron.
As discussed in #980 (comment), I think a call out to this inconsistency would be appropriate.
Additionally, I've two more small things noted below as review comments. Then I think we can merge as stage 1 and move on to stage 2.
We can link to the indexing strategy docs when it becomes available. I'll make sure this doesn't get lost between stage 1 & 2 PRs.
We may have to adjust the naming restrictions a bit, after elastic/elasticsearch#63987 |
@webmat Why? . indices are currently not supported by the indexing strategy. |
@ruflin I guess you're right. I keep conflating using data streams in general with the use of this field set, which happens to use data streams, but is actually only for one specific use case / indexing strategy. This gets back to my main concern, that I raised months ago. This field set is called As I said back then, I get the feeling that naming these two somewhat unrelated things the same way will be confusing to people. Me first, apparently 😄 Come to think of it, this actual concern on the naming is not captured in the RFC. Could we add a section in "Concerns" to mention this? We don't need to change anything else, but at least document that this potential confusion was noted. |
@webmat I look at this a bit different. |
I see what you mean. The way i think of it, data stream fields are a convention we've decided upon, for use in the new indexing strategy which relies on data streams (cannot be used with indices). As always with conventions, you can decide to use it in places it wasn't intended for or you can chose not to use it in cases were it does fit. Separately, it's my hope that we'll use this convention even with hidden time series data streams. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'm good with the RFC as it stands now. Thanks for all the discussion and adjustments 👍
I'd like an Approve review from Observability as well, to confirm nothing's missing from your point of view.
We can merge afterwards
thanks folks! |
Co-authored-by: Mathieu Martin <[email protected]> Co-authored-by: Eric Beahan <[email protected]>
Very early draft on an RFC for the data_stream fields.
make test
?make
and committed those changes?Markdown preview of this RFC