-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report stats related to new sizing guidance #86639
Comments
Pinging @elastic/es-data-management (Team:Data Management) |
Adds measures of the total size of all mappings and the total number of fields in the cluster (both before and after deduplication). Relates elastic#86639 Relates elastic#77466
#87556 addresses the cluster-wide stats. Adding the node-level stats is a little bit tricky because today everything is done at the shard level, and yet mappers are shared across all shards in an index on each node. |
I briefly discussed with @original-brownbear about this feature, and gave me the following tips on getting started:
|
@original-brownbear , I see that somebody can call also |
Although there's no deduplication of mappers today I expect this will be subject to improvement in a future version (see #86440). It would be good if the solution on which we land doesn't assume "no deduplication" too fundamentally.
Definitely yes to node-level and no to shard-level, but I think it would make sense to report it at the index level too. That way users can see which indices are contributing most to this memory usage which will help them address it. |
Thanks @DaveCTurner ! I have an implementation underway that exposes the field mapper stats in the _nodes/stats API at both the node & index level, but not the shard level finally. A PR is imminent in the next couple of days. It will look like this:
And we should be able to extend it with more stats about deduplicated fields. Only minor complication would be to see adding the deduplicated stats only at the node level and not at the index level (since I guess the deduplication would happen across indices if my reasoning is correct). But that should be an implementation complication to handle, rather than an API one. |
Sounds good.
I'm not sure we will need any extra stats when deduplication lands, as long as the total estimated overhead is updated to match the new implementation. Indeed that kind of future-proofing is why I think it better to report the actual bytes of estimated overhead even if it's just |
Are labels here correct? I just realized the team is data management. But I see @DaveCTurner handled a part in this PR with different labels, and I also thought so far it was for the distributed team. |
It could be owned by several teams - the distrib team broadly own the recent sizing guidance work (#77466) even if this particular sizing concern could arguably be handled under |
OK, thanks. Just to note: I am working on the node-level stats for field mappings stats. I have a PR in the works.
@DaveCTurner just to be clear on this point. I discussed with @original-brownbear and we think for now we can add the new field mapper stats to the |
I see some value in having these stats in the indices stats API too, because I'd expect if you told users that they had too many fields in the node stats then they'd mostly use the indices stats API to investigate further. But I see that this is quite a different thing from the changes to the nodes stats and not quite so important so I'm ok with leaving this out for now. |
Indeed I also thought the same way, but we discussed with @original-brownbear that it's a bit of a chicken-and-egg problem: one needs to actually create the indices to get the estimated overhead of the field mappers on the data nodes, and then potentially re-try a different allocation or something to again get a new estimation and so on. I think ideally one would need an offline tool to give an estimation of the total dataset (or indices) and then how many data nodes would be ideal to have. So indeed for the moment we will expose them in the node stats. But if in the future we also want to expose them in index stats, that will be also possible with some further implementation. |
For more info about the deduplication effort of mappings in a data node, see here. I see the mention of field mappers in this issue, and I wanted to raise that while indexed fields have both an instance of a field mapper as well as a mapped field type, runtime fields don't have a corresponding field mapper but only a mapped field type. Should the new stats target also runtime fields, and in that case be less specific about "field mappers" in the API response? Should we rather have a higher-level estimation of the overhead of a MappingLookup for a certain index, which is shared across shards of the same index that are allocated on the same data node? |
Hi, thanks for the conversation as I am also getting more nuances and terminology :) For the moment, the implementation I am trying is counting the field mappers using basically the line Here is an example of what my in-progress implementation does. If an index has the following mapping:
It will then calculate the following fields:
Meaning a total of 21 fields and thus an estimated overhead of 21*1KiB = 21KiB. As you see it does not include the runtime field. I have two questions:
Thanks! |
My take on runtime fields is that they should be taken into account, and the same applies to object fields which are not among field mappers. I don't see a reason that they should be treated differently. I think a more accurate way to count would be: number of mapped field types + number of object mappers. I'm afraid the former is not currently exposed though. |
Thanks @javanna ! When you refer to object fields, do you mean the nested field like the one I have in the example in my comment above? If that's the case, then my current implementation takes them into account (you will see in the output that they are included in the flattened list). For the runtime fields, would you be able to give me a hint of where to find them in the code so that I can count them? |
OK, I researched a bit more the code. I understand that if I use the following instead:
This will include both field mappers + runtime fields + flattened list of object fields, so I believe everything we need @javanna . Now, if we take into account the runtime fields as well, then the name I have chosen for the stats is not accurate. Maybe I can name the new node stats like this:
What do you think @javanna , @DaveCTurner , @original-brownbear on including runtime fields and the naming? |
I'm not sure The important numbers from a sizing perspective are |
Hi @DaveCTurner . I see there's already a Would any of these work?
|
Objects are not counted in field type lookup: objects have their own object mapper that takes memory too, and they contribute to the total fields limit, hence I was thinking you may want to count them too. When you have some structure in your docs and hundreds of leaf fields, it's very common to end up with tens if not hundreds of objects. You can count them by inspecting |
That's ok I think, these are also mapping stats at the node level. |
@javanna are you sure? I just debugged the above example I mentioned and I see the following: This shows that the |
@DaveCTurner , oh I did not see mapping stats at the node level. Where are they in the code and in the API? Still I would suggest we have a different name. Sounds like future trouble to just have it |
Hi again! For the moment, I am thinking of naming the stats Also, in using |
The |
Nono I mean we're adding node-level stats about mappings here, we should just call them |
Hi @DaveCTurner , OK, I will name them as @javanna , ah I see what you mean, that I should account also the top-level field for each object. OK, I made a further example with the following mapping:
And indeed I see the following: I understand now that the final number I'm looking for (for field count) is summing the |
So that they are visible in NodeIndicesStats only at the node and index (but not shard) levels. Also visible in the _cat/nodes table. Relates to issue elastic#86639
Hi @DaveCTurner , @javanna , opened a PR at #89807 . Should I invite you as reviewers as well (apart from @original-brownbear )? Or feel free to add yourselves as reviewers and provide feedback. Thanks! |
PR #89807 got merged. But there is a remaining task in this ticket for "Update sizing guidance docs to refer to new stats". In the PR, we briefly updated the documentation, see it here to say to consult the new @original-brownbear , @DaveCTurner do you have some recommendations on how to rephrase the section, or what would you expect it to mention (and I can try working out a first PR and then receive suggestions)? |
I'd like the guidance to be in terms of the stats we expose, and ideally include some instructions about how to obtain those stats ( |
Thanks @DaveCTurner ! No worries, I can give it a try to give you a headstart, and then you can directly modify on the PR if you'd like :) |
Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to elastic#86639
Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to #86639
Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to elastic#86639
Closed with latest PR #90274 |
In #86223 we adjusted our guidance about node capacity to be more in line with actual resource usage and constraints given recent changes (#77466). The metrics mentioned in the guidance are a little difficult to understand since they depend on opaque mechanisms like mapping deduplication. We should report the relevant stats directly to avoid any misunderstandings.
In particular, in nodes stats we should report the total
numberestimated overhead of field mappers on each node, and in cluster stats we should report the total number of fields in the cluster (after deduplication).Edited (2022-06-09) to change "total number of field mappers" to "total estimated overhead of field mappers", because I expect we'll want to refine our "1kB-per-field" guidance in the future, which we can do if we build this guidance into ES too. Total number of field mappers is also likely still useful.
The text was updated successfully, but these errors were encountered: