Report stats related to new sizing guidance #86639

DaveCTurner · 2022-05-10T19:36:16Z

In #86223 we adjusted our guidance about node capacity to be more in line with actual resource usage and constraints given recent changes (#77466). The metrics mentioned in the guidance are a little difficult to understand since they depend on opaque mechanisms like mapping deduplication. We should report the relevant stats directly to avoid any misunderstandings.

In particular, in nodes stats we should report the total ~~number~~ estimated overhead of field mappers on each node, and in cluster stats we should report the total number of fields in the cluster (after deduplication).

Edited (2022-06-09) to change "total number of field mappers" to "total estimated overhead of field mappers", because I expect we'll want to refine our "1kB-per-field" guidance in the future, which we can do if we build this guidance into ES too. Total number of field mappers is also likely still useful.

Estimated overhead of mappers in node stats Introduce node mappings stats #89807
Field counts & mapping sizes in cluster stats Report overall mapping size in cluster stats #87556
Update sizing guidance docs to refer to new stats Redefine section on sizing data nodes #90274

elasticmachine · 2022-05-10T19:36:19Z

Pinging @elastic/es-data-management (Team:Data Management)

Adds measures of the total size of all mappings and the total number of fields in the cluster (both before and after deduplication). Relates elastic#86639 Relates elastic#77466

DaveCTurner · 2022-06-09T12:26:23Z

#87556 addresses the cluster-wide stats. Adding the node-level stats is a little bit tricky because today everything is done at the shard level, and yet mappers are shared across all shards in an index on each node.

Adds measures of the total size of all mappings and the total number of fields in the cluster (both before and after deduplication). Relates #86639 Relates #77466

kingherc · 2022-08-22T13:43:44Z

I briefly discussed with @original-brownbear about this feature, and gave me the following tips on getting started:

We can introduce a new mappers field under NodeIndicesStats with two values, one for the total field count (note that there's no deduplication for data nodes) and one for the estimated overhead (just multiply the field count by 1KiB).
In org.elasticsearch.indices.IndicesService#statsByShard function, for each index, use the index Service to get the mapper Service to finally get the count of the fieldMappers. There's a small doubt whether this count contains nested ones, but I will verify.
Maybe add a test for NodeStats similar to ClusterStatsIT.

kingherc · 2022-08-25T12:25:31Z

@original-brownbear , I see that somebody can call also _nodes/stats?level=shards for shared-level info. Reading @DaveCTurner 's comment above, I see that mappers are shared across all shards in an index on each node. Thus, I understand that the total number of fields we want to expose should only be part of the node level (under nodes > indices in the final json) and not part of the indices (I mean under nodes > indices > indices in the final json) nor shards level. Feel free to correct me if I am wrong.

DaveCTurner · 2022-09-05T09:16:41Z

note that there's no deduplication for data nodes

Although there's no deduplication of mappers today I expect this will be subject to improvement in a future version (see #86440). It would be good if the solution on which we land doesn't assume "no deduplication" too fundamentally.

the total number of fields we want to expose should only be part of the node level ... and not part of the indices ... nor shards level

Definitely yes to node-level and no to shard-level, but I think it would make sense to report it at the index level too. That way users can see which indices are contributing most to this memory usage which will help them address it.

kingherc · 2022-09-05T09:31:19Z

Thanks @DaveCTurner ! I have an implementation underway that exposes the field mapper stats in the _nodes/stats API at both the node & index level, but not the shard level finally. A PR is imminent in the next couple of days. It will look like this:

"field_mappers": {
  "total_count": 58,
  "total_estimated_overhead": "58kb",
  "total_estimated_overhead_in_bytes": 59392
}

And we should be able to extend it with more stats about deduplicated fields. Only minor complication would be to see adding the deduplicated stats only at the node level and not at the index level (since I guess the deduplication would happen across indices if my reasoning is correct). But that should be an implementation complication to handle, rather than an API one.

DaveCTurner · 2022-09-05T10:47:22Z

Sounds good.

I guess the deduplication would happen across indices if my reasoning is correct

~~Maybe, although intra-index deduplication would be enough, and perhaps easier to implement.~~ Edit: I was chatting with Luca and it turns out that the search folks tried some ideas for intra-index deduplication and didn't get the gains they expected, so it seems you're right 😁

I'm not sure we will need any extra stats when deduplication lands, as long as the total estimated overhead is updated to match the new implementation. Indeed that kind of future-proofing is why I think it better to report the actual bytes of estimated overhead even if it's just total_count * 1kiB today.

kingherc · 2022-09-05T15:26:31Z

Are labels here correct? I just realized the team is data management. But I see @DaveCTurner handled a part in this PR with different labels, and I also thought so far it was for the distributed team.

DaveCTurner · 2022-09-05T16:25:50Z

It could be owned by several teams - the distrib team broadly own the recent sizing guidance work (#77466) even if this particular sizing concern could arguably be handled under :Search/Mapping, and the data management team own stats in general.

kingherc · 2022-09-05T16:52:19Z

OK, thanks. Just to note: I am working on the node-level stats for field mappings stats. I have a PR in the works.

Definitely yes to node-level and no to shard-level, but I think it would make sense to report it at the index level too.

@DaveCTurner just to be clear on this point. I discussed with @original-brownbear and we think for now we can add the new field mapper stats to the _nodes/stats API only. It will be available at both the ?level=node and ?level=indices but not at ?level=shards. For the moment we think there is no usefulness to add these stats in the cluster-wide index/_stats API.

DaveCTurner · 2022-09-06T08:17:44Z

For the moment we think there is no usefulness to add these stats in the cluster-wide index/_stats API.

I see some value in having these stats in the indices stats API too, because I'd expect if you told users that they had too many fields in the node stats then they'd mostly use the indices stats API to investigate further. But I see that this is quite a different thing from the changes to the nodes stats and not quite so important so I'm ok with leaving this out for now.

kingherc · 2022-09-06T08:32:27Z

Indeed I also thought the same way, but we discussed with @original-brownbear that it's a bit of a chicken-and-egg problem: one needs to actually create the indices to get the estimated overhead of the field mappers on the data nodes, and then potentially re-try a different allocation or something to again get a new estimation and so on. I think ideally one would need an offline tool to give an estimation of the total dataset (or indices) and then how many data nodes would be ideal to have.

So indeed for the moment we will expose them in the node stats. But if in the future we also want to expose them in index stats, that will be also possible with some further implementation.

javanna · 2022-09-06T08:58:30Z

For more info about the deduplication effort of mappings in a data node, see here. I see the mention of field mappers in this issue, and I wanted to raise that while indexed fields have both an instance of a field mapper as well as a mapped field type, runtime fields don't have a corresponding field mapper but only a mapped field type. Should the new stats target also runtime fields, and in that case be less specific about "field mappers" in the API response? Should we rather have a higher-level estimation of the overhead of a MappingLookup for a certain index, which is shared across shards of the same index that are allocated on the same data node?

kingherc · 2022-09-06T09:46:15Z

Hi, thanks for the conversation as I am also getting more nuances and terminology :)

For the moment, the implementation I am trying is counting the field mappers using basically the line indexService.mapperService().mappingLookup().fieldMappers().size(). This does not account runtime fields for the moment. I am not sure we would like to consider runtime fields since we would like to estimate the overhead of mapped fields in data nodes.

Here is an example of what my in-progress implementation does. If an index has the following mapping:

  "mappings": {
    "runtime": {
      "day_of_week": {
        "type": "keyword",
        "script": {
          "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))",
          "lang": "painless"
        }
      }
    },
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "authors": {
        "properties": {
          "company": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "first_name": {
            "type": "keyword"
          },
          "full_name": {
            "type": "text"
          },
          "last_name": {
            "type": "keyword"
          }
        }
      },
      "title": {
        "type": "text"
      },
      "url": {
        "type": "keyword"
      }
    }
  }

It will then calculate the following fields:

1. authors.last_name - {"last_name":{"type":"keyword"}}
2. _data_stream_timestamp - {}
3. _routing - {}
4. _feature - {}
5. authors.full_name - {"full_name":{"type":"text"}}
6. _source - {}
7. _id - {}
8. @timestamp - {"@timestamp":{"type":"date"}}
9. _version - {}
10. url - {"url":{"type":"keyword"}}
11. title - {"title":{"type":"text"}}
12. authors.company - {"company":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}
13. _index - {}
14. authors.first_name - {"first_name":{"type":"keyword"}}
15. _seq_no - {}
16. _nested_path - {}
17. authors.company.keyword - {"keyword":{"type":"keyword","ignore_above":256}}
18. _tier - {}
19. _ignored - {}
20. _field_names - {}
21. _doc_count - {}

Meaning a total of 21 fields and thus an estimated overhead of 21*1KiB = 21KiB. As you see it does not include the runtime field. I have two questions:

Would we need to include the runtime field(s) in the calculation?
Note also that it does include a lot of other "artificial" fields that the user did not define (e.g., _tier), and I am a bit unaware of those. Should they not be counted?

Thanks!

javanna · 2022-09-06T10:33:46Z

My take on runtime fields is that they should be taken into account, and the same applies to object fields which are not among field mappers. I don't see a reason that they should be treated differently. I think a more accurate way to count would be: number of mapped field types + number of object mappers. I'm afraid the former is not currently exposed though.

kingherc · 2022-09-06T11:21:58Z

Thanks @javanna ! When you refer to object fields, do you mean the nested field like the one I have in the example in my comment above? If that's the case, then my current implementation takes them into account (you will see in the output that they are included in the flattened list).

For the runtime fields, would you be able to give me a hint of where to find them in the code so that I can count them?

kingherc · 2022-09-06T12:29:56Z

OK, I researched a bit more the code. I understand that if I use the following instead:

indexService.mapperService().mappingLookup().fieldTypeLookup.fullNameToFieldType.size()

This will include both field mappers + runtime fields + flattened list of object fields, so I believe everything we need @javanna .

Now, if we take into account the runtime fields as well, then the name I have chosen for the stats is not accurate. Maybe I can name the new node stats like this:

"mapping_lookup": {
  "total_field_count": 58,
  "total_estimated_overhead": "58kb",
  "total_estimated_overhead_in_bytes": 59392
}

What do you think @javanna , @DaveCTurner , @original-brownbear on including runtime fields and the naming?

DaveCTurner · 2022-09-06T12:43:01Z

I'm not sure mapping_lookup means anything to end-users. How about just mappings?

The important numbers from a sizing perspective are total_estimated_overhead (and total_estimated_overhead_in_bytes) - I expect we might refine the computation of these values in future versions but the fields themselves seem pretty future-proof. total_field_count also makes sense to me and explains the overhead computation (for now) and we will likely add more stats in future.

kingherc · 2022-09-06T12:51:45Z

Hi @DaveCTurner . I see there's already a org.elasticsearch.action.admin.cluster.stats.MappingStats which is exposed as mappings in Cluster stats. Those mappings seem to be more exact than the ones we try to estimate here in the node stats. For this reason, I would suggest having a slightly different name to differentiate between the two mapping stats (even though they appear in different APIs).

Would any of these work?

field_mappings
mapped_fields
mapped_types
mapped_field_types

javanna · 2022-09-06T12:54:46Z

Objects are not counted in field type lookup: objects have their own object mapper that takes memory too, and they contribute to the total fields limit, hence I was thinking you may want to count them too. When you have some structure in your docs and hundreds of leaf fields, it's very common to end up with tens if not hundreds of objects. You can count them by inspecting MappingLookup#objectMappers.

DaveCTurner · 2022-09-06T13:00:06Z

I see there's already a org.elasticsearch.action.admin.cluster.stats.MappingStats which is exposed as mappings in Cluster stats

That's ok I think, these are also mapping stats at the node level.

kingherc · 2022-09-06T13:00:36Z

@javanna are you sure? I just debugged the above example I mentioned and I see the following:

This shows that the authors is indeed an objectMapper but it also appears flattened in fieldMappers and also in fieldTypeLookup.fullNameToFieldType. Moreover, the latter seems to also have the runtime field of the example. So that is why I was thinking that finally counting the fields of fieldTypeLookup.fullNameToFieldType would be fine. Right?

kingherc · 2022-09-06T13:02:49Z

That's ok I think, these are also mapping stats at the node level.

@DaveCTurner , oh I did not see mapping stats at the node level. Where are they in the code and in the API?

Still I would suggest we have a different name. Sounds like future trouble to just have it mappings while it's different stats with a different underlying object.

kingherc · 2022-09-06T15:59:13Z

Hi again! For the moment, I am thinking of naming the stats field_mappings which sounds a bit more generic than field_mappers. What do you think?

Also, in using fieldTypeLookup.fullNameToFieldType to count the mappings, I realize this will include also alias fields. Would we want to exclude alias fields from the calculations? I guess not, since I see them taking a place in the MappingLookup structures.

javanna · 2022-09-06T19:22:52Z

The authors object does not appear in the field type lookup nor in the field mappers, only the leaves that belong to it do. In your case you have a single object, hence the count should be what you have + 1, but with many objects the difference is bigger (every level that contains sub-fields is a separate object).

DaveCTurner · 2022-09-07T08:02:29Z

oh I did not see mapping stats at the node level

Nono I mean we're adding node-level stats about mappings here, we should just call them mappings. The cluster-level mappings stats don't make sense at the node level so I don't see how this might cause future problems.

kingherc · 2022-09-08T14:15:19Z

Hi @DaveCTurner , OK, I will name them as mappings in the node-level stats if you are certain :)

@javanna , ah I see what you mean, that I should account also the top-level field for each object. OK, I made a further example with the following mapping:

  "mappings": {
    "runtime": {
      "day_of_week": {
        "type": "keyword",
        "script": {
          "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))",
          "lang": "painless"
        }
      }
    },
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "authors": {
        "properties": {
          "age": {
            "type": "long"
          },
          "company": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "name": {
            "properties": {
              "first_name": {
                "type": "keyword"
              },
              "full_name": {
                "type": "text"
              },
              "last_name": {
                "type": "keyword"
              }
            }
          }
        }
      },
      "link": {
        "type": "alias",
        "path": "url"
      },
      "title": {
        "type": "text"
      },
      "url": {
        "type": "keyword"
      }
    }
  }

And indeed I see the following:

I understand now that the final number I'm looking for (for field count) is summing the fieldMappers.size(), objectMappers.size(), and runtimeFieldMappersCount of the MappingLookup object. Please tell me if I am mistaken.

So that they are visible in NodeIndicesStats only at the node and index (but not shard) levels. Also visible in the _cat/nodes table. Relates to issue elastic#86639

kingherc · 2022-09-08T17:42:19Z

Hi @DaveCTurner , @javanna , opened a PR at #89807 . Should I invite you as reviewers as well (apart from @original-brownbear )? Or feel free to add yourselves as reviewers and provide feedback. Thanks!

kingherc · 2022-09-22T15:17:53Z

PR #89807 got merged. But there is a remaining task in this ticket for "Update sizing guidance docs to refer to new stats". In the PR, we briefly updated the documentation, see it here to say to consult the new mappings node stats for the estimation. On the PR, @DaveCTurner had mentioned "I think we should consider rephrasing this whole section in terms of these stats now that they're available. We can however do that in a followup - it's the third item on the list for #86639."

@original-brownbear , @DaveCTurner do you have some recommendations on how to rephrase the section, or what would you expect it to mention (and I can try working out a first PR and then receive suggestions)?

DaveCTurner · 2022-09-22T16:02:57Z

I'd like the guidance to be in terms of the stats we expose, and ideally include some instructions about how to obtain those stats (GET _nodes/stats?filter_path=nodes.*.mappings.total_estimated_overhead* I guess) and compare them to the overall heap size on each node (also available in nodes stats at some other ?filter_path). I'd probably start from those instructions and then rework the surrounding prose so it all fits in. I'm happy to take on the docs change if you'd prefer.

kingherc · 2022-09-22T16:57:30Z

Thanks @DaveCTurner ! No worries, I can give it a try to give you a headstart, and then you can directly modify on the PR if you'd like :)

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to elastic#86639

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to #86639

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to elastic#86639

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to #86639

kingherc · 2022-10-03T11:49:03Z

Closed with latest PR #90274

DaveCTurner added >enhancement :Data Management/Stats Statistics tracking and retrieval APIs labels May 10, 2022

elasticmachine added the Team:Data Management Meta label for data/management team label May 10, 2022

This was referenced May 10, 2022

Fix Large Shard Count Scalability Issues #77466

Open

Remove shards per gb of heap guidance #86223

Merged

DaveCTurner mentioned this issue Jun 9, 2022

Report overall mapping size in cluster stats #87556

Merged

DaveCTurner added a commit that referenced this issue Jun 14, 2022

Report overall mapping size in cluster stats (#87556)

fcf293f

Adds measures of the total size of all mappings and the total number of fields in the cluster (both before and after deduplication). Relates #86639 Relates #77466

kingherc self-assigned this Aug 25, 2022

kingherc mentioned this issue Sep 5, 2022

Add NodeIndicesStats to NodeStatsTests #89798

Merged

kingherc mentioned this issue Sep 5, 2022

Introduce node mappings stats #89807

Merged

kingherc mentioned this issue Sep 22, 2022

Redefine section on sizing data nodes #90274

Merged

kingherc added a commit that referenced this issue Sep 30, 2022

Redefine section on sizing data nodes (#90274)

ad8d064

Now that we have the estimated field mappings heap overhead in nodes stats, we can refer to them in the guide for sizing data nodes appropriately. Relates to #86639

kingherc closed this as completed Oct 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report stats related to new sizing guidance #86639

Report stats related to new sizing guidance #86639

DaveCTurner commented May 10, 2022 •

edited by kingherc

Loading

elasticmachine commented May 10, 2022

DaveCTurner commented Jun 9, 2022

kingherc commented Aug 22, 2022

kingherc commented Aug 25, 2022 •

edited

Loading

DaveCTurner commented Sep 5, 2022

kingherc commented Sep 5, 2022

DaveCTurner commented Sep 5, 2022 •

edited

Loading

kingherc commented Sep 5, 2022

DaveCTurner commented Sep 5, 2022

kingherc commented Sep 5, 2022

DaveCTurner commented Sep 6, 2022

kingherc commented Sep 6, 2022

javanna commented Sep 6, 2022

kingherc commented Sep 6, 2022

javanna commented Sep 6, 2022

kingherc commented Sep 6, 2022

kingherc commented Sep 6, 2022

DaveCTurner commented Sep 6, 2022

kingherc commented Sep 6, 2022 •

edited

Loading

javanna commented Sep 6, 2022

DaveCTurner commented Sep 6, 2022

kingherc commented Sep 6, 2022

kingherc commented Sep 6, 2022

kingherc commented Sep 6, 2022

javanna commented Sep 6, 2022

DaveCTurner commented Sep 7, 2022

kingherc commented Sep 8, 2022

kingherc commented Sep 8, 2022

kingherc commented Sep 22, 2022

DaveCTurner commented Sep 22, 2022

kingherc commented Sep 22, 2022

kingherc commented Oct 3, 2022

Report stats related to new sizing guidance #86639

Report stats related to new sizing guidance #86639

Comments

DaveCTurner commented May 10, 2022 • edited by kingherc Loading

elasticmachine commented May 10, 2022

DaveCTurner commented Jun 9, 2022

kingherc commented Aug 22, 2022

kingherc commented Aug 25, 2022 • edited Loading

DaveCTurner commented Sep 5, 2022

kingherc commented Sep 5, 2022

DaveCTurner commented Sep 5, 2022 • edited Loading

kingherc commented Sep 5, 2022

DaveCTurner commented Sep 5, 2022

kingherc commented Sep 5, 2022

DaveCTurner commented Sep 6, 2022

kingherc commented Sep 6, 2022

javanna commented Sep 6, 2022

kingherc commented Sep 6, 2022

javanna commented Sep 6, 2022

kingherc commented Sep 6, 2022

kingherc commented Sep 6, 2022

DaveCTurner commented Sep 6, 2022

kingherc commented Sep 6, 2022 • edited Loading

javanna commented Sep 6, 2022

DaveCTurner commented Sep 6, 2022

kingherc commented Sep 6, 2022

kingherc commented Sep 6, 2022

kingherc commented Sep 6, 2022

javanna commented Sep 6, 2022

DaveCTurner commented Sep 7, 2022

kingherc commented Sep 8, 2022

kingherc commented Sep 8, 2022

kingherc commented Sep 22, 2022

DaveCTurner commented Sep 22, 2022

kingherc commented Sep 22, 2022

kingherc commented Oct 3, 2022

DaveCTurner commented May 10, 2022 •

edited by kingherc

Loading

kingherc commented Aug 25, 2022 •

edited

Loading

DaveCTurner commented Sep 5, 2022 •

edited

Loading

kingherc commented Sep 6, 2022 •

edited

Loading