Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should mappings keep using CopyOnWriteHashMap? #80072

Closed
jpountz opened this issue Oct 29, 2021 · 5 comments
Closed

Should mappings keep using CopyOnWriteHashMap? #80072

jpountz opened this issue Oct 29, 2021 · 5 comments
Labels
>refactoring :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@jpountz
Copy link
Contributor

jpountz commented Oct 29, 2021

CopyOnWriteHashMap was introduced for mappings at a time when some Logging use-cases would introduce many (as in several thousands) fields dynamically and applying mapping updates would be the bottleneck due to the time it would take to apply mapping updates.

However several things changed since then:

  • Mapping updates are now applied on the master node before indexing the document, which makes dynamic mapping updates slow by nature (mapping updates used to be performed locally first, and then asynchronously propagated to the master node, which would cause big issues when different shards of the same index would make different decisions).
  • We introduced a soft limit of 1,000 fields per index.
  • ECS introduced standardization of field names, making it less likely to have different documents use different field names for the same information.

This CopyOnWriteHashMap optimizes for mapping updates to the detriment of lookups, which can be slower than with regular hash maps, especially in case of hash collisions. We should look into moving back to regular maps that we would fully copy upon mapping updates.

@jpountz jpountz added :Search Foundations/Mapping Index mappings, including merging and defining field types >refactoring labels Oct 29, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 29, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@dakrone
Copy link
Member

dakrone commented Nov 19, 2021

@jpountz any ideas for the best way to test something like this performance-wise? Do we have a Rally track that tests a large number of mapping updates?

@jpountz
Copy link
Contributor Author

jpountz commented Nov 19, 2021

We don't. This is in-part due to the fact that we moved away from folding everything into the same index / data stream. While in the past you could have 10s of different data types that would end up in the same index and you would get dynamic mapping updates everytime a new type of data would be indexed for the first time, nowadays dynamic mappings are usually set on the first document that gets indexed and then almost never get modified.

I'd suggest running an artificial benchmark that triggers the worst-case scenario, such as indexing 1,000 documents that all introduce a new field, to make sure that index requests do not become absurdly slow.

@romseygeek
Copy link
Contributor

We no longer use CopyOnWriteHashMap in mappers after #81449. In fact we have only one place in the codebase that uses it now, in AutoFollowCoordinator and I'm not sure that it's necessary there either so we may be able to remove it entirely.

@romseygeek
Copy link
Contributor

Fixed by #83040

@javanna javanna added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

5 participants