Consider merging word_delimiter and word_delimiter_graph #37474
Labels
>deprecation
:Search Relevance/Analysis
How text is split into tokens
Team:Search Relevance
Meta label for the Search Relevance team in Elasticsearch
We have an open PR (#29216) deprecating
word_delimiter
filter in favour ofword_delimiter_graph
filter. Similarly, we also havesynonym
andsynonym_graph
filter, the first of which ought to be deprecated in favour of the second.The difference in both of these cases between the deprecated and non-deprecated versions is that the
_graph
filters correctly assign position lengths to their output, creating properly-formed graphs. However, as lucene does not store position lengths in the index, this makes no difference at index time, only at query time. For this reason (and because I don't think anybody intentionally wants a badly-formed query), maybe we should simply map both forms to the_graph
implementation under the hood? This would also save having to re-index when upgrading with mappings that useword_delimiter
orsynonym
.The text was updated successfully, but these errors were encountered: