Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search as you type fieldmapper #35600

Merged

Conversation

andyb-elastic
Copy link
Contributor

@andyb-elastic andyb-elastic commented Nov 15, 2018

Adds a new field type, search_as_you_type, that acts like a text field optimized for as-you-type search completion. It creates a couple subfields that analyze the indexed terms as shingles, against which full terms are queried, and a prefix subfield that analyze terms as the largest shingle size used and edge-ngrams, against which partial terms are queried

Adds a boolean_prefix query type that creates a boolean clause of a term query for each term except the last, for which a boolean clause with a prefix query is created. This will be used as the recommended query type for querying search_as_you_type fields, although other text queries will be supported as well

This is for #33160

@andyb-elastic andyb-elastic added >feature WIP :Search Relevance/Suggesters "Did you mean" and suggestions as you type labels Nov 15, 2018
@andyb-elastic andyb-elastic requested a review from jimczi November 15, 2018 19:26
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@andyb-elastic
Copy link
Contributor Author

@jimczi from our discussion yesterday I

  • set default max_shingle_size to 3
  • capped it at 5
  • removed configurability for min_gram and max_gram
  • added tests for the configuration options exposed

I think this is ready for a first look unless there's anything else you'd like me to add

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a great start @andyb-elastic . I left some comments regarding how we should interact with this new mapper.

maxShingleSize = XContentMapValues.nodeIntegerValue(fieldNode);
iterator.remove();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw an error if the map is not empty (has some fields that we don't support) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed whatever calls this method did that afterwards, it looks like in at least some places they do. Should I throw one here too?

Mapper.Builder<?,?> fieldBuilder = typeParser.parse(realFieldName, propNode, parserContext);
for (int i = fieldNameParts.length - 2; i >= 0; --i) {
ObjectMapper.Builder<?, ?> intermediate = new ObjectMapper.Builder<>(fieldNameParts[i]);
intermediate.add(fieldBuilder);
fieldBuilder = intermediate;
}
objBuilder.add(fieldBuilder);
propNode.remove("type");
DocumentMapperParser.checkNoRemainingFields(fieldName, propNode, parserContext.indexVersionCreated());

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing the docs @andyb-elastic , I left some comments.

When going through what multi match parameters are supported for bool_prefix, it looks like the operator, auto_generate_synonyms_phrase_query, and the fuzziness family of settings are applied to query in the multi_match context, so it seems like we should support them in the single-query form too

I agree, we should at least support operator and fuzziness.

@andyb-elastic
Copy link
Contributor Author

I agree, we should at least support operator and fuzziness.

Sounds good, I think I'd like to add them in a follow up

@jimczi
Copy link
Contributor

jimczi commented Mar 25, 2019

Sounds good, I think I'd like to add them in a follow up

Sure as you like. Let's not forget it though ;)

@andyb-elastic
Copy link
Contributor Author

I'm adding those parameters here since it's not a big change

@andyb-elastic
Copy link
Contributor Author

I pushed changes that rename the query to match_bool_prefix and provide support for operator and fuzziness and its related parameters

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks good to me, let's merge !

@andyb-elastic andyb-elastic changed the title Feature search as you type fieldmapper search as you type fieldmapper Mar 27, 2019
@andyb-elastic andyb-elastic merged commit 6bba9fc into elastic:master Mar 27, 2019
andyb-elastic added a commit to andyb-elastic/elasticsearch that referenced this pull request Mar 27, 2019
Adds the search_as_you_type field type that acts like a text field optimized
for as-you-type search completion. It creates a couple subfields that analyze
the indexed terms as shingles, against which full terms are queried, and a
prefix subfield that analyze terms as the largest shingle size used and
edge-ngrams, against which partial terms are queried

Adds a match_bool_prefix query type that creates a boolean clause of a term
query for each term except the last, for which a boolean clause with a prefix
query is created.

The match_bool_prefix query is the recommended way of querying a search as you
type field, which will boil down to term queries for each shingle of the input
text on the appropriate shingle field, and the final (possibly partial) term
as a term query on the prefix field. This field type also supports phrase and
phrase prefix queries however
andyb-elastic added a commit that referenced this pull request Mar 27, 2019
Adds the search_as_you_type field type that acts like a text field optimized
for as-you-type search completion. It creates a couple subfields that analyze
the indexed terms as shingles, against which full terms are queried, and a
prefix subfield that analyze terms as the largest shingle size used and
edge-ngrams, against which partial terms are queried

Adds a match_bool_prefix query type that creates a boolean clause of a term
query for each term except the last, for which a boolean clause with a prefix
query is created.

The match_bool_prefix query is the recommended way of querying a search as you
type field, which will boil down to term queries for each shingle of the input
text on the appropriate shingle field, and the final (possibly partial) term
as a term query on the prefix field. This field type also supports phrase and
phrase prefix queries however
@andyb-elastic
Copy link
Contributor Author

master - 6bba9fc
7.x - 23395a9

- match: { hits.total: 1 }
- match: { hits.hits.0._source.a_field: "quick brown fox jump lazy dog" }
- match: { hits.hits.0._source.text_field: "quick brown fox jump lazy dog" }
- match: { hits.hits.0.highlight.a_field.0: "quick <em>brown</em> fox jump lazy dog" }
Copy link
Contributor

@telendt telendt Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyb-elastic Is it unreasonable to expect quick <em>brown</em> <em>fox</em> jump lazy dog highlight here?

(That's what match_bool_prefix: "brown fo" on text_field would give).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's not nice to comment on already merged PR, that's why I opened this issue:
#53744

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search Relevance/Suggesters "Did you mean" and suggestions as you type v7.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants