-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
semantic_text field mapping #107519
semantic_text field mapping #107519
Conversation
Fix the merging of the object field within the semantic_text mapper, the merge context should be set at the parent level (was at the object/child level before merging).
…-text-field-mapping-specifics # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/SemanticTextFeature.java
Pinging @elastic/es-search (Team:Search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass complete. Need to go through the tests now :)
@@ -22,6 +22,7 @@ | |||
import java.util.Arrays; | |||
import java.util.List; | |||
import java.util.Map; | |||
import java.util.Set; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test here that ensures the InferenceFieldMapper
logic is correct? Generally this is done by making a small test only plugin that provides a mapper that satisfies this InferenceFieldMapper
interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I've added them in a new test class - MappingLookupInferenceFieldMapperTests
. LMKWYT!
@@ -17,6 +17,7 @@ | |||
requires org.apache.httpcomponents.httpasyncclient; | |||
requires org.apache.httpcomponents.httpcore.nio; | |||
requires org.apache.lucene.core; | |||
requires org.elasticsearch.logging; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this inclusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - it seems we're not using logging. I'm removing it 👍
...ugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextField.java
Outdated
Show resolved
Hide resolved
args -> new InferenceResult((String) args[0], (ModelSettings) args[1], (List<Chunk>) args[2]) | ||
); | ||
|
||
@SuppressWarnings("unchecked") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SuppressWarnings("unchecked") |
This doesn't need the suppression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
...ugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextField.java
Show resolved
Hide resolved
public class SemanticTextFieldMapper extends FieldMapper implements InferenceFieldMapper { | ||
public static final String CONTENT_TYPE = "semantic_text"; | ||
|
||
private static final Logger logger = LogManager.getLogger(SemanticTextFieldMapper.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logger is unused?
...nference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java
Outdated
Show resolved
Hide resolved
...nference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java
Outdated
Show resolved
Hide resolved
if (current == null) { | ||
conflicts.addConflict("model_settings", ""); | ||
return false; | ||
} | ||
conflicts.addConflict("model_settings", ""); | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two branches are exactly the same, is this on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes no sense, I'm simplifying it
...nference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java
Show resolved
Hide resolved
...lugin/inference/src/yamlRestTest/java/org/elasticsearch/xpack/inference/InferenceRestIT.java
Outdated
Show resolved
Hide resolved
@elasticsearchmachine update branch |
server/src/main/java/org/elasticsearch/index/mapper/Mapping.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't get through all of this yet but left a couple of things that should get fixed IMO. I'll continue later
...ugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextField.java
Outdated
Show resolved
Hide resolved
...nference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on the tests. But I think this is almost ready to go :)
...e/src/yamlRestTest/resources/rest-api-spec/test/inference/10_semantic_text_field_mapping.yml
Outdated
Show resolved
Hide resolved
...s/rest-api-spec/test/inference/20_semantic_text_field_mapping_incompatible_field_mapping.yml
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,105 @@ | |||
setup: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some tests for the following cases? Yaml tests give us a ton more coverage.
- Multi-field (itself as a multi-field and multi-fields UNDER it)
- nested fields
- copy to
- Missing inference chunk text, missing inference results (you already have good tests around bad params or mixed types, huzzah!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Ben, those were definitely good tests to have:
Multi-field (itself as a multi-field and multi-fields UNDER it)
- It can't be used as a multifield (added test)
- For now, we're disabling multi-fields for semantic_text and provide support later (added check and test). It would imply checking the field content to understand if it's coming from a reindex or an index operation in order to access the appropriate content.
nested fields
Added test to check it can be used as a nested field
copy to
- For the same reasons than multi-fields, disabling to use it as the origin of copy_to.
- It can be used as a copy_to target, but this support will be added for the inference PR to come next.
Missing inference chunk text, missing inference results
Added them 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the copy-to check will be interesting, we need to be sure to handle it correctly :)
…-text-field-mapping-specifics
@elasticmachine update branch |
@elasticmachine update branch |
This PR contains changes related to just the
semantic_text
field mapping, including some specific YAML mapping tests.Automatic Inference on ingestion will be included as a separate PR. This change deals with just the mapping side:
semantic_text
dense_vector
orsparse_vector
fields internally