Don't detect source's XContentType in DocumentParser.parseDocument() #26880

tlrx · 2017-10-04T09:08:30Z

DocumentParser.parseDocument() auto detects the XContentType of the
document to parse, but this information is already provided by SourceToParse.

This PR also removes an unused parameter and avoids field names to be split twice in the getMapper() method.

cbuescher

@tlrx thanks, removing unnecessary xContent type detection and the unused atRoot flag looks good to me, I added a querstion about the use of the third change though.

cbuescher · 2017-10-05T10:47:52Z

core/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

@@ -481,14 +482,13 @@ private static void parseObjectOrField(ParseContext context, Mapper mapper) thro
    private static void parseObject(final ParseContext context, ObjectMapper mapper, String currentFieldName) throws IOException {
        assert currentFieldName != null;

-        Mapper objectMapper = getMapper(mapper, currentFieldName);
+        final String[] paths = splitAndValidatePath(currentFieldName);


Can you explain what moving out the call to splitAndValidatePath() from getMapper() saves here? As far as I understand it it seems better to centralize this in getMapper, but I might miss something here.

I came across this while I was debugging step-by-step this portion of code; I've been surprised that the field name was split and validated twice: one in getMapper() and another right after in the else condition. I can revert this if you prefer.

Okay, I understand this now. Makes sense, although it seems nice to hide the call to splitAndValidatePath in the getMapper method.

javanna

good catch @tlrx I think that is a leftover of #22691. We already do auto-detection when we don't know the content-type (we don't have it in the translog) before calling this method, so we should not do auto-detection within the method.

cbuescher

LGTM, I left a tiny suggestion and the current CI failure in PercolatorQuerySearchIT reproduces for me (but not on the previous commit) so checking that would also be good before merging.

cbuescher · 2017-10-07T09:56:56Z

core/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

@@ -929,8 +929,7 @@ private static void parseCopy(String field, ParseContext context) throws IOExcep
    }

    // looks up a child mapper, but takes into account field names that expand to objects
-    static Mapper getMapper(ObjectMapper objectMapper, String fieldName) {
-        String[] subfields = splitAndValidatePath(fieldName);
+    static Mapper getMapper(ObjectMapper objectMapper, String fieldName, String[] subfields) {


nit: seems this can this be made private.

cbuescher · 2017-10-07T10:07:25Z

core/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

@@ -481,14 +482,13 @@ private static void parseObjectOrField(ParseContext context, Mapper mapper) thro
    private static void parseObject(final ParseContext context, ObjectMapper mapper, String currentFieldName) throws IOException {
        assert currentFieldName != null;

-        Mapper objectMapper = getMapper(mapper, currentFieldName);
+        final String[] paths = splitAndValidatePath(currentFieldName);


Okay, I understand this now. Makes sense, although it seems nice to hide the call to splitAndValidatePath in the getMapper method.

DocumentParser.parseDocument() auto detects the XContentType of the document to parse, but this information is already provided by SourceToParse.

…26880) DocumentParser.parseDocument() auto detects the XContentType of the document to parse, but this information is already provided by SourceToParse.

tlrx · 2017-10-10T13:47:41Z

Thanks @javanna and @cbuescher. This has been backported to 6.0 and 6.x.

* master: (35 commits) Create weights lazily in filter and filters aggregation (#26983) Use a dedicated ThreadGroup in rest sniffer (#26897) Fire global checkpoint sync under system context Update by Query is modified to accept short `script` parameter. (#26841) Cat shards bytes (#26952) Add support for parsing inline script (#23824) (#26846) Change default value to true for transpositions parameter of fuzzy query (#26901) Adding unreleased 5.6.4 version number to Version.java Rename TCPTransportTests to TcpTransportTests (#26954) Fix NPE for /_cat/indices when no primary shard (#26953) [DOCS] Fixed indentation of the definition list. Fix formatting in channel close test Check for closed connection while opening Clarify systemd overrides [DOCS] Plugin Installation for Windows (#21671) Painless: add tests for cached boxing (#24163) Don't detect source's XContentType in DocumentParser.parseDocument() (#26880) Fix handling of paths containing parentheses Allow only a fixed-size receive predictor (#26165) Add Homebrew instructions to getting started ...

* 6.x: (32 commits) Use a dedicated ThreadGroup in rest sniffer (#26897) Fire global checkpoint sync under system context Update by Query is modified to accept short `script` parameter. (#26841) Cat shards bytes (#26952) Adding unreleased 5.6.4 version number to Version.java Rename TCPTransportTests to TcpTransportTests (#26954) Fix NPE for /_cat/indices when no primary shard (#26953) [DOCS] Fixed indentation of the definition list. Check for closed connection while opening Clarify systemd overrides [DOCS] Plugin Installation for Windows (#21671) Painless: add tests for cached boxing (#24163) Don't detect source's XContentType in DocumentParser.parseDocument() (#26880) Return List instead of an array from settings (#26903) Emit deprecation warning for variable size predictor Fix handling of paths containing parentheses Deprecate variable-size receive predictor Add Homebrew instructions to getting started ingest: Fix bug that prevent date_index_name processor from accepting timestamps specified as a json number Scripting: Fix expressions to temporarily support filter scripts (#26824) ...

tlrx added :Search Foundations/Mapping Index mappings, including merging and defining field types >enhancement review v7.0.0 labels Oct 4, 2017

cbuescher self-assigned this Oct 5, 2017

cbuescher reviewed Oct 5, 2017

View reviewed changes

javanna approved these changes Oct 5, 2017

View reviewed changes

cbuescher approved these changes Oct 7, 2017

View reviewed changes

tlrx added 2 commits October 10, 2017 13:43

Don't detect source's XContentType in DocumentParser.parseDocument()

11d71d1

DocumentParser.parseDocument() auto detects the XContentType of the document to parse, but this information is already provided by SourceToParse.

Fix PercolatorQuerySearchIT

af7e2d9

tlrx force-pushed the use-contenttype-in-doc-parser branch from 6ee200e to af7e2d9 Compare October 10, 2017 11:46

tlrx merged commit 6658ff0 into elastic:master Oct 10, 2017

tlrx added the v6.1.0 label Oct 10, 2017

tlrx added the v6.0.0 label Oct 10, 2017

tlrx deleted the use-contenttype-in-doc-parser branch October 10, 2017 13:48

lcawl added v6.0.0-rc2 and removed v6.0.0 labels Oct 30, 2017

lcawl removed the v6.1.0 label Dec 12, 2017

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't detect source's XContentType in DocumentParser.parseDocument() #26880

Don't detect source's XContentType in DocumentParser.parseDocument() #26880

tlrx commented Oct 4, 2017

cbuescher left a comment

cbuescher Oct 5, 2017

tlrx Oct 6, 2017

cbuescher Oct 7, 2017

javanna left a comment

cbuescher left a comment

cbuescher Oct 7, 2017

cbuescher Oct 7, 2017

tlrx commented Oct 10, 2017

Don't detect source's XContentType in DocumentParser.parseDocument() #26880

Don't detect source's XContentType in DocumentParser.parseDocument() #26880

Conversation

tlrx commented Oct 4, 2017

cbuescher left a comment

Choose a reason for hiding this comment

cbuescher Oct 5, 2017

Choose a reason for hiding this comment

tlrx Oct 6, 2017

Choose a reason for hiding this comment

cbuescher Oct 7, 2017

Choose a reason for hiding this comment

javanna left a comment

Choose a reason for hiding this comment

cbuescher left a comment

Choose a reason for hiding this comment

cbuescher Oct 7, 2017

Choose a reason for hiding this comment

cbuescher Oct 7, 2017

Choose a reason for hiding this comment

tlrx commented Oct 10, 2017