Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable strict duplicate checks for JSON content #22073

Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -45,17 +45,44 @@ public class JsonXContent implements XContent {
public static XContentBuilder contentBuilder() throws IOException {
return XContentBuilder.builder(jsonXContent);
}

private static final JsonFactory jsonFactory;

public static final JsonXContent jsonXContent;

/*
* NOTE: This comment is only meant for maintainers of the Elasticsearch code base and is intentionally not a Javadoc comment as it
* describes an undocumented system property.
*
*
* Determines whether the JSON parser will always check for duplicate keys in JSON content. This behavior is enabled by default but
* can be disabled by setting the otherwise undocumented system property "es.json.strict_duplicate_detection" to "false".
*
* Before we've enabled this mode, we had custom duplicate checks in various parts of the code base. As the user can still disable this
* mode and fall back to the legacy duplicate checks, we still need to keep the custom duplicate checks around and we also need to keep
* the tests around.
*
* If this fallback via system property is removed one day in the future you can remove all tests that call this method and also remove
* the corresponding custom duplicate check code.
*
*/
public static boolean isStrictDuplicateDetectionEnabled() {
// Don't allow duplicate keys in JSON content by default but let the user opt out
return Boolean.valueOf(System.getProperty("es.json.strict_duplicate_detection", "true"));
}

static {
jsonFactory = new JsonFactory();
jsonFactory.configure(JsonGenerator.Feature.QUOTE_FIELD_NAMES, true);
jsonFactory.configure(JsonParser.Feature.ALLOW_COMMENTS, true);
jsonFactory.configure(JsonFactory.Feature.FAIL_ON_SYMBOL_HASH_OVERFLOW, false); // this trips on many mappings now...
// Do not automatically close unclosed objects/arrays in com.fasterxml.jackson.core.json.UTF8JsonGenerator#close() method
jsonFactory.configure(JsonGenerator.Feature.AUTO_CLOSE_JSON_CONTENT, false);
try {
jsonFactory.configure(JsonParser.Feature.STRICT_DUPLICATE_DETECTION, isStrictDuplicateDetectionEnabled());
} catch (Exception ex) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this exception still a problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is. A user could still pass -Des.json.strict_duplicate_detection=complete_bogus and provoke an exception when we try to convert the system property to a boolean. I don't want to be lenient and terminate early then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they specify a non-boolean I think they'll get false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should only support true and false though.

// don't be lenient if the user specified an invalid value
throw new ExceptionInInitializerError(ex);
}
jsonXContent = new JsonXContent();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ public void testFieldsParsing() throws Exception {
assertThat(request.getIndexConstraints()[3].getComparison(), equalTo(LTE));
assertThat(request.getIndexConstraints()[4].getField(), equalTo("field5"));
assertThat(request.getIndexConstraints()[4].getValue(), equalTo("2"));
assertThat(request.getIndexConstraints()[4].getProperty(), equalTo(MAX));
assertThat(request.getIndexConstraints()[4].getProperty(), equalTo(MIN));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For those following along at home, this is required because the example data had a duplicate key and @danielmitterdorfer switched it to min in the file.

assertThat(request.getIndexConstraints()[4].getComparison(), equalTo(GT));
assertThat(request.getIndexConstraints()[5].getField(), equalTo("field5"));
assertThat(request.getIndexConstraints()[5].getValue(), equalTo("9"));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import org.elasticsearch.ElasticsearchParseException;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.settings.SettingsException;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.elasticsearch.test.ESTestCase;

import static org.hamcrest.CoreMatchers.containsString;
Expand All @@ -48,6 +49,10 @@ public void testSimpleJsonSettings() throws Exception {
}

public void testDuplicateKeysThrowsException() {
if (JsonXContent.isStrictDuplicateDetectionEnabled()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about assumeTrue("Test only makes sense if json parser doesn't have strict duplicate checks enabled and can be removed if we force it to be enabled", JsonXContent.isStrictDuplicateDetectionEnabled());?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't know about assumeTrue / assumeFalse and have only noticed the approach that I've used in the code base. Btw, I think this should be assumeFalse. But this makes sense and I'll change the tests as you've suggested. Thanks for the pointer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I think I did have it backwards.

logger.info("Skipping test as it uses a custom duplicate check that is obsolete when strict duplicate checks are enabled.");
return;
}
final String json = "{\"foo\":\"bar\",\"foo\":\"baz\"}";
final SettingsException e = expectThrows(SettingsException.class, () -> Settings.builder().loadFromSource(json).build());
assertEquals(e.getCause().getClass(), ElasticsearchParseException.class);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import org.elasticsearch.common.ParsingException;
import org.elasticsearch.common.bytes.BytesReference;
import org.elasticsearch.common.xcontent.AbstractObjectParser.NoContextParser;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.elasticsearch.test.ESTestCase;
import org.hamcrest.Matcher;

Expand Down Expand Up @@ -168,6 +169,10 @@ public void testMissingFirstConstructorArgButNotRequired() throws IOException {
}

public void testRepeatedConstructorParam() throws IOException {
if (JsonXContent.isStrictDuplicateDetectionEnabled()) {
logger.info("Skipping test as it uses a custom duplicate check that is obsolete when strict duplicate checks are enabled.");
return;
}
XContentParser parser = XContentType.JSON.xContent().createParser(
"{\n"
+ " \"vegetable\": 1,\n"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonGenerator;

import com.fasterxml.jackson.core.JsonParseException;
import org.elasticsearch.common.xcontent.BaseXContentTestCase;
import org.elasticsearch.common.xcontent.XContentType;

Expand All @@ -39,4 +40,15 @@ public void testBigInteger() throws Exception {
JsonGenerator generator = new JsonFactory().createGenerator(os);
doTestBigInteger(generator, os);
}

public void testChecksForDuplicates() throws Exception {
if (JsonXContent.isStrictDuplicateDetectionEnabled() == false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assumeFalse here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll change all affected tests accordingly.

logger.info("Skipping test as it is only effective is strict duplicate detection is enabled");
return;
}

JsonParseException pex = expectThrows(JsonParseException.class,
() -> XContentType.JSON.xContent().createParser("{ \"key\": 1, \"key\": 2 }").map());
assertEquals("Duplicate field 'key'", pex.getMessage());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,8 @@ public void testCopyToFieldsParsing() throws Exception {
assertThat(copyTestMap.get("type").toString(), is("text"));
List<String> copyToList = (List<String>) copyTestMap.get("copy_to");
assertThat(copyToList.size(), equalTo(2));
assertThat(copyToList.get(0).toString(), equalTo("another_field"));
assertThat(copyToList.get(1).toString(), equalTo("cyclic_test"));
assertThat(copyToList.get(0), equalTo("another_field"));
assertThat(copyToList.get(1), equalTo("cyclic_test"));

// Check data parsing
BytesReference json = jsonBuilder().startObject()
Expand Down Expand Up @@ -312,44 +312,43 @@ public void testCopyToFieldMerge() throws Exception {
public void testCopyToNestedField() throws Exception {
IndexService indexService = createIndex("test");
DocumentMapperParser parser = indexService.mapperService().documentMapperParser();
for (boolean mapped : new boolean[] {true, false}) {
XContentBuilder mapping = jsonBuilder().startObject()
.startObject("type")
.startObject("properties")
.startObject("target")
.field("type", "long")
.field("doc_values", false)
.endObject()
.startObject("n1")
.field("type", "nested")
.startObject("properties")
.startObject("target")
.field("type", "long")
.field("doc_values", false)
XContentBuilder mapping = jsonBuilder().startObject()
.startObject("type")
.startObject("properties")
.startObject("target")
.field("type", "long")
.field("doc_values", false)
.endObject()
.startObject("n1")
.field("type", "nested")
.startObject("properties")
.startObject("target")
.field("type", "long")
.field("doc_values", false)
.endObject()
.startObject("n2")
.field("type", "nested")
.startObject("properties")
.startObject("target")
.field("type", "long")
.field("doc_values", false)
.endObject()
.startObject("source")
.field("type", "long")
.field("doc_values", false)
.startArray("copy_to")
.value("target") // should go to the root doc
.value("n1.target") // should go to the parent doc
.value("n1.n2.target") // should go to the current doc
.endArray()
.endObject()
.endObject()
.startObject("n2")
.field("type", "nested")
.startObject("properties")
.startObject("target")
.field("type", "long")
.field("doc_values", false)
.endObject()
.startObject("source")
.field("type", "long")
.field("doc_values", false)
.startArray("copy_to")
.value("target") // should go to the root doc
.value("n1.target") // should go to the parent doc
.value("n1.n2.target") // should go to the current doc
.endArray()
.endObject();
for (int i = 0; i < 3; ++i) {
if (mapped) {
mapping = mapping.startObject("target").field("type", "long").field("doc_values", false).endObject();
}
mapping = mapping.endObject().endObject();
}
mapping = mapping.endObject();
.endObject()
.endObject()
.endObject()
.endObject()
.endObject()
.endObject();

DocumentMapper mapper = parser.parse("type", new CompressedXContent(mapping.string()));

Expand All @@ -376,39 +375,38 @@ public void testCopyToNestedField() throws Exception {
.endArray()
.endObject();

ParsedDocument doc = mapper.parse("test", "type", "1", jsonDoc.bytes());
assertEquals(6, doc.docs().size());

Document nested = doc.docs().get(0);
assertFieldValue(nested, "n1.n2.target", 7L);
assertFieldValue(nested, "n1.target");
assertFieldValue(nested, "target");

nested = doc.docs().get(2);
assertFieldValue(nested, "n1.n2.target", 5L);
assertFieldValue(nested, "n1.target");
assertFieldValue(nested, "target");

nested = doc.docs().get(3);
assertFieldValue(nested, "n1.n2.target", 3L);
assertFieldValue(nested, "n1.target");
assertFieldValue(nested, "target");

Document parent = doc.docs().get(1);
assertFieldValue(parent, "target");
assertFieldValue(parent, "n1.target", 7L);
assertFieldValue(parent, "n1.n2.target");

parent = doc.docs().get(4);
assertFieldValue(parent, "target");
assertFieldValue(parent, "n1.target", 3L, 5L);
assertFieldValue(parent, "n1.n2.target");

Document root = doc.docs().get(5);
assertFieldValue(root, "target", 3L, 5L, 7L);
assertFieldValue(root, "n1.target");
assertFieldValue(root, "n1.n2.target");
}
ParsedDocument doc = mapper.parse("test", "type", "1", jsonDoc.bytes());
assertEquals(6, doc.docs().size());

Document nested = doc.docs().get(0);
assertFieldValue(nested, "n1.n2.target", 7L);
assertFieldValue(nested, "n1.target");
assertFieldValue(nested, "target");

nested = doc.docs().get(2);
assertFieldValue(nested, "n1.n2.target", 5L);
assertFieldValue(nested, "n1.target");
assertFieldValue(nested, "target");

nested = doc.docs().get(3);
assertFieldValue(nested, "n1.n2.target", 3L);
assertFieldValue(nested, "n1.target");
assertFieldValue(nested, "target");

Document parent = doc.docs().get(1);
assertFieldValue(parent, "target");
assertFieldValue(parent, "n1.target", 7L);
assertFieldValue(parent, "n1.n2.target");

parent = doc.docs().get(4);
assertFieldValue(parent, "target");
assertFieldValue(parent, "n1.target", 3L, 5L);
assertFieldValue(parent, "n1.n2.target");

Document root = doc.docs().get(5);
assertFieldValue(root, "target", 3L, 5L, 7L);
assertFieldValue(root, "n1.target");
assertFieldValue(root, "n1.n2.target");
}

public void testCopyToDynamicNestedObjectParsing() throws Exception {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.XContentFactory;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.elasticsearch.search.internal.SearchContext;
import org.elasticsearch.test.AbstractQueryTestCase;
import org.hamcrest.Matchers;
Expand Down Expand Up @@ -343,6 +344,10 @@ public void testUnknownQueryName() throws IOException {
* test that two queries in object throws error
*/
public void testTooManyQueriesInObject() throws IOException {
if (JsonXContent.isStrictDuplicateDetectionEnabled()) {
logger.info("Skipping test as it uses a custom duplicate check that is obsolete when strict duplicate checks are enabled.");
return;
}
String clauseType = randomFrom("must", "should", "must_not", "filter");
// should also throw error if invalid query is preceded by a valid one
String query = "{\n" +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import org.apache.lucene.search.ConstantScoreQuery;
import org.apache.lucene.search.Query;
import org.elasticsearch.common.ParsingException;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.elasticsearch.search.internal.SearchContext;
import org.elasticsearch.test.AbstractQueryTestCase;

Expand Down Expand Up @@ -65,6 +66,10 @@ public void testFilterElement() throws IOException {
* test that multiple "filter" elements causes {@link ParsingException}
*/
public void testMultipleFilterElements() throws IOException {
if (JsonXContent.isStrictDuplicateDetectionEnabled()) {
logger.info("Skipping test as it uses a custom duplicate check that is obsolete when strict duplicate checks are enabled.");
return;
}
String queryString = "{ \"" + ConstantScoreQueryBuilder.NAME + "\" : {\n" +
"\"filter\" : { \"term\": { \"foo\": \"a\" } },\n" +
"\"filter\" : { \"term\": { \"foo\": \"x\" } },\n" +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
import org.elasticsearch.common.lucene.search.function.WeightFactorFunction;
import org.elasticsearch.common.unit.DistanceUnit;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.elasticsearch.index.query.AbstractQueryBuilder;
import org.elasticsearch.index.query.MatchAllQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
Expand Down Expand Up @@ -722,6 +723,10 @@ public void testMalformedQueryMultipleQueryObjects() throws IOException {
}

public void testMalformedQueryMultipleQueryElements() throws IOException {
if (JsonXContent.isStrictDuplicateDetectionEnabled()) {
logger.info("Skipping test as it uses a custom duplicate check that is obsolete when strict duplicate checks are enabled.");
return;
}
String json = "{\n" +
" \"function_score\":{\n" +
" \"query\":{\n" +
Expand Down
Loading