Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow doc-values only search on date types #82602

Merged
merged 2 commits into from
Jan 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/reference/mapping/params/doc-values.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ makes this data access pattern possible. They store the same values as the
sorting and aggregations. Doc values are supported on almost all field types,
with the __notable exception of `text` and `annotated_text` fields__.

<<number,Numeric types>>, such as `long` and `double`, can also be queried
<<number,Numeric types>>, such as `long` and `double`, and <<date,Date types>>
can also be queried
when they are not <<mapping-index,indexed>> but only have doc values enabled.
Query performance on doc values is much slower than on index structures, but
offers an interesting tradeoff between disk usage and query performance for
Expand Down
4 changes: 3 additions & 1 deletion docs/reference/mapping/types/date.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,9 @@ The following parameters are accepted by `date` fields:

<<mapping-index,`index`>>::

Should the field be searchable? Accepts `true` (default) and `false`.
Should the field be quickly searchable? Accepts `true` (default) and
`false`. Date fields that only have <<doc-values,`doc_values`>>
enabled can also be queried, albeit slower.

<<null-value,`null_value`>>::

Expand Down
3 changes: 2 additions & 1 deletion docs/reference/query-dsl.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ the stability of the cluster. Those queries can be categorised as follows:

* Queries that need to do linear scans to identify matches:
** <<query-dsl-script-query,`script` queries>>
** queries on <<number,numeric fields>> that are not indexed but have <<doc-values,doc values>> enabled
** queries on <<number,numeric>> and <<date,date>> fields that are not indexed
but have <<doc-values,doc values>> enabled

* Queries that have a high up-front cost:
** <<query-dsl-fuzzy-query,`fuzzy` queries>> (except on
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,9 @@ setup:
type: long
date:
type: date
non_indexed_date:
type: date
index: false
geo:
type: keyword
object:
Expand Down Expand Up @@ -210,6 +213,18 @@ setup:

- match: {fields.object\.nested1.long.searchable: true}

---
"Field caps for date field with only doc values":
- skip:
version: " - 8.0.99"
reason: "doc values search was added in 8.1.0"
- do:
field_caps:
index: 'test1,test2,test3'
fields: non_indexed_date

- match: {fields.non_indexed_date.date.searchable: true}

---
"Get object and nested field caps":

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ setup:
created_at:
type: date
format: "yyyy-MM-dd"
created_at_not_indexed:
type: date
index: false
format: "yyyy-MM-dd"
- do:
indices.create:
index: index_2
Expand All @@ -21,6 +25,10 @@ setup:
created_at:
type: date_nanos
format: "yyyy-MM-dd"
created_at_not_indexed:
type: date
index: false
format: "yyyy-MM-dd"
- do:
indices.create:
index: index_3
Expand All @@ -32,6 +40,10 @@ setup:
created_at:
type: date
format: "yyyy-MM-dd"
created_at_not_indexed:
type: date
index: false
format: "yyyy-MM-dd"


---
Expand Down Expand Up @@ -222,3 +234,53 @@ setup:
- length: { hits.hits: 1 }
- match: {hits.hits.0._id: "3" }
- length: { aggregations.idx_terms.buckets: 3 }

---
"prefilter on non-indexed date fields":
- skip:
version: "- 8.0.99"
reason: "doc values search was added in 8.1.0"

- do:
index:
index: index_1
id: 1
body: { "created_at_not_indexed": "2016-01-01"}
- do:
index:
index: index_2
id: 2
body: { "created_at_not_indexed": "2017-01-01" }

- do:
index:
index: index_3
id: 3
body: { "created_at_not_indexed": "2018-01-01" }
- do:
indices.refresh: {}


- do:
search:
rest_total_hits_as_int: true
body: { "size" : 0, "query" : { "range" : { "created_at_not_indexed" : { "gte" : "2016-02-01", "lt": "2018-02-01"} } } }

- match: { _shards.total: 3 }
- match: { _shards.successful: 3 }
- match: { _shards.skipped: 0 }
- match: { _shards.failed: 0 }
- match: { hits.total: 2 }

# this is a case where we would normally skip due to rewrite but we can't because we only have doc values
- do:
search:
rest_total_hits_as_int: true
pre_filter_shard_size: 1
body: { "size" : 0, "query" : { "range" : { "created_at_not_indexed" : { "gte" : "2016-02-01", "lt": "2018-02-01"} } } }

- match: { _shards.total: 3 }
- match: { _shards.successful: 3 }
- match: { _shards.skipped : 0 }
- match: { _shards.failed: 0 }
- match: { hits.total: 2 }
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@ setup:
short:
type: short
index: false
date:
type: date
format: yyyy/MM/dd
index: false

- do:
index:
Expand All @@ -45,6 +49,7 @@ setup:
integer: 1
long: 1
short: 1
date: "2017/01/01"

- do:
index:
Expand All @@ -58,6 +63,7 @@ setup:
integer: 2
long: 2
short: 2
date: "2017/01/02"

- do:
indices.refresh: {}
Expand Down Expand Up @@ -196,3 +202,21 @@ setup:
index: test
body: { query: { range: { short: { gte: 0 } } } }
- length: { hits.hits: 2 }

---
"Test match query on date field where only doc values are enabled":

- do:
search:
index: test
body: { query: { match: { date: { query: "2017/01/01" } } } }
- length: { hits.hits: 1 }

---
"Test range query on date field where only doc values are enabled":

- do:
search:
index: test
body: { query: { range: { date: { gte: "2017/01/01" } } } }
- length: { hits.hits: 2 }
Original file line number Diff line number Diff line change
Expand Up @@ -367,7 +367,7 @@ public static final class DateFieldType extends MappedFieldType {

public DateFieldType(
String name,
boolean isSearchable,
boolean isIndexed,
boolean isStored,
boolean hasDocValues,
DateFormatter dateTimeFormatter,
Expand All @@ -376,7 +376,7 @@ public DateFieldType(
FieldValues<Long> scriptValues,
Map<String, String> meta
) {
super(name, isSearchable, isStored, hasDocValues, TextSearchInfo.SIMPLE_MATCH_WITHOUT_TERMS, meta);
super(name, isIndexed, isStored, hasDocValues, TextSearchInfo.SIMPLE_MATCH_WITHOUT_TERMS, meta);
this.dateTimeFormatter = dateTimeFormatter;
this.dateMathParser = dateTimeFormatter.toDateMathParser();
this.resolution = resolution;
Expand All @@ -388,6 +388,10 @@ public DateFieldType(String name) {
this(name, true, false, true, DEFAULT_DATE_TIME_FORMATTER, Resolution.MILLISECONDS, null, null, Collections.emptyMap());
}

public DateFieldType(String name, boolean isIndexed) {
this(name, isIndexed, false, true, DEFAULT_DATE_TIME_FORMATTER, Resolution.MILLISECONDS, null, null, Collections.emptyMap());
}

public DateFieldType(String name, DateFormatter dateFormatter) {
this(name, true, false, true, dateFormatter, Resolution.MILLISECONDS, null, null, Collections.emptyMap());
}
Expand Down Expand Up @@ -464,6 +468,11 @@ private String format(long timestamp, DateFormatter formatter) {
return formatter.format(dateTime);
}

@Override
public boolean isSearchable() {
return isIndexed() || hasDocValues();
}

@Override
public Query termQuery(Object value, @Nullable SearchExecutionContext context) {
return rangeQuery(value, value, true, true, ShapeRelation.INTERSECTS, null, null, context);
Expand All @@ -480,7 +489,7 @@ public Query rangeQuery(
@Nullable DateMathParser forcedDateParser,
SearchExecutionContext context
) {
failIfNotIndexed();
failIfNotIndexedNorDocValuesFallback(context);
if (relation == ShapeRelation.DISJOINT) {
throw new IllegalArgumentException("Field [" + name() + "] of type [" + typeName() + "] does not support DISJOINT ranges");
}
Expand All @@ -496,14 +505,18 @@ public Query rangeQuery(
parser = forcedDateParser;
}
return dateRangeQuery(lowerTerm, upperTerm, includeLower, includeUpper, timeZone, parser, context, resolution, (l, u) -> {
Query query = LongPoint.newRangeQuery(name(), l, u);
if (hasDocValues()) {
Query dvQuery = SortedNumericDocValuesField.newSlowRangeQuery(name(), l, u);
query = new IndexOrDocValuesQuery(query, dvQuery);

if (context.indexSortedOnField(name())) {
query = new IndexSortSortedNumericDocValuesRangeQuery(name(), l, u, query);
Query query;
if (isIndexed()) {
query = LongPoint.newRangeQuery(name(), l, u);
if (hasDocValues()) {
Query dvQuery = SortedNumericDocValuesField.newSlowRangeQuery(name(), l, u);
query = new IndexOrDocValuesQuery(query, dvQuery);
}
} else {
query = SortedNumericDocValuesField.newSlowRangeQuery(name(), l, u);
}
if (hasDocValues() && context.indexSortedOnField(name())) {
query = new IndexSortSortedNumericDocValuesRangeQuery(name(), l, u, query);
}
return query;
});
Expand Down Expand Up @@ -593,6 +606,10 @@ public Relation isFieldWithinQuery(
DateMathParser dateParser,
QueryRewriteContext context
) throws IOException {
if (isIndexed() == false && hasDocValues()) {
// we don't have a quick way to run this check on doc values, so fall back to default assuming we are within bounds
return Relation.INTERSECTS;
}
byte[] minPackedValue = PointValues.getMinPackedValue(reader, name());
if (minPackedValue == null) {
// no points, so nothing matches
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2025,6 +2025,9 @@ public ShardLongFieldRange getTimestampRange() {
if (mappedFieldType instanceof DateFieldMapper.DateFieldType == false) {
return ShardLongFieldRange.UNKNOWN; // field missing or not a date
}
if (mappedFieldType.isIndexed() == false) {
return ShardLongFieldRange.UNKNOWN; // range information missing
}

final ShardLongFieldRange rawTimestampFieldRange;
try {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,19 @@ public void testIsFieldWithinRangeEmptyReader() throws IOException {
);
}

public void testIsFieldWithinRangeOnlyDocValues() throws IOException {
QueryRewriteContext context = new QueryRewriteContext(parserConfig(), writableRegistry(), null, () -> nowInMillis);
IndexReader reader = new MultiReader();
DateFieldType ft = new DateFieldType("my_date", false);
// in case of only doc-values, we can't establish disjointness
assertEquals(
Relation.INTERSECTS,
ft.isFieldWithinQuery(reader, "2015-10-12", "2016-04-03", randomBoolean(), randomBoolean(), null, null, context)
);
}

public void testIsFieldWithinQueryDateMillis() throws IOException {
DateFieldType ft = new DateFieldType("my_date", Resolution.MILLISECONDS);
DateFieldType ft = new DateFieldType("my_date");
isFieldWithinRangeTestCase(ft);
}

Expand Down Expand Up @@ -192,19 +203,23 @@ public void testTermQuery() {
);
assertEquals(expected, ft.termQuery(date, context));

ft = new DateFieldType("field", false);
expected = SortedNumericDocValuesField.newSlowRangeQuery("field", instant, instant + 999);
assertEquals(expected, ft.termQuery(date, context));

MappedFieldType unsearchable = new DateFieldType(
"field",
false,
false,
true,
false,
DateFieldMapper.DEFAULT_DATE_TIME_FORMATTER,
Resolution.MILLISECONDS,
null,
null,
Collections.emptyMap()
);
IllegalArgumentException e = expectThrows(IllegalArgumentException.class, () -> unsearchable.termQuery(date, context));
assertEquals("Cannot search on field [field] since it is not indexed.", e.getMessage());
assertEquals("Cannot search on field [field] since it is not indexed nor has doc values.", e.getMessage());
}

public void testRangeQuery() throws IOException {
Expand Down Expand Up @@ -245,6 +260,10 @@ public void testRangeQuery() throws IOException {
);
assertEquals(expected, ft.rangeQuery(date1, date2, true, true, null, null, null, context).rewrite(new MultiReader()));

MappedFieldType ft2 = new DateFieldType("field", false);
Query expected2 = SortedNumericDocValuesField.newSlowRangeQuery("field", instant1, instant2);
assertEquals(expected2, ft2.rangeQuery(date1, date2, true, true, null, null, null, context).rewrite(new MultiReader()));

instant1 = nowInMillis;
instant2 = instant1 + 100;
expected = new DateRangeIncludingNowQuery(
Expand All @@ -255,11 +274,14 @@ public void testRangeQuery() throws IOException {
);
assertEquals(expected, ft.rangeQuery("now", instant2, true, true, null, null, null, context));

expected2 = new DateRangeIncludingNowQuery(SortedNumericDocValuesField.newSlowRangeQuery("field", instant1, instant2));
assertEquals(expected2, ft2.rangeQuery("now", instant2, true, true, null, null, null, context));

MappedFieldType unsearchable = new DateFieldType(
"field",
false,
false,
true,
false,
DateFieldMapper.DEFAULT_DATE_TIME_FORMATTER,
Resolution.MILLISECONDS,
null,
Expand All @@ -270,7 +292,7 @@ public void testRangeQuery() throws IOException {
IllegalArgumentException.class,
() -> unsearchable.rangeQuery(date1, date2, true, true, null, null, null, context)
);
assertEquals("Cannot search on field [field] since it is not indexed.", e.getMessage());
assertEquals("Cannot search on field [field] since it is not indexed nor has doc values.", e.getMessage());
}

public void testRangeQueryWithIndexSort() {
Expand Down Expand Up @@ -321,6 +343,10 @@ public void testRangeQueryWithIndexSort() {
new IndexOrDocValuesQuery(pointQuery, dvQuery)
);
assertEquals(expected, ft.rangeQuery(date1, date2, true, true, null, null, null, context));

ft = new DateFieldType("field", false);
expected = new IndexSortSortedNumericDocValuesRangeQuery("field", instant1, instant2, dvQuery);
assertEquals(expected, ft.rangeQuery(date1, date2, true, true, null, null, null, context));
}

public void testDateNanoDocValues() throws IOException {
Expand Down
Loading