-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factor out sort values from InternalSearchHit #22080
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few minor things, LGTM though.
sortValues = new Object[0]; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move writeTo
up here? I like to get reading and writing on the same page if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
@Override | ||
public void writeTo(StreamOutput out) throws IOException { | ||
out.writeVInt(sortValues.length); | ||
if (sortValues.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this if
is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
} | ||
} | ||
|
||
public boolean xContentEquals(ToXContentToBytes other) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer to leave this out and do assertEquals(this.toString(), other.toString())
in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a few comments but I like it
sortValues = new Object[0]; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
assert parser.currentToken() == XContentParser.Token.FIELD_NAME; | ||
assert parser.currentName().equals(Fields.SORT); | ||
XContentParser.Token token = parser.nextToken(); | ||
assert token == XContentParser.Token.START_ARRAY; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall these be exceptions? I don't have a strong opinion but I got this same comment from Tanguy in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way is fine. I will change it to what you did in #22082.
} | ||
|
||
public SearchSortValues(Object[] sortValues, DocValueFormat[] sortValueFormats) { | ||
this.sortValues = Arrays.copyOf(sortValues, sortValues.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we check that sortValueFormats is not null? also sortValues?
@@ -86,7 +82,7 @@ | |||
|
|||
private Map<String, HighlightField> highlightFields = null; | |||
|
|||
private Object[] sortValues = EMPTY_SORT_VALUES; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this empty array could be useful, shall we rather have a SearchSortValues.EMPTY default ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do.
@Override | ||
public void writeTo(StreamOutput out) throws IOException { | ||
out.writeVInt(sortValues.length); | ||
if (sortValues.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
} | ||
} | ||
|
||
public boolean xContentEquals(ToXContentToBytes other) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
valueSuppliers.add(() -> randomByte()); | ||
valueSuppliers.add(() -> randomShort()); | ||
valueSuppliers.add(() -> randomBoolean()); | ||
valueSuppliers.add(() -> frequently() ? randomAsciiOfLengthBetween(1, 30) : randomRealisticUnicodeOfCodepointLength(30)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, more elegant than a switch. I may steal this :)
@nik9000 @javanna I added a commit that addresses your last comments. For now I exclude SMILE from the xcontent types being tested because it makes roundtrip testing very hard, even if we only compare the json string representation of the object before and after parsing xContent. The problem is that when SMILE sends float values and we don't explicitely cast them back to floats, they are parsed as |
SearchSortValues sortValues = createTestItem(); | ||
// exclude SMILE: its difficult to compare exact float values after parsing | ||
// they get returned as doubles with a slightly different value if not cast back to float (which we cannot do due to | ||
// losing type information on the rest layer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did quite some digging around these, I did find that the different parsers for SMILE, CBOR and JSON and YAML have a different way of parsing float, either as float, as double, or as double but with double precision. This makes comparisons a bit cumbersome. The issue you were finding with SMILE is that once you parse a float, it is parsed into a double variable, which has double precision. When you print that out it becomes a double, which reparsed it is not a float anymore. The solution isn't to disable SMILE testing though, I resorted to comparing the map representations, essentially by parsing them back into maps. That also solves eventual ordering issue, as json keys ordering doesn't matter. see https://github.com/elastic/elasticsearch/pull/22082/files#diff-3e73f4760da40a9d55345c94edf833c2R108 . I think that we have to move that assertion to some common place, not sure yet where.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this using the common helpers you pointed me to, left TODOs where we need to remove those once they are moved to a common class.
builder.startObject(); // we need to wrap xContent output in proper object to create a parser for it | ||
builder = sortValues.toXContent(builder, ToXContent.EMPTY_PARAMS); | ||
builder.endObject(); | ||
return builder.string(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use Strings.toString instead? is pretty printing important here?
builder = sortValues.toXContent(builder, ToXContent.EMPTY_PARAMS); | ||
builder.endObject(); | ||
|
||
XContentParser parser = xcontentType.xContent().createParser(builder.bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general, we should rather use createParser from ESTestCase (it was added yesterday I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
parser.nextToken(); | ||
if (parser.currentToken() != XContentParser.Token.START_ARRAY) { | ||
throw new ParsingException(parser.getTokenLocation(), "expected [" + XContentParser.Token.START_ARRAY + "] token"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the static methods in XContentParserUtils instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
adda43b
to
a0fede6
Compare
@javanna I hope I adressed your last comments, using two (slightly modified) helper methods from your own PR now but leaving a note to remove these once they are available elsewhere. Not sure if we should hold this PR until those are merged. |
public static SearchSortValues fromXContent(XContentParser parser) throws IOException { | ||
XContentParserUtils.ensureFieldName(parser, parser.currentToken(), Fields.SORT); | ||
parser.nextToken(); | ||
XContentParserUtils.ensureType(XContentParser.Token.START_ARRAY, parser.currentToken(), () -> parser.getTokenLocation()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just do parser.nextToken() as part of this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it but decided against it for better readability, I found advancing of the token gets hidden too much otherwise. Matter of taste I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's ok with me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you could then save the returned token rather than retrieve it back from the parser? super minor nit though, not even sure what it helps
XContentParserUtils.ensureFieldName(parser, parser.currentToken(), Fields.SORT); | ||
parser.nextToken(); | ||
XContentParserUtils.ensureType(XContentParser.Token.START_ARRAY, parser.currentToken(), () -> parser.getTokenLocation()); | ||
return new SearchSortValues(parser.list().toArray()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parser.list also supports nested arrays and objects. can they appear as part of sort values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may have the same "problem" that we have in get with stored fields, where we print out potentially anything, and when parsing it back we have to figure out what formats we actually support. Note also that parser list supports binary objects, but it returns byte[] for them, which makes it cumbersome to test them, hence I am assuming that part does not get tested at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat limited here due to what is possible to go through writeTo/readFrom. Here this is essential single values, they should all be covered by the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but with serialization we can always read whatever we wrote in the same format. With parsing on the other hand, we can write many different types but we can parse only a few e.g. strings, numbers and not much more. That is why the generic methods hide complexity some times, especially because we end up supporting stuff that may not be needed and is not tested. Maybe it's ok in this case though. Can you double check if byte[] can be a potential value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had missed that we know exactly which formats are supported thanks to the serialization methods that don't use a generic write/read method. So no byte[], no objects etc. I do wonder if it's cleaner to read the list manually and call parser.objectText() to read each value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, the main point is about formats supported when parsing stuff back, I think we have to figure out exactly what we support and test it rather than support anything which seems to be what we do at the moment.
This adds fromXContent method and unit test for sort values that are part of InternalSearchHit. In order to centralize serialisation and xContent parsing and rendering code, move all relevant parts to a new class which can be unit tested much better in isolation.This is part of the preparation for parsing search responses on the client side.
d5b9cbf
to
668bfcd
Compare
668bfcd
to
6e9a583
Compare
@javanna I updated this using the helper methods introduced in |
it was removed as there were no usages for it. Feel free to add it back if it's going to be useful in other places too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a small comment, LGTM otherwise
*/ | ||
public static void ensureFieldName(XContentParser parser, Token token, String fieldName) throws IOException { | ||
ensureExpectedToken(Token.FIELD_NAME, token, parser::getTokenLocation); | ||
String current = parser.currentName() != null ? parser.currentName() : "<null>"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to compare to something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? current
is either the current tokens field name or <null>
which I thought serves for printing the error message. I might be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the currentName is null we have a bigger problem. why do we do the comparison even? have you tried passing in fieldName set to ? if the current token is a field name, currentName should not be null ever. I would treat that corner case differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, got it. Makes sense.
This adds fromXContent method and unit test for sort values that are part of InternalSearchHit. In order to centralize serialisation and xContent parsing and rendering code, move all relevant parts to a new class which can be unit tested much better in isolation.This is part of the preparation for parsing search responses on the client side.
On 5.x with 9404411 |
* master: Simplify Unicast Zen Ping (elastic#22277) Replace IndicesQueriesRegistry (elastic#22289) Fixed document mistake and fit for 5.1.1 API [TEST] improve error message in ESTestCase#assertWarnings [TEST] remove deleted test classes from checkstyle suppressions [TEST] make ESSingleNodeTestCase tests repeatable (elastic#22283) Link for setting page in elasticsearch.yml is outdated Factor out sort values from InternalSearchHit (elastic#22080) Add ID for percolate query to Java API docs x_refresh.yaml tests should use unique index names and doc ids to ease debugging IndicesStoreIntegrationIT should not use start recovery sending as an indication that the recovery started Added base class for testing aggregators and some initial tests for `terms`, `top_hits` and `min` aggregations. Add link to foreach processor to ingest-attachment.asciidoc
This adds fromXContent method and unit test for sort values that are part of
InternalSearchHit. In order to centralize serialisation and xContent parsing and
rendering code, move all relevant parts to a new class which can be unit tested
much better in isolation.This is part of the preparation for parsing search
responses on the client side.