Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make date parsing more flexible for linedocsfile (europarl, enwiki) #13075

Merged
merged 4 commits into from
Feb 5, 2024

Conversation

dweiss
Copy link
Contributor

@dweiss dweiss commented Feb 5, 2024

Fixes #13073.

@dweiss dweiss self-assigned this Feb 5, 2024
@dweiss
Copy link
Contributor Author

dweiss commented Feb 5, 2024

This makes date parsing accept both europarl and enwiki. The tests still assume the docs come from europarl though and fail on the large enwiki.random.lines.txt:

gradlew -p lucene\backward-codecs test -Dtests.linedocsfile=/../enwiki.random.lines.txt

…another plain term query right below it (body:the) so I don't think it hurts much. apache#13073
@dweiss dweiss marked this pull request as ready for review February 5, 2024 10:36
@dweiss dweiss requested a review from s1monw February 5, 2024 10:36
Copy link
Contributor

@uschindler uschindler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. Why did you create a Function<String,LocalDateTime> for parsing instead of a simple static method? I think you did this to hide the formatter instances?

@uschindler
Copy link
Contributor

P.S.: Maybe add a quick test for the parsing logic to check that both formats are accepted.

@dweiss
Copy link
Contributor Author

dweiss commented Feb 5, 2024

Looks fine. Why did you create a Function<String,LocalDateTime> for parsing instead of a simple static method? I think you did this to hide the formatter instances?

Yes, correct.

@dweiss
Copy link
Contributor Author

dweiss commented Feb 5, 2024

P.S.: Maybe add a quick test for the parsing logic to check that both formats are accepted.

Maybe it'd be better to move this logic into LineFileDocs so that the date field's value is normalized somehow? Then we can add TestLineFileDocs and test it there. Adding a test method to TestIndexSortBackwardsCompatibility will multiply it over all the parameters, which doesn't make sense.

@dweiss
Copy link
Contributor Author

dweiss commented Feb 5, 2024

I've moved that utility function to LineFileDocs and added a basic test case to TestLineFileDocs (good idea). I feel tempted to normalize the date field's value in LineFileDocs but this may break other stuff so better to leave it for a follow-up issue.

@dweiss
Copy link
Contributor Author

dweiss commented Feb 5, 2024

I'll merge this in so that we can avoid jenkins failures. If there has to be a follow-up, I'll open another issue.

@dweiss dweiss merged commit 635d090 into apache:main Feb 5, 2024
4 checks passed
@s1monw
Copy link
Member

s1monw commented Feb 12, 2024

LGTM thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

java.lang.AssertionError in backward compat tests ("failed to parse ... as date")
3 participants