Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort on multi-value unsigned long field is incorrect #16698

Open
bugmakerrrrrr opened this issue Nov 21, 2024 · 5 comments · May be fixed by #16732
Open

Sort on multi-value unsigned long field is incorrect #16698

bugmakerrrrrr opened this issue Nov 21, 2024 · 5 comments · May be fixed by #16732
Labels
bug Something isn't working Search Search query, autocomplete ...etc

Comments

@bugmakerrrrrr
Copy link
Contributor

Describe the bug

Recently, I worked on supporting the reuse of doc values to reconstruct the source field and avoid storing it. Then, I found that the number fields are ordered when fetching from sorted numeric doc values except the unsigned long field. This is because Lucene treats it as a signed number. IMO, this is not a problem in many situations, but if an unsigned long field has multiple values per doc, the sort result may be wrong, because in the current implementation, the sort mode uses signed long to pick the value.

Related component

Search

To Reproduce

  1. create index
{
  "mappings": {
    "properties": {
      "field1": {
        "type": "unsigned_long"
      }
    }
  }
}
  1. index some docs
{"index": {"_id": 1}}
{"field1": [13835058055282163712, 3]}
{"index": {"_id": 2}}
{"field1": [13835058055282163713, 2]}
{"index": {"_id": 3}}
{"field1": [13835058055282163714, 1]}
  1. search with sort
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "field1": {
        "order": "desc",
        "mode": "max"
      }
    }
  ]
}

Expected behavior

The sort result should be correct.

Possible solution: create a dedicated UnsignedLongMultiValueMode for unsigned long, then convert the normal mode to unsigned long mode and use the converted mode to pick the sort value in UnsignedLongValuesComparatorSource.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@bugmakerrrrrr bugmakerrrrrr added bug Something isn't working untriaged labels Nov 21, 2024
@github-actions github-actions bot added the Search Search query, autocomplete ...etc label Nov 21, 2024
@bugmakerrrrrr
Copy link
Contributor Author

@reta You may be interested in this. Any thoughts?

@bugmakerrrrrr
Copy link
Contributor Author

And if we want to make values ordered in one doc, maybe we can flip the sign bit before sending the value to Lucene, and re-flip the sign bit when reading from sorted numeric doc value, this can benefit max & min multi-value mode. But this is a breaking change, we can do it according to the index version.

@reta
Copy link
Collaborator

reta commented Nov 21, 2024

Thanks @bugmakerrrrrr , I should be able to get into it next week, unless you want to pick it up? :-)

@bugmakerrrrrr
Copy link
Contributor Author

@reta I'd be happy to fix this, but I'm wondering if the above solution would work for you.

@bugmakerrrrrr bugmakerrrrrr linked a pull request Nov 27, 2024 that will close this issue
3 tasks
@reta
Copy link
Collaborator

reta commented Nov 27, 2024

@reta I'd be happy to fix this, but I'm wondering if the above solution would work for you.

@bugmakerrrrrr my apologies for not replying, haven't had time to look at it yet but you already have a pull request, let's work on it, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Search Search query, autocomplete ...etc
Projects
Status: 🆕 New
Development

Successfully merging a pull request may close this issue.

3 participants