Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible focus point query improvements #1205

Open
orangejulius opened this issue Oct 11, 2018 · 0 comments
Open

Possible focus point query improvements #1205

orangejulius opened this issue Oct 11, 2018 · 0 comments

Comments

@orangejulius
Copy link
Member

orangejulius commented Oct 11, 2018

Summary

The way we calculate focus point bias today is by adding together scores for text match quality and distance from the focus point. Due to the way Elasticsearch calculates scores, these values are often of completely different magnitudes. This can result in one of two problems:

  • A poor text match that is slightly closer to the focus point is shown before a much better text match
  • A result that is much farther from the focus point but is a slightly better text match is shown first.

Details

A rough outline of our current autocomplete query structure when using a focus point is as follows:

{
  "must": {
    // "multiple queries of text match logic"
  },  
  "should": [{
    "function_score": {
      "query": {
        // "ONLY ONE of the potentially multiple queries for text matching"
      },  
      "functions": {
        // "`center_point` query to handle focus point"
      },  
      "boost_mode": "replace"
    }   
  }, {
    // "other should queries for text matching, popularity, etc"
  }]  
}

comments represent placeholders for complicated query logic

Elasticsearch offers guidance that scores from different queries generally cannot be compared.

While the clauses are composed into a single query, for the purposes of scoring, we can also treat the sub-clauses as their own query. I believe this leads to incorrect results.

Potential solution

Instead, I think it would be valuable (and much simpler) if the function_score query wrapped all the text matching query clauses. For example, something like this:

{
  "function_score": {
    "query": {
      "must": [{
        // "any required text matching queries"
      }], 
      "should":  [{  
        // "any optional queries such as text matching, popularity boosts, etc"
      }]  
    },  
    "functions": {
      // "`center_point` query to handle focus point"
    },  
    "boost_mode": "multiply"
  }
}

The primary change is that the focus point function score has been brought to the top level, and the boost_mode has been changed to multiply.

Examples

At the moment, I don't have any "off the shelf" examples that I believe come down entirely to this issue, but will update the issue if I find any. When I've observed this, it has been during development while tweaking other parameters, so hard to duplicate.

At the moment, I think pelias/pelias#862 and pelias/pelias#849 are generally masking cases where this will become an issue after solving them. We should probably look at those first.

orangejulius added a commit to pelias/acceptance-tests that referenced this issue May 14, 2019
They all seem to be connected to popularity or focus points.

Perhaps related to pelias/api#1205
orangejulius added a commit to pelias/acceptance-tests that referenced this issue May 14, 2019
They all seem to be connected to popularity or focus points.

Perhaps related to pelias/api#1205
orangejulius added a commit to pelias/acceptance-tests that referenced this issue May 14, 2019
They all seem to be connected to popularity or focus points.

Perhaps related to pelias/api#1205
@orangejulius orangejulius changed the title Possible autocomplete focus point query improvements Possible focus point query improvements May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant