Added section for painless scripting #381

VijayanB · 2020-12-20T01:39:34Z

Define specialized api that are allowed to be used in scripting.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

VijayanB · 2020-12-20T01:41:15Z

This will be merged as part of odfe 1.13.0.
Corresponding feature PR: opendistro-for-elasticsearch/k-NN#281

vamshin · 2020-12-23T06:52:39Z

docs/knn/index.md

+## Use Similarity methods in painless scripting
+
+Sometimes users would like to go beyond Elasticsearch’s built-in features for scoring and might want to customize the search scores in more complex ways.
+Elasticsearch provides script_score, the ability to provide custom scores for returned documents.


Elasticsearch provides script_score, the ability to provide custom scores for desired documents. ?

vamshin · 2020-12-23T06:59:16Z

docs/knn/index.md

+}
+
+```
+Since l2Squared function is a distance function, unlike cosine function, we need to reverse the output.


How about we add this line instead

"The lesser the distance the more the relevance of the document to the query vector. In order to bring the lesser distances documents to the top of the scores, we invert the distance from l2Squared function"

vamshin

LGTM!

VijayanB · 2021-01-04T18:26:24Z

@ashwinkumar12345 @aetter Can you please review? This feature will be released as part of odfe 1.13.0. Thanks.

aetter

Overall looks really good, but could use some light cleanup.

aetter · 2021-01-04T22:14:07Z

docs/knn/index.md

@@ -246,3 +246,98 @@ All parameters are required.
 The standard KNN query and custom scoring option perform differently. Test using a representative set of documents to see if the search results and latencies match your expectations.

 Custom scoring works best if the initial filter reduces the number of documents to no more than 20,000. Increasing shard count can improve latencies, but be sure to keep shard size within [the recommended guidelines](../elasticsearch/#primary-and-replica-shards).
+
+
+## Use Similarity methods in painless scripting


Use similarity functions in Painless scripts

From the Painless docs, it looks like they are functions rather than methods. We just want to choose the right term and be consistent.

docs/knn/index.md

aetter · 2021-01-04T22:41:48Z

docs/knn/index.md

+
+### Cosine Similarity
+This function calculates the measure of cosine similarity between a given query vector and document vectors.
+Optionally accepts normQueryVector, to avoid repeated calculation of normalization for query vector for every filtered documents.  


Optionally accepts normQueryVector to avoid...

Can you be more specific about why you'd add this parameter and maybe some sample values/outcomes?

aetter · 2021-01-04T22:42:54Z

docs/knn/index.md

+  }
+}
+```
+The above script adds 1.0 to the cosine similarity to keep score positive.


Ditto here. What's the range of values, and why is the 1.0 necessary? An example or two would likely help.

docs/knn/index.md

aetter · 2021-01-04T22:55:05Z

docs/knn/index.md

+Also, when a document vector matches the query vector, we needed to add 1 in the denominator to avoid divide by zero error.
+
+####Constraints
+1. If a document’s knn vector field has different dimensions from the query, an error(IllegalArgumentException) will be thrown.


If a document’s knn_vector field has different dimensions than the query, the function throws an IllegalArgumentException.

If a vector field doesn't have a value, the function throws an IllegalStateException.

You can avoid this situation by first checking if a document as a value for the field:

"source": "doc[params.field].size() == 0 ? 0 : 1 / (1 + l2Squared(params.query_value, doc[params.field]))",

Since scores can only be positive, this script ranks documents with vector fields higher than those without.

Define specialized api that are allowed to be used in scripting.

Add resolution for constraint to avoid error.

VijayanB · 2021-01-15T21:19:35Z

@aetter can you take another look at this? Thanks.

aetter

Still a few minor formatting concerns, but we'll just tweak it post-merge. LGTM.

vamshin reviewed Dec 23, 2020

View reviewed changes

vamshin approved these changes Dec 23, 2020

View reviewed changes

aetter added the upcoming release Don't merge until the version or feature is available label Jan 4, 2021

aetter reviewed Jan 4, 2021

View reviewed changes

VijayanB added 3 commits January 11, 2021 17:02

Added section for painless scripting

6f8fa38

Define specialized api that are allowed to be used in scripting.

Update constraints

244ba84

Add resolution for constraint to avoid error.

Fixed review comments

8de6f7c

VijayanB force-pushed the scoring-methods branch from 93014c2 to 274e0b3 Compare January 12, 2021 01:02

VijayanB requested a review from aetter January 12, 2021 01:03

VijayanB force-pushed the scoring-methods branch from 274e0b3 to 56aad37 Compare January 12, 2021 01:04

Fixed review comments

3fbd358

VijayanB force-pushed the scoring-methods branch from 56aad37 to 3fbd358 Compare January 12, 2021 01:07

aetter approved these changes Jan 15, 2021

View reviewed changes

jmazanec15 mentioned this pull request Feb 8, 2021

Refactor k-NN documentation #396

Merged

aetter closed this Feb 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added section for painless scripting #381

Added section for painless scripting #381

VijayanB commented Dec 20, 2020

VijayanB commented Dec 20, 2020

vamshin Dec 23, 2020

VijayanB Dec 23, 2020

vamshin Dec 23, 2020

VijayanB Dec 23, 2020

vamshin left a comment

VijayanB commented Jan 4, 2021

aetter left a comment

aetter Jan 4, 2021

aetter Jan 4, 2021

aetter Jan 4, 2021

aetter Jan 4, 2021

VijayanB Jan 12, 2021

VijayanB commented Jan 15, 2021

aetter left a comment

Added section for painless scripting #381

Added section for painless scripting #381

Conversation

VijayanB commented Dec 20, 2020

VijayanB commented Dec 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vamshin left a comment

Choose a reason for hiding this comment

VijayanB commented Jan 4, 2021

aetter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VijayanB commented Jan 15, 2021

aetter left a comment

Choose a reason for hiding this comment