Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reset for vector #237

Merged
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/links.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
id: lychee
uses: lycheeverse/lychee-action@master
with:
args: --accept=200,403,429 "**/*.html" "**/*.md" "**/*.txt" "**/*.json"
args: --accept=200,403,429 **/*.html **/*.md **/*.txt **/*.json
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
- name: Fail if there were link errors
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -432,6 +432,9 @@ protected void parseCreateField(ParseContext context, int dimension) throws IOEx

vector.add(value);
context.parser().nextToken();
} else if (token == XContentParser.Token.VALUE_NULL) {
context.path().remove();
return;
}

if (dimension != vector.size()) {
Expand Down
24 changes: 24 additions & 0 deletions src/test/java/org/opensearch/knn/KNNRestTestCase.java
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@
public class KNNRestTestCase extends ODFERestTestCase {
public static final String INDEX_NAME = "test_index";
public static final String FIELD_NAME = "test_field";
private static final String DOCUMENT_FIELD_SOURCE = "_source";
private static final String DOCUMENT_FIELD_FOUND = "found";

@AfterClass
public static void dumpCoverage() throws IOException, MalformedObjectNameException {
Expand Down Expand Up @@ -486,6 +488,28 @@ protected void deleteKnnDoc(String index, String docId) throws IOException {
RestStatus.fromCode(response.getStatusLine().getStatusCode()));
}

/**
* Retrieve document by index and document id
*/
protected Map<String, Object> getKnnDoc(final String index, final String docId) throws IOException {
final Request request = new Request(
"GET",
"/" + index + "/_doc/" + docId
);
final Response response = client().performRequest(request);

final Map<String, Object> responseMap =
createParser(XContentType.JSON.xContent(), EntityUtils.toString(response.getEntity())).map();

assertNotNull(responseMap);
assertTrue((Boolean) responseMap.get(DOCUMENT_FIELD_FOUND));
assertNotNull(responseMap.get(DOCUMENT_FIELD_SOURCE));

final Map<String, Object> docMap = (Map<String, Object>) responseMap.get(DOCUMENT_FIELD_SOURCE);

return docMap;
}

/**
* Utility to update settings
*/
Expand Down
49 changes: 49 additions & 0 deletions src/test/java/org/opensearch/knn/index/OpenSearchIT.java
Original file line number Diff line number Diff line change
Expand Up @@ -399,4 +399,53 @@ public void testExistsQuery() throws IOException {

assertEquals(3, parseTotalSearchHits(EntityUtils.toString(response.getEntity())));
}

public void testIndexingVectorValidation_updateVectorWithNull() throws Exception {
Settings settings = Settings.builder()
.put(getKNNDefaultIndexSettings())
.build();

createKnnIndex(INDEX_NAME, settings, createKnnIndexMapping(FIELD_NAME, 4));

// valid case with 4 dimension
VijayanB marked this conversation as resolved.
Show resolved Hide resolved
final Float[] vectorForDocumentOne = {6.0f, 7.0f, 8.0f, 9.0f};
final String docOneId = "1";
addKnnDoc(INDEX_NAME, docOneId, FIELD_NAME, vectorForDocumentOne);

final Float[] vectorForDocumentTwo = {2.0f, 1.0f, 3.8f, 2.5f};
final String docTwoId = "2";
addKnnDoc(INDEX_NAME, docTwoId, FIELD_NAME, vectorForDocumentTwo);

//checking that one of two documents in retrievable based on closer similarity to query vector
int k = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just make k = 2 and then:

  1. Make sure 2 docs are returned by the k-NN query
  2. Update doc 2 to null
  3. Assert the value returned from retrieving doc is null
  4. Make sure 1 doc is returned by the k-NN query when k = 2 (doc id 1)
  5. Update doc 2 back to the vector
  6. Make sure retrieving doc 2 gives the correct vector
  7. Make sure 2 docs are returned by the k-NN query

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, changing test to use k equals 2 instead of 1 for knn searches

float[] queryVector = {5.0f, 6.0f, 7.0f, 10.0f};
final KNNQueryBuilder knnQueryBuilder = new KNNQueryBuilder(FIELD_NAME, queryVector, k);
final Response response = searchKNNIndex(INDEX_NAME, knnQueryBuilder, k);
final List<KNNResult> results = parseSearchResponse(EntityUtils.toString(response.getEntity()), FIELD_NAME);
assertEquals(1, results.size());
assertEquals(docOneId, results.get(0).getDocId());

//retrieving document by id
final Map<String, Object> knnDocMap = getKnnDoc(INDEX_NAME, docOneId);
assertNotNull(knnDocMap.get(FIELD_NAME));
final Float[] vectorInDocument = ((List<Double>) knnDocMap.get(FIELD_NAME)).stream()
.map(Double::floatValue).toArray(Float[]::new);
assertEquals(vectorForDocumentOne.length, vectorInDocument.length);
assertArrayEquals(vectorForDocumentOne, vectorInDocument);

// update vector value to null
updateKnnDoc(INDEX_NAME, docOneId, FIELD_NAME, null);

//retrieving updated document by id, vector should be null
final Map<String, Object> knnDocMapUpdated = getKnnDoc(INDEX_NAME, docOneId);
assertNull(knnDocMapUpdated.get(FIELD_NAME));

//checking that first document one is no longer discoverable by knn plugin
//expected result is the document two which had originally less similarity score
final Response updatedResponse = searchKNNIndex(INDEX_NAME, knnQueryBuilder, k);
final List<KNNResult> updatedResults =
parseSearchResponse(EntityUtils.toString(updatedResponse.getEntity()), FIELD_NAME);
assertEquals(1, updatedResults.size());
assertEquals(docTwoId, updatedResults.get(0).getDocId());
}
}