[BUG] Inaccuracy in IVF-Flat Search Results with Large Number of Queries #1756

tarang-jain · 2023-08-19T00:41:56Z

Describe the bug
IVF-Flat Search gives inconsistent results with different batch sizes when the number of queries is very large. This is a blocker in the FAISS IVF-Flat integration. cc @achirkin @cjnolet @tfeher

Steps/Code to reproduce bug
Context: With the faiss integration work under way, the following test from faiss fails: LargeBatch
This test runs 100,000 search queries on an IVF-Flat index and compares the resulting indices and distances with a FAISS CPU IVF-Flat Index.

Expected behavior
I tried modifying the batch size by changing this line to
const uint32_t max_queries = std::min<uint32_t>(n_queries, 10000);
and now the test passes. I tried other values such that they are lesser than 32768, to come to the conclusion that whenever the max_queries defined here is greater than the kMaxGridY, the results are incorrect and when it is lesser than kMaxGridY, the FAISS test passes.
In other words, whenever the kernel here runs more than once, the test fails.

The text was updated successfully, but these errors were encountered:

achirkin · 2023-08-22T07:41:22Z

Thanks for the report. Apparently, we do test this use case in raft. I've tried to add a dozen of other parameter combinations, but consistently get the recall of 1.0; that is, I haven't been able to reproduce this in raft.
Could you please report the indexing and search parameters you use in faiss, so that we can try to reproduce it here?
Could you also elaborate to which extent the results are inconsistent? Is the recall just a bit smaller than 1.0, or are the results completely wrong (e.g. recall < 0.5)?

So far, the only hypothesis that comes to my mind is that maybe it is a concurrency issue? Maybe when raft runs more than one batch iteration, the submitted gpu work piles up and does not finish before the results are submitted for evaluation? Do you set the cuda stream in raft::resources (raft_handle) to be the same as the stream faiss uses under the hood? Or you do synchronize between them / device?

tarang-jain · 2023-08-22T18:20:07Z

Thanks for pointing out the large query batch test.
Steps to reproduce the inconsistent results:

Add the following input to the gtests:
{100000, 8712, 3, 10, 51, 66, raft::distance::DistanceType::L2Expanded, false}
Set the kmeans_trainset_fraction to 1.0 here
Run the test for float inputs. The recall should be ~0.83
Now change this line to const uint32_t max_queries = std::min<uint32_t>(n_queries, 32000);
Now run the same test, and notice that the recall is much higher now (=1.000).

This set of inputs is the same as the one being run in FAISS.

achirkin · 2023-08-23T09:50:07Z

Thanks for very helpful reproducer! The PR is ready.

@tarang-jain

Fix the cluster probes (coarse_index) not being advanced when batching. Thanks @tarang-jain for the precise reproducer. Closes: #1756 Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #1764

tarang-jain added bug Something isn't working FAISS Vector Search labels Aug 19, 2023

cjnolet added this to VS/ML/DM Primitives Release Board Aug 22, 2023

achirkin mentioned this issue Aug 23, 2023

IVF-Flat: fix search batching #1764

Merged

rapids-bot bot closed this as completed in #1764 Aug 23, 2023

github-project-automation bot moved this to Done in VS/ML/DM Primitives Release Board Aug 23, 2023

tarang-jain mentioned this issue Sep 11, 2023

Integrate IVF-Flat from RAFT facebookresearch/faiss#2521

Closed

8 tasks

cjnolet assigned yong-wang Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Inaccuracy in IVF-Flat Search Results with Large Number of Queries #1756

[BUG] Inaccuracy in IVF-Flat Search Results with Large Number of Queries #1756

tarang-jain commented Aug 19, 2023

achirkin commented Aug 22, 2023

tarang-jain commented Aug 22, 2023

achirkin commented Aug 23, 2023 •

edited

Loading

[BUG] Inaccuracy in IVF-Flat Search Results with Large Number of Queries #1756

[BUG] Inaccuracy in IVF-Flat Search Results with Large Number of Queries #1756

Comments

tarang-jain commented Aug 19, 2023

achirkin commented Aug 22, 2023

tarang-jain commented Aug 22, 2023

achirkin commented Aug 23, 2023 • edited Loading

achirkin commented Aug 23, 2023 •

edited

Loading