-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trade-off between performance and speed #74
Comments
Which index are you using (IVFFlat or IVFPQ)?
If you want to verify nprobe against number of centroids, try for each number of centroids that you choose some powers-of-2 of nprobe like 1, 2, 4, 8, ... Accuracy should increase with higher nprobe, but will eventually hit a limit. But, if 9000 vectors is all you have, the IVF probably doesn't make much sense (see 3).
For 9000 vectors total (in the database), that's in the regime where IndexFlat would make the most sense anyways, these are small datasets. |
@wickedfoo Thanks a lot! I've tried IndexFlat and it is incredibly fast without much accuracy loss. Though the cpu version is already fast enough, I am wondering if the GPU version (GpuIndexFlat) is available to be used? I tried to follow the tests in the gpu folder but got no success. It seems to me that this project currently does not contain an example for the use of GpuIndexFlat, does it? |
IndexFlat is exact brute force nearest neighbor, not approximate, there is no accuracy loss (up to order of floating point reductions of course). Are you using Python or C++? It should be fairly straightforward, the only addition is that a |
@wickedfoo Thanks. I have configured it smoothly. I am going to close this issue. |
First, thanks you for releasing this wonderful work. I have three questions about the performance of this software. In one retrieval task with L2 metric, I query about 9000 vectors with dimension 512 against themselves (so the queries are the same with the database). However, the mAP performance drops from 68 to around 40, which is a huge loss for performance. So my questions are:
Is there a way to fine-tune the trade-off between speed and accuracy. I've tried to set different number of centroids (from 16 to 256 to 4*sqrt(9000), as suggested by the demo program), but it makes little difference. Is it possible to do accurate L2 matching without employing approximate nearest neighbour search?
When I set the candidate number k to be larger, sometimes I cannot get full ranks. For example, after a certain point, say 50, the candidate would be -1. It seems to be related to the number of centroids. That means if the number of centroids is smaller, it allows larger k.
In your technical paper, I say no comparisons between this method and hashing-based methods. For smaller-size vector databases of a few tens of thousands, which method will be faster, quantization-based or hashing-based? Mind to comment on that? Thanks.
The text was updated successfully, but these errors were encountered: