Trade-off between performance and speed #74

hlzz · 2017-04-10T12:15:37Z

First, thanks you for releasing this wonderful work. I have three questions about the performance of this software. In one retrieval task with L2 metric, I query about 9000 vectors with dimension 512 against themselves (so the queries are the same with the database). However, the mAP performance drops from 68 to around 40, which is a huge loss for performance. So my questions are:

Is there a way to fine-tune the trade-off between speed and accuracy. I've tried to set different number of centroids (from 16 to 256 to 4*sqrt(9000), as suggested by the demo program), but it makes little difference. Is it possible to do accurate L2 matching without employing approximate nearest neighbour search?
When I set the candidate number k to be larger, sometimes I cannot get full ranks. For example, after a certain point, say 50, the candidate would be -1. It seems to be related to the number of centroids. That means if the number of centroids is smaller, it allows larger k.
In your technical paper, I say no comparisons between this method and hashing-based methods. For smaller-size vector databases of a few tens of thousands, which method will be faster, quantization-based or hashing-based? Mind to comment on that? Thanks.

wickedfoo · 2017-04-10T17:09:37Z

Which index are you using (IVFFlat or IVFPQ)?

are you varying nprobe for the queries which is the main parameter to trade-off speed versus accuracy? nprobe are the centroid buckets which are searched. Increasing the number of centroids without increasing nprobe will result in a lower quality of search.

If you want to verify nprobe against number of centroids, try for each number of centroids that you choose some powers-of-2 of nprobe like 1, 2, 4, 8, ... Accuracy should increase with higher nprobe, but will eventually hit a limit.

But, if 9000 vectors is all you have, the IVF probably doesn't make much sense (see 3).

likely because you are not adjusting nprobe, some of the centroids might be getting very few assignments, thus you encounter nprobe * x < k actual database vectors, where x is the average number of vectors per quantization cell that you are encountering.
Hard to say without a quality tradeoff (space of encoding vs. time vs. accuracy) desired.

For 9000 vectors total (in the database), that's in the regime where IndexFlat would make the most sense anyways, these are small datasets.

hlzz · 2017-04-11T05:33:23Z

@wickedfoo Thanks a lot! I've tried IndexFlat and it is incredibly fast without much accuracy loss. Though the cpu version is already fast enough, I am wondering if the GPU version (GpuIndexFlat) is available to be used? I tried to follow the tests in the gpu folder but got no success. It seems to me that this project currently does not contain an example for the use of GpuIndexFlat, does it?

wickedfoo · 2017-04-11T06:29:46Z

IndexFlat is exact brute force nearest neighbor, not approximate, there is no accuracy loss (up to order of floating point reductions of course).

Are you using Python or C++? It should be fairly straightforward, the only addition is that a StandardGpuResources object needs to be instantiated separately and passed into the index, and kept in scope for as long as the index is in scope.

hlzz · 2017-04-11T07:42:35Z

@wickedfoo Thanks. I have configured it smoothly. I am going to close this issue.

hlzz closed this as completed Apr 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trade-off between performance and speed #74

Trade-off between performance and speed #74

hlzz commented Apr 10, 2017

wickedfoo commented Apr 10, 2017 •

edited

Loading

hlzz commented Apr 11, 2017

wickedfoo commented Apr 11, 2017

hlzz commented Apr 11, 2017

Trade-off between performance and speed #74

Trade-off between performance and speed #74

Comments

hlzz commented Apr 10, 2017

wickedfoo commented Apr 10, 2017 • edited Loading

hlzz commented Apr 11, 2017

wickedfoo commented Apr 11, 2017

hlzz commented Apr 11, 2017

wickedfoo commented Apr 10, 2017 •

edited

Loading