Get Fashion Mnist 96% recall up to 200 queries/second #611

alexklibisz · 2023-11-29T15:58:56Z

I'd like to optimize Elastiknn such that the Fashion Mnist benchmark performance exceeds 200 qps at 96% recall. Currently it's at 180 qps. So this would be about an 11% improvement. There are already several issues under the performance label with ideas towards this goal. I've already merged a few PRs. I'm just opening this issue to formalize the effort and to aggregate PRs that don't otherwise have an issue.

alexklibisz · 2023-12-07T16:51:18Z

We're now above 200 qps. Last thing left is to just submit a PR to ann-benchmarks w/ the updated versions.

alexklibisz · 2024-03-26T16:48:34Z

Status update here, after releasing 8.12.2.1.

The non-containerized benchmark is reliably over 200qps @ 96% recall, around 210 qps. Latest update here: ddf637a

The containerized benchmark (running ann-benchmarks and elastiknn in the same container) has improved from ~160qps to ~180qps.

Here are the results using 8.6.2.0:

Model	Parameters	Recall	Queries per Second
eknn-l2lsh	L=100 k=4 w=1024 candidates=500 probes=0	0.378	304.111
eknn-l2lsh	L=100 k=4 w=1024 candidates=1000 probes=0	0.445	246.319
eknn-l2lsh	L=100 k=4 w=1024 candidates=500 probes=3	0.635	245.977
eknn-l2lsh	L=100 k=4 w=1024 candidates=1000 probes=3	0.716	201.608
eknn-l2lsh	L=100 k=4 w=2048 candidates=500 probes=0	0.767	265.545
eknn-l2lsh	L=100 k=4 w=2048 candidates=1000 probes=0	0.846	218.673
eknn-l2lsh	L=100 k=4 w=2048 candidates=500 probes=3	0.921	184.178
eknn-l2lsh	L=100 k=4 w=2048 candidates=1000 probes=3	0.960	160.437

Here are the results using 8.12.2.1:

Model	Parameters	Recall	Queries per Second
eknn-l2lsh	L=100 k=4 w=1024 candidates=500 probes=0	0.378	314.650
eknn-l2lsh	L=100 k=4 w=1024 candidates=1000 probes=0	0.446	247.659
eknn-l2lsh	L=100 k=4 w=1024 candidates=500 probes=3	0.634	258.834
eknn-l2lsh	L=100 k=4 w=1024 candidates=1000 probes=3	0.716	210.380
eknn-l2lsh	L=100 k=4 w=2048 candidates=500 probes=0	0.767	271.442
eknn-l2lsh	L=100 k=4 w=2048 candidates=1000 probes=0	0.846	221.127
eknn-l2lsh	L=100 k=4 w=2048 candidates=500 probes=3	0.921	199.353
eknn-l2lsh	L=100 k=4 w=2048 candidates=1000 probes=3	0.960	171.614

alexklibisz · 2024-08-30T01:48:30Z

Latest update here: the non-containerized benchmark is hovering around ~195 qps. It dropped below 200 qps when I re-ran the benchmark for Elasticsearch 8.15.0: bbbaeea

I've tried several other ideas to accelerate the ArrayHitCounter. Some examples: #721, #615, #598. None of it really makes a dent.

I'm thinking a major issue might be that the current LSH parameters end up matching the vast majority of documents in the index. When I sample the 0.96 benchmark in VisualVM, it's spending ~30% of its time in countHits:

elastiknn/elastiknn-lucene/src/main/java/org/apache/lucene/search/MatchHashesAndScoreQuery.java

Lines 52 to 85 in 923fb22

    
           private HitCounter countHits(LeafReader reader) throws IOException { 
        
               Terms terms = reader.terms(field); 
        
               // terms seem to be null after deleting docs. https://github.com/alexklibisz/elastiknn/issues/158 
        
               if (terms == null) { 
        
                   return new EmptyHitCounter(); 
        
               } else { 
        
                   TermsEnum termsEnum = terms.iterator(); 
        
                   PostingsEnum docs = null; 
        
                   HitCounter counter = new ArrayHitCounter(reader.maxDoc()); 
        
                   for (HashAndFreq hf : hashAndFrequencies) { 
        
                       // We take two different paths here, depending on the frequency of the current hash. 
        
                       // If the frequency is one, we avoid checking the frequency of matching docs when 
        
                       // incrementing the counter. This yields a ~5% to ~10% speedup. 
        
                       // See https://github.com/alexklibisz/elastiknn/pull/612 for details. 
        
                       if (hf.freq == 1) { 
        
                           if (termsEnum.seekExact(new BytesRef(hf.hash))) { 
        
                               docs = termsEnum.postings(docs, PostingsEnum.NONE); 
        
                               while (docs.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) { 
        
                                   counter.increment(docs.docID()); 
        
                               } 
        
                           } 
        
                       } else { 
        
                           if (termsEnum.seekExact(new BytesRef(hf.hash))) { 
        
                               docs = termsEnum.postings(docs, PostingsEnum.NONE); 
        
                               while (docs.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) { 
        
                                   counter.increment(docs.docID(), (short) min(hf.freq, docs.freq())); 
        
                               } 
        
                           } 
        
                       } 
        
                   } 
        
                   return counter; 
        
               } 
        
           }

A good chunk of that is spend in seekExact.

So I think I see two possible paths for the next speedup:

Using Lucene more efficiently.
Find some better parameters for the LSH model. Maybe there's some area of the parameter space that avoids matching the vast majority of documents in the index.

alexklibisz · 2024-08-30T15:15:24Z

I ran a grid search and found some promising parameters. Verified these on AWS:

Model	Parameters	Recall	Queries per Second
eknn-l2lsh	L=96 k=8 w=4096 candidates=1024 probes=0	0.905	250.333
eknn-l2lsh	L=150 k=8 w=4000 candidates=800 probes=0	0.929	264.922
eknn-l2lsh	L=128 k=8 w=4096 candidates=1024 probes=0	0.935	255.407
eknn-l2lsh	L=150 k=8 w=4000 candidates=1000 probes=0	0.942	249.797

Some other parameters that were promising but I haven't verified on AWS:

eknn-l2lsh-L=96-k=4-w=4096_candidates=1024_probes=0        0.954       68.276
eknn-l2lsh-L=64-k=4-w=2048_candidates=1024_probes=4        0.942       66.007
eknn-l2lsh-L=64-k=4-w=4096_candidates=1024_probes=0        0.913       85.683
eknn-l2lsh-L=96-k=4-w=2048_candidates=1024_probes=2        0.944       66.841
eknn-l2lsh-L=96-k=2-w=2048_candidates=1024_probes=0        0.906       62.746
eknn-l2lsh-L=64-k=8-w=4096_candidates=1024_probes=2        0.936       97.321
eknn-l2lsh-L=96-k=8-w=4096_candidates=1024_probes=0        0.905      138.512
eknn-l2lsh-L=128-k=8-w=4096_candidates=512_probes=2        0.949       58.870
eknn-l2lsh-L=96-k=2-w=1024_candidates=1024_probes=2        0.910       62.992
eknn-l2lsh-L=128-k=4-w=2048_candidates=512_probes=2        0.925       61.073
eknn-l2lsh-L=128-k=4-w=4096_candidates=1024_probes=0        0.971       60.615
eknn-l2lsh-L=128-k=8-w=4096_candidates=1024_probes=0        0.935      119.305
eknn-l2lsh-L=96-k=8-w=4096_candidates=1024_probes=2        0.964       68.093
eknn-l2lsh-L=64-k=4-w=2048_candidates=1024_probes=2        0.906      112.086
eknn-l2lsh-L=128-k=4-w=4096_candidates=512_probes=0        0.925       63.552
eknn-l2lsh-L=128-k=8-w=4096_candidates=1024_probes=2        0.978       53.779
eknn-l2lsh-L=96-k=8-w=4096_candidates=512_probes=2        0.926       84.319
eknn-l2lsh-L=96-k=8-w=4096_candidates=512_probes=4        0.950       57.320
eknn-l2lsh-L=96-k=4-w=2048_candidates=512_probes=4        0.934       61.600
eknn-l2lsh-L=64-k=8-w=4096_candidates=1024_probes=4        0.959       69.827
eknn-l2lsh-L=64-k=8-w=4096_candidates=512_probes=4        0.915       92.171

alexklibisz · 2024-08-31T16:18:36Z

I managed to find some parameters that get 239 QPS at 0.96 recall. There are a ton of results in this commit: c7efcf1

alexklibisz · 2024-08-31T17:40:46Z

The fully-dockerized ann-benchmarks results are still quite pitiful:

Model	Parameters	Recall	Queries per Second
eknn-l2lsh	L=175 k=7 w=3900 candidates=100 probes=0	0.607	233.326
eknn-l2lsh	L=175 k=7 w=3900 candidates=500 probes=0	0.921	200.319
eknn-l2lsh	L=175 k=7 w=3900 candidates=1000 probes=0	0.962	169.238

I went ahead and opened a PR to get the latest parameters and Elastiknn version into ann-benchmarks: erikbern/ann-benchmarks#544

alexklibisz added the performance label Nov 29, 2023

alexklibisz mentioned this issue Mar 26, 2024

Bump version to 8.12.2.1 #658

Merged

This was referenced Aug 29, 2024

Performance: avoid sorting the hashes in MatchHashesAndScoreQuery #724

Merged

Performance: attempt to optimize the ArrayHitCounter by maintaining some state while updating the counter #721

Draft

This was referenced Aug 30, 2024

Performance: keep a local copy of the ann-benchmarks config.yml file #726

Merged

Performance: picked better parameters for Fashion MNIST dataset (239QPS @ 0.96 recall) #727

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get Fashion Mnist 96% recall up to 200 queries/second #611

Get Fashion Mnist 96% recall up to 200 queries/second #611

alexklibisz commented Nov 29, 2023

alexklibisz commented Dec 7, 2023

alexklibisz commented Mar 26, 2024

alexklibisz commented Aug 30, 2024

alexklibisz commented Aug 30, 2024 •

edited

Loading

alexklibisz commented Aug 31, 2024

alexklibisz commented Aug 31, 2024 •

edited

Loading

Get Fashion Mnist 96% recall up to 200 queries/second #611

Get Fashion Mnist 96% recall up to 200 queries/second #611

Comments

alexklibisz commented Nov 29, 2023

alexklibisz commented Dec 7, 2023

alexklibisz commented Mar 26, 2024

alexklibisz commented Aug 30, 2024

alexklibisz commented Aug 30, 2024 • edited Loading

alexklibisz commented Aug 31, 2024

alexklibisz commented Aug 31, 2024 • edited Loading

alexklibisz commented Aug 30, 2024 •

edited

Loading

alexklibisz commented Aug 31, 2024 •

edited

Loading