You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I test the time exhaust with source code and find 2 questions:
The inclusive_scan method exhust the most time among all the step , about 90% times, is it normal?
I have a index of 20G, store on T4 GPU with 2 shards, then I search the index integrate with fairseq code. I find the inclusive_scan method behave different in the 2 shards GPU, one tokens about 3ms which run the fairseq inference code, another tokens 7 ms. The differ is huge and strange. Without fairseq integrating, the 2 shard gpu search time behave the same , about 3-4ms.
I run the bench_gpu_sift1m.py script with time print on one T4 GPU, find the thrust::inclusive_scan time thrust::inclusive_scan alternative change like this(nprobe=8): #ms device_id #0.330 0 #1.074 0 #0.330 0 #1.074 0 #0.330 0 #1.074 0 ......
Summary
inclusive_scan tokes long time and different time
Platform
OS: Ubuntu18.04
Faiss version: 1.7.2
Installed from: compiled
Faiss compilation options:
cmake -B build
-DFAISS_ENABLE_GPU=ON
-DFAISS_ENABLE_C_API=ON
-DFAISS_ENABLE_PYTHON=ON
-DBUILD_TESTING=ON
-DCMAKE_CUDA_FLAGS="-gencode arch=compute_75,code=sm_75"
-DPython_EXECUTABLE=/usr/bin/python3.6
.
Running on:
Interface:
Reproduction instructions
I test the time exhaust with source code and find 2 questions:
the inclusive_scan code in IVFUtils.cu
the fairseq search code
The text was updated successfully, but these errors were encountered: