This page shows the benchmark data of GenAIExamples. More data for different examples will be submitted in the future release.
Docker Images for Test |
opea/embedding-tei:v0.9 |
ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 |
opea/llm-tgi:v0.9 |
ghcr.io/huggingface/tgi-gaudi:2.0.1 |
opea/dataprep-redis:v0.9 |
redis/redis-stack:7.2.0-v9 |
opea/reranking-tei:v0.9 |
opea/tei-gaudi:v0.9 |
opea/retriever-redis:v0.9 |
opea/chatqna:v0.9 |
System Summary:
1-node, 2x Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz, 40 cores, 270W TDP, HT On, Turbo On, NUMA 2, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 1024GB (32x32GB DDR4 3200 MT/s [3200 MT/s]), BIOS ETM02, microcode 0xd0003b9, 8x Habana Labs Ltd., 4x MT28800 Family [ConnectX-5 Ex], 4x 7T INTEL SSDPF2KX076TZ, 2x 894.3G SAMSUNG MZ1L2960HCJR-00A07, Ubuntu 22.04.3 LTS, 5.15.0-92-generic. Software: WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_SW. Test by Intel as of 08/20/24.
1Node E2E Performance (Sec) |
Gaudi nodes |
Concurrency |
Input |
Output |
Average Latency |
P90 Total latency |
OOB w/o Reranking |
1 |
128 |
128 |
128 |
5.597 |
7.59 |
OOB w/ Reranking |
1 |
128 |
128 |
128 |
6.003 |
8.123 |
2Nodes E2E Performance (Sec) |
Gaudi nodes |
Concurrency |
Input |
Output |
Average Latency |
P90 Total latency |
OOB w/o Reranking |
2 |
256 |
128 |
128 |
7.05 |
9.122 |
OOB w/ Reranking |
2 |
256 |
128 |
128 |
7.26 |
9.239 |
4Nodes E2E Performance (Sec) |
Gaudi nodes |
Concurrency |
Input |
Output |
Average Latency |
P90 Total latency |
OOB w/o Reranking |
4 |
512 |
128 |
128 |
16.293 |
21.169 |
OOB w/ Reranking |
4 |
512 |
128 |
128 |
17.22 |
21.942 |
Go to Benchmark README for reproduce steps, tuned performance data will be released soon.
Test Case |
Hits@10 |
Hits@4 |
MAP@10 |
MRR@10 |
Retrieval w/o Reranking |
66.16% |
49.80% |
17.62% |
39.75% |
Retrieval w/ Reranking |
72.28% |
63.24% |
24.97% |
56.79% |
Go to Accuracy README for reproduce steps.