Skip to content

Latest commit

 

History

History
49 lines (37 loc) · 3.29 KB

opea_release_data.md

File metadata and controls

49 lines (37 loc) · 3.29 KB

OPEA Release Data

This page shows the benchmark data of GenAIExamples. More data for different examples will be submitted in the future release.

ChatQnA

Docker Images for Test
opea/embedding-tei:v0.9
ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
opea/llm-tgi:v0.9
ghcr.io/huggingface/tgi-gaudi:2.0.1
opea/dataprep-redis:v0.9
redis/redis-stack:7.2.0-v9
opea/reranking-tei:v0.9
opea/tei-gaudi:v0.9
opea/retriever-redis:v0.9
opea/chatqna:v0.9

System Summary:
1-node, 2x Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz, 40 cores, 270W TDP, HT On, Turbo On, NUMA 2, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 1024GB (32x32GB DDR4 3200 MT/s [3200 MT/s]), BIOS ETM02, microcode 0xd0003b9, 8x Habana Labs Ltd., 4x MT28800 Family [ConnectX-5 Ex], 4x 7T INTEL SSDPF2KX076TZ, 2x 894.3G SAMSUNG MZ1L2960HCJR-00A07, Ubuntu 22.04.3 LTS, 5.15.0-92-generic. Software: WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_SW. Test by Intel as of 08/20/24.

Performance Data

1Node E2E Performance (Sec) Gaudi nodes Concurrency Input Output Average Latency P90 Total latency
OOB w/o Reranking 1 128 128 128 5.597 7.59
OOB w/ Reranking 1 128 128 128 6.003 8.123
2Nodes E2E Performance (Sec) Gaudi nodes Concurrency Input Output Average Latency P90 Total latency
OOB w/o Reranking 2 256 128 128 7.05 9.122
OOB w/ Reranking 2 256 128 128 7.26 9.239
4Nodes E2E Performance (Sec) Gaudi nodes Concurrency Input Output Average Latency P90 Total latency
OOB w/o Reranking 4 512 128 128 16.293 21.169
OOB w/ Reranking 4 512 128 128 17.22 21.942

Go to Benchmark README for reproduce steps, tuned performance data will be released soon.

Accuracy Data

Test Case Hits@10 Hits@4 MAP@10 MRR@10
Retrieval w/o Reranking 66.16% 49.80% 17.62% 39.75%
Retrieval w/ Reranking 72.28% 63.24% 24.97% 56.79%

Go to Accuracy README for reproduce steps.