-
Notifications
You must be signed in to change notification settings - Fork 198
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* run both xeon and gaudi when both hardware detect Signed-off-by: chensuyue <[email protected]> * add v0.9 RAG release data Signed-off-by: chensuyue <[email protected]> * update system summary Signed-off-by: chensuyue <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: chensuyue <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
4b0bc26
commit 947936e
Showing
3 changed files
with
62 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# OPEA Release Data | ||
|
||
This page shows the benchmark data of GenAIExamples. More data for different examples will be submitted in the future release. | ||
|
||
## ChatQnA | ||
|
||
| **Docker Images for Test** | | ||
| ----------------------------------------------------- | | ||
| opea/embedding-tei:v0.9 | | ||
| ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | | ||
| opea/llm-tgi:v0.9 | | ||
| ghcr.io/huggingface/tgi-gaudi:2.0.1 | | ||
| opea/dataprep-redis:v0.9 | | ||
| redis/redis-stack:7.2.0-v9 | | ||
| opea/reranking-tei:v0.9 | | ||
| opea/tei-gaudi:v0.9 | | ||
| opea/retriever-redis:v0.9 | | ||
| opea/chatqna:v0.9 | | ||
|
||
System Summary: | ||
1-node, 2x Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz, 40 cores, 270W TDP, HT On, Turbo On, NUMA 2, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 1024GB (32x32GB DDR4 3200 MT/s [3200 MT/s]), BIOS ETM02, microcode 0xd0003b9, 8x Habana Labs Ltd., 4x MT28800 Family [ConnectX-5 Ex], 4x 7T INTEL SSDPF2KX076TZ, 2x 894.3G SAMSUNG MZ1L2960HCJR-00A07, Ubuntu 22.04.3 LTS, 5.15.0-92-generic. Software: WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_SW. Test by Intel as of 08/20/24. | ||
|
||
### Performance Data | ||
|
||
| 1Node E2E Performance (Sec) | Gaudi nodes | Concurrency | Input | Output | Average Latency | P90 Total latency | | ||
| :-------------------------: | :---------: | :---------: | :---: | :----: | :-------------: | :---------------: | | ||
| OOB w/o Reranking | 1 | 128 | 128 | 128 | 5.597 | 7.59 | | ||
| OOB w/ Reranking | 1 | 128 | 128 | 128 | 6.003 | 8.123 | | ||
|
||
| 2Nodes E2E Performance (Sec) | Gaudi nodes | Concurrency | Input | Output | Average Latency | P90 Total latency | | ||
| :--------------------------: | :---------: | :---------: | :---: | :----: | :-------------: | :---------------: | | ||
| OOB w/o Reranking | 2 | 256 | 128 | 128 | 7.05 | 9.122 | | ||
| OOB w/ Reranking | 2 | 256 | 128 | 128 | 7.26 | 9.239 | | ||
|
||
| 4Nodes E2E Performance (Sec) | Gaudi nodes | Concurrency | Input | Output | Average Latency | P90 Total latency | | ||
| :--------------------------: | :---------: | :---------: | :---: | :----: | :-------------: | :---------------: | | ||
| OOB w/o Reranking | 4 | 512 | 128 | 128 | 16.293 | 21.169 | | ||
| OOB w/ Reranking | 4 | 512 | 128 | 128 | 17.22 | 21.942 | | ||
|
||
Go to Benchmark [README](./ChatQnA/benchmark/README.md) for reproduce steps, tuned performance data will be released soon. | ||
|
||
### Accuracy Data | ||
|
||
| Test Case | Hits@10 | Hits@4 | MAP@10 | MRR@10 | | ||
| :---------------------: | :-----: | :----: | :----: | :----: | | ||
| Retrieval w/o Reranking | 66.16% | 49.80% | 17.62% | 39.75% | | ||
| Retrieval w/ Reranking | 72.28% | 63.24% | 24.97% | 56.79% | | ||
|
||
Go to Accuracy [README](https://github.com/opea-project/GenAIEval/tree/main/evals/evaluation/rag_eval#multihop-english-dataset) for reproduce steps. |