Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

h2o for kv cache compression #5077

h2o for kv cache compression

h2o for kv cache compression #5077

Triggered via pull request June 25, 2024 01:49
@n1ck-guon1ck-guo
synchronize #1468
hengguo/h2o
Status Cancelled
Total duration 1m 16s
Artifacts

chatbot-test.yml

on: pull_request
call-inference-llama-2-7b-chat-hf  /  inference test
1m 3s
call-inference-llama-2-7b-chat-hf / inference test
call-inference-mpt-7b-chat  /  inference test
0s
call-inference-mpt-7b-chat / inference test
Fit to window
Zoom out
Zoom in

Annotations

3 errors
call-inference-mpt-7b-chat / inference test
Canceling since a higher priority waiting request for 'Chat Bot Test-1468' exists
call-inference-llama-2-7b-chat-hf / inference test
Canceling since a higher priority waiting request for 'Chat Bot Test-1468' exists
call-inference-llama-2-7b-chat-hf / inference test
The operation was canceled.