Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align parameters for "max_token, repetition_penalty,presence_penalty,frequency_penalty" #726

Merged
merged 21 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion AudioQnA/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,6 @@ curl http://${host_ip}:3002/v1/audio/speech \
```bash
curl http://${host_ip}:3008/v1/audioqna \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_new_tokens":64}' \
-H 'Content-Type: application/json'
```
2 changes: 1 addition & 1 deletion AudioQnA/docker/ui/svelte/src/lib/modules/chat/network.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ export async function fetchAudioText(file) {
const url = `${CHAT_URL}`;
const requestBody = {
audio: file,
max_tokens: 64,
max_new_tokens: 64,
};

const init: RequestInit = {
Expand Down
2 changes: 1 addition & 1 deletion AudioQnA/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,6 @@ curl http://${host_ip}:3002/v1/audio/speech \
```bash
curl http://${host_ip}:3008/v1/audioqna \
-X POST \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' \
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_new_tokens":64}' \
-H 'Content-Type: application/json'
```
2 changes: 1 addition & 1 deletion AudioQnA/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,5 @@ Make sure all the pods are running, and restart the audioqna-xxxx pod if necessa
```bash
kubectl get pods

curl http://${host_ip}:3008/v1/audioqna -X POST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' -H 'Content-Type: application/json'
curl http://${host_ip}:3008/v1/audioqna -X POST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_new_tokens":64}' -H 'Content-Type: application/json'
```
2 changes: 1 addition & 1 deletion AudioQnA/tests/test_audioqna_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ function start_services() {


function validate_megaservice() {
result=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' -H 'Content-Type: application/json')
result=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_new_tokens":64}' -H 'Content-Type: application/json')
echo "result is === $result"
if [[ $result == *"AAA"* ]]; then
echo "Result correct."
Expand Down
2 changes: 1 addition & 1 deletion AudioQnA/tests/test_audioqna_on_xeon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ function start_services() {


function validate_megaservice() {
result=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64}' -H 'Content-Type: application/json')
result=$(http_proxy="" curl http://${ip_address}:3008/v1/audioqna -XPOST -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_new_tokens":64}' -H 'Content-Type: application/json')
echo $result
if [[ $result == *"AAA"* ]]; then
echo "Result correct."
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ curl http://${host_ip}:8007/v1/completions \
-d '{
"model": "${LLM_MODEL_ID}",
"prompt": "What is Deep Learning?",
"max_tokens": 32,
"max_new_tokens": 32,
"temperature": 0
}'
```
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,7 @@ curl http://${host_ip}:9009/generate \
# vLLM Service
curl http://${host_ip}:9009/v1/completions \
-H "Content-Type: application/json" \
-d '{"model": "Intel/neural-chat-7b-v3-3", "prompt": "What is Deep Learning?", "max_tokens": 32, "temperature": 0}'
-d '{"model": "Intel/neural-chat-7b-v3-3", "prompt": "What is Deep Learning?", "max_new_tokens": 32, "temperature": 0}'
```

7. LLM Microservice
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/tests/test_chatqna_vllm_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ function validate_microservices() {
"text" \
"vllm-llm" \
"vllm-gaudi-server" \
'{"model": "Intel/neural-chat-7b-v3-3","prompt": "What is Deep Learning?","max_tokens": 32,"temperature": 0}'
'{"model": "Intel/neural-chat-7b-v3-3","prompt": "What is Deep Learning?","max_new_tokens": 32,"temperature": 0}'

# llm microservice
validate_services \
Expand Down
2 changes: 1 addition & 1 deletion ChatQnA/tests/test_chatqna_vllm_on_xeon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ function validate_microservices() {
"text" \
"vllm-llm" \
"vllm-service" \
'{"model": "Intel/neural-chat-7b-v3-3", "prompt": "What is Deep Learning?", "max_tokens": 32, "temperature": 0}'
'{"model": "Intel/neural-chat-7b-v3-3", "prompt": "What is Deep Learning?", "max_new_tokens": 32, "temperature": 0}'

# llm microservice
validate_services \
Expand Down
2 changes: 1 addition & 1 deletion VisualQnA/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ curl http://${host_ip}:8888/v1/visualqna -H "Content-Type: application/json" -d
]
}
],
"max_tokens": 300
"max_new_tokens": 300
}'
```

Expand Down
2 changes: 1 addition & 1 deletion VisualQnA/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ curl http://${host_ip}:8888/v1/visualqna -H "Content-Type: application/json" -d
]
}
],
"max_tokens": 300
"max_new_tokens": 300
}'
```

Expand Down
2 changes: 1 addition & 1 deletion VisualQnA/kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,5 +52,5 @@ In the below example we illustrate on Xeon.
]
}
],
"max_tokens": 128}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_visualqna.log
"max_new_tokens": 128}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_visualqna.log
```
2 changes: 1 addition & 1 deletion VisualQnA/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,5 +47,5 @@ curl http://localhost:8888/v1/visualqna \
]
}
],
"max_tokens": 128}'
"max_new_tokens": 128}'
```
Loading