Skip to content

Commit

Permalink
Update Conceptual_Guide/Part_8-semantic_caching/README.md
Browse files Browse the repository at this point in the history
Co-authored-by: Kris Hung <[email protected]>
  • Loading branch information
oandreeva-nv and krishung5 authored Oct 23, 2024
1 parent 210a400 commit 841464c
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion Conceptual_Guide/Part_8-semantic_caching/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ sys 0m0.015s
Now, let's try a different response, but keep the semantics:

```bash
time curl -X POST localhost:8000/v2/models/vllm_model/generate -d '{"text_input": "How do I set up model repository for Triton Inference Server?", "parameters": {"stream": false, "temperature": 0, "max_tokens":100}, "exclude_input_in_output":true}
time curl -X POST localhost:8000/v2/models/vllm_model/generate -d '{"text_input": "How do I set up model repository for Triton Inference Server?", "parameters": {"stream": false, "temperature": 0, "max_tokens":100}, "exclude_input_in_output":true}'
```

Upon success, you should see a response from the server like this one:
Expand Down

0 comments on commit 841464c

Please sign in to comment.