feat: [support return multiple choices] #264

nguyenhoangthuan99 · 2024-10-24T04:40:02Z

Problem

Support params: n integer or null
Optional
Defaults to 1
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

-> need to check if llama.cpp support this option.

reference: https://platform.openai.com/docs/api-reference/chat/create#chat-create-n

related issue: https://github.com/janhq/internal/issues/160

The text was updated successfully, but these errors were encountered:

nguyenhoangthuan99 · 2024-10-28T12:19:22Z

according to this comment, llamacpp hasn't supported it yet.

this issue need to be transferred to handle at the cortex.cpp layer

nguyenhoangthuan99 · 2024-10-31T04:31:23Z

Now we can get multiple choices from 1 request by adding n params to input

curl http://localhost:3928/v1/chat/completions   -H "Content-Type: application/json"   -d '{
    "engine":"cortex.llamacpp",                                                                      
    "model": "meta-llama3.1-8b-instruct",
    "n_probs":1,                                                                      
    "stream":false,                                                        
    "top_k":20,                                                               
    "n":3,
    "messages": [
      {            
        "role": "user",
        "content": "Who won the world series in 2020?"
      },          
    ]
  }'

Response:

{
        "choices" : 
        [
                {
                        "finish_reason" : null,
                        "index" : 0,
                        "message" : 
                        {
                                "content" : "The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in the series, winning four games to two. The final game was played on October 27, 2020.<|eot_id|>",
                                "role" : "assistant"
                        }
                },
                {
                        "finish_reason" : null,
                        "index" : 1,
                        "message" : 
                        {
                                "content" : "The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in 6 games, winning the final game on October 27, 2020. This was their first championship since 1988.<|eot_id|>",
                                "role" : "assistant"
                        }
                },
                {
                        "finish_reason" : null,
                        "index" : 2,
                        "message" : 
                        {
                                "content" : "The Los Angeles Dodgers won the World Series in 2020.<|eot_id|>",
                                "role" : "assistant"
                        }
                }
        ],
        "created" : 1730345128,
        "id" : "kPlhopLJhYAQ0hQtCRVD",
        "model" : "_",
        "object" : "chat.completion",
        "system_fingerprint" : "_",
        "usage" : 
        {
                "completion_tokens" : 43,
                "prompt_tokens" : 21,
                "total_tokens" : 64
        }
}

gabrielle-ong · 2024-11-05T09:40:15Z

✅ QA API - thank you @nguyenhoangthuan99!

Requires upgrade cortex.llama-cpp engine to 0.1.37-01.11.24
cortex-nightly engines install llama-cpp -v v0.1.37-01.11.24`
n = 3, expect 3 choices returned

nguyenhoangthuan99 added the type: feature request A new feature label Oct 24, 2024

github-project-automation bot added this to Menlo Oct 24, 2024

github-project-automation bot moved this to Investigating in Menlo Oct 24, 2024

nguyenhoangthuan99 self-assigned this Oct 28, 2024

nguyenhoangthuan99 moved this from Investigating to In Progress in Menlo Oct 29, 2024

nguyenhoangthuan99 moved this from In Progress to Investigating in Menlo Oct 29, 2024

This was referenced Oct 30, 2024

planning: /chat/completions Documentation and Roadmap janhq/cortex.cpp#1582

Closed

feat: support multiple choices #274

Merged

nguyenhoangthuan99 moved this from Investigating to In Progress in Menlo Oct 31, 2024

nguyenhoangthuan99 moved this from In Progress to In Review in Menlo Oct 31, 2024

nguyenhoangthuan99 closed this as completed in #274 Oct 31, 2024

github-project-automation bot moved this from In Review to Review + QA in Menlo Oct 31, 2024

gabrielle-ong moved this from Review + QA to Completed in Menlo Nov 5, 2024

gabrielle-ong added this to the v1.0.2 milestone Nov 6, 2024

vansangpfiev mentioned this issue Dec 27, 2024

roadmap: llamacpp-engine to align with llama.cpp upstream janhq/cortex.cpp#1728

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: [support return multiple choices] #264

feat: [support return multiple choices] #264

nguyenhoangthuan99 commented Oct 24, 2024 •

edited

Loading

nguyenhoangthuan99 commented Oct 28, 2024 •

edited

Loading

nguyenhoangthuan99 commented Oct 31, 2024 •

edited

Loading

gabrielle-ong commented Nov 5, 2024 •

edited

Loading

feat: [support return multiple choices] #264

feat: [support return multiple choices] #264

Comments

nguyenhoangthuan99 commented Oct 24, 2024 • edited Loading

nguyenhoangthuan99 commented Oct 28, 2024 • edited Loading

nguyenhoangthuan99 commented Oct 31, 2024 • edited Loading

gabrielle-ong commented Nov 5, 2024 • edited Loading

nguyenhoangthuan99 commented Oct 24, 2024 •

edited

Loading

nguyenhoangthuan99 commented Oct 28, 2024 •

edited

Loading

nguyenhoangthuan99 commented Oct 31, 2024 •

edited

Loading

gabrielle-ong commented Nov 5, 2024 •

edited

Loading