Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: [support return multiple choices] #264

Closed
nguyenhoangthuan99 opened this issue Oct 24, 2024 · 3 comments · Fixed by #274
Closed

feat: [support return multiple choices] #264

nguyenhoangthuan99 opened this issue Oct 24, 2024 · 3 comments · Fixed by #274
Assignees
Labels
Milestone

Comments

@nguyenhoangthuan99
Copy link
Contributor

nguyenhoangthuan99 commented Oct 24, 2024

Problem

  • Support params: n integer or null
  • Optional
  • Defaults to 1
    How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

-> need to check if llama.cpp support this option.

reference: https://platform.openai.com/docs/api-reference/chat/create#chat-create-n

related issue: https://github.com/janhq/internal/issues/160

@nguyenhoangthuan99
Copy link
Contributor Author

nguyenhoangthuan99 commented Oct 28, 2024

according to this comment, llamacpp hasn't supported it yet.

image

this issue need to be transferred to handle at the cortex.cpp layer

@nguyenhoangthuan99 nguyenhoangthuan99 self-assigned this Oct 28, 2024
@nguyenhoangthuan99 nguyenhoangthuan99 moved this from Investigating to In Progress in Menlo Oct 29, 2024
@nguyenhoangthuan99 nguyenhoangthuan99 moved this from In Progress to Investigating in Menlo Oct 29, 2024
@nguyenhoangthuan99 nguyenhoangthuan99 moved this from Investigating to In Progress in Menlo Oct 31, 2024
@nguyenhoangthuan99 nguyenhoangthuan99 moved this from In Progress to In Review in Menlo Oct 31, 2024
@nguyenhoangthuan99
Copy link
Contributor Author

nguyenhoangthuan99 commented Oct 31, 2024

Now we can get multiple choices from 1 request by adding n params to input

curl http://localhost:3928/v1/chat/completions   -H "Content-Type: application/json"   -d '{
    "engine":"cortex.llamacpp",                                                                      
    "model": "meta-llama3.1-8b-instruct",
    "n_probs":1,                                                                      
    "stream":false,                                                        
    "top_k":20,                                                               
    "n":3,
    "messages": [
      {            
        "role": "user",
        "content": "Who won the world series in 2020?"
      },          
    ]
  }'

Response:

{
        "choices" : 
        [
                {
                        "finish_reason" : null,
                        "index" : 0,
                        "message" : 
                        {
                                "content" : "The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in the series, winning four games to two. The final game was played on October 27, 2020.<|eot_id|>",
                                "role" : "assistant"
                        }
                },
                {
                        "finish_reason" : null,
                        "index" : 1,
                        "message" : 
                        {
                                "content" : "The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in 6 games, winning the final game on October 27, 2020. This was their first championship since 1988.<|eot_id|>",
                                "role" : "assistant"
                        }
                },
                {
                        "finish_reason" : null,
                        "index" : 2,
                        "message" : 
                        {
                                "content" : "The Los Angeles Dodgers won the World Series in 2020.<|eot_id|>",
                                "role" : "assistant"
                        }
                }
        ],
        "created" : 1730345128,
        "id" : "kPlhopLJhYAQ0hQtCRVD",
        "model" : "_",
        "object" : "chat.completion",
        "system_fingerprint" : "_",
        "usage" : 
        {
                "completion_tokens" : 43,
                "prompt_tokens" : 21,
                "total_tokens" : 64
        }
}

@github-project-automation github-project-automation bot moved this from In Review to Review + QA in Menlo Oct 31, 2024
@gabrielle-ong
Copy link

gabrielle-ong commented Nov 5, 2024

✅ QA API - thank you @nguyenhoangthuan99!

  • Requires upgrade cortex.llama-cpp engine to 0.1.37-01.11.24
  • cortex-nightly engines install llama-cpp -v v0.1.37-01.11.24`
  • n = 3, expect 3 choices returned
    Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants