-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: [support log prob like OpenAI API] #262
Comments
Hi @nguyenhoangthuan99, I tested the output of Logprobs for OpenAI vs Cortex. Output:Code:## from https://cookbook.openai.com/examples/using_logprobs
from openai import OpenAI
from math import exp
import os
import numpy as np
## OpenAI
# MODEL = "gpt-4o"
# client = OpenAI(
# api_key="xxx"
# )
## Cortex
ENDPOINT = "http://localhost:39281/v1"
MODEL = "llama3.2:3b-gguf-q2-k"
client = OpenAI(
base_url=ENDPOINT,
api_key="not-needed"
)
def get_completion(
messages: list[dict[str, str]],
model: str = MODEL,
max_tokens=500,
temperature=0,
stop=None,
seed=123,
tools=None,
logprobs=None, # whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message..
top_logprobs=None,
) -> str:
params = {
"messages": messages,
"model": model, # Use the parameter 'model' instead of the global 'MODEL'
"max_tokens": max_tokens,
"temperature": temperature,
"stop": stop,
"seed": seed,
"logprobs": logprobs,
"top_logprobs": top_logprobs,
}
if tools:
params["tools"] = tools
completion = client.chat.completions.create(**params)
return completion
CLASSIFICATION_PROMPT = """You will be given a headline of a news article.
Classify the article into one of the following categories: Technology, Politics, Sports, and Art.
Return only the name of the category, and nothing else.
MAKE SURE your output is one of the four categories stated.
Article headline: {headline}"""
headlines = [
"Tech Giant Unveils Latest Smartphone Model with Advanced Photo-Editing Features.",
"Local Mayor Launches Initiative to Enhance Urban Public Transport.",
"Tennis Champion Showcases Hidden Talents in Symphony Orchestra Debut",
"Dog eats Cat in a fight",
]
for headline in headlines:
print(f"\nHeadline: {headline}")
API_RESPONSE = get_completion(
[{"role": "user", "content": CLASSIFICATION_PROMPT.format(headline=headline)}],
model=MODEL,
logprobs=True,
top_logprobs=3,
)
top_two_logprobs = API_RESPONSE.choices[0].logprobs.content[0].top_logprobs
html_content = ""
for i, logprob in enumerate(top_two_logprobs, start=1):
print(
f"Output token {i}: {logprob.token}, "
f"logprobs: {logprob.logprob}, "
f"linear probability: {np.round(np.exp(logprob.logprob)*100,2)}%"
) |
Also tested with the code snippet in your PR: #276, OpenAI returns negative logprobs while Cortex returns positive (0.2, 0.3 etc which rounds to 0.0) Cortex response:
OpenAI response:
|
The root cause is llamacpp only return confident score for each token, not the log probs. |
Hi @nguyenhoangthuan99, In particular, Im testing this headline "Tennis Champion plays the drums in novel way" - I expecting close to 50/50 between Sports and Arts |
With cortex.llamacpp, we are using the probs output from llamacpp api const auto* cur_p = common_sampler_get_candidates(slot.smpl);
result.tok = id;
for (size_t i = 0; i < (size_t)slot.sparams.n_probs; ++i) {
result.probs.push_back({
cur_p->data[i].id,
i >= cur_p->size ? 0.0f : cur_p->data[i].p,
});
} So the probs is calculated directly by llamacpp and we just get the result and get log of it. We can also adjust sampling params to make log probs look better, for example I add the
If I run this the log probs look like this It won't return 0 and 100% all the times |
Thanks @nguyenhoangthuan99 - noted that the log probs is coming from llamacpp and depends on the sampling params. Marking as complete |
Problem
The api need to support
logprobs
andtop_logprobs
https://platform.openai.com/docs/api-reference/chat/create#chat-create-logprobsThe response object for non stream mode: https://platform.openai.com/docs/api-reference/chat/object#chat/object-choices
The response object for stream mode: https://platform.openai.com/docs/api-reference/chat/streaming#chat/streaming-choices
related issue https://github.com/janhq/internal/issues/160
The text was updated successfully, but these errors were encountered: