Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow bypassing the chat template via api #1862

Open
K0IN opened this issue Jan 15, 2025 · 1 comment
Open

Allow bypassing the chat template via api #1862

K0IN opened this issue Jan 15, 2025 · 1 comment

Comments

@K0IN
Copy link

K0IN commented Jan 15, 2025

Problem Statement

I am building a app to visualise logprobes, one Feature is to restart generation on a token, so if the model responds with text you can pick and chose a token and restart generation from there on with another suggestion (basically forcing a different logprob path to take).

For this in need a way to complete partial LLM responses (this might be in the middle of a response.

Feature Idea

I need a way to disable prompt formatting(and I can take care for prompt formatting and preparation) or a way to "restart" response generation on a partial message.

do other engine do this:
yes ollama has a "raw" flag (even tho not in openai compat mode)

vllm can use complete normal text so you can use the python API to input your specially crafter prompt

why I want this, I came across a paper that states most llms will use cot on its own given all logprobes are sampled from the first token (or at least it's likely)

https://arxiv.org/pdf/2402.10200

@github-project-automation github-project-automation bot moved this to Investigating in Menlo Jan 15, 2025
@K0IN
Copy link
Author

K0IN commented Jan 15, 2025

a cool quality of life feature would also be a formatting endpoint which just returns the formatted prompt -> openai compatible request to formatted string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Investigating
Development

No branches or pull requests

1 participant