-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(chat): Refactor prompt hierarchy #834
Conversation
interface DocsRequestMetadata { | ||
intent: 'docs'; | ||
chatId: string; | ||
docsChatbotMessageId?: string; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a drive-by - I wanted to more strongly confine the metadata types to ensure we don't have docsChatbotMessageId
unless the intent is docs
.
src/participant/prompts/namespace.ts
Outdated
} | ||
|
||
static buildMessages({ | ||
async buildMessages({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is the only one I couldn't map super cleanly and had to override the base class implementation. I have some ideas for how to restructure the abstraction to support these more complex history/prompt rewrites, but those would be more complicated than the current implementation and didn't feel it's justified to pursue until we have more than one use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll revisit this function and also how the other prompt message building functions handle namespace and connect messages/prompts in:
https://jira.mongodb.org/browse/VSCODE-611
I think the other prompts would benefit from a similar prompt change when the last message was selecting a namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - I was following up on this PR with telemetry stuff and noticed history scraping could be applied to all messages, not just namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, I like the refactor to get the types more aligned. two some suggestions, I'd like if we can avoid the argument mutation one, not blockers.
src/participant/prompts/namespace.ts
Outdated
} | ||
|
||
static buildMessages({ | ||
async buildMessages({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll revisit this function and also how the other prompt message building functions handle namespace and connect messages/prompts in:
https://jira.mongodb.org/browse/VSCODE-611
I think the other prompts would benefit from a similar prompt change when the last message was selecting a namespace.
The schema is generated from a sample of documents in the user's collection. | ||
You must follows these rules. | ||
You must follow these rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
@@ -59,11 +58,14 @@ export async function getStringifiedSampleDocuments({ | |||
} | |||
|
|||
const stringifiedDocuments = toJSString(additionToPrompt); | |||
promptInputTokens = await model.countTokens(prompt + stringifiedDocuments); | |||
|
|||
// TODO: model.countTokens will sometimes return undefined - at least in tests. We should investigate why. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was something Alena ran into, The model returned by vscode.lm.selectChatModels is always undefined in tests
. So there's no model to count the tokens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! Nice update to share the connect message prompt from the rest of the prompts.
Description
This is a bit controversial, so putting it as a draft to get some feedback on the direction. It's preparation work for VSCODE-606 - it unifies all prompts under a common base class, which I'm planning to then extend with capability for counting user/participant message lengths as well as a flag for whether sample docs have been included in the prompt (per TD. While there are less disruptive ways to achieve this, this one felt the cleanest in terms of separation of concerns and extensibility.
It should also set us up for VSCODE-614 as abstracting the base functionality in a common place would allow us to count the tokens and drop history messages that push us past the model token limit.
Checklist
Motivation and Context
Open Questions
Dependents
Types of changes