-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for . (any character) token in grammar engine. #6467
Added support for . (any character) token in grammar engine. #6467
Conversation
da3dc77
to
9a3acbb
Compare
Matching any character seems it could be a useful addition, but I agree it would be better to first focus on improving grammar tests (and potentially performance). We can revisit this addition at a bit later point |
Sounds great! That's where I'll turn my attention next -- I've done some profiling, and have some ideas in the works for how to improve the grammar sampler. Next step will be to add profiling to the grammar engine (I need to investigate the current state of benchmarks that include grammars). Meanwhile, this PR can stay here and we can return to it whenever we feel like it. Thank you! |
Awesome! The grammar functionality is a great feature and would be nice to get some extra attention. I sent you a collaborator invite, if you feel like helping out (no pressure if you don't have time / resources, this is mainly a token of appreciation at this point) |
Wow, I am honored -- thank you very much!! I will do my best to not abuse the privilege. I agree about the grammar functionality being very powerful. I think that verifiable correctness is going to be one of the big ways that local LLMs can really gain some usefulness -- I started doing some experiments a couple of months ago with text-to-SQL generation and I think that using grammars to ensure syntactically-correct SQL queries (even tuned for a person's specific database schema) offer a lot of potential to expand usefulness of LLMs. That's what started me down this whole road of digging into grammars on llama.cpp, and I'm excited to see what the future holds for it. Thanks for everything! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HanClinto Feel free to merge this if it is ready
Thank you! I'm much more familiar with the grammar engine now than I was when I first wrote this, so I'd like to try to look it all over again with fresh eyes. Overall I'm feeling much better about making changes like this to the grammar engine now that we have integration test coverage. @ochafik not sure what your availability is these days, but wouldn't mind your critique at some point as well. |
TODO before merge:
|
Noting that it appears there is a general agreement to merge this PR, but just waiting on someone to add "." symbol to integration tests. (Monitoring via this filter ) |
e56761d
to
774e9f5
Compare
Rebased on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
9e30513
to
c1b89b8
Compare
Low priority feature. Consider this more of a suggestion than a request. :)
As came up in discussion of #6441, I wanted a way to create a grammar that ensured a minimum response length. It seemed prudent to add support for a "." character that would match on any generated token -- without it, I used the string
[^\x00]
, which matches on any non-null character (which is very nearly the same thing).I don't know if this character be useful in too many other situations or not, so feel free to leave this one out if you don't think it's worthy.
I didn't add this token to the grammar tests, because frankly I haven't really been able to wrap my head around them. I would still like to eventually get around to writing some end-to-end / integration tests for the grammar engine that are a bit easier to grok and extend, but unless otherwise requested, I'll leave that exercise for another PR.
Example usage:
Example output: