-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
json
: document schema conversion in GBNF readme, align manual grammar examples & converters
#7841
Conversation
@@ -57,7 +57,7 @@ std::unordered_map<std::string, BuiltinRule> PRIMITIVE_RULES = { | |||
{"object", {"\"{\" space ( string \":\" space value (\",\" space string \":\" space value)* )? \"}\" space", {"string", "value"}}}, | |||
{"array", {"\"[\" space ( value (\",\" space value)* )? \"]\" space", {"value"}}}, | |||
{"uuid", {"\"\\\"\" [0-9a-fA-F]{8} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{12} \"\\\"\" space", {}}}, | |||
{"char", {"[^\"\\\\] | \"\\\\\" ([\"\\\\/bfnrt] | \"u\" [0-9a-fA-F]{4})", {}}}, | |||
{"char", {"[^\"\\\\\\x7F\\x00-\\x1F] | [\\\\] ([\"\\\\bfnrt] | \"u\" [0-9a-fA-F]{4})", {}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these possibly be updated to use the new .
operator, or is now not the time for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, we can't do that, because we need to exclude backslashes from this list. Nevermind, carry on! :)
grammars/json.gbnf
Outdated
|
||
# Optional space: by convention, applied in this grammar after literal chars when allowed | ||
ws ::= ([ \t\n] ws)? | ||
ws ::= [ \t\n]{,20} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a good change -- I like constraining the output in this way. Could even consider limiting it to something more restrictive like {,4}
or {,8}
, but this is a good start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've drafted an updated space rule in #7866. No matter what the bound is with this current syntax, models like Llama-3-8B & Phi-3-mini seem keen to misuse it. But given near-unlimited indent space only (and only 1 newline at a time), they're very sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! 👍
json
: document schema conversion in GBNF readme, align manual grammar examples & convertersjson
: document schema conversion in GBNF readme, align manual grammar examples & converters
JSON Schemas → GBNF
section to the grammar readmecc/ @HanClinto @ExtReMLapin