Support structured responses from the ChatGPT API #735

joshuacoles · 2025-02-25T14:54:29Z

This PR implements the response_format property from the OpenAI API Spec (link) to allow specifying. It is implemented using the Low Level Guidance project which allows token masking based on JSON schemas and a variant of BNF grammars.

There is support for JSON decoding, both with and without a supplied schema (although only supports a subset of JSON schemas). This is done by incorporating the masking process into the BufferedOutput object introduced in #734.

As with my other PRs this builds on them incrementally the others (#734, #720, #716) should be reviewed prior to this, I will keep this branch up to date with any changes from the prior PRs.

Still todo

Testing
Performance testing to minimise regressions in inference, especially when doing so without a grammar.

structure

…on type

…ocess_inference_result`

…o more closely align with OpenAI

…tion from the ChatGPT API

…und for textual stop sequence matches

This removes direct references the internals of BufferedOutput to allow for better abstraction

…quences

…anch

joshuacoles and others added 30 commits February 19, 2025 15:47

Strip EOS token from output to mirror the OpenAI behaviour

0560640

Add "[DONE]" message and change streaming response to mirror OpenAI

40e8619

structure

Fix bench.py to support "[DONE]" terminating event in stream

fed063c

Resolve out of range error in debug line

3daa77d

Add generation options protocol buf definition and corresponding pyth…

510d1b2

…on type

Fish generation_options through from the ChatGPT API request to `pr…

afaf6e6

…ocess_inference_result`

Apply the generation options to inference

5855230

Emit the finish reason completion chunk separately from the content t…

7213ccb

…o more closely align with OpenAI

Add stop to the ChatGPT API and GenerationOptions

9f040ad

Refactor process_inference_result

33edde6

Add finish_reason to the SendResultRequest proto type

eeba9e7

Pipe finish reason around the place

638c4bb

Add finish_reason extraction to ChatGPT API

1ae0002

Add missing parameter to update_topology_viz

cc1b9da

Move finish reason determination to the process_inference_result func…

74c07b9

…tion from the ChatGPT API

Move to BufferedOutput object

fc6153f

Add BufferedOutput#token_count

2c14c39

Move next token determination to BufferedOutput

b0cfa1c

Delay emission by a number of tokens to simulate keeping a buffer aro…

a1db802

…und for textual stop sequence matches

Return None from process_inference_result as it is unused.

c4f3ba0

This removes direct references the internals of BufferedOutput to allow for better abstraction

Resolve issue with stop parameters in GRPC communication

f5d49ef

Skip empty completions before finish as caused by buffering

d680df0

Resolve issue with shared array being used for BufferedOutput

a08b6d3

Move more logic into BufferedOutput

9c9355f

Move to keeping the text and tokens so that we can search for stop se…

d74e423

…quences

Initial stop sequence work

1a6b7bf

Search for stop sequences in the entire string

57dd76a

Handle partial tokens and retokenise when tokens are split

908eda9

Buffer by character count rather than token count

e3622f1

Handle no stop sequences being provided

c8c64e9

joshuacoles and others added 17 commits February 21, 2025 09:52

Fix issue with finish_reason not being defined in stable diffusion br…

1e54165

…anch

Implement hardcoded JSON schema guided generation as test case

b6cc58b

Fix None mask handling

9679f33

Handle stopping better

fbe56ff

Adapt test prompt

0770249

Add support for additional response formats

643410b

Read the response format from the ChatGPT API request

114f4ee

Cleanup testing files

1808c0c

Add llguidance dependency to setup.py

ce0e5e3

Move grammar determination into the ResponseFormat types

7f8a431

Add a note about handling stop reasons from llguidance

3f6b359

Stop reading the JSON grammar from a file

5a4d472

Start passing the grammar definition around directly

a88aeb0

Generate types

e437c3c

Rename constant to be more clear

d615fc5

Pass additional GenerationOptions fields through GRPC

92d41a0

Move types back

099507c

joshuacoles force-pushed the structured-output branch from f19add6 to 099507c Compare February 26, 2025 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support structured responses from the ChatGPT API #735

Support structured responses from the ChatGPT API #735

joshuacoles commented Feb 25, 2025

Support structured responses from the ChatGPT API #735

Are you sure you want to change the base?

Support structured responses from the ChatGPT API #735

Conversation

joshuacoles commented Feb 25, 2025

Still todo