Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support structured Ollama output #29

Open
florianm opened this issue Dec 7, 2024 · 3 comments
Open

Support structured Ollama output #29

florianm opened this issue Dec 7, 2024 · 3 comments

Comments

@florianm
Copy link

florianm commented Dec 7, 2024

With Ollama now supporting structured output, would it be feasible to support the 'format' parameter in rollama?

Source: https://ollama.com/blog/structured-outputs

@edubruell
Copy link

edubruell commented Dec 7, 2024

The dev-version of tidyllm allready supports schemas with ollama 0.5.0, so implementing it here should be relatively easy. It only took me about 5 minutes to add json schema to my ollama functions. I could contribute some code if that helps.

elmer has much nicer schema defintion functions than tidyllm and I allready asked if they want to export their S7 schema object definitions, since I wanted to make my package compatible to elmer schemata. Perhaps this could be a way to implement a really nice schema support. Strangely since the ollama function in elmer is only a wrapper around their openai function i am not sure schemata work with ollama in elmer, since the parameters to set schemata are named different between openai and ollama.

JBGruber added a commit that referenced this issue Dec 7, 2024
@JBGruber
Copy link
Owner

JBGruber commented Dec 7, 2024

For the basic support of this, we only needed to change the data preparation a little. Now this works (don't forget to update Ollama to v0.5.0):

library(rollama)
res <- query(q = "Tell me about Canada.", model = "llama3.1", format = jsonlite::fromJSON('{
    "type": "object",
    "properties": {
      "name": {
        "type": "string"
      },
      "capital": {
        "type": "string"
      },
      "languages": {
        "type": "array",
        "items": {
          "type": "string"
        }
      }
    },
    "required": [
      "name",
      "capital", 
      "languages"
    ]
  }'))
#> 
#> ── Answer from llama3.1 ────────────────────────────────────────────────────────
#> { "capital": "Ottawa", "languages": ["english"], "name": "canada" }

Or passing a list directly:

schema <- list(type = "object", 
               properties = list(
                 name = list(type = "string"),
                 capital = list(type = "string"), 
                 languages = list(
                   type = "array",
                   items = list(type = "string")
                 )
               ), required = c( "name", "capital",  "languages"))   

res <- query(q = "Tell me about Canada.", model = "llama3.1", format = schema)
#> 
#> ── Answer from llama3.1 ────────────────────────────────────────────────────────
#> { "capital": "Ottawa", "languages": ["English", "French"], "name": "Canada" }

Created on 2024-12-07 with reprex v2.1.1

But I agree with @edubruell that it would be nice to make this a little easier for users. Supporting tidyllm's tidyllm_schema or elmer's type_object would be cool!

@edubruell
Copy link

Great!

By the way, I just modified tidyllm_schema() a bit to get rid of the unusual openai-like wrapping of the schema in a list and now handle the openai-specifical stuff just in my openai functions. So tidyllm_schema() now works out of the box for rollama:

library(tidyllm)
library(rollama)

address_schema <- tidyllm_schema(
  street = "character",
  houseNumber = "numeric",
  postcode = "character",
  city = "character",
  region = "character",
  country = "factor(Germany,France)"
)

res <- query(q = "Imagine an address in Mannheim.", model = "gemma2", format = address_schema)
                     
#> ── Answer from gemma2 ──────────────────
#> {
#> "city": "Mannheim",
#> "country": "Germany",
#> "houseNumber": 42,
#> "postcode": "68169",
#> "region": "Rhein-Neckar",
#> "street": "Luisenstraße"
#> }

If it is easier for your users, you could just get the schema function from tidyllm and add it to your codebase under a different name to avoid name space conflicts. For elmer:::TypeObject and the like you would need a few S7 methods similar to the ones that are currently commented out in tidyllm_schema.R in the tidyllm repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants