Replies: 5 comments 6 replies
-
Thanks, it's interesting that you posted this at this time. I've been thinking of moving away from my current implementation for function calling, and making an API-breaking change. First of all, I wrote it when it was still called "function calling", but now everyone calls it "tool use", so my naming is confusing. I also think the way I did it requires too much struct-building. Your note about I've recently added json mode, which also uses JSON schema, but with some differences in how it's used than tool calling, and it ends up looking a bit more like what you have proposed. An example, taken from my integration tests is: (llm-chat
provider
(llm-make-chat-prompt
"List the 3 largest cities in France in order of population, giving the results in JSON."
:response-format
'(:type object
:properties
(:cities (:type array :items (:type string)))
:required (cities)))) This isn't released yet, but I think it's a lighter-weight and easier to read way to specify schema. I could change it to some other format before release, so now's a good time to change it to something that would work well for both it and using the same structure for tool use. One question is how much you want to support the JSON schema - for example, can arguments be more than just strings, integers, etc, and be objects that have their own structure? |
Beta Was this translation helpful? Give feedback.
-
Are you planning to remove the tool-use feature entirely and implement it via
I think the top level struct for a tool/function-call spec makes sense, that API is stable. It was only the component struct (like args) that I thought might be too constraining.
Not quite, because the examples in the OpenAI/Anthropic API are using both (:name "unit"
:type "string"
:enum ["celsius" "farenheit"]
:description "The unit of temperature, either 'celsius' or 'fahrenheit'"
:required nil)
(llm-chat
provider
(llm-make-chat-prompt
"List the 3 largest cities in France in order of population, giving the results in JSON."
:response-format
'(:type object
:properties
(:cities (:type array :items (:type string)))
:required (cities))))
This method looks good for arbitrary JSON. Tool-use requires a more constrained schema, so a struct actually makes sense there.
How would you communicate composite types to the API? |
Beta Was this translation helpful? Give feedback.
-
Could you let me know when you decide on a schema for specifying tools? I can try to stay close to it so we can share tools between LLM clients in the future.
I agree. |
Beta Was this translation helpful? Give feedback.
-
@ahyatt, how do you handle async tool-use? I've settled on a rather clumsy API and was wondering if you have a better solution: A synchronous tool-use function is defined as: (make-tool
:function #'foo
:description "Return ..."
:args '((:name "arg1"
:description "..."
:type "string"))) and an async one as (make-tool
:function #'bar
:description "Return ..."
:args '((:name "arg1"
:description "..."
:type "string"))
:async t) The synchronous tool is called as On a related note, how do you handle the difference between tools whose return value should be fed back to the LLM, and tools run for side-effects or not run at all because the LLM's tool call JSON is all that was needed? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the explanation; yes - your method is actually better for the cases in which the function that is getting called needs to itself by async. `llm` doesn't have a good way to handle that yet. We just wait until the function is finished, then update the prompt so that the next time the user calls (they have to re-use the same prompt struct) it has the right information.
I decided at the start that gptel will never block Emacs, and have since paid dearly with my time to uphold that principle!
As to how we handle both text & functions for Claude, we append the text to the prompt, after the function call results, so that the information is there for when the client calls Claude again.
Cool. In gptel I'm calling the callback twice, once with the text and again with the tool result.
But I may be forgetting some detail here, the whole dance you have to do with tool use in conversations is complicated, under-documented, and has several non-standard in different providers.
I finished implementing tool use for all the major APIs with and without streaming responses, and it was a big ol' mess. A lot of the demo code online, and even the official API documentation in the case of Gemini, is just flat out wrong. All the idiosyncracies are fresh in my mind at the moment, so let me know if you need help with the details.
I think automatically feeding it back like you will be doing would be reasonable, but I think most of the time it isn't needed. If I rethink the tool use interface this weekend, I'll consider adopting your way.
All right.
|
Beta Was this translation helpful? Give feedback.
-
Hi @ahyatt,
I'm adding tool-use to gptel and wanted to coordinate with you on the tool definition format. I think it would be good to have a community-maintained bank of commonly useful tool calls that can plug in easily into all Emacs LLM clients. gptel uses a different internal data structure to manage tools from llm, so what do you think of defining tools as loosely-structured plists that we can both use?
I can explain why. Here's an example tool definition that can be read by both llm and gptel:
The repo would contain this piece of data along with an implementation of
get-weather
. This example is useless, but you can imagine commonly useful tools, like ones that fetch web video or google scholar results, or results from info manuals.Here's how
llm
could import this:gptel can do something similar to convert the data into its internal tool structure.
If you are interested in this idea, we can decide on a plist format. I have two points of feedback on the current implementation of tool definitions in
llm
, one minor and one major::required
key can be inverted to:optional
, with a default value ofnil
. This way defining an argument works like in emacs-lisp, and:required
does not need to be specified, since the shorter declaration:will imply that it's a required argument, and
explicitly specifies that it's optional, like
&optional
in an elisp function. I would expect optional arguments to be rarer across tool definitions than required ones.:enum
field:The
:enum
field is currently not allowed bymake-llm-function-arg
. I don't know what fields the full JSON schema allows here, but I'm guessing restricting them inmake-llm-function-arg
might cause issues. In gptel I'm currently just using a plist for the function arg spec.Beta Was this translation helpful? Give feedback.
All reactions