New logging schema for conversations and continue mode #91

simonw · 2023-07-11T05:48:57Z

To implement continue mode I'm going to need to persist these things to the database.

Which means I need a whole new schema, since I'm switching to using ULID IDs as part of this work.

Originally posted by @simonw in #85 (comment)

simonw · 2023-07-11T05:50:35Z

Current schema:

llm/docs/logging.md

Lines 78 to 91 in 23eeb0f

    
           CREATE TABLE "logs" ( 
        
             [id] INTEGER PRIMARY KEY, 
        
             [model] TEXT, 
        
             [prompt] TEXT, 
        
             [system] TEXT, 
        
             [prompt_json] TEXT, 
        
             [options_json] TEXT, 
        
             [response] TEXT, 
        
             [response_json] TEXT, 
        
             [reply_to_id] INTEGER REFERENCES [logs]([id]), 
        
             [chat_id] INTEGER REFERENCES [logs]([id]), 
        
             [duration_ms] INTEGER, 
        
             [datetime_utc] TEXT 
        
           );

But I want to instead log conversations and responses. Here's what those look like at the code level right now:

llm/llm/models.py

Lines 47 to 52 in 23eeb0f

    
           @dataclass 
        
           class Conversation: 
        
               model: "Model" 
        
               id: str = field(default_factory=lambda: str(ULID()).lower()) 
        
               name: Optional[str] = None 
        
               responses: List["Response"] = field(default_factory=list)

llm/llm/models.py

Lines 74 to 89 in 23eeb0f

    
           class Response(ABC): 
        
               def __init__( 
        
                   self, 
        
                   prompt: Prompt, 
        
                   model: "Model", 
        
                   stream: bool, 
        
                   conversation: Optional[Conversation] = None, 
        
               ): 
        
                   self.prompt = prompt 
        
                   self._prompt_json = None 
        
                   self.model = model 
        
                   self.stream = stream 
        
                   self._chunks: List[str] = [] 
        
                   self._done = False 
        
                   self._response_json = None 
        
                   self.conversation = conversation

I think I'm going to replace or remove the LogMessage class entirely:

llm/llm/models.py

Lines 32 to 44 in 23eeb0f

    
           @dataclass 
        
           class LogMessage: 
        
               model: str  # Actually the model.model_id string 
        
               prompt: str  # Simplified string version of prompt 
        
               system: Optional[str]  # Simplified string of system prompt 
        
               prompt_json: Optional[Dict[str, Any]]  # Detailed JSON of prompt 
        
               options_json: Dict[str, Any]  # Any options e.g. temperature 
        
               response: str  # Simplified string version of response 
        
               response_json: Optional[Dict[str, Any]]  # Detailed JSON of response 
        
               reply_to_id: Optional[int]  # ID of message this is a reply to 
        
               chat_id: Optional[ 
        
                   int 
        
               ]  # ID of chat this is a part of (ID of first message in thread)

simonw · 2023-07-11T05:53:25Z

I'm a bit nervous about these ULIDs, which look like this:

>>> str(ulid.ULID()).lower()
'01h51r2j69dbj1qma2874bywvw'
>>> str(ulid.ULID()).lower()
'01h51r2jvwkx2d8t8dweynrkvb'
>>> str(ulid.ULID()).lower()
'01h51r2kay2hqj9ymw5fmckmhw'

(They are case-insensitive, I think lower-case is visually prettier.)

The downside of these is that they aren't things people can type, unlike llm --chat 34 "continue prompt". People will have to copy-and-paste them.

But... I expect most CLI usage of the continue mode to use llm --continue instead, which just uses the most recent ID without you needing to specify it.

The benefit of them is that they're globally unique, like UUIDs - which is great news if you want to e.g. run prompts on your local machine and then upload them to a shared space later.

Since shared prompt libraries feel like a useful thing to support, I'm going to use ULIDs.

simonw · 2023-07-11T05:55:00Z

Calling them "conversations" does also mean that the llm --chat 34 option should perhaps be something a bit longer.

llm --conversation 34

Since -c is already taken by --continue, maybe I use a surprising shortcut letter like -x?

simonw · 2023-07-11T05:59:21Z

I think the option is --cid - where C ID is short for Conversation ID.

simonw · 2023-07-11T06:39:46Z

OK, I got the new schema working and got --continue mode to work too - but I still need to update how llm logs works and get all the tests passing and add new tests and documentation.

I'll move that work to a PR.

simonw · 2023-07-11T06:40:43Z

Here's the new schema:

CREATE TABLE [conversations] (
   [id] TEXT PRIMARY KEY,
   [name] TEXT,
   [model] TEXT
);
CREATE TABLE [responses] (
   [id] TEXT PRIMARY KEY,
   [model] TEXT,
   [prompt] TEXT,
   [system] TEXT,
   [prompt_json] TEXT,
   [options_json] TEXT,
   [response] TEXT,
   [response_json] TEXT,
   [conversation_id] TEXT REFERENCES [conversations]([id]),
   [duration_ms] INTEGER,
   [datetime_utc] TEXT
);

Refs #31, #53, #55, #57, #63, #69, #70, #75, #76, #79, #82, #91, #98

simonw added this to the 0.5 milestone Jul 11, 2023

simonw added the enhancement New feature or request label Jul 11, 2023

simonw mentioned this issue Jul 11, 2023

Ensure things don't break if you load a conversation from an uninstalled model #92

Closed

simonw added a commit that referenced this issue Jul 11, 2023

Create conversations and responses tables and write to them, refs #91

e0439a8

simonw mentioned this issue Jul 11, 2023

Log and load conversations from new DB schema #93

Merged

3 tasks

simonw added a commit that referenced this issue Jul 11, 2023

llm logs now uses new DB schema, refs #91

57c4348

simonw added a commit that referenced this issue Jul 11, 2023

Include conversations table in docs schema, refs #91

89af04d

simonw added a commit that referenced this issue Jul 11, 2023

Create conversations and responses tables and write to them, refs #91

88d839f

simonw added a commit that referenced this issue Jul 11, 2023

llm logs now uses new DB schema, refs #91

2d3ebe7

simonw added a commit that referenced this issue Jul 11, 2023

Include conversations table in docs schema, refs #91

56016ae

simonw closed this as completed Jul 11, 2023

simonw added a commit that referenced this issue Jul 12, 2023

Document --cid/--conversation, refs #91

93d61c5

simonw mentioned this issue Jul 12, 2023

Release notes for 0.5 #99

Closed

simonw added a commit that referenced this issue Jul 12, 2023

Release 0.5

a3c0796

Refs #31, #53, #55, #57, #63, #69, #70, #75, #76, #79, #82, #91, #98

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New logging schema for conversations and continue mode #91

New logging schema for conversations and continue mode #91

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

New logging schema for conversations and continue mode #91

New logging schema for conversations and continue mode #91

Comments

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023

simonw commented Jul 11, 2023