Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): Runs endpoints #583

Merged
merged 111 commits into from
Jun 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
111 commits
Select commit Hold shift + click to select a range
306cbbe
Squash runs branch
CollectiveUnicorn Jun 13, 2024
cd30d02
Shuffle awaits to fix runtime error
CollectiveUnicorn Jun 13, 2024
800aa42
Removes parallel call duplicates
CollectiveUnicorn Jun 13, 2024
97dde73
Updates list_messages to handle None result
CollectiveUnicorn Jun 13, 2024
e912141
fix the lint
gphorvath Jun 13, 2024
9e2609b
Fixes validation check to remove error
CollectiveUnicorn Jun 13, 2024
e18a8df
wip: starting on runs tests
gphorvath Jun 13, 2024
f096a50
Adds user message to db when calling create thread and run
CollectiveUnicorn Jun 13, 2024
772cef5
Updates stream output to yield messages properly
CollectiveUnicorn Jun 13, 2024
3e56163
Ensure that the uuid exists for newly created messages
CollectiveUnicorn Jun 13, 2024
03b6ba4
Ruff linting
CollectiveUnicorn Jun 13, 2024
484259c
Sets parallel default to false
CollectiveUnicorn Jun 13, 2024
d84d847
added an actual run test
gphorvath Jun 13, 2024
ca9c00c
now with less blah
gphorvath Jun 13, 2024
e96841e
Adds fastapi exception handler for unprocessable entities and fixes t…
CollectiveUnicorn Jun 13, 2024
1adfce7
Adds missing import
CollectiveUnicorn Jun 13, 2024
3f00406
Moves validation logger into main
CollectiveUnicorn Jun 13, 2024
a6c2e8d
Changes how textblockparams are handled
CollectiveUnicorn Jun 13, 2024
5ddf03c
Removes unnecessary type validation
CollectiveUnicorn Jun 13, 2024
d42aa10
Uses SyncCursorPage for openai response
CollectiveUnicorn Jun 14, 2024
70afc23
Adds sync page to threads
CollectiveUnicorn Jun 14, 2024
4c4998c
Fixes SyncPage for threads and adds run_id to messages
CollectiveUnicorn Jun 14, 2024
e653e92
Ruff linting
CollectiveUnicorn Jun 14, 2024
7fefb00
Adds run_id back to message from run
CollectiveUnicorn Jun 14, 2024
a689fd3
Sets metadata default on assistant to none
CollectiveUnicorn Jun 14, 2024
b35c1ed
Assigns run to message
CollectiveUnicorn Jun 14, 2024
6912ed8
Adds asssistant information to message
CollectiveUnicorn Jun 14, 2024
6690846
Fixes blank tool_resources issue
CollectiveUnicorn Jun 14, 2024
dd3a462
Adds SyncPage to vector store
CollectiveUnicorn Jun 14, 2024
4768cc2
Adds empty default metadata and fixes vector store file creation
CollectiveUnicorn Jun 14, 2024
b2aaa02
Fixes vector store file delete
CollectiveUnicorn Jun 14, 2024
3db496a
Ruff linting
CollectiveUnicorn Jun 14, 2024
e6d0fb7
fixing list messages test
gphorvath Jun 14, 2024
adb974d
fixing a bug with create_run
gphorvath Jun 14, 2024
d7f93e0
update runs tests
gphorvath Jun 14, 2024
33305b4
remove prints from runs tests
gphorvath Jun 14, 2024
48ea387
working on runs tests
gphorvath Jun 14, 2024
ecdffec
fix vector stores test to cleanup test file
gphorvath Jun 14, 2024
4163311
amping up the verbosity of local integration tests a bit
gphorvath Jun 14, 2024
c89dfda
Re-arranges chat message order, adds attachments, and removes zarf var
CollectiveUnicorn Jun 14, 2024
1d49e6a
Resolves missing function on message creation
CollectiveUnicorn Jun 14, 2024
d8b3dfa
Ruff linting and fix for bad assistant creation reference
CollectiveUnicorn Jun 14, 2024
c754b32
Deletes user_id prior to returning runs
CollectiveUnicorn Jun 14, 2024
03cb6af
Pulls tool resources from assistant or the run request
CollectiveUnicorn Jun 14, 2024
d5c84bf
Adds the file citations
CollectiveUnicorn Jun 14, 2024
54699ce
Adds file id to user input
CollectiveUnicorn Jun 14, 2024
414aeec
Ruff linting
CollectiveUnicorn Jun 14, 2024
c76cdfe
Ruff linting
CollectiveUnicorn Jun 14, 2024
17f60f9
Sets tool_choice to auto by default
CollectiveUnicorn Jun 14, 2024
351880d
Correctly apply annotation text
CollectiveUnicorn Jun 14, 2024
ef9863c
Refactors file citation
CollectiveUnicorn Jun 14, 2024
9266bdf
Refactors rag
CollectiveUnicorn Jun 14, 2024
d8794b3
Correctly adds fileids to llm response
CollectiveUnicorn Jun 14, 2024
f987ee6
Removes itertools
CollectiveUnicorn Jun 14, 2024
22f1612
Replaces list of ids with set
CollectiveUnicorn Jun 14, 2024
a5939a8
Reverse message order
CollectiveUnicorn Jun 14, 2024
e49e90c
Logging
CollectiveUnicorn Jun 14, 2024
523014e
Chat message order
CollectiveUnicorn Jun 14, 2024
2cb1cfd
Ruff linting
CollectiveUnicorn Jun 14, 2024
454959c
Adds empty array
CollectiveUnicorn Jun 14, 2024
8e7ce86
Log insertion and reverse order
CollectiveUnicorn Jun 14, 2024
79fa4ed
More ordering
CollectiveUnicorn Jun 14, 2024
5583cb3
More ordering
CollectiveUnicorn Jun 15, 2024
b18f5e8
Fixes timestamp issues with messages
CollectiveUnicorn Jun 15, 2024
4809573
Updates message created_at time once generation is complete
CollectiveUnicorn Jun 15, 2024
3ac0745
Removes time override
CollectiveUnicorn Jun 15, 2024
5eed82c
Sort after list retrieval
CollectiveUnicorn Jun 15, 2024
c6a375e
Logging
CollectiveUnicorn Jun 15, 2024
7599303
Logging typo
CollectiveUnicorn Jun 15, 2024
a34fdc5
Resolves annotation hallucinations
CollectiveUnicorn Jun 15, 2024
214ab82
Ruff linting
CollectiveUnicorn Jun 15, 2024
3e6e6e3
Fix up invalid reference
CollectiveUnicorn Jun 15, 2024
2ec5ce9
Removes annotations before passing them to the LLM
CollectiveUnicorn Jun 15, 2024
bf33cac
Ruff linting
CollectiveUnicorn Jun 15, 2024
f777c31
Use the updated message
CollectiveUnicorn Jun 15, 2024
498e295
Ruff linting
CollectiveUnicorn Jun 15, 2024
6ea7d51
Returns the receipt to being a user message
CollectiveUnicorn Jun 15, 2024
3e05c13
Format tests
CollectiveUnicorn Jun 15, 2024
8d7efee
updating to cascade delete runs when an assistant is deleted. Added t…
gphorvath Jun 16, 2024
ef81fa2
fixed an issue with modify run failing schema check for Run type and …
gphorvath Jun 16, 2024
51332f8
valid tool types on threads endpoint and test for it
gphorvath Jun 16, 2024
df7d71d
validating tools types on the rest of threads
gphorvath Jun 16, 2024
03f7b54
broke up threads/messages/runs routers and tests
gphorvath Jun 16, 2024
d233021
taking out crud_run_step because run steps aren't implemented yet
gphorvath Jun 16, 2024
e109f96
removing unused run steps schema
gphorvath Jun 16, 2024
0ac7cdb
fix filename typo
gphorvath Jun 16, 2024
b2471c6
remove redundant test config load
gphorvath Jun 16, 2024
0401a6c
change order of supabase/api packages, supabase migrations need to ap…
gphorvath Jun 17, 2024
5990b13
remove RAG deployment - the API supports this natively now
gphorvath Jun 17, 2024
10795b3
cleanup some WIP cruft in runs tests
gphorvath Jun 17, 2024
5941d51
Merge branch 'main' into 419-runs-endpoints-420-run-steps-endpoints
gphorvath Jun 17, 2024
cdc5590
fix an issue with validate tools
gphorvath Jun 17, 2024
f803f01
Merge branch '419-runs-endpoints-420-run-steps-endpoints' of github.c…
gphorvath Jun 17, 2024
ddcc282
runs endpoint needs to come before threads
gphorvath Jun 17, 2024
851c566
fix tools validation bug
gphorvath Jun 17, 2024
3c3ad91
fix: deleting user_id from create_run, adding more tests
gphorvath Jun 17, 2024
ed59cde
because line length
gphorvath Jun 17, 2024
c0ff640
fix issue with max tokens defaulting to 0
gphorvath Jun 17, 2024
6efc306
fixing issues in messages router
gphorvath Jun 17, 2024
ad8c74c
fixing issues in threads router
gphorvath Jun 17, 2024
b8b39ad
split out runs and run-steps into separate routers and working to cle…
gphorvath Jun 17, 2024
9ca1022
refixing a bug
gphorvath Jun 18, 2024
3ddafbb
moving away from typed_dict OpenAI types to Pydantic
gphorvath Jun 18, 2024
48014e7
adding type hints
gphorvath Jun 18, 2024
2020c61
check to make sure things exist before using them
gphorvath Jun 18, 2024
7175ea8
fix: simplifying message_content_part validation
gphorvath Jun 18, 2024
9e298c6
cleaning up typehints and checking metadata
gphorvath Jun 18, 2024
596e907
cleaning up edge cases in run_create_params_request_base.py
gphorvath Jun 18, 2024
8f8aa6e
cleaning up assistants router
gphorvath Jun 18, 2024
c551d4c
cleanup routes
gphorvath Jun 18, 2024
2c4e5ed
formatting
gphorvath Jun 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion packages/api/chart/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ image:
fsGroup: 65532

supabase:
url: "https://supabase-kong.###ZARF_VAR_HOSTED_DOMAIN###"
url: "http://supabase-kong.leapfrogai.svc.cluster.local:80"

api:
replicas: 1
Expand Down
52 changes: 52 additions & 0 deletions packages/api/supabase/migrations/20240611111500_runs.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
-- Create a table to store the OpenAI Run Objects
create table
run_objects (
id uuid primary key default uuid_generate_v4(),
user_id uuid references auth.users not null,
object text check (object in ('thread.run')),
created_at bigint default extract(epoch FROM NOW()) not null,
thread_id uuid references thread_objects (id) on delete cascade not null,
assistant_id uuid references assistant_objects (id) on delete cascade not null,
status text,
required_action jsonb,
last_error jsonb,
expires_at bigint,
started_at bigint,
cancelled_at bigint,
failed_at bigint,
completed_at bigint,
model text,
instructions text,
tools jsonb,
metadata jsonb,
parallel_tool_calls boolean,
stream boolean,
file_ids uuid[],
gphorvath marked this conversation as resolved.
Show resolved Hide resolved
incomplete_details jsonb,
usage jsonb,
temperature float,
top_p float,
max_prompt_tokens int,
max_completion_tokens int,
truncation_strategy jsonb,
tool_choice jsonb,
response_format jsonb
);

-- RLS policies
alter table run_objects enable row level security;

-- Policies for run_objects
create policy "Individuals can view their own run_objects." on run_objects for
select using (auth.uid() = user_id);
create policy "Individuals can create run_objects." on run_objects for
insert with check (auth.uid() = user_id);
create policy "Individuals can update their own run_objects." on run_objects for
update using (auth.uid() = user_id);
create policy "Individuals can delete their own run_objects." on run_objects for
delete using (auth.uid() = user_id);

-- Indexes for common filtering and sorting for run_objects
CREATE INDEX run_objects_id ON run_objects (id);
CREATE INDEX run_objects_user_id ON run_objects (user_id);
CREATE INDEX run_objects_created_at ON run_objects (created_at);
2 changes: 1 addition & 1 deletion src/leapfrogai_api/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,4 @@ env:
$(call get_jwt_token,"${SUPABASE_URL}/auth/v1/token?grant_type=password")

test-integration:
cd ../../ && python -m pytest tests/integration/api
cd ../../ && python -m pytest tests/integration/api/ -vv -s
107 changes: 107 additions & 0 deletions src/leapfrogai_api/backend/converters.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
"""Converters for the LeapfrogAI API"""

from typing import Iterable
from openai.types.beta import AssistantStreamEvent
from openai.types.beta.assistant_stream_event import ThreadMessageDelta
from openai.types.beta.threads.file_citation_annotation import FileCitation
from openai.types.beta.threads import (
MessageContentPartParam,
MessageContent,
TextContentBlock,
Text,
Message,
MessageDeltaEvent,
MessageDelta,
TextDeltaBlock,
TextDelta,
FileCitationAnnotation,
)


def from_assistant_stream_event_to_str(stream_event: AssistantStreamEvent):
return f"event: {stream_event.event}\ndata: {stream_event.data.model_dump_json()}"


def from_content_param_to_content(
thread_message_content: str | Iterable[MessageContentPartParam],
) -> MessageContent:
"""Converts messages from MessageContentPartParam to MessageContent"""
if isinstance(thread_message_content, str):
return TextContentBlock(
text=Text(annotations=[], value=thread_message_content),
type="text",
)
else:
result: str = ""

for message_content_part in thread_message_content:
if isinstance(text := message_content_part.get("text"), str):
result += text

return TextContentBlock(
text=Text(annotations=[], value=result),
type="text",
)


def from_text_to_message(text: str, file_ids: list[str]) -> Message:
all_file_ids: str = ""

for file_id in file_ids:
all_file_ids += f" [{file_id}]"

message_content: TextContentBlock = TextContentBlock(
text=Text(
annotations=[
FileCitationAnnotation(
text=f"[{file_id}]",
file_citation=FileCitation(file_id=file_id, quote=""),
start_index=0,
end_index=0,
type="file_citation",
)
for file_id in file_ids
],
value=text + all_file_ids,
),
type="text",
)

new_message = Message(
id="",
created_at=0,
object="thread.message",
status="in_progress",
thread_id="",
content=[message_content],
role="assistant",
metadata=None,
)

return new_message


async def from_chat_completion_choice_to_thread_message_delta(
index, random_uuid, streaming_response
) -> ThreadMessageDelta:
thread_message_event: ThreadMessageDelta = ThreadMessageDelta(
data=MessageDeltaEvent(
id=str(random_uuid),
delta=MessageDelta(
content=[
TextDeltaBlock(
index=index,
type="text",
text=TextDelta(
annotations=[],
value=streaming_response.choices[0].chat_item.content,
),
)
],
role="assistant",
),
object="thread.message.delta",
),
event="thread.message.delta",
)
return thread_message_event
21 changes: 20 additions & 1 deletion src/leapfrogai_api/backend/grpc_client.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""gRPC client for OpenAI models."""

from typing import Iterator
from typing import Iterator, AsyncGenerator, Any
import grpc
from fastapi.responses import StreamingResponse
import leapfrogai_sdk as lfai
Expand All @@ -16,6 +16,9 @@
EmbeddingResponseData,
Usage,
)
from leapfrogai_sdk.chat.chat_pb2 import (
ChatCompletionResponse as ProtobufChatCompletionResponse,
)
from leapfrogai_api.utils.config import Model


Expand Down Expand Up @@ -66,6 +69,22 @@ async def stream_chat_completion(model: Model, request: lfai.ChatCompletionReque
return StreamingResponse(recv_chat(stream), media_type="text/event-stream")


async def stream_chat_completion_raw(
gphorvath marked this conversation as resolved.
Show resolved Hide resolved
model: Model, request: lfai.ChatCompletionRequest
) -> AsyncGenerator[ProtobufChatCompletionResponse, Any]:
"""Stream chat completion using the specified model."""
async with grpc.aio.insecure_channel(model.backend) as channel:
stub = lfai.ChatCompletionStreamServiceStub(channel)
stream: grpc.aio.UnaryStreamCall[
lfai.ChatCompletionRequest, lfai.ChatCompletionResponse
] = stub.ChatCompleteStream(request)

await stream.wait_for_connection()

async for response in stream:
yield response


# TODO: Clean up completion() and stream_completion() to reduce code duplication
async def chat_completion(model: Model, request: lfai.ChatCompletionRequest):
"""Complete chat using the specified model."""
Expand Down
4 changes: 2 additions & 2 deletions src/leapfrogai_api/backend/helpers.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""Helper functions for the OpenAI backend."""

from typing import BinaryIO, Iterator
from typing import BinaryIO, Iterator, AsyncGenerator, Any
import grpc
import leapfrogai_sdk as lfai
from leapfrogai_api.backend.types import (
Expand Down Expand Up @@ -48,7 +48,7 @@ async def recv_chat(
stream: grpc.aio.UnaryStreamCall[
lfai.ChatCompletionRequest, lfai.ChatCompletionResponse
],
):
) -> AsyncGenerator[str, Any]:
"""Generator that yields chat completion responses as Server-Sent Events."""
async for c in stream:
yield (
Expand Down
5 changes: 4 additions & 1 deletion src/leapfrogai_api/backend/rag/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

from supabase_py_async import AsyncClient
from leapfrogai_api.backend.rag.index import IndexingService
from postgrest.base_request_builder import SingleAPIResponse


class QueryService:
Expand All @@ -11,7 +12,9 @@ def __init__(self, db: AsyncClient) -> None:
"""Initializes the QueryService."""
self.db = db

async def query_rag(self, query: str, vector_store_id: str, k: int = 5):
async def query_rag(
self, query: str, vector_store_id: str, k: int = 5
) -> SingleAPIResponse:
"""
Query the Vector Store.

Expand Down
95 changes: 60 additions & 35 deletions src/leapfrogai_api/backend/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,24 @@
import datetime
from enum import Enum
from typing import Literal
from pydantic import BaseModel, Field

from fastapi import UploadFile, Form, File
from openai.types.beta.vector_store import ExpiresAfter
from openai.types import FileObject
from openai.types.beta import VectorStore
from openai.types.beta import Assistant, AssistantTool
from openai.types.beta.threads import Message, MessageContent, TextContentBlock, Text
from openai.types.beta.threads.message import Attachment
from openai.types.beta.assistant import ToolResources
from openai.types.beta import VectorStore
from openai.types.beta.assistant import (
ToolResources as BetaAssistantToolResources,
ToolResourcesFileSearch,
)
from openai.types.beta.assistant_tool import FileSearchTool
from openai.types.beta.thread import ToolResources as BetaThreadToolResources
from openai.types.beta.thread_create_params import (
ToolResourcesFileSearchVectorStoreChunkingStrategy,
ToolResourcesFileSearchVectorStoreChunkingStrategyAuto,
)
from openai.types.beta.threads.text_content_block_param import TextContentBlockParam
from openai.types.beta.vector_store import ExpiresAfter
from pydantic import BaseModel, Field


##########
Expand Down Expand Up @@ -101,8 +110,8 @@ class ChatFunction(BaseModel):
class ChatMessage(BaseModel):
"""Message object for chat completion."""

role: str
content: str
role: Literal["user", "assistant", "system", "function"]
content: str | list[TextContentBlockParam]


class ChatDelta(BaseModel):
Expand Down Expand Up @@ -259,16 +268,29 @@ class ListFilesResponse(BaseModel):
class CreateAssistantRequest(BaseModel):
"""Request object for creating an assistant."""

model: str = "mistral"
name: str | None = "Froggy Assistant"
description: str | None = "A helpful assistant."
instructions: str | None = "You are a helpful assistant."
tools: list[AssistantTool] | None = [] # This is all we support right now
tool_resources: ToolResources | None = ToolResources()
metadata: dict | None = Field(default=None, examples=[{}])
temperature: float | None = 1.0
top_p: float | None = 1.0
response_format: Literal["auto"] | None = "auto" # This is all we support right now
model: str = Field(default="llama-cpp-python", examples=["llama-cpp-python"])
name: str | None = Field(default=None, examples=["Froggy Assistant"])
description: str | None = Field(default=None, examples=["A helpful assistant."])
instructions: str | None = Field(
default=None, examples=["You are a helpful assistant."]
)
tools: list[AssistantTool] | None = Field(
default=None, examples=[[FileSearchTool(type="file_search")]]
)
tool_resources: BetaAssistantToolResources | None = Field(
default=None,
examples=[
BetaAssistantToolResources(
file_search=ToolResourcesFileSearch(vector_store_ids=[])
)
],
)
metadata: dict | None = Field(default={}, examples=[{}])
temperature: float | None = Field(default=None, examples=[1.0])
top_p: float | None = Field(default=None, examples=[1.0])
response_format: Literal["auto"] | None = Field(
default=None, examples=["auto"]
) # This is all we support right now


class ModifyAssistantRequest(CreateAssistantRequest):
Expand Down Expand Up @@ -304,6 +326,21 @@ class VectorStoreStatus(Enum):
COMPLETED = "completed"


class CreateVectorStoreFileRequest(BaseModel):
"""Request object for creating a vector store file."""

chunking_strategy: ToolResourcesFileSearchVectorStoreChunkingStrategy | None = (
Field(
default=None,
examples=[
ToolResourcesFileSearchVectorStoreChunkingStrategyAuto(type="auto")
],
)
)

file_id: str = Field(default="", examples=[""])


class CreateVectorStoreRequest(BaseModel):
"""Request object for creating a vector store."""

Expand Down Expand Up @@ -371,30 +408,18 @@ class ListVectorStoresResponse(BaseModel):
################


class CreateThreadRequest(BaseModel):
"""Request object for creating a thread."""
class ModifyRunRequest(BaseModel):
"""Request object for modifying a run."""

messages: list[Message] | None = Field(default=None, examples=[None])
tool_resources: ToolResources | None = Field(default=None, examples=[None])
metadata: dict | None = Field(default=None, examples=[{}])
metadata: dict[str, str] | None = Field(default=None, examples=[{}])


class ModifyThreadRequest(BaseModel):
"""Request object for modifying a thread."""

tool_resources: ToolResources | None = Field(default=None, examples=[None])
metadata: dict | None = Field(default=None, examples=[{}])


class CreateMessageRequest(BaseModel):
"""Request object for creating a message."""

role: Literal["user", "assistant"] = Field(default="user")
content: list[MessageContent] = Field(
default=[TextContentBlock(text=Text(value="", annotations=[]), type="text")],
examples=[[TextContentBlock(text=Text(value="", annotations=[]), type="text")]],
tool_resources: BetaThreadToolResources | None = Field(
default=None, examples=[None]
)
attachments: list[Attachment] | None = Field(default=None, examples=[None])
metadata: dict | None = Field(default=None, examples=[{}])


Expand Down
Loading