Support SQLAlchemy for custom data layer #836

hayescode · 2024-03-21T04:18:01Z

Overview

Adds custom, direct database, data layer using SQLAlchemy with support for a wide-range of SQL dialects
Improved loading speeds of list_threads and get_threads dramatically versus previous recursion.
- The code is kinda wacky/long, but the end performance is worth it.
Optionally configures ADLS as the blob storage provider (designed to allow other providers in the future)
- I only have access to Azure so I can't test/configure other providers
Duplicated PageInfo and PaginatedResponse from literal SDK into backend/chainlit/types.py and updated typing.
Note: ~~I had to add locks because the backend uses asyncio.gather and sometimes create_step is attempted before update_thread causing a foreign key violation and race condition.~~
- ~~I prefer this approach in order to maintain referential integrity of the database.~~
- April 2, 2024: Removed most foreign key constraints because Chainlit backend executes tasks concurrently via .gather() causing create_step to be called before update_thread resulting in foreign key violations. I assume this was done for performance reasons. The backend would have to be re-worked to handle this and a custom solution adds complexity that should not exist.
Added tests to new cypress/e2e/custom_data_layer.py.

Shout-Outs

Thank you @nixent for suggesting to use SQLAlchemy for a broader solution to support a larger # of developers!
Inspired from @sandangel who showed that this can be done and shared their work
For testing I straight up ganked your cypress tests @tjroamer . I had never done these before.

Test System

OS: Windows 11
Python: 3.11.4
Postgres: v11

How to configure

Install necessary dependencies to use this custom data layer.

pip install chainlit[custom-data] --upgrade

Run this SQL DDL in your sql database

DDL

CREATE TABLE users (
    "id" UUID PRIMARY KEY,
    "identifier" TEXT NOT NULL UNIQUE,
    "metadata" JSONB NOT NULL,
    "createdAt" TEXT
);

CREATE TABLE IF NOT EXISTS threads (
    "id" UUID PRIMARY KEY,
    "createdAt" TEXT,
    "name" TEXT,
    "userId" UUID,
    "userIdentifier" TEXT,
    "tags" TEXT[], 
    "metadata" JSONB,
    FOREIGN KEY ("userId") REFERENCES users("id") ON DELETE CASCADE
);

CREATE TABLE IF NOT EXISTS steps (
    "id" UUID PRIMARY KEY,
    "name" TEXT NOT NULL,
    "type" TEXT NOT NULL,
    "threadId" UUID NOT NULL,
    "parentId" UUID,
    "disableFeedback" BOOLEAN NOT NULL,
    "streaming" BOOLEAN NOT NULL,
    "waitForAnswer" BOOLEAN,
    "isError" BOOLEAN,
    "metadata" JSONB,
    "tags" TEXT[], 
    "input" TEXT,
    "output" TEXT,
    "createdAt" TEXT,
    "start" TEXT,
    "end" TEXT,
    "generation" JSONB,
    "showInput" TEXT,
    "language" TEXT,
    "indent" INT
);

CREATE TABLE IF NOT EXISTS elements (
    "id" UUID PRIMARY KEY,
    "threadId" UUID,
    "type" TEXT,
    "url" TEXT,
    "chainlitKey" TEXT,
    "name" TEXT NOT NULL,
    "display" TEXT,
    "objectKey" TEXT,
    "size" TEXT,
    "page" INT,
    "language" TEXT,
    "forId" UUID,
    "mime" TEXT
);

CREATE TABLE IF NOT EXISTS feedbacks (
    "id" UUID PRIMARY KEY,
    "forId" UUID NOT NULL,
    "value" INT NOT NULL,
    "comment" TEXT
);

Add SQLALCHEMY_CONNINFO environment variable in your .env file

SQLALCHEMY_CONNINFO = <dialect>+<driver>://<user>:<password>:<port>/<database>
ADLS_SAS_TOKEN = <your_SAS_token> # Optional

Add this to your app.py

import chainlit.data as cl_data
from chainlit.data.sql_alchemy import SQLAlchemyDataLayer
from chainlit.data.storage_clients import AzureStorageClient

storage_client = AzureStorageClient(
    account_url="https://<your_account>.dfs.core.windows.net",
    container="<your_container>",
    credential=credential,
    sas_token=ADLS_SAS_TOKEN
    )
cl_data._data_layer = SQLAlchemyDataLayer(
    conninfo=SQLALCHEMY_CONNINFO,
    ssl_require=True,
    storage_provider=storage_client,
    user_thread_limit=100)

nixent

Thank you for turning to SQLAlchemy that fast! I have couple of comments, primarily about DB init instead of running DDL scripts

backend/chainlit/sql_alchemy.py

List thread update and imports

DALLE for example can return a url but it's only valid for a limited time, so we persist those now

backend/chainlit/data/sql_alchemy.py

sandangel · 2024-03-26T22:20:34Z

can we add an interface/adapter for blob storage client?
For example: the client needs to implement upload_file, delete_file, download_file. then user can just have their own client and implement that interface instead of having to make a PR to chainlit to support their blob storage choice?

sandangel · 2024-03-26T22:22:11Z

I also suggest using memcached or session storage, but this is a good first starting point already.

hayescode · 2024-03-26T23:32:38Z

can we add an interface/adapter for blob storage client? For example: the client needs to implement upload_file, delete_file, download_file. then user can just have their own client and implement that interface instead of having to make a PR to chainlit to support their blob storage choice?

@sandangel I was thinking about make another file called chainlit/backend/data/blob.py or something where we could have classes for each blob storage provider. I only have access to Azure though, so cannot implement the others.

I added the functions based on the requirements from the chainlit documentation, except for delete_user_session because I cannot see a purpose. Where are you seeing delete_file and download_file?

sandangel · 2024-03-27T02:47:21Z

@hayescode

Where are you seeing delete_file and download_file?

I think they are hidden in literalai API client. That is why I chose to implement the literalai API instead of BaseDataLayer in my MongoDB PR.

sandangel · 2024-03-27T02:49:44Z

I was thinking about make another file called chainlit/backend/data/blob.py or something where we could have classes for each blob storage provider. I only have access to Azure though, so cannot implement the others.

I think this does not scale well for the reason I mentioned earlier:

having to make a PR to chainlit to support their blob storage choice

As the number of providers grow, it will add more maintenance burden for us. That is why I suggest having a common interface on the chainlit side, and on the users side they can use their own blob storage client. Similar to how chainlit is creating BaseDataLayer.

For SQLAlchemyDataLayer in this PR, I think it will only serve as an example implementation maintained by the community, not really something battle tested and ready to use for production that scale beyond > 100K users. For example, we will also need to add queue system for the write path, and key-value store for session and cache for faster querying and displaying data.

hayescode · 2024-03-27T03:19:57Z

I think they are hidden in literalai API client.

I am not looking in literal because it's irrelevant for a custom data layer. Devs advise in the project ticket is to inherit from BaseDataLayer so that's what this implements.

That is why I suggest having a common interface on the chainlit side, and on the users side they can use their own blob storage client.

@sandangel I don't understand what you mean I guess. If you have code for this please share.

In any case, this isn't a fully feature complete PR and we should expect (and welcome!) more enhancements. I can only test some SQL dialects and storage providers. Tbh I am in no rush as I'm already live with this. Data persistence has blocked me and my team for months so I made this out of frustration mostly haha.

sandangel · 2024-03-28T08:36:48Z

@hayescode sorry for the confusion. Here is some pseudocode:

# chainlit code

from typing import Protocol


class BlobStorageClient(Protocol):
    def upload_file(self, key: str, data: bytes) -> str:
        pass


class SQLAlchemyDataLayer(BaseDataLayer):

    async def add_blob_storage_client(self, blob_storage_client: BlobStorageClient) -> None:
        self.blob_storage_client = blob_storage_client
        logger.info("Blob Storage client initialized")


    @queue_until_user_message()
    async def create_element(self, element: 'Element'):
        # ...

        element.url = self.blob_storage_client.upload_file(key=element.key, data=element.data)


# user code:

class S3StorageClient:
    def upload_file(self, key: str, data: bytes) -> str:
        import boto3

        s3_client = boto3.client("s3")
        s3_client.put_object(Bucket="my-bucket", Key=key, Body=data)

        return f"s3://my-bucket/{key}"

cl._data_layer = SQLAlchemyDataLayer()
cl._data_layer.add_blob_storage_client(S3StorageClient())

backend/chainlit/data/sql_alchemy.py

sandangel · 2024-04-10T21:55:32Z

backend/chainlit/data/sql_alchemy.py

+    async def delete_user_session(self, id: str) -> bool:
+        return False # Not sure why documentation wants this
+
+    async def get_all_user_threads(self, user_id: Optional[str] = None, thread_id: Optional[str] = None) -> Optional[List[ThreadDict]]:


I'm not sure how chainlit is using this function, but if they are using this for displaying the list of threads only, then I think we should not query all the information inside each thread. That should happen when user click to resume the chat.

This function expects a List[ThreadDict] which is basically everything to your point. I was able to speed it up a lot by querying it all at once per user instead of recursively calling each thread, steps in that thread, etc to build it.

This is also used for the search and feedback filters.

I added a user_thread_limit to cap threads since it'll only grow. I have several users filling up my limit (100 threads) and performance is fine. Are you seeing performance problems or just asking the Chainlit devs why it's like this?

This function isn't part of BaseDataLayer so it can be defined and used from within this custom data layer as needed.

@sandangel I agree that there might be some performance improvements to be done, I think it's fine for the first release of this custom data layer 👍

tpatel · 2024-04-11T12:39:35Z

backend/chainlit/data/sql_alchemy.py

+        if not getattr(context.session.user, 'id', None):
+            raise ValueError("No authenticated user in context")


You need to remove this check. The upsert_feedback is only called from a route, so it's outside of the user session (so attempting to use context will always through).

I don't have a good solution to prevent adding feedback to the DB for non-authenticated chainlit apps for now.

Suggested change

if not getattr(context.session.user, 'id', None):

raise ValueError("No authenticated user in context")

Maybe we're overthinking this. I don't see a scenario where authenticated and non-authenticated users coexist in the same app/db. With authentication enabled users can't use my app with authentication. If someone doesn't have authentication set up I don't know why they'd set up a custom data layer.

I just don't see how this scenario we're trying to guard against would be possible, but maybe I'm missing something?

Yes this is a edge edge case I agree. I can sadly think about a least one scenario where it could happen: a app owner disables the authentication for a few weeks.
To be fair this is the only thing that I've found in today's testing session, and getting this fixed would enable me to do a last code review / testing session.

An option that I've just though about is to use the require_login method instead of checking the context (

chainlit/backend/chainlit/auth.py

Lines 31 to 37 in 5939877

def require_login():

return (

bool(os.environ.get("CHAINLIT_CUSTOM_AUTH"))

or config.code.password_auth_callback is not None

or config.code.header_auth_callback is not None

or is_oauth_enabled()

)

).

I like this idea! Then we can just block initialzation of the custom data layer itself if no authentication. Great solution!

Where would we call require_login()? I've tried the app.py before the decorators and in sqlalchemy.py but it is returning None for me even when I do have authentication enabled.

from chainlit.auth import require_login is_authentication_enabled = require_login() class SQLAlchemyDataLayer(BaseDataLayer): def __init__(self, conninfo: str, ssl_require: bool = False, storage_provider: Optional[BaseStorageClient] = None, user_thread_limit: Optional[int] = 1000): if not is_authentication_enabled: print(f'is_authentication_enabled: {is_authentication_enabled}') raise PermissionError("Authentication is required to use SQLAlchemyDataLayer")

intc-hharshtk · 2024-04-15T11:08:07Z

Is it possible to continue from previous thread?

tpatel

✨✨✨ AWESOME ✨✨✨

Thanks for your work @hayescode ! I believe this feature is ready now, we can still disable the persistence and display a proper warning for chainlit apps without authentication in a future PR.

hayescode · 2024-04-15T12:52:54Z

@tpatel thank you very much for your assistance and advice throughout this process! I'm excited to see how the Chainlit community evolved this functionality!

- adds custom, direct database, data layer using SQLAlchemy with support for a wide-range of SQL dialects - configures ADLS or S3 as the blob storage provider - duplicated `PageInfo` and `PaginatedResponse` from literal SDK into backend/chainlit/types.py and updated typing

JeanRessouche · 2024-04-16T08:18:30Z

Hi, it seems that this implementation does not support Azure Sql Databases: i get no error at boot but a Bad Request: 'User not persisted' 400 error, while the user is correctly stored in the user table. Also, on thread creation, an error appears to store thread.tags because this datatype (TEXT[]) does not exist in Azure Sql.

from sqlalchemy import create_engine
import urllib
import chainlit.data as cl_data
from chainlit.data.sql_alchemy import SQLAlchemyDataLayer
from sqlalchemy.engine import URL

driver = 'ODBC Driver 17 for SQL Server'

params = urllib.parse.quote_plus(
    'Driver=%s;' % driver +
    'Server=tcp:%s,1433;' % os.environ["AZURE_DB_HOST"] +
    'Database=%s;' % os.environ["AZURE_DB_DATABASE"] +
    'Uid=%s;' % os.environ["AZURE_DB_USERNAME"] +
    'Pwd={%s};' % os.environ["AZURE_DB_PASSWORD"] +
    'Persist Security Info=False;' +
    'MultipleActiveResultSets=False;' +
    'Encrypt=yes;' +
    'TrustServerCertificate=no;' +
    'Connection Timeout=30;')

conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
engine_azure = create_engine(conn_str)
engine_azure.connect()
print('Connection to the Azure Db is ok')

connection_url = URL.create(
    drivername='mssql+aioodbc',
    username=os.environ["AZURE_DB_USERNAME"],
    password=os.environ["AZURE_DB_PASSWORD"],
    host=os.environ["AZURE_DB_HOST"],
    database=os.environ["AZURE_DB_DATABASE"],
    query={
        'driver': driver,
        'Encrypt': 'yes',
        'TrustServerCertificate': 'no',
        'Connection Timeout': '30'
    }
)

cl_data._data_layer = SQLAlchemyDataLayer(
    conninfo=connection_url,
    ssl_require=True,
    storage_provider=engine_azure,
    user_thread_limit=100)

Did i made a mistake somewhere ?

tpatel · 2024-04-16T08:47:24Z

@JeanRessouche I've added some docs yesterday to clarify the scope of this release. For now this has only been tested with PostgreSQL.

I'll be happy to guide you if you choose to add support for Azure SQL!

JeanRessouche · 2024-04-16T11:01:14Z

Thanks a lot @tpatel, the doc is helping but it's still a little bit cloudy for me.

Adding support for Azure Sql is definitely something that i'm willing to do, but not in the short term, so I'm trying the Azure Datalake way.

Based on the doc, i'm facing a blocker with the conninfo variable.
For me an Azure Datalake Gen 2 is basically a storage account (that can provide with the account name & sas token), so i have no idea how to get the conninfo content that look like sql server details here. Does it require an Azure Synapse on top of the Data lake gen 2 ?

[EDIT] Yeah, i certainly need Synapse, configured one, now i'm lost with the incompatible ddl as the datalake link seems to be SQL server compatible.

hayescode · 2024-04-16T12:45:45Z

For the tags I'm not sure what that azure SQL equivalent is but it's a list of strings in that column.

For the data lake I am using ADLS gen2. User not persisted sounds like this isn't getting configured properly. Pass the account url and credential (managed identity or access key). Try a test script to write some test data.

JeanRessouche · 2024-04-16T13:19:06Z

I'm curious about how you created the DDL in ADLS gen 2 on your side, i was only able to connect to it with sql server driver, thus the postgre ddl won't work.

So far i was only able to make it work properly with a postgre server (which is great already!), failed with ADSL gen2 & Azure Sql database.

hayescode · 2024-04-18T06:27:13Z

I'm curious about how you created the DDL in ADLS gen 2 on your side

I'm not sure what you mean, there is no DDL for ADLS. All you need is the account_url, credential, and the container name. Optionally add sas_token to append to the url to give users access. Once this is provided the code should handle it all and log any errors.

JeanRessouche · 2024-04-18T08:16:52Z

Hum, ok, i have to retry then, did that but it wasn't working, certainly because i did not figure out what to put in conninfo when we use ADLS.

hayescode · 2024-04-18T12:37:34Z

you don't put it in the conninfo you instantiate the ADLS class and pass it storage_provider.

from chainlit.data.storage_clients import AzureStorageClient

storage_client = AzureStorageClient(
    account_url="https://<your_account>.dfs.core.windows.net",
    container="<your_container>",
    credential=credential,
    sas_token=ADLS_SAS_TOKEN
    )
cl_data._data_layer = SQLAlchemyDataLayer(
    conninfo=SQLALCHEMY_CONNINFO,
    ssl_require=True,
    storage_provider=storage_client,
    user_thread_limit=100)

JeanRessouche · 2024-04-19T20:14:05Z

Hum, thanks, but there is still something i'm missing here, you ask me not to set conninfo but you do set it in your answer, and in the description above.

In the other hand if i simply don't provide it or set it empty i get an error as it's a required parameter

For ADLS i have no clue what is expected in the SQLALCHEMY_CONNINFO variable.

<dialect>+<driver>://<user>:<password>:<port>/<database>

hayescode · 2024-04-19T22:18:41Z

There are 2 clients. Pass the ALDS client to the SQLAlchemy via storage_client. I think you're conflating these.

Reymond190 · 2024-04-27T10:31:49Z

In next commits, pls fix this ->
i tested the sql alchemy datalayer using azure AD Oauth. there is an validation error when whole dict is used in "PersistedUser(**user_data)", works after metadata is removed from the dict.

ERROR:
File "D:\git_projects\mvenv_504\lib\site-packages\chainlit\data\sql_alchemy.py", line 114, in get_user
return PersistedUser(**user_data)
File "D:\git_projects\mvenv_504\lib\site-packages\pydantic_internal_dataclasses.py", line 140, in init
s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
pydantic_core._pydantic_core.ValidationError: 1 validation error for PersistedUser
metadata
Input should be a valid dictionary [type=dict_type, input_value='{"image": "data:image/jp..."provider": "azure-ad"}', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/dict_type

Tirupathiraopatnala · 2024-05-04T01:15:05Z

There are 2 clients. Pass the ALDS client to the SQLAlchemy via storage_client. I think you're conflating these.

This is a bit confusing tbh, can you please add detailed steps to setup data persistance with azure

Tirupathiraopatnala · 2024-05-04T01:27:34Z

There are 2 clients. Pass the ALDS client to the SQLAlchemy via storage_client. I think you're conflating these.

from chainlit.data.storage_clients import AzureStorageClient

storage_client = AzureStorageClient(
account_url="https://<your_account>.dfs.core.windows.net",
container="<your_container>",
credential=credential,
sas_token=ADLS_SAS_TOKEN
)

If only this part of the code is responsible for data persistance in azure, will it create the required tables by default? or do we need to spin a db there?

hayescode · 2024-05-04T01:57:36Z

Data lake only stores elements (non-text). It's optional. SQL database handles everything else and the DDL is in the pr description.

@willydouhard

* remove unused recoil dependency (Chainlit#551) * make sure context is always carried (Chainlit#552) * restore context * use a copy of metadata when restoring the user session to avoid conflicts and json serialization issues (Chainlit#557) * update langchain cache (Chainlit#556) * prepare 0.7.604 * fix lc placeholder with lcel (Chainlit#568) * call on_chat_end when session is manually cleared (Chainlit#567) * avoid readme flickering when resuming a chat (Chainlit#566) * avoid readme flickering when resuming a chat * highlight resumed conversation in the sidebar * redirect to main chat after deletion * avoid error if conversation does not exist * check module is loaded before deleting it (Chainlit#563) * check module is loaded before deleting it * fix key error * allow html behind feature flag (Chainlit#565) * allow html behind feature flag * make latex a feature * prepare 0.7.700 * fix chat_profiles + clear asks * feat(playground): add gpt-4-turbo in the playground (Chainlit#578) * Update README.md * Release/1.0.0 (Chainlit#587) * init * base data layer * add step to data layer * add queue until user message * remove data_persistence from config * upload askfilemessage response as file element * step context * step context * llama index integration + step elements * haystack integration + step error * langchain integration + error handling * feedback * feedback * refactor AppUser to User * migrate react-client * migrate react-components * migrate main ui * fix mypy * fix type import * fix step issues + langchain issues * token count * remove IMessage and MessageDict * wip fix tests * fix existing tests * add data layer test * action toast * remove seconds from message time * add support for action interruption * rename appuser to user * toast style * fix update thread * use http for file uploads * remove useless create_task * wip data layer * rename client step type * fix chainlit hello * wip data layer * fix test * wip data layer * add root param to step * fix llama index callback handler * add step show input * fix final answer streaming * update readme * step type lower case * chainlit_client * debug ci * debug ci * bump sdk version * bump version * bump versions (Chainlit#590) * various bug fixes (Chainlit#600) * various bug fixes * file upload should flush thread queues * relax python dependency * fix deps * don't send steps if hide_cot is true * make oauth cookie samesite policy configurable * add copy button and replace message buttons * 1.0.0rc2 * changelog * bump sdk version * add page to pdf element (Chainlit#606) * do not ask confirmation for new chat if no interaction happened (Chainlit#605) * do not ask confirmation for new chat if no interaction happened * fix test * Wd/fix element (Chainlit#608) * fix element url when authenticated * fix file url * fix chat resume first interaction * hide copy button is disable_feeback is true * add changelog * fix flaky tasklist test * feat(playground): Add Gemini Pro in the LLM playground (Chainlit#610) * langchain callback should consider current step * llama index callback should consider current step * add loader while image is loading (Chainlit#612) * enhance custom auth (Chainlit#613) * return error 500 instead of 4xx when feedback fails * remove legacy prod url (Chainlit#615) * Update callbacks.py (Chainlit#614) * fix react client lodash issue (Chainlit#619) * added AWS Cognito OAuth provider (Chainlit#540) (Chainlit#617) * track user sessions (Chainlit#620) * track user sessions * interactive -> is_interactive * update sdk * Adding streaming functionality to output for haystack callback manager (Chainlit#621) * Adding streaming functionality to haystack callback manager * Improved code and added type to self.last_tokens * Fixed another latent bug. Seems like the code would fail when agent_step.is_last() wasn't true as the stack had already been popped * Changed list to List for mypy test (guessing this is due to python version) * fixed mypy issue --------- Co-authored-by: Kevin Longe <[email protected]> * do not throw in ws connect handler * Add Chat Settings to Generic Langchain Provider (Chainlit#622) * Add Chat Settings to Generic Langchain Provider * Update langchain.py * Set default inputs to empty list Suggestion from @willydouhard * Do not require settings for stream event * use chainlit 1.0.0 attrs * remove unecessary whitespace * Update langchain.py * Update backend/chainlit/playground/providers/langchain.py Co-authored-by: Willy Douhard <[email protected]> --------- Co-authored-by: Willy Douhard <[email protected]> * refactor: Add new way to open thread history (Chainlit#629) * avoid creating existing session (Chainlit#637) * fix element creation (Chainlit#636) * fix element creation * remove print * feat: Add button to auto scroll down (Chainlit#630) * disable jwt auth for elements (Chainlit#633) * feat: New way to stop the loading task (Chainlit#631) * refactor: Design of messages (Chainlit#635) * refactor: Design of messages * ui fixes * fix tests * clear warnings * fix width issue * fix sidebar trigger button * fix vertical centering * enhance user session count * add tooltip to sidebar trigger --------- Co-authored-by: Willy Douhard <[email protected]> * do not display scroll to bottom button on chat history * fix hide cot hide pp (Chainlit#638) * fix scroll bottom button flickering (Chainlit#639) * literalai (Chainlit#642) * do not display hide of cot toggle is hide cot is true on the server (Chainlit#656) * add onlogout hook (Chainlit#654) * add onlogout hook * pass the fastapi response as well * update the user if it already exists (Chainlit#653) * do not apply a bgcolor on an avatar that has an image (Chainlit#652) * always use the latest uploadFile function (Chainlit#651) * Update server.py (Chainlit#648) * prevent the app to crash if the data layer is not reachable (Chainlit#644) * prevent the app to crash if the data layer is not reachable * make the app still usable if auth is enabled and data layer is down * prepare release (Chainlit#657) * prepare release * bump literalai version * fix parent id issue (Chainlit#659) * prevent running button to display twice when cot is false and streaming (Chainlit#666) * bump version * Added Internationalization with react-i18next (Chainlit#668) * feature: i18n * do not display running button if cot is true and message is being streamed (Chainlit#669) * chore: bump uvicorn to 0.25.0 (Chainlit#664) fixes Chainlit#663 Co-authored-by: = <[email protected]> * fix pp doc link * enhance default translation log * attempt to fix file watcher (Chainlit#674) * enhance langchain tracing * fix: Tasklist flick (Chainlit#676) * Wd/embed (Chainlit#679) * relax fast api * allow for custom fonts * make cors configurable * rename to copilot and serve copilot index.js --------- Co-authored-by: SuperTurk <[email protected]> * fix translation key * Release/1.0.200 (Chainlit#681) * fix translation issue * fix copilot auth * add continuing chat info * bump version * prepare release * Update README.md * Remove `+` from secrets (Chainlit#688) Some oauth providers will replace the `+` character with a space which fails the auth check. * fix/fix overlay (Chainlit#717) Co-authored-by: Clément Sirieix <[email protected]> * steps should be able to be called in parallel (Chainlit#694) * steps should be able to be called in parallel * clean local steps * allow generation on message * fix tests * remove fast api version constraint (Chainlit#732) * remove fast api version constraint * bump langchain * update literalai dependency (Chainlit#735) * thread name, wrap step input/output, remove user session * update literal-ai dependency * set the default prompt playground state to `Formatted` * keep the input/output step properties as dict in Chainlit - also fixes the llama index callback and the playground, following the renaming * datalayer: only add input/output if needed, remove empty input/output objects * remove debug log --------- Co-authored-by: Willy Douhard <[email protected]> * bump literal version (Chainlit#749) * bump literal version * fix pp * Adds Cognito label and icon (Chainlit#704) * adds custom_js (Chainlit#708) * prepare release (Chainlit#751) * fix thread name * fix thread name * fix test * fix mypy * fix resume chat * Add OpenAI integration (Chainlit#778) * Add OpenAI integration - This simplifies instrumenting OpenAI calls - Reuses the literalai OpenAI instrumentation - Creates a new step for each OpenAI call, with the call details in the generation property * add a 1ms delay to the `on_message` callback - This makes sure any children step starts after the parent message step - replaced `generation.settings.model` by `generation.model` * move the openai version check inside the instrumentation call - avoids erroring if a user isn't using `instrument_openai` and hasn't installed openai * Add issue templates (Chainlit#786) - Helps gather more information when users create issues * prepare release (Chainlit#794) * prepare release * fix mypy * fix pdf mime type * Update pyproject.toml to fix python-multipart vulnerability (Chainlit#777) fixes Chainlit#776 * Fix: LlamaIndex 0.10.x migration (Chainlit#797) * fix unbound reference issue (Chainlit#807) * fix unbound reference issue * bump version * Add custom frontend build (Chainlit#783) - Use the `custom_build` to set the relative path for the custom build. - The `custom_build` path should contain a `./frontend/dist/index.html` file that will be load when users load the chainlit app url. - This `index.html` can contain any custom frontend, you're responsible for providing the full ui. * fix: fix typing for data layer (Chainlit#802) * added missing translation for time grouped categories (Chainlit#773) * remove forced lineheight (Chainlit#814) * Multi-modal support for file size, mime types and number of files (Chainlit#787) - adds `accept` (mime type), `max_files` and `max_size_mb` configurations (under `features.multi_modal`) - file uploads are rejected if they aren't respecting the configuration * move react-components codebase back into frontend (Chainlit#829) - simplifies the codebase by removing the need for a separate package * Loads markdown file based on language (Chainlit#692) * Fix 'completion' referenced before assignment (Chainlit#800) - Fixes `UnboundLocalError("local variable 'completion' referenced before assignment")` error in LangchainTracer.on_llm_end callback * fix typo in provider name (Chainlit#822) - `Coginto` -> `Cognito` * add missing translation keys (Chainlit#818) - add the "settings" modal title and the "show history" text * bump react-client version * feat: add HEAD route for '/' to support status checks (Chainlit#835) * Do not reload installed packages on file change (Chainlit#842) - fixes the filewatcher issues - the fix checks where the modules are installed (venv or regular installation) and prevents their reload. * fix BaseDataLayer create_element type (Chainlit#850) * migrate to literal score (Chainlit#851) * migrate to literal score * fix tests * enhance langchain llm step display * correctly display new lines * move new line cleaning to backend * fix thread dict * bump sdk version * fix data layer test * fix casing * allow for custom socketio pathname (Chainlit#853) * Improve translations (Chainlit#852) * add translation linter - compares the local translation json structure with the en-US chainlit reference - helps maintain accurate translations across languages * add translation fallback - when a translation is defined as a country code (ex: `de`), it should be used if the user uses a specific language (ex: `de-DE`) * move translations from settings to its own route - this enables having translations on the login screen (we didn't load the config before a user is logged-in) * add fastapi gzip middleware to speed up json replies * add missing translations - enable a fallback (skeleto component) in the existing Translator component - add a new hook to return a string (with a `...` fallback) - switch to using the Translator as much as possible (to benefit from the skeleton fallback) - if the Translator can't be used, use the new `useTranslation` hook instead of the one from `react-i18next` (to benefit from the fallback which avoids displaying the translation keys) * fix tests * update the copilot translation fetching logic - use the new /project/translations route instead to retrieve the translations * fix the tests * update changelog * Update CHANGELOG.md * bump version * unpin starlette dependency (Chainlit#868) * remove translations that aren't up to date (Chainlit#866) - we won't host all translations in Chainlit as this would lower the speed of iteration (we would need to update each translation for each frontend change that introduces a new translation) - we will need to find a good structure to allow the community to share their translations * Fix dates in changelog (Chainlit#872) * fix: clean sidebar when starting a new conversation (Chainlit#878) - moved the 'SideView' state into react-client - clean up the 'SideView' state in the `clear` method from `useChatInteract` * add optional `tags` and `metadata` to steps and messages (Chainlit#877) * fix tool calls in prompt playground (Chainlit#865) - also get rid of non-chat openai * rename the Literal env variable to LITERAL_API_URL (Chainlit#870) - backward compatible as `LITERAL_SERVER` is used as fallback * add model and chunks to llama index callback handler (Chainlit#876) * Wd/update literalsdk (Chainlit#885) * update to async literal sdk * changelog * make actions trigger thread persistence * add chat profile as thread tag * expose sessionId in useChatSession * Support SQLAlchemy for custom data layer (Chainlit#836) - adds custom, direct database, data layer using SQLAlchemy with support for a wide-range of SQL dialects - configures ADLS or S3 as the blob storage provider - duplicated `PageInfo` and `PaginatedResponse` from literal SDK into backend/chainlit/types.py and updated typing * bump literalai version and chainlit * add quotes to create-secret output (Chainlit#909) - fixes an issue with special characters from the secret * update changelog (Chainlit#910) * bump version * Wd/resilience (Chainlit#913) * make sure the chainlit app works even if the data layer is down * changelog * check that user exists before listing threads * fix fallback * bump literalai version * put back user creation * fix typo * remove return * Add the user's browser languages in the user session (Chainlit#889) * chore: label new issues as "needs-triage" (Chainlit#914) - prevents all untriaged issues from going on the community board * make auto tag thread opt in (Chainlit#927) * fix: correctly serialize generation and allow `None` `storage_provider` in SQLAlchemyDataLayer (Chainlit#921) - adding `self.storage_provider = None` in `def __init__()` so that it does not break method create_element. - adding generation parameter in create_step as it's also a JSON type --------- Co-authored-by: Thibaut Patel <[email protected]> * fix flaky parent for openai instrumentation (Chainlit#931) * Allow html in text elements (Chainlit#893) * allow for setting a ChatProfile default (Chainlit#930) * bump literalai sdk version * Update thread list on first interaction (Chainlit#923) - re-fetch the thread list on first interaction - navigate to `/thread/:id` url when creating a new conversation - update the `/thread/:id` page to allow for displaying the current chat - add `threadId` to `useChatMessages` to get the current conversation thread id - update "back to conversation" links - clear current conversation when deleting the current thread or changing chat profile * bump version (Chainlit#932) * fix: pasting from microsoft products generates text instead of an image (Chainlit#934) fixes Chainlit#743 - only consider paste input if it can't be converted as text * add support for `multiline` option in `TextInput` chat settings field (Chainlit#945) closes Chainlit#507 * fix: do not prevent thread revalidation (Chainlit#944) fixes Chainlit#941 - This fixes the issue where only the first messages of a thread would be shown in the thread history in some cases. * fix: display the label instead of the value for menu item (Chainlit#943) * fix: disable gzip middleware to prevent a compression issue on safari (Chainlit#952) fixes Chainlit#895 * release 1.0.506 (Chainlit#953) - update literalai dependency to version 0.0.509 - bump version - update changelog * Wd/audio (Chainlit#962) * rework navigation * wip * fix buffering * finalizing audio feature * fix lint * update changelog * bump literalai version * Run on_chat_resume() before resume_thread() (Chainlit#968) Execute the on_chat_resume decorator before the resume_thread emitter. This allows the `thread` in on_chat_resume to be updated on load. * fix: double on_chat_end invocation (Chainlit#971) * feat: remove bytes objects from steps (Chainlit#969) * feat: remove bytes objects from steps * feat: process list and tuples * Feat/discord (Chainlit#986) * add slack platform * add discord platform * Update sql_alchemy.py (Chainlit#981) * Release/1.1.0 (Chainlit#989) * bump version * fix dependabot security alert * feat: wrap on_message with typing (Chainlit#991) Bot will show as typing while the run is running. * fix user menu overflow * make discord and slack bot resilient to the data layer * prepare 1.1.101 release * fix scroll issues (Chainlit#1000) * Added threadId to Feedback (Chainlit#999) * feat: add custom meta image config (Chainlit#1007) * Wd/1.1.200 (Chainlit#1008) * loader rework * update icons * fix auto scroll * fix github button * update readme * fix details button + loader * fix ci * fix audio capture * fix: linter * feat: add video player using react-player to support YouTube, Vimeo and other sources (Chainlit#980) * Update CHANGELOG.md * feat: add font colors to config.toml (Chainlit#976) - you can now configure the primary and secondary text color from the `config.toml` file. * fix: limit discord thread name (Chainlit#1013) * fix: limit discord thread name * fix: handle start of thread with blank message * update readme * Update README.md * Added threadId to Element (Chainlit#1005) * Added threadId to Feedback * :Revert "Added threadId to Feedback" This reverts commit dd5e4b5. * Add threadId to delete_element * fix: custom build dir do not need frontend and dist (Chainlit#1020) * fix: custom build dir do not need frontend and dist Signed-off-by: San Nguyen <[email protected]> * fix e2e test Signed-off-by: San Nguyen <[email protected]> * test previous values Signed-off-by: San Nguyen <[email protected]> * fix tests Signed-off-by: San Nguyen <[email protected]> * no format index.html Signed-off-by: San Nguyen <[email protected]> * remove envrc Signed-off-by: San Nguyen <[email protected]> --------- Signed-off-by: San Nguyen <[email protected]> * Release/1.1.300 (Chainlit#1028) * Add starters * Debug mode * Rework CoT * Rework Avatars * Remove PP * Update README.md * update readme image * fix: scroll flickering * 1.1.300rc1 * attempt to fix build (Chainlit#1033) * attempt to fix build * update numpy * condition numpy version based on python version * fix condition on numpy * remove duplicate new chat button * update changelog * feat: added Gitlab OAuth provider (Chainlit#1012) * feat: add DynamoDB Datalayer (Chainlit#1030) - enables using AWS DynamoDB as a database for the chat history * Release/1.1.300 (Chainlit#1040) * enhance conversation spacing and make copilot expandable * fix non ascii characters for chat profiles * Release/1.1.300 (Chainlit#1041) * enhance conversation spacing and make copilot expandable * fix non ascii characters for chat profiles * add input streaming support * Release/1.1.300 (Chainlit#1043) * enhance conversation spacing and make copilot expandable * fix non ascii characters for chat profiles * add input streaming support * fix message margin top * feat: add ssl support using custom key/cert files (Chainlit#1047) --------- Co-authored-by: BCM <[email protected]> * Add Teams integration (Chainlit#1003) * add teams integration * enhance feedback and add typing activity * feat: add OAuth Azure AD hybrid flow (Chainlit#1046) * Added Hybrid Flow for Authorization grant to reterive user id_token. Changed Redirect from Get to Post as Hybrid Flow needs form_post instead of query and returns the response as Form body. Added Hybrid Flow for Authorization grant to reterive user id_token. Changed Redirect from Get to Post as Hybrid Flow needs form_post instead of query and returns the response as Form body. * Added suggested changes Added Get for Callback Added Id_token to Callback signature * fixes - add missing random_secret - fix scopes - split the oauth redirection into two routes - fix imports * split azure hybrid flow into a separate oauth provider * fix typing and incorrect merge conflict resolution --------- Co-authored-by: Shabir Jan <[email protected]> * make select input use theme colors * add tooltip to avatar * Update en-US.json (Chainlit#1061) fix small typo * Update sql_alchemy.py - bugfix for delete_step (Chainlit#1027) Fixed bug in sql_alchemy.py:delete_step: Wrong/non-existing primary key name for steps_query (was "forId" instead of id) * prepare release (Chainlit#1064) * Added the ability to specify a root path (sub directory * @cl.on_system_message * add system message * make sure metadata is a dict * add useAudio * fix index error * rc5 --------- Co-authored-by: JT <[email protected]> * Parametrize uvicorn ws protocol * prepare release (Chainlit#1071) * prepare release * fix: slack bot if user email is not available * fix socketio double cross origin header * add IS_SUBMOUNT * fix: add back get_user_info to AzureADOAuthProvider (Chainlit#1075) * fix: add back get_user_info to AzureADOAuthProvider * prepare patch * prepare release (Chainlit#1081) * prepare release * fix: copilot theme * fix: oauth redirection url (Chainlit#1088) * fix: oauth redirection with root path * feat: add mistral instrumentation (Chainlit#1100) * feat: add mistral instrumentation * fix: remove requirements on mistralai * fix: add changelog * add chat context (Chainlit#1108) * add chat context * Wd/chat context (Chainlit#1109) * add edit_message * fix: clear steps when editing a message * fix: test * prepare release * fix tool nesting (Chainlit#1113) * fix tool nesting * fix: flaky test * Fix Azure authentication (Chainlit#1117) The callback path for authentication in Azure AD, and possibly others, was unintentionally broken in 8ecd415 by modifying the user-facing path. This commit reverts one line of that change to restore it to its previous state. * Willy/cot (Chainlit#1128) * give more options for cot (hidden, tool_call, full) * handle scorable runs at the framework level * lack/Teams/Discord DM threads are now split by day * prepare 1.1.400rc0 * fix: make only last message of a run scorable * fix: data_layer test * enhance run scoring * bump version --------- Signed-off-by: San Nguyen <[email protected]> Co-authored-by: Willy Douhard <[email protected]> Co-authored-by: Florian Valeye <[email protected]> Co-authored-by: DanConstantini <[email protected]> Co-authored-by: Robin Opdam <[email protected]> Co-authored-by: datapay-ai <[email protected]> Co-authored-by: Kevin Longe <[email protected]> Co-authored-by: Tyler Titsworth <[email protected]> Co-authored-by: SuperTurk <[email protected]> Co-authored-by: Josh Hayes <[email protected]> Co-authored-by: Davi Reis Vieira <[email protected]> Co-authored-by: Pawel <[email protected]> Co-authored-by: = <[email protected]> Co-authored-by: SuperTurk <[email protected]> Co-authored-by: Brian Antonelli <[email protected]> Co-authored-by: Clément Sirieix <[email protected]> Co-authored-by: Clément Sirieix <[email protected]> Co-authored-by: Thibaut Patel <[email protected]> Co-authored-by: Brian Antonelli <[email protected]> Co-authored-by: Kevin <[email protected]> Co-authored-by: Anurag Dandamudi <[email protected]> Co-authored-by: Felipe Aros <[email protected]> Co-authored-by: 131 <[email protected]> Co-authored-by: San Nguyen <[email protected]> Co-authored-by: hans-sarpei <[email protected]> Co-authored-by: jpolvto <[email protected]> Co-authored-by: steflommen <[email protected]> Co-authored-by: shishax <[email protected]> Co-authored-by: giulioottantotto <[email protected]> Co-authored-by: Sinan Saral <[email protected]> Co-authored-by: BCM <[email protected]> Co-authored-by: rickythefox <[email protected]> Co-authored-by: Koichi Ishida <[email protected]> Co-authored-by: mohamedalani <[email protected]> Co-authored-by: Jan Beitner <[email protected]> Co-authored-by: Kevin Merritt <[email protected]> Co-authored-by: Kevin Merritt <[email protected]> Co-authored-by: Maciej Wieczorek <[email protected]> Co-authored-by: Mayaank Vadlamani <[email protected]> Co-authored-by: Quy Tang <[email protected]> Co-authored-by: Hugues de Saxcé <[email protected]> Co-authored-by: DanConstantini <[email protected]> Co-authored-by: Mathieu CHANIAT <[email protected]> Co-authored-by: Shabir Jan <[email protected]> Co-authored-by: Davi S. Zucon <[email protected]> Co-authored-by: ralphkink <[email protected]> Co-authored-by: JT <[email protected]> Co-authored-by: Maciej Wieczorek <[email protected]>

hayescode added 3 commits March 20, 2024 22:58

Create sql_alchemy.py

1a4c361

Update pyproject.toml

bec6e5d

Create sql_alchemy.py

354ddc2

hayescode mentioned this pull request Mar 21, 2024

Create an open source data layer #793

Closed

nixent reviewed Mar 21, 2024

View reviewed changes

hayescode added 10 commits March 25, 2024 09:46

Merge branch 'Chainlit:main' into list_thread-update-and-imports

775ed88

Merge branch 'Chainlit:main' into main

21b2d31

Create sql_alchemy.py

610a0f4

Delete backend/chainlit/sql_alchemy.py

e11d00d

Merge pull request #2 from hayescode/list_thread-update-and-imports

3e576c7

List thread update and imports

Delete backend/chainlit/sql_alchemy.py

b99d05d

Update types.py

59b779d

Update BaseDataLayer to use types from chainlit

7d87846

Update __init__.py

b93c6ec

Update sql_alchemy.py

f244196

hayescode requested review from nixent and willydouhard March 25, 2024 16:23

persist elements with url, sql select typo fixes

055a906

DALLE for example can return a url but it's only valid for a limited time, so we persist those now

sandangel reviewed Mar 26, 2024

View reviewed changes

backend/chainlit/data/sql_alchemy.py Outdated Show resolved Hide resolved

hayescode mentioned this pull request Mar 27, 2024

Support Postgres for custom data layer #825

Closed

hayescode added 3 commits March 28, 2024 14:36

Merge branch 'Chainlit:main' into main

2e386b5

Update pyproject.toml

03f748b

Update __init__.py

0bd0aa6

tpatel reviewed Apr 10, 2024

View reviewed changes

backend/chainlit/data/sql_alchemy.py Outdated Show resolved Hide resolved

Update sql_alchemy.py

5939877

sandangel reviewed Apr 10, 2024

View reviewed changes

tpatel reviewed Apr 11, 2024

View reviewed changes

Update sql_alchemy.py

5d0cb18

tpatel self-requested a review April 11, 2024 17:02

tpatel approved these changes Apr 15, 2024

View reviewed changes

tpatel merged commit 2cd87ec into Chainlit:main Apr 15, 2024
4 checks passed

azlkiniue mentioned this pull request Apr 30, 2024

feat: add MinIO as storage provider #954

Closed

hayescode mentioned this pull request May 29, 2024

Fix SQLAlchemy bug and change add logging option #981

Merged

dokterbob added the data layer Pertains to data layers. label Aug 14, 2024

niklasdiehm mentioned this pull request Oct 22, 2024

SQLAlchemy Data Layer: Switch to more generic ORM layer #1460

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support SQLAlchemy for custom data layer #836

Support SQLAlchemy for custom data layer #836

hayescode commented Mar 21, 2024 •

edited

Loading

nixent left a comment

sandangel commented Mar 26, 2024

sandangel commented Mar 26, 2024

hayescode commented Mar 26, 2024

sandangel commented Mar 27, 2024 •

edited

Loading

sandangel commented Mar 27, 2024 •

edited

Loading

hayescode commented Mar 27, 2024

sandangel commented Mar 28, 2024 •

edited

Loading

sandangel Apr 10, 2024

hayescode Apr 10, 2024 •

edited

Loading

tpatel Apr 11, 2024

tpatel Apr 11, 2024

hayescode Apr 11, 2024

tpatel Apr 11, 2024

hayescode Apr 11, 2024

intc-hharshtk commented Apr 15, 2024

tpatel left a comment

hayescode commented Apr 15, 2024

JeanRessouche commented Apr 16, 2024

tpatel commented Apr 16, 2024

JeanRessouche commented Apr 16, 2024 •

edited

Loading

hayescode commented Apr 16, 2024

JeanRessouche commented Apr 16, 2024

hayescode commented Apr 18, 2024

JeanRessouche commented Apr 18, 2024

hayescode commented Apr 18, 2024

JeanRessouche commented Apr 19, 2024

hayescode commented Apr 19, 2024

Reymond190 commented Apr 27, 2024

Tirupathiraopatnala commented May 4, 2024

Tirupathiraopatnala commented May 4, 2024

hayescode commented May 4, 2024

		if not getattr(context.session.user, 'id', None):
		raise ValueError("No authenticated user in context")

	def require_login():
	return (
	bool(os.environ.get("CHAINLIT_CUSTOM_AUTH"))
	or config.code.password_auth_callback is not None
	or config.code.header_auth_callback is not None
	or is_oauth_enabled()
	)

Support SQLAlchemy for custom data layer #836

Support SQLAlchemy for custom data layer #836

Conversation

hayescode commented Mar 21, 2024 • edited Loading

Overview

Shout-Outs

Test System

How to configure

nixent left a comment

Choose a reason for hiding this comment

sandangel commented Mar 26, 2024

sandangel commented Mar 26, 2024

hayescode commented Mar 26, 2024

sandangel commented Mar 27, 2024 • edited Loading

sandangel commented Mar 27, 2024 • edited Loading

hayescode commented Mar 27, 2024

sandangel commented Mar 28, 2024 • edited Loading

sandangel Apr 10, 2024

Choose a reason for hiding this comment

hayescode Apr 10, 2024 • edited Loading

Choose a reason for hiding this comment

tpatel Apr 11, 2024

Choose a reason for hiding this comment

tpatel Apr 11, 2024

Choose a reason for hiding this comment

hayescode Apr 11, 2024

Choose a reason for hiding this comment

tpatel Apr 11, 2024

Choose a reason for hiding this comment

hayescode Apr 11, 2024

Choose a reason for hiding this comment

intc-hharshtk commented Apr 15, 2024

tpatel left a comment

Choose a reason for hiding this comment

hayescode commented Apr 15, 2024

JeanRessouche commented Apr 16, 2024

tpatel commented Apr 16, 2024

JeanRessouche commented Apr 16, 2024 • edited Loading

hayescode commented Apr 16, 2024

JeanRessouche commented Apr 16, 2024

hayescode commented Apr 18, 2024

JeanRessouche commented Apr 18, 2024

hayescode commented Apr 18, 2024

JeanRessouche commented Apr 19, 2024

hayescode commented Apr 19, 2024

Reymond190 commented Apr 27, 2024

Tirupathiraopatnala commented May 4, 2024

Tirupathiraopatnala commented May 4, 2024

hayescode commented May 4, 2024

hayescode commented Mar 21, 2024 •

edited

Loading

sandangel commented Mar 27, 2024 •

edited

Loading

sandangel commented Mar 27, 2024 •

edited

Loading

sandangel commented Mar 28, 2024 •

edited

Loading

hayescode Apr 10, 2024 •

edited

Loading

JeanRessouche commented Apr 16, 2024 •

edited

Loading