Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support SQLAlchemy for custom data layer #836

Merged
merged 42 commits into from
Apr 15, 2024
Merged

Support SQLAlchemy for custom data layer #836

merged 42 commits into from
Apr 15, 2024

Conversation

hayescode
Copy link
Contributor

@hayescode hayescode commented Mar 21, 2024

Overview

  • Adds custom, direct database, data layer using SQLAlchemy with support for a wide-range of SQL dialects
  • Improved loading speeds of list_threads and get_threads dramatically versus previous recursion.
    • The code is kinda wacky/long, but the end performance is worth it.
  • Optionally configures ADLS as the blob storage provider (designed to allow other providers in the future)
    • I only have access to Azure so I can't test/configure other providers
  • Duplicated PageInfo and PaginatedResponse from literal SDK into backend/chainlit/types.py and updated typing.
  • Note: I had to add locks because the backend uses asyncio.gather and sometimes create_step is attempted before update_thread causing a foreign key violation and race condition.
    • I prefer this approach in order to maintain referential integrity of the database.
    • April 2, 2024: Removed most foreign key constraints because Chainlit backend executes tasks concurrently via .gather() causing create_step to be called before update_thread resulting in foreign key violations. I assume this was done for performance reasons. The backend would have to be re-worked to handle this and a custom solution adds complexity that should not exist.
  • Added tests to new cypress/e2e/custom_data_layer.py.

Shout-Outs

  • Thank you @nixent for suggesting to use SQLAlchemy for a broader solution to support a larger # of developers!
  • Inspired from @sandangel who showed that this can be done and shared their work
  • For testing I straight up ganked your cypress tests @tjroamer . I had never done these before.

Test System

  • OS: Windows 11
  • Python: 3.11.4
  • Postgres: v11

How to configure

Install necessary dependencies to use this custom data layer.

pip install chainlit[custom-data] --upgrade

Run this SQL DDL in your sql database

DDL
CREATE TABLE users (
    "id" UUID PRIMARY KEY,
    "identifier" TEXT NOT NULL UNIQUE,
    "metadata" JSONB NOT NULL,
    "createdAt" TEXT
);

CREATE TABLE IF NOT EXISTS threads (
    "id" UUID PRIMARY KEY,
    "createdAt" TEXT,
    "name" TEXT,
    "userId" UUID,
    "userIdentifier" TEXT,
    "tags" TEXT[], 
    "metadata" JSONB,
    FOREIGN KEY ("userId") REFERENCES users("id") ON DELETE CASCADE
);

CREATE TABLE IF NOT EXISTS steps (
    "id" UUID PRIMARY KEY,
    "name" TEXT NOT NULL,
    "type" TEXT NOT NULL,
    "threadId" UUID NOT NULL,
    "parentId" UUID,
    "disableFeedback" BOOLEAN NOT NULL,
    "streaming" BOOLEAN NOT NULL,
    "waitForAnswer" BOOLEAN,
    "isError" BOOLEAN,
    "metadata" JSONB,
    "tags" TEXT[], 
    "input" TEXT,
    "output" TEXT,
    "createdAt" TEXT,
    "start" TEXT,
    "end" TEXT,
    "generation" JSONB,
    "showInput" TEXT,
    "language" TEXT,
    "indent" INT
);

CREATE TABLE IF NOT EXISTS elements (
    "id" UUID PRIMARY KEY,
    "threadId" UUID,
    "type" TEXT,
    "url" TEXT,
    "chainlitKey" TEXT,
    "name" TEXT NOT NULL,
    "display" TEXT,
    "objectKey" TEXT,
    "size" TEXT,
    "page" INT,
    "language" TEXT,
    "forId" UUID,
    "mime" TEXT
);

CREATE TABLE IF NOT EXISTS feedbacks (
    "id" UUID PRIMARY KEY,
    "forId" UUID NOT NULL,
    "value" INT NOT NULL,
    "comment" TEXT
);

Add SQLALCHEMY_CONNINFO environment variable in your .env file

SQLALCHEMY_CONNINFO = <dialect>+<driver>://<user>:<password>:<port>/<database>
ADLS_SAS_TOKEN = <your_SAS_token> # Optional

Add this to your app.py

import chainlit.data as cl_data
from chainlit.data.sql_alchemy import SQLAlchemyDataLayer
from chainlit.data.storage_clients import AzureStorageClient

storage_client = AzureStorageClient(
    account_url="https://<your_account>.dfs.core.windows.net",
    container="<your_container>",
    credential=credential,
    sas_token=ADLS_SAS_TOKEN
    )
cl_data._data_layer = SQLAlchemyDataLayer(
    conninfo=SQLALCHEMY_CONNINFO,
    ssl_require=True,
    storage_provider=storage_client,
    user_thread_limit=100)

Copy link

@nixent nixent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for turning to SQLAlchemy that fast! I have couple of comments, primarily about DB init instead of running DDL scripts

DALLE for example can return a url but it's only valid for a limited time, so we persist those now
@sandangel
Copy link
Contributor

can we add an interface/adapter for blob storage client?
For example: the client needs to implement upload_file, delete_file, download_file. then user can just have their own client and implement that interface instead of having to make a PR to chainlit to support their blob storage choice?

@sandangel
Copy link
Contributor

I also suggest using memcached or session storage, but this is a good first starting point already.

@hayescode
Copy link
Contributor Author

can we add an interface/adapter for blob storage client? For example: the client needs to implement upload_file, delete_file, download_file. then user can just have their own client and implement that interface instead of having to make a PR to chainlit to support their blob storage choice?

@sandangel I was thinking about make another file called chainlit/backend/data/blob.py or something where we could have classes for each blob storage provider. I only have access to Azure though, so cannot implement the others.

I added the functions based on the requirements from the chainlit documentation, except for delete_user_session because I cannot see a purpose. Where are you seeing delete_file and download_file?

@sandangel
Copy link
Contributor

sandangel commented Mar 27, 2024

@hayescode

Where are you seeing delete_file and download_file?

I think they are hidden in literalai API client. That is why I chose to implement the literalai API instead of BaseDataLayer in my MongoDB PR.

@sandangel
Copy link
Contributor

sandangel commented Mar 27, 2024

I was thinking about make another file called chainlit/backend/data/blob.py or something where we could have classes for each blob storage provider. I only have access to Azure though, so cannot implement the others.

I think this does not scale well for the reason I mentioned earlier:

having to make a PR to chainlit to support their blob storage choice

As the number of providers grow, it will add more maintenance burden for us. That is why I suggest having a common interface on the chainlit side, and on the users side they can use their own blob storage client. Similar to how chainlit is creating BaseDataLayer.

For SQLAlchemyDataLayer in this PR, I think it will only serve as an example implementation maintained by the community, not really something battle tested and ready to use for production that scale beyond > 100K users. For example, we will also need to add queue system for the write path, and key-value store for session and cache for faster querying and displaying data.

@hayescode
Copy link
Contributor Author

I think they are hidden in literalai API client.

I am not looking in literal because it's irrelevant for a custom data layer. Devs advise in the project ticket is to inherit from BaseDataLayer so that's what this implements.

That is why I suggest having a common interface on the chainlit side, and on the users side they can use their own blob storage client.

@sandangel I don't understand what you mean I guess. If you have code for this please share.

In any case, this isn't a fully feature complete PR and we should expect (and welcome!) more enhancements. I can only test some SQL dialects and storage providers. Tbh I am in no rush as I'm already live with this. Data persistence has blocked me and my team for months so I made this out of frustration mostly haha.

@sandangel
Copy link
Contributor

sandangel commented Mar 28, 2024

@hayescode sorry for the confusion. Here is some pseudocode:

# chainlit code

from typing import Protocol


class BlobStorageClient(Protocol):
    def upload_file(self, key: str, data: bytes) -> str:
        pass


class SQLAlchemyDataLayer(BaseDataLayer):

    async def add_blob_storage_client(self, blob_storage_client: BlobStorageClient) -> None:
        self.blob_storage_client = blob_storage_client
        logger.info("Blob Storage client initialized")


    @queue_until_user_message()
    async def create_element(self, element: 'Element'):
        # ...

        element.url = self.blob_storage_client.upload_file(key=element.key, data=element.data)


# user code:

class S3StorageClient:
    def upload_file(self, key: str, data: bytes) -> str:
        import boto3

        s3_client = boto3.client("s3")
        s3_client.put_object(Bucket="my-bucket", Key=key, Body=data)

        return f"s3://my-bucket/{key}"

cl._data_layer = SQLAlchemyDataLayer()
cl._data_layer.add_blob_storage_client(S3StorageClient())

async def delete_user_session(self, id: str) -> bool:
return False # Not sure why documentation wants this

async def get_all_user_threads(self, user_id: Optional[str] = None, thread_id: Optional[str] = None) -> Optional[List[ThreadDict]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how chainlit is using this function, but if they are using this for displaying the list of threads only, then I think we should not query all the information inside each thread. That should happen when user click to resume the chat.

Copy link
Contributor Author

@hayescode hayescode Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function expects a List[ThreadDict] which is basically everything to your point. I was able to speed it up a lot by querying it all at once per user instead of recursively calling each thread, steps in that thread, etc to build it.

This is also used for the search and feedback filters.

I added a user_thread_limit to cap threads since it'll only grow. I have several users filling up my limit (100 threads) and performance is fine. Are you seeing performance problems or just asking the Chainlit devs why it's like this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function isn't part of BaseDataLayer so it can be defined and used from within this custom data layer as needed.

@sandangel I agree that there might be some performance improvements to be done, I think it's fine for the first release of this custom data layer 👍

Comment on lines 255 to 256
if not getattr(context.session.user, 'id', None):
raise ValueError("No authenticated user in context")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to remove this check. The upsert_feedback is only called from a route, so it's outside of the user session (so attempting to use context will always through).

I don't have a good solution to prevent adding feedback to the DB for non-authenticated chainlit apps for now.

Suggested change
if not getattr(context.session.user, 'id', None):
raise ValueError("No authenticated user in context")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we're overthinking this. I don't see a scenario where authenticated and non-authenticated users coexist in the same app/db. With authentication enabled users can't use my app with authentication. If someone doesn't have authentication set up I don't know why they'd set up a custom data layer.

I just don't see how this scenario we're trying to guard against would be possible, but maybe I'm missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is a edge edge case I agree. I can sadly think about a least one scenario where it could happen: a app owner disables the authentication for a few weeks.
To be fair this is the only thing that I've found in today's testing session, and getting this fixed would enable me to do a last code review / testing session.

An option that I've just though about is to use the require_login method instead of checking the context (

def require_login():
return (
bool(os.environ.get("CHAINLIT_CUSTOM_AUTH"))
or config.code.password_auth_callback is not None
or config.code.header_auth_callback is not None
or is_oauth_enabled()
)
).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea! Then we can just block initialzation of the custom data layer itself if no authentication. Great solution!

Where would we call require_login()? I've tried the app.py before the decorators and in sqlalchemy.py but it is returning None for me even when I do have authentication enabled.

from chainlit.auth import require_login
is_authentication_enabled = require_login()

class SQLAlchemyDataLayer(BaseDataLayer):
    def __init__(self, conninfo: str, ssl_require: bool = False, storage_provider: Optional[BaseStorageClient] = None, user_thread_limit: Optional[int] = 1000):
        if not is_authentication_enabled:
            print(f'is_authentication_enabled: {is_authentication_enabled}')
            raise PermissionError("Authentication is required to use SQLAlchemyDataLayer")

@tpatel tpatel self-requested a review April 11, 2024 17:02
@intc-hharshtk
Copy link

Is it possible to continue from previous thread?

Copy link
Contributor

@tpatel tpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✨✨✨ AWESOME ✨✨✨

Thanks for your work @hayescode ! I believe this feature is ready now, we can still disable the persistence and display a proper warning for chainlit apps without authentication in a future PR.

@tpatel tpatel merged commit 2cd87ec into Chainlit:main Apr 15, 2024
4 checks passed
@hayescode
Copy link
Contributor Author

@tpatel thank you very much for your assistance and advice throughout this process! I'm excited to see how the Chainlit community evolved this functionality!

tpatel pushed a commit to tpatel/chainlit that referenced this pull request Apr 15, 2024
- adds custom, direct database, data layer using SQLAlchemy with support for a wide-range of SQL dialects
- configures ADLS or S3 as the blob storage provider
- duplicated `PageInfo` and `PaginatedResponse` from literal SDK into backend/chainlit/types.py and updated typing
@JeanRessouche
Copy link

Hi, it seems that this implementation does not support Azure Sql Databases: i get no error at boot but a Bad Request: 'User not persisted' 400 error, while the user is correctly stored in the user table. Also, on thread creation, an error appears to store thread.tags because this datatype (TEXT[]) does not exist in Azure Sql.

from sqlalchemy import create_engine
import urllib
import chainlit.data as cl_data
from chainlit.data.sql_alchemy import SQLAlchemyDataLayer
from sqlalchemy.engine import URL

driver = 'ODBC Driver 17 for SQL Server'

params = urllib.parse.quote_plus(
    'Driver=%s;' % driver +
    'Server=tcp:%s,1433;' % os.environ["AZURE_DB_HOST"] +
    'Database=%s;' % os.environ["AZURE_DB_DATABASE"] +
    'Uid=%s;' % os.environ["AZURE_DB_USERNAME"] +
    'Pwd={%s};' % os.environ["AZURE_DB_PASSWORD"] +
    'Persist Security Info=False;' +
    'MultipleActiveResultSets=False;' +
    'Encrypt=yes;' +
    'TrustServerCertificate=no;' +
    'Connection Timeout=30;')

conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
engine_azure = create_engine(conn_str)
engine_azure.connect()
print('Connection to the Azure Db is ok')

connection_url = URL.create(
    drivername='mssql+aioodbc',
    username=os.environ["AZURE_DB_USERNAME"],
    password=os.environ["AZURE_DB_PASSWORD"],
    host=os.environ["AZURE_DB_HOST"],
    database=os.environ["AZURE_DB_DATABASE"],
    query={
        'driver': driver,
        'Encrypt': 'yes',
        'TrustServerCertificate': 'no',
        'Connection Timeout': '30'
    }
)

cl_data._data_layer = SQLAlchemyDataLayer(
    conninfo=connection_url,
    ssl_require=True,
    storage_provider=engine_azure,
    user_thread_limit=100)

Did i made a mistake somewhere ?

@tpatel
Copy link
Contributor

tpatel commented Apr 16, 2024

@JeanRessouche I've added some docs yesterday to clarify the scope of this release. For now this has only been tested with PostgreSQL.

I'll be happy to guide you if you choose to add support for Azure SQL!

@JeanRessouche
Copy link

JeanRessouche commented Apr 16, 2024

Thanks a lot @tpatel, the doc is helping but it's still a little bit cloudy for me.

Adding support for Azure Sql is definitely something that i'm willing to do, but not in the short term, so I'm trying the Azure Datalake way.

Based on the doc, i'm facing a blocker with the conninfo variable.
For me an Azure Datalake Gen 2 is basically a storage account (that can provide with the account name & sas token), so i have no idea how to get the conninfo content that look like sql server details here. Does it require an Azure Synapse on top of the Data lake gen 2 ?

[EDIT] Yeah, i certainly need Synapse, configured one, now i'm lost with the incompatible ddl as the datalake link seems to be SQL server compatible.

@hayescode
Copy link
Contributor Author

For the tags I'm not sure what that azure SQL equivalent is but it's a list of strings in that column.

For the data lake I am using ADLS gen2. User not persisted sounds like this isn't getting configured properly. Pass the account url and credential (managed identity or access key). Try a test script to write some test data.

@JeanRessouche
Copy link

I'm curious about how you created the DDL in ADLS gen 2 on your side, i was only able to connect to it with sql server driver, thus the postgre ddl won't work.

So far i was only able to make it work properly with a postgre server (which is great already!), failed with ADSL gen2 & Azure Sql database.

@hayescode
Copy link
Contributor Author

I'm curious about how you created the DDL in ADLS gen 2 on your side

I'm not sure what you mean, there is no DDL for ADLS. All you need is the account_url, credential, and the container name. Optionally add sas_token to append to the url to give users access. Once this is provided the code should handle it all and log any errors.

@JeanRessouche
Copy link

Hum, ok, i have to retry then, did that but it wasn't working, certainly because i did not figure out what to put in conninfo when we use ADLS.

@hayescode
Copy link
Contributor Author

you don't put it in the conninfo you instantiate the ADLS class and pass it storage_provider.

from chainlit.data.storage_clients import AzureStorageClient

storage_client = AzureStorageClient(
    account_url="https://<your_account>.dfs.core.windows.net",
    container="<your_container>",
    credential=credential,
    sas_token=ADLS_SAS_TOKEN
    )
cl_data._data_layer = SQLAlchemyDataLayer(
    conninfo=SQLALCHEMY_CONNINFO,
    ssl_require=True,
    storage_provider=storage_client,
    user_thread_limit=100)

@JeanRessouche
Copy link

Hum, thanks, but there is still something i'm missing here, you ask me not to set conninfo but you do set it in your answer, and in the description above.

image

In the other hand if i simply don't provide it or set it empty i get an error as it's a required parameter

image

For ADLS i have no clue what is expected in the SQLALCHEMY_CONNINFO variable.

<dialect>+<driver>://<user>:<password>:<port>/<database>

@hayescode
Copy link
Contributor Author

There are 2 clients. Pass the ALDS client to the SQLAlchemy via storage_client. I think you're conflating these.

@Reymond190
Copy link

In next commits, pls fix this ->
i tested the sql alchemy datalayer using azure AD Oauth. there is an validation error when whole dict is used in "PersistedUser(**user_data)", works after metadata is removed from the dict.

ERROR:
File "D:\git_projects\mvenv_504\lib\site-packages\chainlit\data\sql_alchemy.py", line 114, in get_user
return PersistedUser(**user_data)
File "D:\git_projects\mvenv_504\lib\site-packages\pydantic_internal_dataclasses.py", line 140, in init
s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
pydantic_core._pydantic_core.ValidationError: 1 validation error for PersistedUser
metadata
Input should be a valid dictionary [type=dict_type, input_value='{"image": "data:image/jp..."provider": "azure-ad"}', input_type=str]
For further information visit https://errors.pydantic.dev/2.7/v/dict_type

@Tirupathiraopatnala
Copy link

There are 2 clients. Pass the ALDS client to the SQLAlchemy via storage_client. I think you're conflating these.

This is a bit confusing tbh, can you please add detailed steps to setup data persistance with azure

@Tirupathiraopatnala
Copy link

There are 2 clients. Pass the ALDS client to the SQLAlchemy via storage_client. I think you're conflating these.

from chainlit.data.storage_clients import AzureStorageClient

storage_client = AzureStorageClient(
account_url="https://<your_account>.dfs.core.windows.net",
container="<your_container>",
credential=credential,
sas_token=ADLS_SAS_TOKEN
)

If only this part of the code is responsible for data persistance in azure, will it create the required tables by default? or do we need to spin a db there?

@hayescode
Copy link
Contributor Author

Data lake only stores elements (non-text). It's optional. SQL database handles everything else and the DDL is in the pr description.

aryadhruv added a commit to aryadhruv/chainlit_sysos that referenced this pull request Jul 15, 2024
* remove unused recoil dependency (Chainlit#551)

* make sure context is always carried (Chainlit#552)

* restore context

* use a copy of metadata when restoring the user session to avoid conflicts and json serialization issues (Chainlit#557)

* update langchain cache (Chainlit#556)

* prepare 0.7.604

* fix lc placeholder with lcel (Chainlit#568)

* call on_chat_end when session is manually cleared (Chainlit#567)

* avoid readme flickering when resuming a chat (Chainlit#566)

* avoid readme flickering when resuming a chat

* highlight resumed conversation in the sidebar

* redirect to main chat after deletion

* avoid error if conversation does not exist

* check module is loaded before deleting it (Chainlit#563)

* check module is loaded before deleting it

* fix key error

* allow html behind feature flag (Chainlit#565)

* allow html behind feature flag

* make latex a feature

* prepare 0.7.700

* fix chat_profiles + clear asks

* feat(playground): add gpt-4-turbo in the playground (Chainlit#578)

* Update README.md

* Release/1.0.0 (Chainlit#587)

* init

* base data layer

* add step to data layer

* add queue until user message

* remove data_persistence from config

* upload askfilemessage response as file element

* step context

* step context

* llama index integration + step elements

* haystack integration + step error

* langchain integration + error handling

* feedback

* feedback

* refactor AppUser to User

* migrate react-client

* migrate react-components

* migrate main ui

* fix mypy

* fix type import

* fix step issues + langchain issues

* token count

* remove IMessage and MessageDict

* wip fix tests

* fix existing tests

* add data layer test

* action toast

* remove seconds from message time

* add support for action interruption

* rename appuser to user

* toast style

* fix update thread

* use http for file uploads

* remove useless create_task

* wip data layer

* rename client step type

* fix chainlit hello

* wip data layer

* fix test

* wip data layer

* add root param to step

* fix llama index callback handler

* add step show input

* fix final answer streaming

* update readme

* step type lower case

* chainlit_client

* debug ci

* debug ci

* bump sdk version

* bump version

* bump versions (Chainlit#590)

* various bug fixes (Chainlit#600)

* various bug fixes

* file upload should flush thread queues

* relax python dependency

* fix deps

* don't send steps if hide_cot is true

* make oauth cookie samesite policy configurable

* add copy button and replace message buttons

* 1.0.0rc2

* changelog

* bump sdk version

* add page to pdf element (Chainlit#606)

* do not ask confirmation for new chat if no interaction happened (Chainlit#605)

* do not ask confirmation for new chat if no interaction happened

* fix test

* Wd/fix element (Chainlit#608)

* fix element url when authenticated

* fix file url

* fix chat resume first interaction

* hide copy button is disable_feeback is true

* add changelog

* fix flaky tasklist test

* feat(playground): Add Gemini Pro in the LLM playground (Chainlit#610)

* langchain callback should consider current step

* llama index callback should consider current step

* add loader while image is loading (Chainlit#612)

* enhance custom auth (Chainlit#613)

* return error 500 instead of 4xx when feedback fails

* remove legacy prod url (Chainlit#615)

* Update callbacks.py (Chainlit#614)

* fix react client lodash issue (Chainlit#619)

* added AWS Cognito OAuth provider (Chainlit#540) (Chainlit#617)

* track user sessions (Chainlit#620)

* track user sessions

* interactive -> is_interactive

* update sdk

* Adding streaming functionality to output for haystack callback manager (Chainlit#621)

* Adding streaming functionality to haystack callback manager

* Improved code and added type to self.last_tokens

* Fixed another latent bug. Seems like the code would fail when agent_step.is_last() wasn't true as the stack had already been popped

* Changed list to List for mypy test (guessing this is due to python version)

* fixed mypy issue

---------

Co-authored-by: Kevin Longe <[email protected]>

* do not throw in ws connect handler

* Add Chat Settings to Generic Langchain Provider (Chainlit#622)

* Add Chat Settings to Generic Langchain Provider

* Update langchain.py

* Set default inputs to empty list

Suggestion from @willydouhard

* Do not require settings for stream event

* use chainlit 1.0.0 attrs

* remove unecessary whitespace

* Update langchain.py

* Update backend/chainlit/playground/providers/langchain.py

Co-authored-by: Willy Douhard <[email protected]>

---------

Co-authored-by: Willy Douhard <[email protected]>

* refactor: Add new way to open thread history (Chainlit#629)

* avoid creating existing session (Chainlit#637)

* fix element creation (Chainlit#636)

* fix element creation

* remove print

* feat: Add button to auto scroll down (Chainlit#630)

* disable jwt auth for elements (Chainlit#633)

* feat: New way to stop the loading task (Chainlit#631)

* refactor: Design of messages (Chainlit#635)

* refactor: Design of messages

* ui fixes

* fix tests

* clear warnings

* fix width issue

* fix sidebar trigger button

* fix vertical centering

* enhance user session count

* add tooltip to sidebar trigger

---------

Co-authored-by: Willy Douhard <[email protected]>

* do not display scroll to bottom button on chat history

* fix hide cot hide pp (Chainlit#638)

* fix scroll bottom button flickering (Chainlit#639)

* literalai (Chainlit#642)

* do not display hide of cot toggle is hide cot is true on the server (Chainlit#656)

* add onlogout hook (Chainlit#654)

* add onlogout hook

* pass the fastapi response as well

* update the user if it already exists (Chainlit#653)

* do not apply a bgcolor on an avatar that has an image (Chainlit#652)

* always use the latest uploadFile function (Chainlit#651)

* Update server.py (Chainlit#648)

* prevent the app to crash if the data layer is not reachable (Chainlit#644)

* prevent the app to crash if the data layer is not reachable

* make the app still usable if auth is enabled and data layer is down

* prepare release (Chainlit#657)

* prepare release

* bump literalai version

* fix parent id issue (Chainlit#659)

* prevent running button to display twice when cot is false and streaming (Chainlit#666)

* bump version

* Added Internationalization with react-i18next (Chainlit#668)

* feature: i18n

* do not display running button if cot is true and message is being streamed (Chainlit#669)

* chore: bump uvicorn to 0.25.0 (Chainlit#664)

fixes Chainlit#663

Co-authored-by: = <[email protected]>

* fix pp doc link

* enhance default translation log

* attempt to fix file watcher (Chainlit#674)

* enhance langchain tracing

* fix: Tasklist flick (Chainlit#676)

* Wd/embed (Chainlit#679)

* relax fast api

* allow for custom fonts

* make cors configurable

* rename to copilot and serve copilot index.js


---------

Co-authored-by: SuperTurk <[email protected]>

* fix translation key

* Release/1.0.200 (Chainlit#681)

* fix translation issue

* fix copilot auth

* add continuing chat info

* bump version

* prepare release

* Update README.md

* Remove `+` from secrets (Chainlit#688)

Some oauth providers will replace the `+` character with a space which fails the auth check.

* fix/fix overlay (Chainlit#717)

Co-authored-by: Clément Sirieix <[email protected]>

* steps should be able to be called in parallel (Chainlit#694)

* steps should be able to be called in parallel

* clean local steps

* allow generation on message

* fix tests

* remove fast api version constraint (Chainlit#732)

* remove fast api version constraint

* bump langchain

* update literalai dependency (Chainlit#735)

* thread name, wrap step input/output, remove user session

* update literal-ai dependency

* set the default prompt playground state to `Formatted`

* keep the input/output step properties as dict in Chainlit

- also fixes the llama index callback and the playground, following the renaming

* datalayer: only add input/output if needed, remove empty input/output objects

* remove debug log

---------

Co-authored-by: Willy Douhard <[email protected]>

* bump literal version (Chainlit#749)

* bump literal version

* fix pp

* Adds Cognito label and icon (Chainlit#704)

* adds custom_js (Chainlit#708)

* prepare release (Chainlit#751)

* fix thread name

* fix thread name

* fix test

* fix mypy

* fix resume chat

* Add OpenAI integration (Chainlit#778)

* Add OpenAI integration

- This simplifies instrumenting OpenAI calls
- Reuses the literalai OpenAI instrumentation
- Creates a new step for each OpenAI call, with the call details in the generation property

* add a 1ms delay to the `on_message` callback

- This makes sure any children step starts after the parent message step
- replaced `generation.settings.model` by `generation.model`

* move the openai version check inside the instrumentation call

- avoids erroring if a user isn't using `instrument_openai` and hasn't installed openai

* Add issue templates (Chainlit#786)

- Helps gather more information when users create issues

* prepare release (Chainlit#794)

* prepare release

* fix mypy

* fix pdf mime type

* Update pyproject.toml to fix python-multipart vulnerability (Chainlit#777)

fixes Chainlit#776

* Fix: LlamaIndex 0.10.x migration (Chainlit#797)

* fix unbound reference issue (Chainlit#807)

* fix unbound reference issue

* bump version

* Add custom frontend build (Chainlit#783)

- Use the `custom_build` to set the relative path for the custom build.
- The `custom_build` path should contain a `./frontend/dist/index.html` file that will be load when users load the chainlit app url.
- This `index.html` can contain any custom frontend, you're responsible for providing the full ui.

* fix: fix typing for data layer (Chainlit#802)

* added missing translation for time grouped categories (Chainlit#773)

* remove forced lineheight (Chainlit#814)

* Multi-modal support for file size, mime types and number of files (Chainlit#787)

- adds `accept` (mime type), `max_files` and `max_size_mb` configurations (under `features.multi_modal`)
- file uploads are rejected if they aren't respecting the configuration

* move react-components codebase back into frontend (Chainlit#829)

- simplifies the codebase by removing the need for a separate package

* Loads markdown file based on language (Chainlit#692)

* Fix 'completion' referenced before assignment (Chainlit#800)

- Fixes `UnboundLocalError("local variable 'completion' referenced before assignment")` error in LangchainTracer.on_llm_end callback

* fix typo in provider name (Chainlit#822)

- `Coginto` -> `Cognito`

* add missing translation keys (Chainlit#818)

- add the "settings" modal title and the "show history" text

* bump react-client version

* feat: add HEAD route for '/' to support status checks (Chainlit#835)

* Do not reload installed packages on file change (Chainlit#842)

- fixes the filewatcher issues
- the fix checks where the modules are installed (venv or regular installation) and prevents their reload.

* fix BaseDataLayer create_element type (Chainlit#850)

* migrate to literal score (Chainlit#851)

* migrate to literal score

* fix tests

* enhance langchain llm step display

* correctly display new lines

* move new line cleaning to backend

* fix thread dict

* bump sdk version

* fix data layer test

* fix casing

* allow for custom socketio pathname (Chainlit#853)

* Improve translations (Chainlit#852)

* add translation linter

- compares the local translation json structure with the en-US chainlit reference
- helps maintain accurate translations across languages

* add translation fallback

- when a translation is defined as a country code (ex: `de`), it should be used if the user uses a specific language (ex: `de-DE`)

* move translations from settings to its own route

- this enables having translations on the login screen (we didn't load the config before a user is logged-in)

* add fastapi gzip middleware to speed up json replies

* add missing translations

- enable a fallback (skeleto component) in the existing Translator component
- add a new hook to return a string (with a `...` fallback)
- switch to using the Translator as much as possible (to benefit from the skeleton fallback)
- if the Translator can't be used, use the new `useTranslation` hook instead of the one from `react-i18next` (to benefit from the fallback which avoids displaying the translation keys)

* fix tests

* update the copilot translation fetching logic

- use the new /project/translations route instead to retrieve the translations

* fix the tests

* update changelog

* Update CHANGELOG.md

* bump version

* unpin starlette dependency (Chainlit#868)

* remove translations that aren't up to date (Chainlit#866)

- we won't host all translations in Chainlit as this would lower the speed of iteration (we would need to update each translation for each frontend change that introduces a new translation)
- we will need to find a good structure to allow the community to share their translations

* Fix dates in changelog (Chainlit#872)

* fix: clean sidebar when starting a new conversation (Chainlit#878)

- moved the 'SideView' state into react-client
- clean up the 'SideView' state in the `clear` method from `useChatInteract`

* add optional `tags` and `metadata` to steps and messages (Chainlit#877)

* fix tool calls in prompt playground (Chainlit#865)

- also get rid of non-chat openai

* rename the Literal env variable to LITERAL_API_URL (Chainlit#870)

- backward compatible as `LITERAL_SERVER` is used as fallback

* add model and chunks to llama index callback handler (Chainlit#876)

* Wd/update literalsdk (Chainlit#885)

* update to async literal sdk

* changelog

* make actions trigger thread persistence

* add chat profile as thread tag

* expose sessionId in useChatSession

* Support SQLAlchemy for custom data layer (Chainlit#836)

- adds custom, direct database, data layer using SQLAlchemy with support for a wide-range of SQL dialects
- configures ADLS or S3 as the blob storage provider
- duplicated `PageInfo` and `PaginatedResponse` from literal SDK into backend/chainlit/types.py and updated typing

* bump literalai version and chainlit

* add quotes to create-secret output (Chainlit#909)

- fixes an issue with special characters from the secret

* update changelog (Chainlit#910)

* bump version

* Wd/resilience (Chainlit#913)

* make sure the chainlit app works even if the data layer is down

* changelog

* check that user exists before listing threads

* fix fallback

* bump literalai version

* put back user creation

* fix typo

* remove return

* Add the user's browser languages in the user session (Chainlit#889)

* chore: label new issues as "needs-triage" (Chainlit#914)

- prevents all untriaged issues from going on the community board

* make auto tag thread opt in (Chainlit#927)

* fix: correctly serialize generation and allow `None` `storage_provider` in SQLAlchemyDataLayer (Chainlit#921)

- adding `self.storage_provider = None` in `def __init__()` so that it does not break method create_element.
- adding generation parameter in create_step as it's also a JSON type

---------

Co-authored-by: Thibaut Patel <[email protected]>

* fix flaky parent for openai instrumentation (Chainlit#931)

* Allow html in text elements (Chainlit#893)

* allow for setting a ChatProfile default (Chainlit#930)

* bump literalai sdk version

* Update thread list on first interaction (Chainlit#923)

- re-fetch the thread list on first interaction
- navigate to `/thread/:id` url when creating a new conversation
- update the `/thread/:id` page to allow for displaying the current chat
- add `threadId` to `useChatMessages` to get the current conversation thread id
- update "back to conversation" links
- clear current conversation when deleting the current thread or changing chat profile

* bump version (Chainlit#932)

* fix: pasting from microsoft products generates text instead of an image (Chainlit#934)

fixes Chainlit#743

- only consider paste input if it can't be converted as text

* add support for `multiline` option in `TextInput` chat settings field (Chainlit#945)

closes Chainlit#507

* fix: do not prevent thread revalidation (Chainlit#944)

fixes Chainlit#941

- This fixes the issue where only the first messages of a thread would be shown in the thread history in some cases.

* fix: display the label instead of the value for menu item (Chainlit#943)

* fix: disable gzip middleware to prevent a compression issue on safari (Chainlit#952)

fixes Chainlit#895

* release 1.0.506 (Chainlit#953)

- update literalai dependency to version 0.0.509
- bump version
- update changelog

* Wd/audio (Chainlit#962)

* rework navigation

* wip

* fix buffering

* finalizing audio feature

* fix lint

* update changelog

* bump literalai version

* Run on_chat_resume() before resume_thread() (Chainlit#968)

Execute the on_chat_resume decorator before the resume_thread emitter.
This allows the `thread` in on_chat_resume to be updated on load.

* fix: double on_chat_end invocation (Chainlit#971)

* feat: remove bytes objects from steps (Chainlit#969)

* feat: remove bytes objects from steps

* feat: process list and tuples

* Feat/discord (Chainlit#986)

* add slack platform
* add discord platform

* Update sql_alchemy.py (Chainlit#981)

* Release/1.1.0 (Chainlit#989)

* bump version

* fix dependabot security alert

* feat: wrap on_message with typing (Chainlit#991)

Bot will show as typing while the run is running.

* fix user menu overflow

* make discord and slack bot resilient to the data layer

* prepare 1.1.101 release

* fix scroll issues (Chainlit#1000)

* Added threadId to Feedback (Chainlit#999)

* feat: add custom meta image config (Chainlit#1007)

* Wd/1.1.200 (Chainlit#1008)

* loader rework
* update icons
* fix auto scroll
* fix github button

* update readme

* fix details button + loader

* fix ci

* fix audio capture

* fix: linter

* feat: add video player using react-player to support YouTube, Vimeo and other sources (Chainlit#980)

* Update CHANGELOG.md

* feat: add font colors to config.toml (Chainlit#976)

- you can now configure the primary and secondary text color from the `config.toml` file.

* fix: limit discord thread name (Chainlit#1013)

* fix: limit discord thread name

* fix: handle start of thread with blank message

* update readme

* Update README.md

* Added threadId to Element (Chainlit#1005)

* Added threadId to Feedback

* :Revert "Added threadId to Feedback"

This reverts commit dd5e4b5.

* Add threadId to delete_element

* fix: custom build dir do not need frontend and dist (Chainlit#1020)

* fix: custom build dir do not need frontend and dist

Signed-off-by: San Nguyen <[email protected]>

* fix e2e test

Signed-off-by: San Nguyen <[email protected]>

* test previous values

Signed-off-by: San Nguyen <[email protected]>

* fix tests

Signed-off-by: San Nguyen <[email protected]>

* no format index.html

Signed-off-by: San Nguyen <[email protected]>

* remove envrc

Signed-off-by: San Nguyen <[email protected]>

---------

Signed-off-by: San Nguyen <[email protected]>

* Release/1.1.300 (Chainlit#1028)


* Add starters

* Debug mode

* Rework CoT

* Rework Avatars

* Remove PP

* Update README.md

* update readme image

* fix: scroll flickering

* 1.1.300rc1

* attempt to fix build (Chainlit#1033)

* attempt to fix build

* update numpy

* condition numpy version based on python version

* fix condition on numpy

* remove duplicate new chat button

* update changelog

* feat: added Gitlab OAuth provider (Chainlit#1012)

* feat: add DynamoDB Datalayer (Chainlit#1030)

- enables using AWS DynamoDB as a database for the chat history

* Release/1.1.300 (Chainlit#1040)

* enhance conversation spacing and make copilot expandable

* fix non ascii characters for chat profiles

* Release/1.1.300 (Chainlit#1041)

* enhance conversation spacing and make copilot expandable

* fix non ascii characters for chat profiles

* add input streaming support

* Release/1.1.300 (Chainlit#1043)

* enhance conversation spacing and make copilot expandable

* fix non ascii characters for chat profiles

* add input streaming support

* fix message margin top

* feat: add ssl support using custom key/cert files (Chainlit#1047)


---------

Co-authored-by: BCM <[email protected]>

* Add Teams integration (Chainlit#1003)

* add teams integration

* enhance feedback and add typing activity

* feat: add OAuth Azure AD hybrid flow (Chainlit#1046)

* Added Hybrid Flow for Authorization grant to reterive user id_token.  Changed Redirect from Get to Post as Hybrid Flow needs form_post instead of query and returns the response as Form body.

Added Hybrid Flow for Authorization grant to reterive user id_token.
Changed Redirect from Get to Post as Hybrid Flow needs form_post instead of query and returns the response as Form body.

* Added suggested changes

Added Get for Callback
Added Id_token to Callback signature

* fixes

- add missing random_secret
- fix scopes
- split the oauth redirection into two routes
- fix imports

* split azure hybrid flow into a separate oauth provider

* fix typing and incorrect merge conflict resolution

---------

Co-authored-by: Shabir Jan <[email protected]>

* make select input use theme colors

* add tooltip to avatar

* Update en-US.json (Chainlit#1061)

fix small typo

* Update sql_alchemy.py - bugfix for delete_step (Chainlit#1027)

Fixed bug in sql_alchemy.py:delete_step: Wrong/non-existing primary key name for steps_query (was "forId" instead of id)

* prepare release (Chainlit#1064)

* Added the ability to specify a root path (sub directory

* @cl.on_system_message

* add system message

* make sure metadata is a dict

* add useAudio

* fix index error

* rc5

---------

Co-authored-by: JT <[email protected]>

* Parametrize uvicorn ws protocol

* prepare release (Chainlit#1071)

* prepare release

* fix: slack bot if user email is not available

* fix socketio double cross origin header

* add IS_SUBMOUNT

* fix: add back get_user_info to AzureADOAuthProvider (Chainlit#1075)

* fix: add back get_user_info to AzureADOAuthProvider

* prepare patch

* prepare release (Chainlit#1081)

* prepare release

* fix: copilot theme

* fix: oauth redirection url (Chainlit#1088)

* fix: oauth redirection with root path

* feat: add mistral instrumentation (Chainlit#1100)

* feat: add mistral instrumentation

* fix: remove requirements on mistralai

* fix: add changelog

* add chat context (Chainlit#1108)

* add chat context

* Wd/chat context (Chainlit#1109)

* add edit_message

* fix: clear steps when editing a message

* fix: test

* prepare release

* fix tool nesting (Chainlit#1113)

* fix tool nesting

* fix: flaky test

* Fix Azure authentication (Chainlit#1117)

The callback path for authentication in Azure AD, and possibly others, was
unintentionally broken in 8ecd415 by modifying the user-facing path.

This commit reverts one line of that change to restore it to its previous
state.

* Willy/cot (Chainlit#1128)

* give more options for cot (hidden, tool_call, full)
* handle scorable runs at the framework level
* lack/Teams/Discord DM threads are now split by day

* prepare 1.1.400rc0

* fix: make only last message of a run scorable

* fix: data_layer test

* enhance run scoring

* bump version

---------

Signed-off-by: San Nguyen <[email protected]>
Co-authored-by: Willy Douhard <[email protected]>
Co-authored-by: Florian Valeye <[email protected]>
Co-authored-by: DanConstantini <[email protected]>
Co-authored-by: Robin Opdam <[email protected]>
Co-authored-by: datapay-ai <[email protected]>
Co-authored-by: Kevin Longe <[email protected]>
Co-authored-by: Tyler Titsworth <[email protected]>
Co-authored-by: SuperTurk <[email protected]>
Co-authored-by: Josh Hayes <[email protected]>
Co-authored-by: Davi Reis Vieira <[email protected]>
Co-authored-by: Pawel <[email protected]>
Co-authored-by: = <[email protected]>
Co-authored-by: SuperTurk <[email protected]>
Co-authored-by: Brian Antonelli <[email protected]>
Co-authored-by: Clément Sirieix <[email protected]>
Co-authored-by: Clément Sirieix <[email protected]>
Co-authored-by: Thibaut Patel <[email protected]>
Co-authored-by: Brian Antonelli <[email protected]>
Co-authored-by: Kevin <[email protected]>
Co-authored-by: Anurag Dandamudi <[email protected]>
Co-authored-by: Felipe Aros <[email protected]>
Co-authored-by: 131 <[email protected]>
Co-authored-by: San Nguyen <[email protected]>
Co-authored-by: hans-sarpei <[email protected]>
Co-authored-by: jpolvto <[email protected]>
Co-authored-by: steflommen <[email protected]>
Co-authored-by: shishax <[email protected]>
Co-authored-by: giulioottantotto <[email protected]>
Co-authored-by: Sinan Saral <[email protected]>
Co-authored-by: BCM <[email protected]>
Co-authored-by: rickythefox <[email protected]>
Co-authored-by: Koichi Ishida <[email protected]>
Co-authored-by: mohamedalani <[email protected]>
Co-authored-by: Jan Beitner <[email protected]>
Co-authored-by: Kevin Merritt <[email protected]>
Co-authored-by: Kevin Merritt <[email protected]>
Co-authored-by: Maciej Wieczorek <[email protected]>
Co-authored-by: Mayaank Vadlamani <[email protected]>
Co-authored-by: Quy Tang <[email protected]>
Co-authored-by: Hugues de Saxcé <[email protected]>
Co-authored-by: DanConstantini <[email protected]>
Co-authored-by: Mathieu CHANIAT <[email protected]>
Co-authored-by: Shabir Jan <[email protected]>
Co-authored-by: Davi S. Zucon <[email protected]>
Co-authored-by: ralphkink <[email protected]>
Co-authored-by: JT <[email protected]>
Co-authored-by: Maciej Wieczorek <[email protected]>
@dokterbob dokterbob added the data layer Pertains to data layers. label Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data layer Pertains to data layers.
Projects
Development

Successfully merging this pull request may close these issues.

10 participants