Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Proposal for SQLAlchemy Data Layer with ORM #1365

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

qtangs
Copy link
Contributor

@qtangs qtangs commented Sep 22, 2024

This is a proposal for a different way to implement SQLAlchemy Data Layer using ORM instead of raw SQL statements, taking inspiration from #832
Hopefully it will simplify support for other SQL dialects besides Postgres and SQLite.

This is a rather major change:

  • Define tables and indices using SQLAlchemy objects
  • Use SQLAlchemy functions to create statements instead of raw SQL queries

Other changes:

  • in get_thread and list_threads. only get specific thread(s) for better performance and reduced bandwidth compared to get_all_user_threads (note: reduced bandwidth is important for providers such as Supabase)
  • in list_threads, filter would use thread name as well as steps input/output texts, this needs careful review.
  • in create_element, storage_provider is made optional so that elements that don't need content to be stored will still be added to database, example of these elements are external links that can be read and displayed at runtime.

The diff between current SQLAlchemy and this one can be found here: https://gist.github.com/qtangs/71370e6ce5feec78d9586c808804e81c/revisions?diff=split&w=

Some tests have been added based on https://github.com/Chainlit/chainlit/blob/main/backend/tests/data/test_sql_alchemy.py but more might be needed.

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Sep 22, 2024
@dokterbob
Copy link
Collaborator

@qtangs Woop woop! :D :D

One important aspect is that we don't break existing applications.

Do you think we can reasonably assure that? Or should we deprecate older installations?

I expect to review this in the 2nd half of this week. Want to take my time to do it well, as it's such a big one.

@dokterbob
Copy link
Collaborator

Might want to take this into account #1368

@qtangs
Copy link
Contributor Author

qtangs commented Sep 25, 2024

Do you think we can reasonably assure that? Or should we deprecate older installations?

@dokterbob I think more testing is needed if we want to replace the old version in-place. I've added a new class to avoid this kind of major impact, anyone needing the new fix can switch to new class in their environment and test accordingly, if something breaks it's easy to switch back to old class.

@dokterbob
Copy link
Collaborator

Do you think we can reasonably assure that? Or should we deprecate older installations?

@dokterbob I think more testing is needed if we want to replace the old version in-place. I've added a new class to avoid this kind of major impact, anyone needing the new fix can switch to new class in their environment and test accordingly, if something breaks it's easy to switch back to old class.

We're not gonna support both though, and it seems there's a reasonable user base for the current implementation. I imagine we could deprecate the old version. Or we could use unit tests to somewhat assure consistency.

Curious how other community members/users of SQLite/SQLAlchemy reflect on this.

@barrel-roll-42
Copy link

barrel-roll-42 commented Oct 3, 2024

@dokterbob This would totally be great for my Databricks implementation of this.

@daviddwlee84
Copy link

@dokterbob
Came from #832
I'm looking for this feature to preserve data locally.
SQLite would be one of the easiest ways to do this.

@dokterbob
Copy link
Collaborator

#1463 is the first step towards the cleanup of data layers. Once that's merged, @qtangs, could you please resolve conflicts on this one?

I want to either have this code merged to main or (preferably) move it directly to our community repo!

@dokterbob dokterbob added the blocked Awaiting update or feedback from user after initial review/comments. label Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Awaiting update or feedback from user after initial review/comments. size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants