-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: postgresql store #157
base: main
Are you sure you want to change the base?
Conversation
Thanks for your contribution, nearly done with my review but I'll go ahead and approve and run the workflow. |
Hey great work, any update on this? |
I'm still in my review, I got wrapped up in some other responsibilities, will get it out ASAP! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay on this review! Mostly looking good, a couple of changes I've requested here.
One note: I noticed that 'static
was used a good bit here which was likely inherited from the sqlite
implementation. I think this is due to how the connection requires an closure with an async block which makes borrowing and using such things really odd.
Recently, proper async closures which will get into Rust stable sometime in 2025 which should help us alleviate our usage of 'static
here (i think!)
Thanks for the review. I'm still in holiday mode until New Year's, will address the comments on Jan 2 or 3. |
Hey @0xMochan ready for another round. I've addressed the comments you brought up, optimistically resolved the ones I think are done, but feel free to reopen if you think otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, I've tested locally and it works fine for me. Thanks for this PR, will be included in the next release!
CI / CD will be fixed soon, I'll rerun and then we'll merge to main afterwards.
|
||
## PostgreSQL usage | ||
|
||
The crate utilizes the [pgvector](https://github.com/pgvector/pgvector) extension, which is available for PostgreSQL version 13 and later. Use any of the [official](https://www.postgresql.org/download/) or alternative methods to install psql. The `pgvector` extension will be automatically installed by the crate if it's not present yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my testing, I had to go an add the pgvector
extension myself. Not sure if I did something wrong, I had installed postgres
via brew
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, curious. You mean by brew installing it separately or something like that?
In any case, I guess the confusing term here in the README is "installed" - what I meant is that the crate will run CREATE EXTENSION
which is the installation of the extension into the database, but of course there's also the installation of the extension library itself into the computer.
Rebased with upstream main, hopefully the CI will go green now 🤞 |
Hi, I would like to be able to use Rig in the future when the integration with Postgres is implemented but there are several points to comment:
In short, most of this implementation should be delegated to the user. Let the user choose which existing column the embedding should be saved on, nothing more. I understand that to make it 'easy' for new users it is done this way, but in real scenarios I would not like to have so little control and assumptions over this library implementation. |
@caenguidanos All fair points. As mentioned at the top, I've modeled this according to the sqlite crate. I'm not claiming it's the right abstraction (and you have good counterpoints why it's not), but it does help to get one started, as you indicated. For someone like you who clearly knows what they're doing, this won't be the right abstraction to use. In fact, I don't think any will and it's better to plug in postgres to Rig in a way you see fit. |
ca2b94f
to
a011763
Compare
Hey @milancermak @caenguidanos thank you both for your contributions and comments! This PR has been sitting here for a while and Postgres integration is something we want sooner than later, yet I agree with a lot of what @caenguidanos said. How about this: we keep using The thing with integration crates is that we not only want to target the most DBs possible, but also the most clients possible. Having both a That being said, @milancermak if you could address @caenguidanos's comments above (apologies if that has already been done, I have not looked at your most recent changes). Once that's done we can do a final review and merge! Cheers! |
@cvauclair There's no easy way to address the comment from @caenguidanos in this PR, it would take a whole different approach and impl. That will have to be done by someone else. I initially built this for myself, but in a somewhat generic way similar to sqlite (as mentioned above). I've since reevaluated my approach and I'm doing stuff differently now and don't have the time to contribute. Feel free to reject this PR and do the integration as you see fit. |
Resolves #5
I've modeled the crate on the SQLite one.
On the DB side, the crate is relying on the pgvector extension. It creates an HSNW index and uses the cosine distane operator to compare embeddings.
Please let me know it this looks ok or if there are any parts of the code that should be changed.