-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch from LocalStorage to IndexedDB for Chat History Storage #158
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@enricoros Did you get a chance to review it? :) Hope you are enjoying your vacations! |
Hi, this looks really good! I was on purpose ignoring requests when coming back from vacation, to prioritize product brainstorming to code but this looks like a great quality commit. Do you think the chats can be preserved / migrated when switching? (As a one time operation). I'm asking because many people have valuable data in the chats and would be disappointed at the product if it wiped the chats at a certain point. |
Hey, thanks for the comment, I would have used the What we can do here is, instead use a custom one-off migration to migrate local storage to indexed DB, it will check if there's existing data in local storage, if so, it will migrate it then clear it out from local storage so that the migration need not happen again. I've implemented the above-mentioned migration in this PR. Do review and let me know how you feel about this approach! Note: Tested by manually setting local storage key |
@enricoros any updates on reviewing this? :P It'd be really nice to have 20+ chats, I constantly find myself having to delete chats 😅 |
I'm being very careful particularly on this, as I'm super scared to kill people's chats. Imagine the backlash! The code is definitely of high quality, and I want to merge it. Just being cautious for this, as it's potentially destructive. On the Chat LimitI am thinking of keeping the limit mechanism in place and raise it to 50. I have no idea what will happen for someone that goes to thousands of chats, and then the DB corrupts or the browser crashes or whatnot. What if we keep the safety net? The product has no analytics, so we don't know how many people are at limit, but I'm pretty much at limit in all my installs :). However, raising 20 -> 50 could be a good safety net to keep in place. There's a visual indicator on the chats list that can be activated when nearing the limit (turn it on with Preferences > Experiments > ON), and can help to visually see which chats are "mostly empty" and good to delete. You may also consider the "MAX" limit as a limit that forces creativity and tidyness. My ChatGPT (I haven't used it since big-AGI) looked like an untify mess :) It's like having 160 chars for a tweet. So the question is: how do you feel to raise the limit to 50, as a "defensive/safety" measure, while switching to IndexDB? Side Effects / Other QuestionsSupport: I checked browser compatibility for IndexedDB and it seems good, but I haven't checked on a few browsers (Android/Chrome, iOS/Mobile Safari, MacOS/Safari). Do you think there's the chance some configuration will be unsupported? Speed: haven't checked the speed difference of IndexedDB vs localStorage - is there any? Versioning: If this works well, we can migrate more things to IndexedDB, and use it properly as a structured DB - for instance AI personas definition, which really needs something like a DB. Do you think we need to "version" the name of the DBs/Tables? |
Also, thinking of importing IndexedDB management functions (from the 'idb-keyval' lib, https://github.com/jakearchibald/idb-keyval) internally - so we can have error handling and more control. e.g. common/utils/idbUtils.ts. |
I totally understand, merging something which has the potential to destroy existing data should be thoroughly considered and examined. I have tested the code on 3 of my existing deployments and there were no issues migrating the chats and no data was lost. If someone uses the app for the first time, they use indexedDB in the first place, no issues for them too. I encourage you and others stumbling across this PR to copy their localstorage key to this deploy preview to see for themselves their data being migrated and not losing anything.
I would have to disagree with keeping the limit mechanism as the entire point of having chat history is that they serve as historical records to refer to in future, I take reference from every other provider, all of them instead of imposing limits on total conversations, instead aim towards improving chat management to make it easier for users to find any of their old chats. If someone hits the 50-chat limit, now every time they have to delete a previous conversation and spend extra time and effort to manually erase chats not needed anymore which is something I believe users should not be troubled with. Talking about DB corruption, it's as likely to happen as with local storage, both are ultimately managed by the browser itself and corruption to browser files will result in data loss in both cases, no exceptions. In terms of reliability, the way atomic transactions work in IndexedDB, in case a crash happens, the browser will preserve the state of the database just the second before crash. WhatsApp for example stores 100s of MBs of data in the IndexedDB and billions use it on a daily basis.
I agree with you here, my ChatGPT history is probably cluttered with old conversations I no longer need but I still do not feel like deleting them. I feel it should be left on the user to decide, if they want to keep it organized or not. We should not force them into our own ways. if I want a quick chat, I do not sit and spend time finding a useless chat to delete.
I am confident that there would be no issues combability wise, It's a fairly matured feature and used by major projects across the globe. It's supported in all browsers, on all platforms (Opera Mini (phone) does not support either localstorage or indexedDB). I am certain not a single user would be left out incompatible in scope of this project.
IndexedDB is slower by a few milliseconds, not a difference to be ever noticed by any user.
This is a good idea; AI personas would benefit from moving it out to IndexedDB. We already have versioning for the database, it will start at 3 once this PR is merged. We can also use a separate IndexedDB for AI personas and keep it separate and clean from the chat records. P.S: Above are just my opinions, feel free to disagree and I will make the required changes as per your suggestions. |
No issues doing that but i ask myself if we really do need to reinvent the weel and copy their implementation when their library works best for our use case already. If any errors, although rare, do happen, the middleware will catch it and preserve the state accordingly is my understanding. I feel like we'd be unnecessarily adding to the codebase for something that isn't worth major improvement. Basically, keeping it simple and let the library do state management / error handling for us. Also, currently the middleware allows us to detect and run migrations, in case we decide to add/remove/alter any field in the database. But yes, changing the DB name itself isn't simple. But again, how often are we going to change the entire name of the database where chats are stored? |
@Ashesh3 great comments across the board. Gives me good confidence on the code. In the end I'm the only QA to ensure the production-ready quality of the App, but I feel that you get that and I'm very comfortable with the patch. Apologies for not jumping on it already - I stayed away from code on purpose the last couple of weeks. I'll be hosting a friend for the next few days, but this will be the first commit you'll see. I'll probably add back the 50 or 100 limit, but again, it's just a product choice, one of those "constraints that push creativity" kind of things. To be lifted quickly in time. But no worries about the code: I was actually looking at the merge last night and I'll add back the code where needed. Thanks again, and look forward to this landing next week. If you're interested, I'm also working on the larger and long term planning for this project. |
Found an issue: if the user opens the app (state is migrated) and then closes the window and opens again, chats are lost. The reason is that 'setItem' on idb is not called after the _migrateLocalStorageToIndexedDB function restores the localStorage data. Looking at the 'persist.ts' middleware implementation in Zustand, it seems that there isn't any 'setItem()' after the onRehydrateStorage(). Also see the comment a few lines above:: We seem to have the issue that migrate() is the place where to perform the load from localStorage, but migrate is not called (as you point out) because we swap the storage backend. The next place, @Ashesh3 Ideas? |
How about this approach? We only clear localStorage after verifying that the IndexedDB contains the migrated data and that it matches the data stored in localStorage. So, before deleting localStorage, we confirm that the chat count in the state (after hydration) is the same as in localStorage. In this case, we can be confident that the migration has been successfully completed and persisted, and it's safe to remove the data from localStorage. If a situation arises where the user does not interact with the app (and subsequently closes it), the state update may not be written to IndexedDB as expected. However, the next time the user uses the app and interacts with it, their chats will still be available in localStorage, and migration will be attempted again. After the migration, we'll check if the chat object matches the one in IndexedDB. We could do this in the onFinishHydration function. Only then will we clear the localStorage key. This shouldn't cause any performance issues, since the migration will be a one-time thing if the user decides to interact with the app, once localStorage is migrated and cleared, the entire migration flow will be skipped. Otherwise, we could manually call the What do you recommend? |
Thanks for the two ideas. I tried looking into the |
@Ashesh3 Thanks for your contribution here. This has been merged and tested thoroughly. The localStorage key is backed up to Very good help here, the app is better now. |
This is awesome! Great commit and I love that you're both so careful ❤️ I still think it could benefit from categories/folders/tabs or similar, because now it's just one list short of chaos. So with the 50/100-object limit proposed, maybe if we apply that to the categories, then anyone can just create a new category when it's been filled up. And you're right about indexedDB not being limited, I have a database with millions of items and handling that without issues. It's only an issue if you apply string-based search or filtering on it, fetching keys is no issue at all no matter the size. |
I love the idea about adding folders for better organization of conversations! It could look something like: @tmikaeld Can you open a new issue about this? |
Sweet mockup! It doesn't avoid the huge-DOM issue though, but that's exactly how I'd imagine setting it up. Maybe it's better if you create it, since I don't want to use your mockup without permission. |
Feel free to create the issue, I myself scavenged the mockup it from a template. Would storing groups of chat in separate key (where key is the folder or the date) under the same database, resolve the huge-DOM object issue? In that way, we'd be loading small chunks of conversations, instead of all at once. |
The problem occurs when you see too much on the screen at the same time. Something like this solves it: Since only the category you're in is rendered. An alternative to handle unlimited amounts is virtual scroll. |
Partially Resolves: #137
This PR addresses the limitation of storing only 20 conversations in the chat history due to the max storage limit of LocalStorage. The solution proposed is to switch to IndexedDB for storing chat conversations, which has a higher storage limit and thus will allow for an unlimited number of conversations to be stored.
Key changes include:
Addition of
idb-keyval
package to interact with IndexedDB.Removal of
MAX_CONVERSATIONS
constant and related logic that limited the number of stored conversations.Modification of the
useChatStore
state to use IndexedDB for storage instead of LocalStorage.Update of the
ChatDrawerItems.tsx
andChatMenuItems.tsx
components to remove the restrictions on maximum conversations.By implementing these changes, users will no longer face the restriction of a limited number of stored conversations, improving the user experience and functionality of the chat feature.
Note: This PR includes a version bump to 3 in
store-chats.ts
to reflect the switch to IndexedDB.Storage limit on IndexedDB: