-
-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve File Uploads, Vision Always On #1210
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Josh-XT
changed the title
Add PowerPoint Upload Support
Add PowerPoint Upload Support, Conversational Vision Persistence
Jun 15, 2024
Josh-XT
changed the title
Add PowerPoint Upload Support, Conversational Vision Persistence
Improve File Uploads, Vision Always On
Jun 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improve File Uploads to Memory
Some file types were identified as needing improvement when chunking into memory while testing. As result, we have several improvements.
Add PowerPoint (PPT/PPTX) upload support
When PowerPoints are uploaded, they will be converted to PDF and handled as PDFs are handled.
Improve PDF uploads
When a PDF file is uploaded, we typically grab the text from it using
pdfplumber
and chunk the information into memory, which has great results. In addition to that strategy, if avision_provider
is selected for the agent, it will also break the PDF up into images per page for the vision model to answer questions about, and any questions answered about images will be retained in conversational memory.Improve XLS/XLSX uploads
Uploading XLS/XLSX previously would upload the first sheet to memory, it will now iterate over each sheet, convert it to CSV, and then handle each sheet as CSVs are handled.
Improve CSV uploads
When uploading a CSV or XLS/XLSX file, it will now turn each item into
json
and add that information to memory to create a new memory per item with reference to where it came from and when. This will greatly improve data analysis, which has also been improved with this update. If a spreadsheet is uploaded at the chat completions endpoint, it will autonomously do data analysis based on user input and output results of executed code for things like graphs from the data.Vision Always On
With PDFs also splitting into images, it makes sense for context to keep vision on when necessary rather than only when the image is uploaded initially. If you upload an image in a conversation and have a
vision_provider
defined for your agent, it will send your input to the vision model + the image, get a description, add that to memories for the conversation to be injected by context from the user's input. If relevant enough to the conversational memories, it will use the vision model with each interaction with the image in context essentially now.