Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve File Uploads, Vision Always On #1210

Merged
merged 16 commits into from
Jun 15, 2024
Merged

Improve File Uploads, Vision Always On #1210

merged 16 commits into from
Jun 15, 2024

Conversation

Josh-XT
Copy link
Owner

@Josh-XT Josh-XT commented Jun 15, 2024

Improve File Uploads to Memory

Some file types were identified as needing improvement when chunking into memory while testing. As result, we have several improvements.

Add PowerPoint (PPT/PPTX) upload support

When PowerPoints are uploaded, they will be converted to PDF and handled as PDFs are handled.

Improve PDF uploads

When a PDF file is uploaded, we typically grab the text from it using pdfplumber and chunk the information into memory, which has great results. In addition to that strategy, if a vision_provider is selected for the agent, it will also break the PDF up into images per page for the vision model to answer questions about, and any questions answered about images will be retained in conversational memory.

Improve XLS/XLSX uploads

Uploading XLS/XLSX previously would upload the first sheet to memory, it will now iterate over each sheet, convert it to CSV, and then handle each sheet as CSVs are handled.

Improve CSV uploads

When uploading a CSV or XLS/XLSX file, it will now turn each item into json and add that information to memory to create a new memory per item with reference to where it came from and when. This will greatly improve data analysis, which has also been improved with this update. If a spreadsheet is uploaded at the chat completions endpoint, it will autonomously do data analysis based on user input and output results of executed code for things like graphs from the data.

Vision Always On

With PDFs also splitting into images, it makes sense for context to keep vision on when necessary rather than only when the image is uploaded initially. If you upload an image in a conversation and have a vision_provider defined for your agent, it will send your input to the vision model + the image, get a description, add that to memories for the conversation to be injected by context from the user's input. If relevant enough to the conversational memories, it will use the vision model with each interaction with the image in context essentially now.

@Josh-XT Josh-XT changed the title Add PowerPoint Upload Support Add PowerPoint Upload Support, Conversational Vision Persistence Jun 15, 2024
agixt/Prompts.py Fixed Show fixed Hide fixed
agixt/Prompts.py Fixed Show fixed Hide fixed
agixt/Prompts.py Fixed Show fixed Hide fixed
agixt/Prompts.py Fixed Show fixed Hide fixed
@Josh-XT Josh-XT changed the title Add PowerPoint Upload Support, Conversational Vision Persistence Improve File Uploads, Vision Always On Jun 15, 2024
@Josh-XT Josh-XT marked this pull request as ready for review June 15, 2024 21:31
@Josh-XT Josh-XT merged commit d139bae into main Jun 15, 2024
7 checks passed
@Josh-XT Josh-XT deleted the add-powerpoint-support branch June 15, 2024 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant