Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle attachment files larger than 20MB #19

Open
simonw opened this issue Oct 29, 2024 · 5 comments
Open

Handle attachment files larger than 20MB #19

simonw opened this issue Oct 29, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Oct 29, 2024

The Gemini API requires that files larger than a certain size (I think 20MB) be uploaded to their files API rather than passed as inline base64.

This may be a tit bit tricky to implement due to the need to remember the file ID for a specific upload - might call for an extra database table or maybe even a change to LLM core to support optional extra metadata for persisted attachment records.

@simonw simonw added the enhancement New feature or request label Oct 29, 2024
@simonw
Copy link
Owner Author

simonw commented Nov 8, 2024

This is tricky, it's actually needed when ALL attachments add up to 20MB:

Always use the File API when the total request size (including the files, text prompt, system instructions, etc.) is larger than 20 MB.

I can still do that in the plugin, but I'll need to resolve attachment sizes in order to make that decision - ideally without loading them into memory first.

At least there are no extra costs to worry about:

The File API lets you store up to 20 GB of files per project, with a per-file maximum size of 2 GB. Files are stored for 48 hours. They can be accessed in that period with your API key, but cannot be downloaded from the API. The File API is available at no cost in all regions where the Gemini API is available.

@simonw
Copy link
Owner Author

simonw commented Nov 8, 2024

I think the trick here will be calculating the size of all attachments plus the prompt and system prompt, then sorting the attachments by size and uploading the largest one in a loop until the amount of content left has dropped below the 20MB threshold.

There's another consideration here: presumably there's a performance advantage to uploading even a small file just once if it's going to be used in a lot of different prompts. But how to decide when to do that?

One possibility: for small files that weren't previously treated as uploads, automatically upload them the second time they are referenced in a prompt within X hours - as a very rough heuristic for detecting that they might be used again in the future.

Could also provide a sub-command:

llm gemini upload file.png

This will hash the file content and upload the file, stashing a record in the attachments table which can then be used to detect the file has been previously uploaded and reuse its Gemini file ID later on.

This is a strong indicator that adding a mechanism for plugins to track extra data against attachments is going to be necessary - either with a JSON column or some kind of foreign key custom table.

Maybe this:

attachment_id key value
43 gemini-file 8f47c8e9-12d4-4b86-b6a3-65c8f32598bc

@simonw
Copy link
Owner Author

simonw commented Nov 8, 2024

Or have the llm-gemini plugin create and migrate its own tables for this - which would set a good precedent for how other plugins could do this.

Need to consider the Python library case though where a SQLite logs database isn't necessarily guaranteed.

That case will be tricky, because the execute prompt method in this plugin needs access to persistent storage in order to check if an attachment has previously been uploaded or not.

@simonw
Copy link
Owner Author

simonw commented Nov 8, 2024

Might need some kind of abstraction in LLM core for persistent storage, which will soon also need to be both sync and async capable.

it probably shouldn't be 100% reliant on SQLite either, since I want LLM as a library to be useful in other contexts, eg for people who are integrating it with PostgreSQL or even a system with NoSQL storage of some kind.

@gerred
Copy link

gerred commented Nov 9, 2024

note too that files only last in this free files API store for a certain amount of time. it's free, and limited to 20GB in size for an entire project. so, it also has some call to be an ephemeral store as well that needs lookup every time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants