Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Can't pass file to Unstructured file loader for upsertion #3731

Closed
kennyakers opened this issue Dec 18, 2024 · 4 comments · Fixed by #3836
Closed

[BUG] Can't pass file to Unstructured file loader for upsertion #3731

kennyakers opened this issue Dec 18, 2024 · 4 comments · Fixed by #3836
Labels
bug Something isn't working

Comments

@kennyakers
Copy link

Describe the bug
When trying to upsert a document via the Unstructured File Loader via the Flowise API, the file attached in the POST call is ignored, and the file specified in the node in the UI is uploaded instead. This was previously fixed for the PDF loader (thank you!), but is still present for Unstructured File Loader.

In this call, a PDF of a short story (several pages long) is attached. As you can see in the response, the file that was upserted was the file "blank.pdf", which is a single page with the words "ERROR" on it that was uploaded directly into the Unstructured File Loader node:
Screenshot 2024-12-17 at 11 09 53 PM

Expected Behavior
The attached file (above: Girl Jamaica Kincaid.pdf) should replace the file in the node (above: blank.pdf) and be upserted.

Flow & Config
Below is an example flow and override configuration to recreate this behavior:
Screenshot 2024-12-17 at 11 26 09 PM
Screenshot 2024-12-17 at 11 13 46 PM

Setup

  • Installation: RepoCloud instance
  • Flowise v2.2.2
@HenryHengZJ HenryHengZJ added the bug Something isn't working label Dec 19, 2024
@HenryHengZJ
Copy link
Contributor

I was able to override the files. But you need to enable the override configuration:

1.) I have a chatflow with doc1.txt uploaded:
image

2.) Then, from the API, I override the files with another file:
image

@kennyakers
Copy link
Author

@HenryHengZJ Thanks for taking a look at this. I also was able to replicate your results with .txt files - I believe I've narrowed down the issue: PDFs specifically cause this problem. For example:

  1. Using the same flow, which worked with the .txt files as in your example
flow
  1. ePub file successfully upserts
successful epub
  1. PDF file still upserts the "doc1.txt" file uploaded in the flow
unsuccessful pdf

@HenryHengZJ HenryHengZJ reopened this Jan 9, 2025
@HenryHengZJ
Copy link
Contributor

thanks for testing it!

found the fix, will be added to this PR (b2ce4fa)

@HenryHengZJ HenryHengZJ linked a pull request Jan 9, 2025 that will close this issue
@saatchi-david
Copy link
Contributor

@HenryHengZJ , I'm testing this now and finding the same error occurs when uploading a file via the ui/embed. The file set in the doc loader is the only file that gets passed, regardless of the filetype. Would you please have a look? Let me know if helpful to open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants