Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception: pdftotext is not installed. It is part of xpdf or poppler-utils software suite. #2456

Closed
AI-Ahmed opened this issue Apr 25, 2022 · 12 comments · Fixed by #2488
Closed
Assignees
Labels
topic:dependencies type:bug Something isn't working

Comments

@AI-Ahmed
Copy link
Contributor

image

An issue of installing the xpdfreader!

@AI-Ahmed
Copy link
Contributor Author

To fix the issue, you need to install the new version

!wget --no-check-certificate https://dl.xpdfreader.com/xpdf-tools-linux-4.04.tar.gz
!tar -xvf xpdf-tools-linux-4.04.tar.gz && sudo cp xpdf-tools-linux-4.04/bin64/pdftotext /usr/local/bin

@julian-risch
Copy link
Member

Hi @AI-Ahmed thanks for pointing out this problem. We also came across the problem here: #2443
We forgot to update the text in the Exception though. I will take care of that now and then close this issue. Thanks again! 👍

@AI-Ahmed
Copy link
Contributor Author

AI-Ahmed commented May 3, 2022

Hi @AI-Ahmed thanks for pointing out this problem. We also came across the problem here: #2443 We forgot to update the text in the Exception though. I will take care of that now and then close this issue. Thanks again! 👍

No problem my friend! I'm happy to help at any time. I really like your framework, and It is really huge thing for me to help you with anyway!

@AI-Ahmed AI-Ahmed closed this as completed May 3, 2022
@julian-risch
Copy link
Member

I will keep this issue open until the fix is merged.

@julian-risch julian-risch reopened this May 3, 2022
@AI-Ahmed
Copy link
Contributor Author

AI-Ahmed commented May 3, 2022

That's Awesome! I will do that, too!

@julian-risch
Copy link
Member

@AI-Ahmed no need to work on this one here. I've already implemented the fix in this pull request: #2488
Great to see your enthusiasm though! 👍

@julian-risch julian-risch self-assigned this May 3, 2022
@AI-Ahmed
Copy link
Contributor Author

AI-Ahmed commented May 3, 2022

I have done that already 😅. No problem @julian-risch, It is good to know!

@sohanasarah
Copy link

I am having the same issue. How to install the packages in windows?

@HGamalElDin
Copy link

I have the same issue in Windows, haystack version 1.18.1.. couldn't install the packages. @julian-risch

@julian-risch
Copy link
Member

Hello @HGamalElDin here is our documentation page about PDFToTextConverter and how to install it: https://docs.haystack.deepset.ai/docs/file_converters#pdftotextconverter
In short, you have two options: install PyMuPDF, for example via pip install farm-haystack[pdf] or install xpdf as described here: https://www.xpdfreader.com/download.html If you still have the same problem, please open a new issue and explain what you have tried so far. This issue here is closed and new comments here get overlooked easily.

@michaelfeil
Copy link
Contributor

michaelfeil commented Sep 13, 2023

Hello @HGamalElDin here is our documentation page about PDFToTextConverter and how to install it: https://docs.haystack.deepset.ai/docs/file_converters#pdftotextconverter In short, you have two options: install PyMuPDF, for example via pip install farm-haystack[pdf] or install xpdf as described here: https://www.xpdfreader.com/download.html If you still have the same problem, please open a new issue and explain what you have tried so far. This issue here is closed and new comments here get overlooked easily.

As haystacks error message recommends installing poppler-utils. Is there a drawback of using sudo apt-get poppler-utils instead of the instructions under https://www.xpdfreader.com/download.html

"""pdftotext is not installed. It is part of xpdf or poppler-utils software suite.

@demongolem-biz2
Copy link

Hello @HGamalElDin here is our documentation page about PDFToTextConverter and how to install it: https://docs.haystack.deepset.ai/docs/file_converters#pdftotextconverter In short, you have two options: install PyMuPDF, for example via pip install farm-haystack[pdf] or install xpdf as described here: https://www.xpdfreader.com/download.html If you still have the same problem, please open a new issue and explain what you have tried so far. This issue here is closed and new comments here get overlooked easily.

Because it is always good to have an option not involving sudo. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:dependencies type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants