Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FSCrawler won't crawl docx files in an Alpine linux container #942

Open
yassinethr opened this issue Apr 23, 2020 · 1 comment
Open

FSCrawler won't crawl docx files in an Alpine linux container #942

yassinethr opened this issue Apr 23, 2020 · 1 comment

Comments

@yassinethr
Copy link

Hello,

I'm trying to containerize my Flask app w/ Elastic & FScrawler.
In my Dockerfile, I start from an Alpine Linux image, and I also copy the FScrawler source folder I have downloaded + the docx files I want to crawl, into the container.

But when running the container, FScrawler (correctly linked to ES) doesn't seem to crawl anything from the data folder.

I made the experiment to transform one docx file to txt format, and FScrawler was then able to read it.

Do you have any clue ?
Thanks !

@dadoonet
Copy link
Owner

There is this work coming: #820 There are some documentation which might be helpful: #820 (comment) and https://github.com/dadoonet/fscrawler/pull/820/files#diff-b0155fc74849c11e4e8df2bd2872b1db

HTH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants