Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on detecting pdf #14

Closed
agofilo opened this issue Jan 12, 2016 · 7 comments
Closed

Error on detecting pdf #14

agofilo opened this issue Jan 12, 2016 · 7 comments

Comments

@agofilo
Copy link

agofilo commented Jan 12, 2016

Upgrading Pantomime from 2.3.0 to 2.8.0 i found the sequent Exception thrown while :
java.lang.NoClassDefFoundError: org/apache/commons/compress/PasswordRequiredException
at org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:88)
at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:77)
at org.apache.tika.Tika.detect(Tika.java:156)
at org.apache.tika.Tika.detect(Tika.java:287)
at pantomime.mime$eval20617$fn__20618.invoke(mime.clj:38)
at pantomime.mime$eval20596$fn__20597$G__20587__20602.invoke(mime.clj:24)

I struggled a while and the problem occurred while i was performing the following code:

(#{"application/pdf" "image/jpg" "image/jpeg" "image/tiff"}
                            (mime-type-of (io/file file)))

Reverting Pantomime from 2.8 to 2.3 fixes the problem.

@agofilo
Copy link
Author

agofilo commented Jan 13, 2016

Update: With 2.7.0 it works.

@michaelklishin
Copy link
Owner

This genuinely looks like a Tika API change. I Don't know what Pantomime can do about it.

@kennethkalmer
Copy link

Right, so since this just happened to me and this issue pops up very high on Google for the exception, I thought to trace this and publish an answer here for others that might experience the same thing.

Digging around a bit more, I found issue TIKA-1717 and from there it became clear that somehow my project was using an older version of commons-compress. I then used lein deps :tree to hunt down the other responsible parties and got all the necessary :exclusions in place for this to work.

In my case the exclusions were:

  {:dependencies ;; ...
                 [org.webjars/webjars-locator-jboss-vfs "0.1.0"
                  :exclusions [org.apache.commons/commons-compress]]
                 [me.raynes/fs "1.4.6"
                  :exclusions [org.clojure/clojure
                               org.apache.commons/commons-compress]]
                 ;; ...
  }

The webjars-locator is part of the default luminus template, so this might very easily happen to someone in the future.

Pantomime will at least bring in commons-compress 1.12, the bug I mentioned in Tika was resolved in commons-compress 1.11. So those are some versions to work around.

@michaelklishin
Copy link
Owner

michaelklishin commented Jul 13, 2017

@kennethkalmer thank you! is there anything Pantomime can do to avoid this? or is it really an outdated [transient] dependency on org.apache.commons/commons-compress in Luminus (or possible other libraries)?

@kennethkalmer
Copy link

Outdated transient dependency 😢

There are probably a few things:

  • Some documentation in the README (or link to wiki page), with lein deps :tree & :exclusions instructions
  • Catch that specific exception and print out a warning message to stderr, then rethrow it
  • Do version detection on the loaded commons-compression version, but that doesn't seem feasible after a 5 second scroll through the Javadocs looking for a version attribute of sorts... Alternatively try and load/use org.apache.commons.compress.PasswordRequiredException which only got added in 1.10 of commons-compress, as "poor mans version detection" (until they remove/rename it)

The sad truth is that many version of commons-* will compete in a project, just because, well, they're common... And I don't know how to guard against that.

Maybe the first and second points should both be done. The first one shares knowledge with users that is more broadly applicable to other edge cases too, not just with Pantomime. The second point just adds some clarity to that moment of insane confusion... Especially seeing a Zip error when trying to parse a PDF file

Wdyt?

@kennethkalmer
Copy link

Oh snap, just realized the same was discussed in #15

@michaelklishin
Copy link
Owner

I will add a modern commons-compress dependency instead and perhaps a README note.

michaelklishin added a commit that referenced this issue Jul 16, 2017
michaelklishin added a commit that referenced this issue Jul 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants