Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle error "Failed to load PyTorch native library" #123

Closed
koppor opened this issue Aug 5, 2024 · 13 comments
Closed

Handle error "Failed to load PyTorch native library" #123

koppor opened this issue Aug 5, 2024 · 13 comments
Milestone

Comments

@koppor
Copy link
Collaborator

koppor commented Aug 5, 2024

image

While trying to reproduce #105, I closed JabRef during downloading. Then I restarted JabRef. Then, I got the error

Failed to load PyTorch native library

Click on "Try to rebuild again" causes the same effect.

@koppor koppor added this to the Week 1 milestone Aug 5, 2024
@koppor
Copy link
Collaborator Author

koppor commented Aug 5, 2024

ERROR: An error occurred while building the embedding model: ai.djl.engine.EngineException: Failed to load PyTorch native library                                                                                              at [email protected]/ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:90)                                                                                                                           at [email protected]/ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:41)                                                                                                             at [email protected]/ai.djl.engine.Engine.getEngine(Engine.java:190)                                                                                                                                                   at [email protected]/ai.djl.Model.newInstance(Model.java:99)                                                                                                                                                           at [email protected]/ai.djl.repository.zoo.BaseModelLoader.createModel(BaseModelLoader.java:196)                                                                                                                       at [email protected]/ai.djl.repository.zoo.BaseModelLoader.loadModel(BaseModelLoader.java:159)                                                                                                                         at [email protected]/ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:174)                                                                                                                                       at [email protected]/org.jabref.logic.ai.models.DeepJavaEmbeddingModel.<init>(DeepJavaEmbeddingModel.java:23)                                                                                                         at [email protected]/org.jabref.logic.ai.models.EmbeddingModel.rebuild(EmbeddingModel.java:128)                                                                                                                       at [email protected]/org.jabref.gui.util.BackgroundTask$2.call(BackgroundTask.java:91)                                                                                                                                at [email protected]/org.jabref.gui.util.BackgroundTask$2.call(BackgroundTask.java:88)                                                                                                                                at [email protected]/org.jabref.gui.util.UiTaskExecutor$1.call(UiTaskExecutor.java:170)                                                                                                                               at [email protected]/javafx.concurrent.Task$TaskCallable.call(Task.java:1399)                                                                                                                                     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)                                                                                                                                          2024-08-05 16:42:54 [JavaFX Application Thread] org.jabref.gui.JabRefDialogService.notify()                                                                                                                                    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)                                                                                                                                   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)                                                                                                                                          INFO: An error occurred while building the embedding model                                                                                                                                                                     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)                                                                                                                           at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)                                                                                                                           at java.base/java.lang.Thread.run(Thread.java:1583)                                                                                                                                                            Caused by: java.lang.UnsatisfiedLinkError: C:\Users\WDAGUtilityAccount\.djl.ai\pytorch\2.3.1-cpu-win-x86_64\fbgemm.dll: Can't find dependent libraries                                                                         at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2418)                                                                                                                                                  at java.base/java.lang.Runtime.load0(Runtime.java:852)                                                                                                                                                                 at java.base/java.lang.System.load(System.java:2025)                                                                                                                                                                   at [email protected]/ai.djl.pytorch.jni.LibUtils.loadNativeLibrary(LibUtils.java:379)                                                                                                                       at [email protected]/ai.djl.pytorch.jni.LibUtils.loadLibTorch(LibUtils.java:195)                                                                                                                            at [email protected]/ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:82)                                                                                                                              at [email protected]/ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53)

@koppor
Copy link
Collaborator Author

koppor commented Aug 5, 2024

Caused by: java.lang.UnsatisfiedLinkError: C:\Users\WDAGUtilityAccount\.djl.ai\pytorch\2.3.1-cpu-win-x86_64\fbgemm.dll: Can't find dependent libraries

@InAnYan
Copy link
Owner

InAnYan commented Aug 6, 2024

I also had this issue, and I don't know what to do with it...

I tried to restart JabRef and it worked okay

@koppor
Copy link
Collaborator Author

koppor commented Aug 6, 2024

Does not work at my side:

Caused by: java.lang.UnsatisfiedLinkError: C:\Users\vagrant\.djl.ai\pytorch\2.3.1-cpu-win-x86_64\fbgemm.dll: Can't find dependent libraries                                                                                                                       at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2418)                                                        at java.base/java.lang.Runtime.load0(Runtime.java:852)                                                                       at java.base/java.lang.System.load(System.java:2025)                                                                         at [email protected]/ai.djl.pytorch.jni.LibUtils.loadNativeLibrary(LibUtils.java:379)                             at [email protected]/ai.djl.pytorch.jni.LibUtils.loadLibTorch(LibUtils.java:195)                                  at [email protected]/ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:82)                                    at [email protected]/ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53)                                 ... 18 more                                                                                      

This refs #105.

Maybe if the download can be triggered explicitly, this can be "solved" by a workaround: User clicks download again. - Or a retry of the download by JabRef automatically? (refs #87)

@InAnYan
Copy link
Owner

InAnYan commented Aug 7, 2024

But there is a button "Rebuild"? (It was named "Try to rebuild" some commits ago)

@InAnYan
Copy link
Owner

InAnYan commented Aug 7, 2024

Hmm, if we only had ways to reproduce this bug. Because I tried to test it, but it doesn't appear again

@koppor
Copy link
Collaborator Author

koppor commented Aug 7, 2024

On Linux Mint, the download is very fast. Thus, it is a Windows issue.

Reproduce:

  1. Install VirtualBox (Windows howto at https://github.com/JabRef/jabref/tree/main/scripts/vms - you can update for Linux if you run linux)
  2. Install Vagrant
  3. cd scripts/vm/windows
  4. vagrant up
  5. Wait until Windows VM is up (approx 10 Minuts)
  6. Login using vagrant as Password
  7. Open cmd
  8. git clone https://github.com/JabRef/jabref.git
  9. cd jabref
  10. git checkout ai-pr-1
  11. gradlew run
  12. Go to Settings -> Ai
  13. Enable AI
  14. Click on Save
  15. Quit JabRef
  16. Kill the running downloading jobs
  17. Press Ctrl+C at the gradle output to be sure, all is cut

Result:

2024-08-07 10:53:47 [pool-2-thread-3] ai.djl.pytorch.jni.LibUtils.downloadPyTorch()                                                                                                                  INFO: Downloading https://publish.djl.ai/pytorch/2.3.1/cpu/win-x86_64/native/lib/asmjit.dll.gz ...                                                                                                   2024-08-07 10:53:47 [main] org.jabref.Launcher.main()                                                                                                                                                ERROR: Unexpected exception: java.lang.RuntimeException: Exception in Application stop method                                                                                                                at [email protected]/com.sun.javafx.application.LauncherImpl.launchApplication1(LauncherImpl.java:898)                                                                                          at [email protected]/com.sun.javafx.application.LauncherImpl.lambda$launchApplication$2(LauncherImpl.java:196)                                                                                  at java.base/java.lang.Thread.run(Thread.java:1583)                                                                                                                                          Caused by: java.util.NoSuchElementException: java.lang.IndexOutOfBoundsException                                                                                                                             at java.base/java.util.AbstractList$Itr.next(AbstractList.java:379)                                                                                                                                  at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)                                                                                                                                  at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1939)                                                                                                     at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762)                                                                                                             at [email protected]/org.jabref.gui.util.UiTaskExecutor.shutdown(UiTaskExecutor.java:131)                                                                                                           at [email protected]/org.jabref.gui.JabRefGUI.shutdownThreadPools(JabRefGUI.java:375)                                                                                                               at [email protected]/org.jabref.gui.JabRefGUI.stop(JabRefGUI.java:365)                                                                                                                              at [email protected]/com.sun.javafx.application.LauncherImpl.lambda$launchApplication1$10(LauncherImpl.java:858)                                                                                at [email protected]/com.sun.javafx.application.PlatformImpl.lambda$runAndWait$12(PlatformImpl.java:483)                                                                                        at [email protected]/com.sun.javafx.application.PlatformImpl.lambda$runLater$10(PlatformImpl.java:456)                                                                                          at java.base/java.security.AccessController.doPrivileged(AccessController.java:400)                                                                                                                  at [email protected]/com.sun.javafx.application.PlatformImpl.lambda$runLater$11(PlatformImpl.java:455)                                                                                          at [email protected]/com.sun.glass.ui.InvokeLaterDispatcher$Future.run(InvokeLaterDispatcher.java:95)                                                                                           at [email protected]/com.sun.glass.ui.win.WinApplication._runLoop(Native Method)                                                                                                                at [email protected]/com.sun.glass.ui.win.WinApplication.lambda$runLoop$3(WinApplication.java:184)                                                                                              ... 1 more                                                                                                                                                                                   Caused by: java.lang.IndexOutOfBoundsException                                                                                                                                                               at [email protected]/javafx.collections.transformation.FilteredList.get(FilteredList.java:169)                                                                                                      at [email protected]/com.tobiasdiez.easybind.MappedList.get(MappedList.java:31)                                                                                                 at java.base/java.util.AbstractList$Itr.next(AbstractList.java:373)                                                                                                                                  ... 15 more                                                                                                                                                                                  2024-08-07 10:53:47 [pool-2-thread-3] ai.djl.pytorch.jni.LibUtils.downloadPyTorch()                                                                                                                  INFO: Downloading https://publish.djl.ai/pytorch/2.3.1/cpu/win-x86_64/native/lib/libiompstubs5md.dll.gz ...                                                                                          <===========--> 90% EXECUTING [47s]                                                                                                                                                                  > :run                                                                                                                                                                                               ^CTerminate batch job (Y/N)? ^C                                                                                      

grafik

Then Re-Run JabRef (using gradlew run)

Result:

024-08-07 11:01:33 [JavaFX Application Thread] sun.util.logging.internal.LoggingProviderImpl$JULWrapper.log()                                                                                       WARN: Resource "" not found.                                                                                                                                                                         Loading:     100% |========================================|                                                                                                                                         2024-08-07 11:01:34 [JavaFX Application Thread] org.jabref.logic.ai.models.EmbeddingModel.lambda$startRebuildingTask$1()                                                                             ERROR: An error occurred while building the embedding model: ai.djl.engine.EngineException: Failed to load PyTorch native library                                                                            at [email protected]/ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:90)                                                                                                         at [email protected]/ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:41)                                                                                           at [email protected]/ai.djl.engine.Engine.getEngine(Engine.java:190)                                            

grafik


This refs #132 - a broken embedded model should be fixed with a download!

@InAnYan
Copy link
Owner

InAnYan commented Aug 7, 2024

OMG so that's a Windows issue?

@InAnYan
Copy link
Owner

InAnYan commented Aug 7, 2024

Okay, little bit of research led to, that the user needs to install VC++ redistributable

https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170

@InAnYan
Copy link
Owner

InAnYan commented Aug 7, 2024

Wait until Windows VM is up (approx 10 Minuts)

😔

@InAnYan
Copy link
Owner

InAnYan commented Aug 7, 2024

From dev call:

  1. Check if VC++ should be really installed (if needed, then update docs (not blog, only link to docs))
  2. Add documentation if embedding model download was interrupted in the middle and AI can't be used at all (fix: delete .djl.ai dir)

@koppor
Copy link
Collaborator Author

koppor commented Aug 7, 2024

Windows: choco install vcredist140

Link for self-guided installation: https://aka.ms/vs/16/release/vc_redist.x64.exe

@InAnYan
Copy link
Owner

InAnYan commented Aug 7, 2024

Yes, they really need VC++. So, I updated the documentation about that.

And I also added guide, if user closed JabRef in the middle of downloading the embedding model

@InAnYan InAnYan closed this as completed Aug 7, 2024
koppor pushed a commit to JabRef/user-documentation that referenced this issue Aug 12, 2024
* Add AI documentation

* Fix some types and grammar

* Rework AI documentation

* Little fixes

* Update the documentation for various AI providers and summarization

* Add notice for InAnYan/jabref#123

* Follow-up notice for InAnYan/jabref#123

* Restructure AI documenation

* Fix from code review

* Fix from linter

* Fix from linter

---------

Co-authored-by: ThiloteE <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants