-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Easiest way to serve ONNX model trained in PyTorch #47
Comments
Absolutely. I am glad that this repo has caught more attention from the community! I just added something to facilitate your request but there are still more complications. Let me get those clear first. First of all, Lantern is not usable in Scala. It may sound strange but we use Scala to generate low level code (C++ or CUDA). We can directly use the generated low level code (compiling them, linking them) but we do not yet provide ways to use the generate code from Scala. You have to (at least for now) use them as C++/CUDA code. Secondly, our support for ONNX ops is limited. So far we only have supported squeezenet and resnet50 for our paper. More engineering effort is needed for other ops. However, we are interested in adding more supports. So if you feel like it, you can share the model with us and we will work to get that model working in Lantern. For a running example, please refer to https://github.com/feiwang3311/Lantern/blob/master/src/main/scala/lantern/GenerateLibraryAPP/GenerateONNX.scala#L7 The way to use it is:
It should create a dir (dirToGeneratedLib), in which you can see 2 files, a filenameOfGeneratedLib.cpp and a filenameOfGeneratedLib.h. You can then write other C++ code to compile and like the function. If you need to generate CUDA code that runs on GPU, I can add that support too (it is not yet in the repo). Let me know what you think about the response, and if you have problems running the code. We are interested in determining priorities for our limited engineering effort too, so we would like to hear more feedback, and potentially (if you think you still can use Lantern after you see the complications) work together for your specific case. |
Thanks for your follow-up. I unfortunately don't think that this will work. My current problem is that I need to serve ONNX models originally exported from PyTorch inside a Flink pipeline (a large distributed stream processing engine written in Java/Scala). I'm currently using Java Embedded Python (JEP) but that is producing problems with respect to shared libraries. I'm really looking for a way to load the exported ONNX model natively into Java/Scala or another JVM language and run it without resorting to external REST APIs or embedding Python in Java like JEP. I've found it hard to find a good solution to this problem, which I find odd given the popularity of big data frameworks in Java/Scala and the number of production applications still written in Java. |
Sorry about that. I totally agree with you on that issue. The ML community is overly centered on Python environment. |
Not sure if you have found a solution to your problem, but I just learned that TensorFlow has Scala frontend (https://github.com/eaplatanios/tensorflow_scala). Could this help you? |
Thanks for the suggestion, but I'm primarily looking at serving PyTorch models and not Tensorflow ones. That is why I'm exporting to ONNX. |
Hello, I was wondering if you had an example of how to serve a ONNX model trained in Python in Scala. Basically I'm looking to deploy a number of models trained in PyTorch and exported to ONNX as quickly possible for use in a Flink streaming application. I wanted to see if you had just a simple example of taking a pretrained ONNX model and performing predictions without all the additional training code. Also, is it possible to simply load the model with Lantern for inference without having to redefine the entire architecture in Scala? Thank you.
The text was updated successfully, but these errors were encountered: