Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easiest way to serve ONNX model trained in PyTorch #47

Open
isaacmg opened this issue Dec 17, 2018 · 6 comments
Open

Easiest way to serve ONNX model trained in PyTorch #47

isaacmg opened this issue Dec 17, 2018 · 6 comments

Comments

@isaacmg
Copy link

isaacmg commented Dec 17, 2018

Hello, I was wondering if you had an example of how to serve a ONNX model trained in Python in Scala. Basically I'm looking to deploy a number of models trained in PyTorch and exported to ONNX as quickly possible for use in a Flink streaming application. I wanted to see if you had just a simple example of taking a pretrained ONNX model and performing predictions without all the additional training code. Also, is it possible to simply load the model with Lantern for inference without having to redefine the entire architecture in Scala? Thank you.

@feiwang3311
Copy link
Owner

Absolutely. I am glad that this repo has caught more attention from the community! I just added something to facilitate your request but there are still more complications. Let me get those clear first.

First of all, Lantern is not usable in Scala. It may sound strange but we use Scala to generate low level code (C++ or CUDA). We can directly use the generated low level code (compiling them, linking them) but we do not yet provide ways to use the generate code from Scala. You have to (at least for now) use them as C++/CUDA code.

Secondly, our support for ONNX ops is limited. So far we only have supported squeezenet and resnet50 for our paper. More engineering effort is needed for other ops. However, we are interested in adding more supports. So if you feel like it, you can share the model with us and we will work to get that model working in Lantern.

For a running example, please refer to https://github.com/feiwang3311/Lantern/blob/master/src/main/scala/lantern/GenerateLibraryAPP/GenerateONNX.scala#L7
which shows how to generate library functions from ONNX model.

The way to use it is:
git clone https://github.com/feiwang3311/Lantern.git
cd Lantern
sbt

run $dirToONNXModel $dirToGeneratedLib $filenameOfGeneratedLib $nameOfGeneratedFunction
pick the correct Main function to run (which should be 1)

It should create a dir (dirToGeneratedLib), in which you can see 2 files, a filenameOfGeneratedLib.cpp and a filenameOfGeneratedLib.h. You can then write other C++ code to compile and like the function. If you need to generate CUDA code that runs on GPU, I can add that support too (it is not yet in the repo).

Let me know what you think about the response, and if you have problems running the code. We are interested in determining priorities for our limited engineering effort too, so we would like to hear more feedback, and potentially (if you think you still can use Lantern after you see the complications) work together for your specific case.

@isaacmg
Copy link
Author

isaacmg commented Dec 17, 2018

Thanks for your follow-up. I unfortunately don't think that this will work. My current problem is that I need to serve ONNX models originally exported from PyTorch inside a Flink pipeline (a large distributed stream processing engine written in Java/Scala). I'm currently using Java Embedded Python (JEP) but that is producing problems with respect to shared libraries. I'm really looking for a way to load the exported ONNX model natively into Java/Scala or another JVM language and run it without resorting to external REST APIs or embedding Python in Java like JEP. I've found it hard to find a good solution to this problem, which I find odd given the popularity of big data frameworks in Java/Scala and the number of production applications still written in Java.

@feiwang3311
Copy link
Owner

Sorry about that. I totally agree with you on that issue. The ML community is overly centered on Python environment.
Your situation poses an interesting challenge for me to see if we can actually run Lantern from Scala. I will let you know if I dig out more about it.

@feiwang3311
Copy link
Owner

Not sure if you have found a solution to your problem, but I just learned that TensorFlow has Scala frontend (https://github.com/eaplatanios/tensorflow_scala). Could this help you?

@isaacmg
Copy link
Author

isaacmg commented Dec 23, 2018

Thanks for the suggestion, but I'm primarily looking at serving PyTorch models and not Tensorflow ones. That is why I'm exporting to ONNX.

@TiarkRompf
Copy link
Collaborator

goo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants