Using inference Sidecar with non-pickleable Pytorch models #80

avttiwari · 2024-08-01T10:14:21Z

The current setting of the inference sidecar allows for inference through pytorch and tensorflow. While for tensorflow, keras we can load both the model wights and model architecture in a single file for pytorch there are restrictions on what type of model can be saved in a single “.PT” file. Because of these restrictions, we effectively need to load the source library during inference which is not a good design. Attaching below snippet from pytorch documentation.

“The disadvantage of this approach is that the serialized data is bound to the specific classes and the exact directory structure used when the model is saved. The reason for this is because pickle does not save the model class itself. Rather, it saves a path to the file containing the class, which is used during load time. Because of this, your code can break in various ways when used in other projects or after refactors.”

Another method is to save/load the Model in TorchScript Format, but that also does not work when the network might not have a fixed and pre-determined compute graph.

For such cases how can we use the inference sidecar without loading the python classes needed to define the model architecture?

Can supporting ONNX inference be a solution to this?

steveganzy · 2024-08-01T20:26:15Z

Hey, thanks for the feedback. I think you're referring to PyTorch's pickle/unpickle capability for the Python runtime without using the TorchScript format. In that case, you may only be able to use your model in the environment where you create it. However, we don't support this format. We only support TorchScript models which will include both model architecture and model parameters in a single saved file. There are two ways of saving a TorchScript model: tracing and scripting. Traced models are also complete models in that they also have architecture and parameters. You may not be able to trace certain models with conditional behaviors but you can get around this by using scripting, which directly translates your Python code into TorchScript IR.

akshaypundle · 2024-08-01T21:18:49Z

BTW, we have heard requests for ONNX models. This is something we are considering, but I don't have updates on prioritization or timelines for this yet.

akshaypundle self-assigned this Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using inference Sidecar with non-pickleable Pytorch models #80

Using inference Sidecar with non-pickleable Pytorch models #80

avttiwari commented Aug 1, 2024

steveganzy commented Aug 1, 2024

akshaypundle commented Aug 1, 2024

Using inference Sidecar with non-pickleable Pytorch models #80

Using inference Sidecar with non-pickleable Pytorch models #80

Comments

avttiwari commented Aug 1, 2024

steveganzy commented Aug 1, 2024

akshaypundle commented Aug 1, 2024