You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current setting of the inference sidecar allows for inference through pytorch and tensorflow. While for tensorflow, keras we can load both the model wights and model architecture in a single file for pytorch there are restrictions on what type of model can be saved in a single “.PT” file. Because of these restrictions, we effectively need to load the source library during inference which is not a good design. Attaching below snippet from pytorch documentation.
“The disadvantage of this approach is that the serialized data is bound to the specific classes and the exact directory structure used when the model is saved. The reason for this is because pickle does not save the model class itself. Rather, it saves a path to the file containing the class, which is used during load time. Because of this, your code can break in various ways when used in other projects or after refactors.”
Another method is to save/load the Model in TorchScript Format, but that also does not work when the network might not have a fixed and pre-determined compute graph.
For such cases how can we use the inference sidecar without loading the python classes needed to define the model architecture?
Can supporting ONNX inference be a solution to this?
The text was updated successfully, but these errors were encountered:
Hey, thanks for the feedback. I think you're referring to PyTorch's pickle/unpickle capability for the Python runtime without using the TorchScript format. In that case, you may only be able to use your model in the environment where you create it. However, we don't support this format. We only support TorchScript models which will include both model architecture and model parameters in a single saved file. There are two ways of saving a TorchScript model: tracing and scripting. Traced models are also complete models in that they also have architecture and parameters. You may not be able to trace certain models with conditional behaviors but you can get around this by using scripting, which directly translates your Python code into TorchScript IR.
BTW, we have heard requests for ONNX models. This is something we are considering, but I don't have updates on prioritization or timelines for this yet.
The current setting of the inference sidecar allows for inference through pytorch and tensorflow. While for tensorflow, keras we can load both the model wights and model architecture in a single file for pytorch there are restrictions on what type of model can be saved in a single “.PT” file. Because of these restrictions, we effectively need to load the source library during inference which is not a good design. Attaching below snippet from pytorch documentation.
“The disadvantage of this approach is that the serialized data is bound to the specific classes and the exact directory structure used when the model is saved. The reason for this is because pickle does not save the model class itself. Rather, it saves a path to the file containing the class, which is used during load time. Because of this, your code can break in various ways when used in other projects or after refactors.”
Another method is to save/load the Model in TorchScript Format, but that also does not work when the network might not have a fixed and pre-determined compute graph.
For such cases how can we use the inference sidecar without loading the python classes needed to define the model architecture?
Can supporting ONNX inference be a solution to this?
The text was updated successfully, but these errors were encountered: