You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem?
This is related to compiling ONNX models for upload to opensearch
two problems with the status quo:
transformers.convert_graph_to_onnx.convert will be deprecated in the next major version of huggingface transformers
transformers.convert_graph_to_onnx.convert only grabs the base model; so the head on top of the base logits is left off. For embedding models, we got around this by implementing the pooling layer in ml-commons for ONNX models, but for other pretrained classification heads (e.g. cross-encoders) this is simply impossible.
What solution would you like?
Instead use torch.onnx.export. An example (that implements this for cross encoders):
usage is similar to torch.jit.trace, which we use for torchscript compilation
This will simplify the code in ml-commons that drives ONNX models
What alternatives have you considered?
There are probably other ways to export a complete model to ONNX (and if we want to support TF we might need to look at options for that) but this seems pretty clean.
Is your feature request related to a problem?
This is related to compiling ONNX models for upload to opensearch
two problems with the status quo:
transformers.convert_graph_to_onnx.convert
will be deprecated in the next major version of huggingface transformerstransformers.convert_graph_to_onnx.convert
only grabs the base model; so the head on top of the base logits is left off. For embedding models, we got around this by implementing the pooling layer in ml-commons for ONNX models, but for other pretrained classification heads (e.g. cross-encoders) this is simply impossible.What solution would you like?
Instead use
torch.onnx.export
. An example (that implements this for cross encoders):usage is similar to
torch.jit.trace
, which we use for torchscript compilationThis will simplify the code in ml-commons that drives ONNX models
What alternatives have you considered?
There are probably other ways to export a complete model to ONNX (and if we want to support TF we might need to look at options for that) but this seems pretty clean.
Do you have any additional context?
Original comment
We should probably invest in supporting all the new kinds of models that will be coming from [RFC] Support more local model types in opensearch-py-ml.
The text was updated successfully, but these errors were encountered: