[RFC] Support more local model types #1164

ylwu-amzn · 2023-07-29T08:54:49Z

Currently, ml-commons only supports uploading text-embedding models. However, we believe there are other models that could be valuable additions to our platform:

Summarization model: This model can summarize a document or lengthy content and return a concise version.
Question answering model: If you have a question related to a context or set of documents, this model can provide you with accurate answers.
Text classification: This model is designed to classify text, such as sentiment analysis, where it assigns labels like 'positive,' 'negative,' or 'neutral.'
Named entity recognition (NER) model: For identifying various entities in a text, such as organizations, persons, locations, and more.
Image embedding model: This model translates an image into a vector representation for easier analysis.
Object detection model: This model detects object in image
Rerank model: rerank the search result

Please comment on this issue if you require support for other local models or vote for the model you need the most.

hijakk · 2023-07-29T16:40:14Z

+1. Most of these would be valuable in my use cases, as well as language identification. Possibly langid is a different beast than this approach should support?

Image vectorization is a potentially tough use case as images can be large and including them in base64 natively in documents can dramatically inflate document size on disk rather than providing a reference pointer to external storage (such as s3).

ylwu-amzn · 2023-08-01T19:41:44Z

From #1150 (comment)

@asfoorial suggests

I suggest to keep the door open for LLM hosting as there is a trend to get LLMs smaller with quantization yet achieve reasonable performance. I would say they will be hostable in ml nodes or other dedicated nodes.

nateynateynate · 2023-08-02T23:12:55Z

They all get a thumbs up from me, but I actually would love to see image embedding. I'm fascinated by it.

HungryHowies · 2023-08-03T04:40:18Z

+1 tbh , I would love to have them all.

austintlee · 2023-10-31T21:29:30Z

Are cross encoders covered under "rerank model"?

austintlee · 2023-11-02T23:43:32Z

@dhrubo-os (tagging you since you went over some basics on this with the OCI students)

Would it make it easier to produce ML input/output classes for all these different models if we used Smithy to define them and have it generate the classes. Just wondering what we can do to expedite progress on this using some common framework.

TrungBui59 · 2023-11-23T11:54:24Z

@ylwu-amzn @dhrubo-os I am interested in working on supporting the Question-answering model. Can you guys give me some hints on what I should do? Currently, I am thinking of following the approach that we used to support the text-embedding model

ylwu-amzn added enhancement New feature or request untriaged labels Jul 29, 2023

ylwu-amzn added RFC Request For Comments from the OpenSearch Community and removed enhancement New feature or request untriaged labels Aug 14, 2023

navneet1v mentioned this issue Aug 23, 2023

Extending Neural Search pipeline to Named entity recognition and other metadata extracting models opensearch-project/neural-search#134

Open

ylwu-amzn added this to ml-commons projects Aug 24, 2023

ylwu-amzn moved this to Backlog in ml-commons projects Aug 24, 2023

This was referenced Sep 27, 2023

[FEATURE] Trace Summarization models to TorchScript and Onnx format opensearch-project/opensearch-py-ml#303

Open

[FEATURE] Trace Question Answering models to TorchScript and Onnx format opensearch-project/opensearch-py-ml#304

Open

This was referenced Nov 3, 2023

[RFC] Improving Search relevancy through Generic Reranker interfaces opensearch-project/neural-search#485

Closed

[FEATURE] Support local cross-encoder model #1589

Open

Add cross encoder support #1615

Merged

HenryL27 mentioned this issue Nov 28, 2023

[ENHANCEMENT] use torch.onnx.export instead of convert_graph_to_onnx opensearch-project/opensearch-py-ml#347

Open

mingshl mentioned this issue Jul 2, 2024

[FEATURE] Add image search pretrained models #2598

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Support more local model types #1164

[RFC] Support more local model types #1164

ylwu-amzn commented Jul 29, 2023 •

edited

Loading

hijakk commented Jul 29, 2023

ylwu-amzn commented Aug 1, 2023

nateynateynate commented Aug 2, 2023

HungryHowies commented Aug 3, 2023 •

edited

Loading

austintlee commented Oct 31, 2023

austintlee commented Nov 2, 2023

TrungBui59 commented Nov 23, 2023

[RFC] Support more local model types #1164

[RFC] Support more local model types #1164

Comments

ylwu-amzn commented Jul 29, 2023 • edited Loading

hijakk commented Jul 29, 2023

ylwu-amzn commented Aug 1, 2023

nateynateynate commented Aug 2, 2023

HungryHowies commented Aug 3, 2023 • edited Loading

austintlee commented Oct 31, 2023

austintlee commented Nov 2, 2023

TrungBui59 commented Nov 23, 2023

ylwu-amzn commented Jul 29, 2023 •

edited

Loading

HungryHowies commented Aug 3, 2023 •

edited

Loading