Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Singular and _search api for Conversational Memory #1268

Closed
HenryL27 opened this issue Aug 30, 2023 · 18 comments
Closed

[FEATURE] Singular and _search api for Conversational Memory #1268

HenryL27 opened this issue Aug 30, 2023 · 18 comments
Assignees
Labels
enhancement New feature or request

Comments

@HenryL27
Copy link
Collaborator

HenryL27 commented Aug 30, 2023

Background
We created a Conversational Memory CRUD API per #1150 to deal with long term storage of conversations for conversational applications. In this API, we identify two resources: ConversationMeta, and Interaction. A ConversationMeta contains some information about the Conversation as a whole, whereas an Interaction is the (call-and-response) individual unit of a Conversation - a Conversation is made up of Interactions. We store ConversationMetas and Interactions in two separate indices.

Is your feature request related to a problem?
We need singular APIs to get just one Interaction or one ConversationMeta
We also need a _search endpoint for ConversationMetas and Interactions to execute more complicated queries over conversational data

What solution would you like?
Four new actions/rest endpoints

  1. GetConversation (gets a single ConversationMeta by id, does not retrieve interactions)
  2. GetInteraction (gets a single interaction by id)
  3. SearchConversations (executes an arbitrary search query over the .conversation-meta index, retains access control guarantees)
  4. SearchInteractions (executes an arbitrary search query over a set of interactions by conversation_id, retains access control guarantees)

This may require a little rethinking on the rest endpoint naming front

What alternatives have you considered?
SearchInteractions over all interactions the user has access to

Do you have any additional context?
PR 1196

@HenryL27
Copy link
Collaborator Author

Should we also add an UpdateInteraction API of some kind? In order to add to the "additional info" field or something

@ylwu-amzn
Copy link
Collaborator

Looks similar to this issue #1284

@HenryL27
Copy link
Collaborator Author

@jngz-es thoughts? Is there anything specifically that you would want out of these that isn't encompassed in standard _search api?

@HenryL27
Copy link
Collaborator Author

HenryL27 commented Oct 6, 2023

I'm likely going to need to change the API endpoints a little. Thoughts?
Current:

CreateConversation  -> /_plugins/_ml/memory/conversation
CreateInteraction   -> /_plugins/_ml/memory/conversation/{conversation_id}
GetConversations    -> /_plugins/_ml/memory/conversation
GetInteractions     -> /_plugins/_ml/memory/conversation/{conversation_id}
DeleteConversation  -> /_plugins/_ml/memory/conversation/{conversation_id}

Proposed change:

CreateConverasation  -> /_plugins/_ml/memory/conversation/_create
CreateInteractions   -> /_plugins/_ml/memory/conversation/{conversation_id}/_create
GetConversations(pl) -> /_plugins/_ml/memory/conversation/_list
GetInteractions(pl)  -> /_plugins/_ml/memory/conversation/{conversation_id}/_list
DeleteConversation   -> /_plugins/_ml/memory/conversation/{conversation_id}/_delete
GetConversation(sg)  -> /_plugins/_ml/memory/conversation/{conversation_id}
GetInteraction(sg)   -> /_plugins/_ml/memory/conversation/{conversation_id}/{interaction_id}
SearchConversations  -> /_plugins/_ml/memory/conversation/_search
SearchInteractions   -> /_plugins/_ml/memory/conversation/{conversation_id}/_search

@HenryL27
Copy link
Collaborator Author

New API Spec

GetInteraction

Gets an interaction by id

Parameter Type Required
conversation_id string yes
interaction_id string yes

Returns the interaction as json (if you have access)

GetConversation

Gets a conversation by id

Parameter Type Required
conversation_id string yes

Returns the conversation meta object as json (if you have access)

SearchConversations

Execute a search query over the conversations index

Parameter Type Required
query OS Query yes

Returns the search results (conversation meta documents), opensearch style. essentially this is a wrapper around standard opensearch _search API that tacks on access controls if applicable

SearchInteractions

Execute a search query over the interactions of a conversation

Parameter Type Required
conversation_id string yes
query OS Query yes

Returns the search results (interaction documents), opensearch style. Essentially this is a wrapper around the standard opensearch _search API that adds a filter for the correct conversation id and checks access controls for the conversation.

@saratvemulapalli
Copy link
Member

I'm likely going to need to change the API endpoints a little. Thoughts? Current:

CreateConversation  -> /_plugins/_ml/memory/conversation
CreateInteraction   -> /_plugins/_ml/memory/conversation/{conversation_id}
GetConversations    -> /_plugins/_ml/memory/conversation
GetInteractions     -> /_plugins/_ml/memory/conversation/{conversation_id}
DeleteConversation  -> /_plugins/_ml/memory/conversation/{conversation_id}

Proposed change:

CreateConverasation  -> /_plugins/_ml/memory/conversation/_create
CreateInteractions   -> /_plugins/_ml/memory/conversation/{conversation_id}/_create
GetConversations(pl) -> /_plugins/_ml/memory/conversation/_list
GetInteractions(pl)  -> /_plugins/_ml/memory/conversation/{conversation_id}/_list
DeleteConversation   -> /_plugins/_ml/memory/conversation/{conversation_id}/_delete
GetConversation(sg)  -> /_plugins/_ml/memory/conversation/{conversation_id}
GetInteraction(sg)   -> /_plugins/_ml/memory/conversation/{conversation_id}/{interaction_id}
SearchConversations  -> /_plugins/_ml/memory/conversation/_search
SearchInteractions   -> /_plugins/_ml/memory/conversation/{conversation_id}/_search

I like the new conventions, do we need both get and search APIs? Eg GetConversations(pl) we should be able to get it with SearchConversations with a match all query.

{
  "query": {
    "match_all": {}
  },
  "size": 1000
}

I haven't looked at the PR #1504 , if we are using the same code underneath its good.

@HenryL27
Copy link
Collaborator Author

Get (pl) returns the last couple conversations (or interactions) sorted by creation time. basically just a shortcut to

{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "create_time": {
        "order": "desc"
      }
    }
  ]
}

and, of course, tacking on the access controls. The idea was that for most use-cases, get (pl) is all an application needs

@dhrubo-os
Copy link
Collaborator

dhrubo-os commented Oct 12, 2023

What do you think about:

CreateConverasation  -> /_plugins/_ml/memory/conversation/_create
CreateInteractions   -> /_plugins/_ml/memory/interaction/{conversation_id}/_create
GetConversations(pl) -> /_plugins/_ml/memory/conversation/
GetInteractions(pl)  -> /_plugins/_ml/memory/interaction/{conversation_id}/_list
DeleteConversation   -> /_plugins/_ml/memory/conversation/{conversation_id}/_delete
GetConversation(sg)  -> /_plugins/_ml/memory/conversation/{conversation_id}
GetInteraction(sg)   -> /_plugins/_ml/memory/interaction/{interaction_id}
SearchConversations  -> /_plugins/_ml/memory/conversation/_search
SearchInteractions   -> /_plugins/_ml/memory/interaction/{conversation_id}/_search

Also will there be any use case to delete any specific interaction from the conversation?

@saratvemulapalli
Copy link
Member

@HenryL27 makes sense. Looking at PR #1504 I understand the user experience is great with these APIS, I see we created bunch of boiler plate just to serve these requests and transport actions, I would love to just re-use the search transport actions for Get REST APIs with some optional params WDYT?

@HenryL27
Copy link
Collaborator Author

@dhrubo-os I haven't seen too much need for single interaction _delete, although I an see how it could be useful (some kind of "undo last interaction" interaction model).
I don't love putting "interaction" in the Interaction-based endpoints; the purpose of the 'conversation' token is to say that this is a special kind of memory for conversations - I think Jing has several other kinds of memory coming down the line? grouping them all together at this level makes sense to me. The model I went with was "if there's a conversation id in the endpoint, it's the same kind of action as it would be if there wasn't, just do it over the interactions of the conversation specified rather than all of the conversation meta-objects"
One idea (from @austintlee) was to allow _search over multiple conversations' interactions similarly to how multi-index search works, e.g. /_plugins/_ml/memory/conversation/cid1,cid2,cid3/_search. I feel like that could be extended to search over all interactions somehow (/_plugins/_ml/memory/conversation/all/_search) - might be cleaner than throwing interaction into half of the endpoints. Idk. I like keeping conversations and interactions tightly coupled if possible.

@HenryL27
Copy link
Collaborator Author

@saratvemulapalli well, yes, there is a lot of boilerplate. I'm not convinced that it can all be stripped away. For example, the TransportActions all need to exist because this is where we check the feature flag. For the plural Get (and another motivation for it existing outside of _search) there was some thought of supporting databases other than an OpenSearch index under the storage interface, and all databases do not necessarily support OpenSearch search queries - so all storage implementations might not support the search action - so Get (pl) and Search should be decoupled at the transport layer.
There are probably a couple of simplifications I can do, but they would be in places that are already pretty simple, so I'm not sure it's worth it (for example maybe GetConversation can take a straight GetRequest instead of a GetConversationRequest. But that's already only like 12 lines of real code, and I'd need to add in more translation logic to the restHandler...)

@dhrubo-os
Copy link
Collaborator

I don't love putting "interaction" in the Interaction-based endpoints; the purpose of the 'conversation' token is to say that this is a special kind of memory for conversations - I think Jing has several other kinds of memory coming down the line? grouping them all together at this level makes sense to me.

From this context, I agree that to put all the endpoints under /_plugins/_ml/memory/conversation makes more sense.

I'm just worried that, these apis aren't quite self explanatory? Like looking at the api endpoints only, I can't tell what I can expect from the api specially /_plugins/_ml/memory/conversation/{conversation_id}/_list vs /_plugins/_ml/memory/conversation/{conversation_id}

If we can increase the readability of these api endpoints, I'm completely fine with keeping all under conversation.

@HenryL27
Copy link
Collaborator Author

I find they read a little better if you fill in the conversation id. but maybe I'm too close to the code, idk.

GET /_plugins/_ml/memory/conversation/oh_ert28b3y4vhowi/_list
GET /_plugins/_ml/memory/conversation/oh_ert28b3y4vhowi
GET /_plugins/_ml/memory/conversation/_list
GET /_plugins/_ml/memory/conversation/oh_ert28b3y4vhowi/o32nv494t49hwe

@austintlee
Copy link
Collaborator

On the verb topic, I came across a similar discussion happening on the AI Workflow side. They are sticking with GET, DELETE, POST and PUT as opposed to having the verbs appear at the end of the URLs.

opensearch-project/flow-framework#49 (comment)

@HenryL27
Copy link
Collaborator Author

hmm...
well, the rest of ml-commons puts the verbs in the URI, so I prioritize matching that convention over other plugins. Also, since I have many kinds of GET (sg, pl, search), there's only so much multiplexing that can be done, so I think the URI path (haha) is cleaner.

@austintlee
Copy link
Collaborator

If we know what APIs we will introduce for the new general memory layer, let's make sure the convention is consistent at least within all things memory. If Henry doesn't mind making another change later, maybe we can proceed with a plan to refactor, if necessary, at a later time once we have the rest of the APIs finalized. I am not saying we have to do what AI Workflows does. It looks like there isn't one standard across OpenSearch so maybe that ship has already sailed...

@HenryL27
Copy link
Collaborator Author

^^ it would be very good to know what the other memory APIs will look like / be

@austintlee
Copy link
Collaborator

@ylwu-amzn is there any design or proposal for this memory layer? Will it also support role-based access control?

@github-project-automation github-project-automation bot moved this from In Progress to Done in ml-commons projects Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

5 participants