-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CDI MemoryIdProvider #523
Comments
I think this is a very interesting idea. I do have a question however: do you expect this to behave any differently than how it does for Websockets or REST? |
As far as I understand, with REST, if If this provider is introduced then if the client is say HTML based client which does not use Web Sockets, and authenticates to Quarkus with Google, then every time the client accesses Quarkus, the memory id will be calculated from SecurityIdentity. It will also work exactly the same for Web Sockets Next once quarkusio/quarkus#40312 is sorted out - if this provider is available, the security identity will get associated with the WS channel and instead of the connection id, it will be a security identity id, or may be a pair of the connection id and security identity id. Did I understand your question correctly ? |
Actually with REST you have the same idea as with Websockets - i.e. we use the active equest itself |
Sure, but what is the advantage of doing that? |
I see, if it also works the similar way for both REST and WS, then, if the provider is available, then it is checked, if not fallback to the way Quarkus LangChain4j does it out of the box for REST and WebSockets. I can experiment with a chat bot demo which does not use WebSockets, as a POC.
AFAIK, with Web Sockets, a connection id does not guarantee that it came from the user who started the WS upgrade, this is why Quarkus users want for example But may be it will be much more relevant with just REST, you mentioned the memory id was also calculated similarly for REST, can you clarify how is it done to correlate things between multiple calls ? FYI, with the provider, the way it will work across calls is that for example an OIDC or Form authentication session cookie will help to create the security identity, etc. I guess this provider, if introduced, may offer options for users to make more sophisticated correlations, even without the security in place |
I think there is a misunderstanding here (which is totally reasonable). The fact that a specific ID is used is not strictly tied to how long memory is available. |
I see what I mean. But I think in the secure Quarkus appplication, the lifespan of the "request" (I'm assuming here that it involves everything for example in a given WS session) will match the security identity lifespan. Let me consider this example. I authenticate to Quarkus via Keycloak and keep accessing Quarkus, for as long as the OIDC session is active, Quarkus will see the same user id and this very same user id will keep the user specific interactions keyed correctly for LangChain4j - this should be the case with or without WS being used. If the OIDC user logs out, the security event is sent and it can help to clean the corresponding memory (though I haven't though yet about the clean up specifics in the logout case). It would be useful for tracing as well, which user asked what. Right now, even if I'm biased :-), I think it will be beneficial to associate the security related data with the memory id. |
Can you please clarify your earlier comment,
How does it work across multiple requests ? |
By the way, another reason I came up with this proposal, is that, after looking at the langchain4j code, I can see the memory id is passed with |
This would only be the case if we knew the lifecycle of the user id. |
It does not, that's part of what I am failing to explain 😄. |
Hi @geoand I think I understand that the way I see how each user's chat bot session avoids clashing with another user's session in this case. And it is exactly the same case for example, for a typical secure OIDC session, we don't depend there on the CDI session scope, manage it in a different way, but the idea is the same. So getting back to your question, what is the advantage of trying to use a user name or some other security related identifier, which can work in in the latter case, with the memory id provider. I think it will let introduce the chat bot experience beyond the first entry, where the user has authenticated and authorized, where it is a user-specific, personalized experience. Take a DB SQL query example mentioned at the Insights call yesterday - if a principal name is With IMHO, it is worth giving it a try. Someone will do it sooner or later in some form, the only question IMHO is where it will be done. Like I said, if agreed in principle, I can start planning a draft PR to show how it may work, and learn a few things along the way.
Something along these lines. The user logs out, the provider can catch a security event and clear the Chat memory somehow. Or may be CDI session scope can be used along the way since I expect the lifespan of this session scope and the OIDC session be more or less equal. This will have to be figured out. I think all I'll need to get started is to know where I can put this provider check in a pure REST case, to start with, just before the automatic memory management decides on allocating the memory id itself. I guess I may be still missing a few details :-), but some kind of association between the the ChatBot session and the current security identity will be useful IMHO. Please think about it with the rest of the Quarkus langchain4j team :-), I can try to help if this enhancement request can be of interest... |
Sounds good to me! |
Thanks @geoand, I'll give it a try, make take me a bit of time to get something real produced with a few other issues in the mix, but I do commit to it. Cheers |
👌 |
At the moment I'm looking at the fraud detection demo which does not require a custom memory id, but once we resolve the WS-Next securty integration issue, it would work nicely with the secure variation of a demo like csv-chat-bot |
Hi @geoand I've found But in any case, I'm not sure now |
Let me work on updating #539 to show how one might want to use it |
I am not sure exactly which classes you mean since we only have |
I meant that I was thinking of adding a new interface, |
Yeah, I think I should close it for now, sorry I did not do my homework earlier :-) |
NP! |
Right now, the memory id can be supported with
@MemoryId
or it is set to the connection id with the WebSockets Next integration.I'd like to be able to use the current
SecurityIdentity
instance. Or the current JWT token's unique subject claim value.I'm not 100% sure what is the best way to do it.
My proposal is to update Quarkus LangChain4j code which checks what memory id is if no
@MemoryId
is set, but before falling back to the WS connection id.Here it will check with Arc if the correct scope
MemoryIdProvider
is registered, if yes, get it and retrieve an instance of Quarkus LangChain4jMemoryIdentifier
interface from it and use as a chat memory id, or to keep it simple, just return String.For example, one demo can have a request scoped
MemoryIdProvider
registered which will haveJsonWebToken
injected and return its token subject claim value, ensuring a verified user identity subject value is used as a memory id.How does it sound ? I'm happy to start looking at this enhancement request once it is agreed upon
The text was updated successfully, but these errors were encountered: