-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in issue-credential v1.0 leads to credential exchange switching to abandoned state #2000
Comments
FYI - @ianco -- this sounds familiar. |
@swcurran ... Yes, similar to the mediator issue we uncovered ... we need to look at where we're sending messages vs committing updates to the exchange records, and also what error checking we're doing on the state when we receive a message (or invoke one of the admin api endpoints) ... |
OK not exactly like the mediator issue, but also looks like a general problem across basically all protocols. In the base_record (https://github.com/hyperledger/aries-cloudagent-python/blob/main/aries_cloudagent/messaging/models/base_record.py#L389) - the Ideally any events that are triggered within a protocol shouldn't actually be "emitted" and acted upon until the protocol step completes. Also the specific error of "exchange is in wrong state" shouldn't trigger the exchange to be pushed into an |
Labelling this as High Priority because I think this needs to be addressed for a 1.0.0 release |
Need to relook at this issue. |
@ianco -- could you look at this again, please? As issue-credential-1.0 is deprecated, an issue local to that particular concern is not that big a deal. However, if it is a broader issue, we need to know that, and to characterize both the problem and what actions we should take. Not looking for it to be fixed (yet) -- just a definition of the problem, its potential impact and suggestions. Note that I didn't read through this full issue, so the information might already be all here. If so, summarizing it in one comment, or in an ACA-Pug presentation might be sufficient for this request. Let me know. |
I'll take a look. Reading through the comments I think the solution is pretty straightforward. I'll fix the 1.0 protocol and double check the other protocols ... |
I believe this is a bigger problem. The My suggestion ... the problem is in the following code, in https://github.com/hyperledger/aries-cloudagent-python/blob/main/aries_cloudagent/messaging/models/base_record.py:
(There are other scenarios as well in this code.) Note that in the My suggestion is to move the This would ensure that all notifications happen after the transaction is committed (and database updates are updated) however it may have other side effects. I can give this a try but wanted to get some feedback ... @dbluhm @swcurran @shaangill025 any thoughts? |
|
@ianco - could you prepare a short session on this for the ACA-Pug meeting tomorrow. I’d like this to have a higher profile, since you haven’t gotten feedback. This is out of my realm. |
Yep will do. I have a fix so I'll open a PR. I'm having trouble duplicating the issue though so can't (yet) verify the fix. PR #2760 |
Fixed by #2760 |
We have a test system running that connects two ACA-Py instances and triggers issuance of credentials using the issue-credential v1.0 protocol. During load tests we have around 1% of all credential exchanges fail.
Some more details:
/issue_credential
webhook and wait for the status REQUEST_RECEIVED of the credential exchange/issue-credential/records/{credentialExchangeId}/issue
to trigger the issuanceWe found out that this problem can be mitigated by waiting some seconds before doing the issue http call and assumed thus, that this is a race condition. Further analysis seems to confirm this:
The text was updated successfully, but these errors were encountered: