Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combination of terse web hooks without --preserve-exchange-records makes completing a present proof 2.0 impossible #3065

Closed
Gavinok opened this issue Jun 25, 2024 · 26 comments
Assignees

Comments

@Gavinok
Copy link
Contributor

Gavinok commented Jun 25, 2024

As the verifier it is impossible to complete a present proof 2.0 transaction without the --preserve-exchange-records flag.

When the webhook confirming a proof request has been completed acapy will already delete the record before the controller can retrieve it.

@swcurran
Copy link
Contributor

To confirm — the issue is not that ACA-Py has already deleted the record before the controller can retrieve it — that is expected/intended behaviour. The controller shouldn’t be trying to retrieve the record after the end of the protocol. Rather the issue is that the “terse" webhook that was added at some point into the code base (for good reason — it carries too much information) does not include the required information for the verifier — the presentation — that is needed by the controller.

Do I have that right?

@jamshale jamshale self-assigned this Jun 25, 2024
@jamshale
Copy link
Contributor

Can we replicate this with the demo? If so what are the steps?

I removed the preserve-exchange-records flag from the faber and alice demo and could still complete the proof despite the presentation record getting deleted by both agents.

@swcurran
Copy link
Contributor

You just need to check the webhook to see what data comes back in it. If it is insufficient, it is a problem. The timing issue is not the problem. The controller/verifier should NOT be trying to retrieve the record via the API when that flag is not set.

@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 25, 2024

@swcurran that is correct.
@jamshale this didn't pop up in the demo. It is using connections vs our current work with vc-authn is using OOB. That said I don't think that is the source of the problem.

I see that the response to the verify-presentation endpoint does contain the presentation. We could use this instead of the recommended process of retrieving the presentation after the the state changes to done.

@jamshale jamshale removed their assignment Jun 25, 2024
@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 25, 2024

The issue is that in our case we don't even seem to manage to verify the presentation before it is deleted. Her are some of our logs

controller-1     | 2024-06-25T19:34:52.470174Z [info     ] GOT A PRESENTATION, TIME TO VERIFY [api.routers.acapy_handler]
controller-1     | 2024-06-25T19:34:52.470306Z [debug    ] >>> verify_presentation        [api.core.acapy.client]
controller-1     | Starting new HTTP connection (1): aca-py:8077
aca-py-1         | 2024-06-25 19:34:52,475 aries_cloudagent.admin.server ERROR Handler error with exception: Record not found: pres_ex_v20/714760ae-3c25-40b3-8ae9-a6413fc9c431.
aca-py-1         | 
aca-py-1         | =================
aca-py-1         | Traceback (most recent call last):
aca-py-1         |   File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/protocols/present_proof/v2_0/routes.py", line 1270, in present_proof_verify_presentation
aca-py-1         |     pres_ex_record = await V20PresExRecord.retrieve_by_id(session, pres_ex_id)
aca-py-1         |   File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/messaging/models/base_record.py", line 230, in retrieve_by_id
aca-py-1         |     result = await storage.get_record(
aca-py-1         |   File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/storage/askar.py", line 92, in get_record
aca-py-1         |     raise StorageNotFoundError(f"Record not found: {record_type}/{record_id}")
aca-py-1         | aries_cloudagent.storage.error.StorageNotFoundError: Record not found: pres_ex_v20/714760ae-3c25-40b3-8ae9-a6413fc9c431
aca-py-1         | 
aca-py-1         | The above exception was the direct cause of the following exception:
aca-py-1         | 
aca-py-1         | Traceback (most recent call last):
aca-py-1         |   File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/admin/server.py", line 181, in ready_middleware
aca-py-1         |     return await handler(request)
aca-py-1         |   File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/admin/server.py", line 218, in debug_middleware
aca-py-1         |     return await handler(request)
aca-py-1         |   File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/admin/server.py", line 451, in setup_context
aca-py-1         |     return await task
aca-py-1         |   File "/usr/local/lib/python3.9/asyncio/futures.py", line 284, in __await__
aca-py-1         |     yield self  # This tells Task to wait for completion.
aca-py-1         |   File "/usr/local/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup
aca-py-1         |     future.result()
aca-py-1         |   File "/usr/local/lib/python3.9/asyncio/futures.py", line 201, in result
aca-py-1         |     raise self._exception
aca-py-1         |   File "/usr/local/lib/python3.9/asyncio/tasks.py", line 256, in __step
aca-py-1         |     result = coro.send(None)
aca-py-1         |   File "/home/aries/.local/lib/python3.9/site-packages/aiohttp_apispec/middlewares.py", line 45, in validation_middleware
aca-py-1         |     return await handler(request)
aca-py-1         |   File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/protocols/present_proof/v2_0/routes.py", line 1272, in present_proof_verify_presentation
aca-py-1         |     raise web.HTTPNotFound(reason=err.roll_up) from err
aca-py-1         | aiohttp.web_exceptions.HTTPNotFound: Record not found: pres_ex_v20/714760ae-3c25-40b3-8ae9-a6413fc9c431.
aca-py-1         | 2024-06-25 19:34:52,476 aiohttp.access INFO 172.24.0.6 [25/Jun/2024:19:34:52 +0000] "POST /present-proof-2.0/records/714760ae-3c25-40b3-8ae9-a6413fc9c431/verify-presentation HTTP/1.1" 404 288 "-" "python-requests/2.32.3"

It is surprising that this doesn't happen with the demo

@swcurran
Copy link
Contributor

Are you saying that ACA-Py doesn’t verify the presentation, or that the controller doesn’t? We need to be pretty clear on that. AFAIK:

  • the presentation should arrive as a message.
  • ACA-Py should invoke the presentation verification code.
  • ACA-Py should post the webhook to the controller with all information needed by the controller.
  • The controller should use the webhook data for all post-verification processing, saving, etc.
  • ACA-Py should (at its leisure) delete the protocol state record, since the controller should never retrieve it.
  • The controller shouldn’t ever attempt to retrieve the protocol state record.

That flow MUST be used when the deployment does not set the —preserve-exchange-records flag, and SHOULD be the flow when it does set that flag.

@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 25, 2024

Sorry I should clarify.

As of now the demo controller simply uses the verify-presentation end point and waits for the web hook to indicate that verification has completed. It does not investigate what the presentation actually contains.

The response to verify-presentation did seem to contain a presentation. This however seems to regularly fail due to the record being deleted before verification has even completed. There seems to be a race condition here since the controller is sometimes able to verify the presentation.

The web hook never contains the presentation it's self.

  • ACA-Py should invoke the presentation verification code.

This is likely the case and the reason that when the controller manually verify the record is already deleted leading to the above stack trace.

@jamshale
Copy link
Contributor

Hmm. As far as I could tell the webhook response always contained the required presentation result in the demo at least. There's some code here https://github.com/hyperledger/aries-cloudagent-python/blob/main/aries_cloudagent/protocols/present_proof/v2_0/models/pres_exchange.py#L205 related to the generated webhook payload. I was actually looking at it, because the debug.webhooks config being false actually seems to not remove the extra data which I found strange.

The above logs look like the controller is using the rest api..?

@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 25, 2024

Hmm strange, I will run the demo to compare. Maybe there is some code in the demo I am missing. As of now I do see that class being returned but no presentation is actually included. These are the different web hook results coming from vc-authn

web hook: {'role': 'verifier', 'initiator': 'self', 'auto_present': False, 'auto_verify': True, 'thread_id': '584ced18-8719-4523-a516-221faaea000c', 'state': 'request-sent', 'trace': False, 'created_at': '2024-06-25T20:53:59.148532Z', 'updated_at': '2024-06-25T20:53:59.148532Z', 'pres_ex_id': 'b3fb345b-afa4-474f-8427-baf638c0e578'} [api.routers.acapy_handler]
web hook: {'role': 'verifier', 'initiator': 'self', 'auto_present': False, 'auto_verify': True, 'thread_id': '584ced18-8719-4523-a516-221faaea000c', 'state': 'presentation-received', 'trace': False, 'created_at': '2024-06-25T20:53:59.148532Z', 'updated_at': '2024-06-25T20:54:08.539665Z', 'pres_ex_id': 'b3fb345b-afa4-474f-8427-baf638c0e578'} [api.routers.acapy_handler]
web hook: {'role': 'verifier', 'initiator': 'self', 'auto_present': False, 'auto_verify': True, 'thread_id': '584ced18-8719-4523-a516-221faaea000c', 'state': 'deleted', 'trace': False, 'verified': 'true', 'verified_msgs': [], 'created_at': '2024-06-25T20:53:59.148532Z', 'updated_at': '2024-06-25T20:54:13.342049Z', 'pres_ex_id': 'b3fb345b-afa4-474f-8427-baf638c0e578'} [api.routers.acapy_handler]
web hook: {'role': 'verifier', 'initiator': 'self', 'auto_present': False, 'auto_verify': True, 'thread_id': '584ced18-8719-4523-a516-221faaea000c', 'state': 'done', 'trace': False, 'verified': 'true', 'verified_msgs': [], 'created_at': '2024-06-25T20:53:59.148532Z', 'updated_at': '2024-06-25T20:54:13.342049Z', 'pres_ex_id': 'b3fb345b-afa4-474f-8427-baf638c0e578'} [api.routers.acapy_handler]

@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 25, 2024

Let me know if I am missing something

@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 25, 2024

The above logs look like the controller is using the rest api..?

Ya we can ignore that error since as mentioned we want to make all the info in the webhook it's self so this stack trace was irrelevant.

@jamshale
Copy link
Contributor

jamshale commented Jun 25, 2024

Maybe the vc-authn controller is parsing out the presentation somehow?

I'd check out the demo with debug logs and different configs and see if it's what you are expecting. Unless, I was doing something wrong, the webhook payload I was seeing contained a lot more information.

@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 25, 2024

Ok, I will give that a try

@swcurran
Copy link
Contributor

I don’t like that there is an endpoint to verify the presentation. That should happen automagically on receipt. Is the presentation not verified until that is called? If that is the case, the protocol state object should not be removed until that endpoint is called.

@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 25, 2024

By default it is done automatically (at least with 2.0) then it is deleted. Manual verification doesn't seem to be necessary if you have auto verification set or leave it unassigned.

@jamshale
Copy link
Contributor

So, what I was doing with the webhook payload was my fault. The dockerfile for the demo had a random env variable setting debug-webhooks to true that overrides the command line argument. Gavin figured that out for me.

So, the payload without debug-webhooks basically causes automatic flows with webhooks to fail. Looks like we need to decide what exactly should be in the payload without the debug setting and what shouldn't.

@swcurran
Copy link
Contributor

Good stuff. Agree — we need to fix the webhook. @Gavinok — can you take a pass at what you think needs to be in the payload for a random verifier?

@Gavinok
Copy link
Contributor Author

Gavinok commented Jun 26, 2024

Certainly, I'm just digging through the debug web hook now. This may take some time to determine what fields are necessary with the 2.0 format

@ianco ianco self-assigned this Jul 2, 2024
@ianco
Copy link
Contributor

ianco commented Jul 2, 2024

I'm just looking into this now (I was away last week) but I don't think the presentation exchange record should be deleted before the presentation is verified, so maybe that's the solution, rather than tinkering with the webhook payload contents?

@ianco
Copy link
Contributor

ianco commented Jul 2, 2024

The presentation exchange record is deleted upon sending or receiving an ACK:

https://github.com/hyperledger/aries-cloudagent-python/blob/main/aries_cloudagent/protocols/present_proof/v2_0/manager.py#L435

@ianco
Copy link
Contributor

ianco commented Jul 2, 2024

I don’t like that there is an endpoint to verify the presentation. That should happen automagically on receipt. Is the presentation not verified until that is called? If that is the case, the protocol state object should not be removed until that endpoint is called.

There is a aca-py setting whether to auto-verify or not.

The record deletion looks like it happens when the ACK is sent (or received) and this is triggered by the verification.

So, if auto-verify is set, then the verification will happen automatically, the ACK will be sent and the record deleted.

If auto-verify is not set, then the verify-endpoint must be called, at which point the ACK will be sent and the record deleted.

So in any case, the record isn't deleted until after verification.

@ianco
Copy link
Contributor

ianco commented Jul 2, 2024

The webhook is emitted automatically, when the pres_exch record is saved, here:

https://github.com/hyperledger/aries-cloudagent-python/blob/main/aries_cloudagent/protocols/present_proof/v2_0/models/pres_exchange.py#L200

The contents of the "abbreviated" webhook are defined here:

https://github.com/hyperledger/aries-cloudagent-python/blob/main/aries_cloudagent/protocols/present_proof/v2_0/messages/pres_webhook.py

... so it looks relatively straightforward to add the presentation to the webhook, if this is what we want.

@swcurran
Copy link
Contributor

swcurran commented Jul 2, 2024

I don’t understand why there would need to be controller involvement in the verification. If a presentation is received in response to a presentation request, I would think that ACA-Py would verify it and put into the web-hook the presentation and result. That should certainly be the case for --auto-verify, but I don’t know why there would be any other setting.

Given how you have described it, if --auto-verify is off, then definitely, the protocol state object should not be deleted until the controller calls the verification endpoint. But that would seem like a waste of time for everyone — especially the controller/user.

@Gavinok
Copy link
Contributor Author

Gavinok commented Jul 2, 2024

The field that was needed in our case was the "by_format" field from the webhook when the state transition switches to "done" https://github.com/bcgov/vc-authn-oidc/blob/88d16bc669186d80c4986e2583a495575b989994/oidc-controller/api/routers/acapy_handler.py#L51

@ianco
Copy link
Contributor

ianco commented Jul 2, 2024

So, what I was doing with the webhook payload was my fault. The dockerfile for the demo had a random env variable setting debug-webhooks to true that overrides the command line argument. Gavin figured that out for me.

So, the payload without debug-webhooks basically causes automatic flows with webhooks to fail. Looks like we need to decide what exactly should be in the payload without the debug setting and what shouldn't.

@jamshale how do you get the demo to use abbreviated webhooks? For me even when I comment out that setting I get the full webhooks in the events :-S

@jamshale
Copy link
Contributor

jamshale commented Jul 2, 2024

@ianco This env variable was overriding the command line argument. https://github.com/hyperledger/aries-cloudagent-python/blob/main/docker/Dockerfile.demo#L7

And then I also commented out the command line argument https://github.com/hyperledger/aries-cloudagent-python/blob/main/demo/runners/support/agent.py#L588

@jamshale jamshale closed this as completed Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants