-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Query Protocol in storage deal negotiation #297
Conversation
0d61358
to
6d30007
Compare
Codecov Report
@@ Coverage Diff @@
## master #297 +/- ##
==========================================
+ Coverage 61.22% 63.23% +2.01%
==========================================
Files 41 41
Lines 2519 2537 +18
==========================================
+ Hits 1542 1604 +62
+ Misses 854 811 -43
+ Partials 123 122 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I wrote a bunch of comments, but there are two blocking issues:
- API signature for GetProviderDealState
- We need to make sure connections are being closed properly and ConnectionClosed state is set properly. One alternative is to refactor more aggressively to get rid of holding connections open entirely:
Refactor: Don't keep any protocol connections open #296
if err != nil { | ||
return ctx.Trigger(storagemarket.ClientEventReadResponseFailed, err) | ||
log.Warnf("error when querying provider deal state: %w", err) // TODO: at what point do we fail the deal? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking the proper thing here would be to have a maximum fail count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here's a suggestion:
make waitAgain take an error value -> pass on to ClientEventWaitForDealState
in the FSM code, for if ClientEventWaitForDealState, if err != nil, also increment another value -- say "PollErrorCount".
in this code, add a check for deal.PollErrorCount > some constant (maybe 10? -- miner offline for 5 minutes seems bad) and if so, instead of waiting again, trigger ClientEventReadResponseFailed like before and add back in CheckForAcceptance as an acceptable transition state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder about this; the Miner can be totally unresponsive to queries, but can still fulfill their side of the storage deal. In that case, if we fail the deal on the Client but the Miner finishes sealing and publishing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM -- but we have to figure out how rejections get sent -- that's not functionality we can break.
Move GetMinerWorkerAddress into StorageCommon
…hem, recreating them on startup, cleaning them up on failing deals, etc.
- Remove some tagging support from the deal environment - Remove StartDataTransfer since we use client push now
- sends the rejection response to the client
eded9c7
to
ca3da7e
Compare
f1fe4f6
to
ff37da8
Compare
Problem
Holding stream connections open is not sustainable, and doesn't help with resuming interrupted deals.
Solution
Use the query protocol in the Client/Provider FSMs, close the deal stream once we get a response.
Resolves #82