-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Reduce number of solutions that fail after calling /settle
#2007
Comments
Since we have to have way higher throughput and should overall have the lowest latency possible on prod I consider this issue as |
Considering that is blocking prod I would be ok with (2) but would make immediate follow up plan what to do in a cleanup - probably communicating with external solvers about this and implement (1) and potentially reduce the solving time to further reduce the change of solution being invalid. |
# Description Fixes #2007. The preferred solution described in that issue was not straight forward to implement. It expected the `autopilot` to call `/reveal` from the highest scoring solution to the lowest scoring one to simulate it in order to avoid cases where a solver wins the auction with a solution that would revert by now. The issue was that this requires the `autopilot` to know the address of every solver which would be a significant change. Instead this PR implements a strategy on the `driver` side that a rational actor would be expected to follow as well. All solvers know when they have to return a solution at the latest. Because a solution can be more accurate the closer it was computed to the time it is supposed to be executed it makes sense for all solvers to delay their response as much as possible (something might change in the mean time which might enable an even better solution). This deadline gets propagated to the `solver` engine so ideally it would take as much time computing the optimal solution as possible. If the solver returns earlier (like some currently do) it still makes sense to assume that other solvers will submit their solution at a later point in time. In that case the only reasonable action for the `driver` is to wait until the deadline approaches and continuously re-simulate the computed solution whenever a new block gets detected. If in the meantime the solution would start to revert the `driver` would then withhold the computed solution to not accidentally win the auction and get slashed for not submitting it on-chain. # Changes `driver` waits until deadline before returning the solution on `/solve` and checks whether it's still viable on every new block. It also updates the score in case the solution still simulates but becomes better or worse (gas usage might change). Also slightly adjusted to `Ethereum::new()` to panic on any init error since we can't handle those errors anyway because the type is essential to the program. ## How to test This can be tested by a new e2e test which I would like to implement in a follow up PR when existing e2e tests don't break.
Background
Similar to #1999 this issue aims to reduce the time we waste due to incomplete data in the
autopilot
.For example it can happen that some solver is very quick to respond with a solution for a given auction. But if some solvers take all the time they can use before reporting a solution they will return solutions with more up-to-date date whereas the quick to respond solvers would be unaware that a new block appeared that made their solution revert.
Details
I see 2 ways to handle this:
driver
keep requesting new solutions from the matching engine until the time runs out. This is the actually the most rational behavior and would produce the most recent data the solver can provide at the cost of more compute resources on the external solver's end.autopilot
don't just pick the highest score and declare them the winner. Instead call/reveal
on the individual drivers from the highest to the lowest score and simulate their solutions. Only call/settle
for the highest ranking solutions that still simulates. This solution would be totally fine for as long as we are running all the drivers but would undermine a pretty important aspect of the redesigned system (improved market maker support that only expects you to produce call data if you actually won).We could of course go for both solutions (or some other solutions I didn't consider) but since just doing
2
would result in a system that is effectively identical to our legacy system (together with the mentioned PR) and is easy to implement I would go for that until we figure out a better way to avoid wasting precious run loops.Acceptance criteria
Some change that results in more winning solutions actually ending up on-chain.
The text was updated successfully, but these errors were encountered: