Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Reduce number of solutions that fail after calling /settle #2007

Closed
MartinquaXD opened this issue Oct 22, 2023 · 2 comments · Fixed by #2008
Closed

chore: Reduce number of solutions that fail after calling /settle #2007

MartinquaXD opened this issue Oct 22, 2023 · 2 comments · Fixed by #2008
Labels
E:3.1 Driver Colocation See https://github.com/cowprotocol/pm/issues/14 for details

Comments

@MartinquaXD
Copy link
Contributor

Background

Similar to #1999 this issue aims to reduce the time we waste due to incomplete data in the autopilot.
For example it can happen that some solver is very quick to respond with a solution for a given auction. But if some solvers take all the time they can use before reporting a solution they will return solutions with more up-to-date date whereas the quick to respond solvers would be unaware that a new block appeared that made their solution revert.

Details

I see 2 ways to handle this:

  1. in the driver keep requesting new solutions from the matching engine until the time runs out. This is the actually the most rational behavior and would produce the most recent data the solver can provide at the cost of more compute resources on the external solver's end.
  2. In the run loop of the autopilot don't just pick the highest score and declare them the winner. Instead call /reveal on the individual drivers from the highest to the lowest score and simulate their solutions. Only call /settle for the highest ranking solutions that still simulates. This solution would be totally fine for as long as we are running all the drivers but would undermine a pretty important aspect of the redesigned system (improved market maker support that only expects you to produce call data if you actually won).

We could of course go for both solutions (or some other solutions I didn't consider) but since just doing 2 would result in a system that is effectively identical to our legacy system (together with the mentioned PR) and is easy to implement I would go for that until we figure out a better way to avoid wasting precious run loops.

Acceptance criteria

Some change that results in more winning solutions actually ending up on-chain.

@MartinquaXD MartinquaXD added the E:3.1 Driver Colocation See https://github.com/cowprotocol/pm/issues/14 for details label Oct 22, 2023
@MartinquaXD
Copy link
Contributor Author

Since we have to have way higher throughput and should overall have the lowest latency possible on prod I consider this issue as Blocking Prod.

@sunce86
Copy link
Contributor

sunce86 commented Oct 23, 2023

Considering that is blocking prod I would be ok with (2) but would make immediate follow up plan what to do in a cleanup - probably communicating with external solvers about this and implement (1) and potentially reduce the solving time to further reduce the change of solution being invalid.

MartinquaXD added a commit that referenced this issue Oct 24, 2023
# Description
Fixes #2007. The preferred solution described in that issue was not
straight forward to implement.
It expected the `autopilot` to call `/reveal` from the highest scoring
solution to the lowest scoring one to simulate it in order to avoid
cases where a solver wins the auction with a solution that would revert
by now.
The issue was that this requires the `autopilot` to know the address of
every solver which would be a significant change.

Instead this PR implements a strategy on the `driver` side that a
rational actor would be expected to follow as well.
All solvers know when they have to return a solution at the latest.
Because a solution can be more accurate the closer it was computed to
the time it is supposed to be executed it makes sense for all solvers to
delay their response as much as possible (something might change in the
mean time which might enable an even better solution).
This deadline gets propagated to the `solver` engine so ideally it would
take as much time computing the optimal solution as possible.
If the solver returns earlier (like some currently do) it still makes
sense to assume that other solvers will submit their solution at a later
point in time.
In that case the only reasonable action for the `driver` is to wait
until the deadline approaches and continuously re-simulate the computed
solution whenever a new block gets detected. If in the meantime the
solution would start to revert the `driver` would then withhold the
computed solution to not accidentally win the auction and get slashed
for not submitting it on-chain.

# Changes
`driver` waits until deadline before returning the solution on `/solve`
and checks whether it's still viable on every new block. It also updates
the score in case the solution still simulates but becomes better or
worse (gas usage might change).
Also slightly adjusted to `Ethereum::new()` to panic on any init error
since we can't handle those errors anyway because the type is essential
to the program.

## How to test
This can be tested by a new e2e test which I would like to implement in
a follow up PR when existing e2e tests don't break.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E:3.1 Driver Colocation See https://github.com/cowprotocol/pm/issues/14 for details
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants