exp/orderbook: Improve performance of path finding implementation #3818
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have implemented 3 changes which improve the performance of the path finding implementation:
visited map[string]bool
and instead scan through the visited[]xdr.Asset
slice to determine if we have visited a node at some point in the current path of the DFSThe first optimization takes advantage of the fact that doing a linear scan over a small list is faster than a map lookup. Since we're limiting the depth of our DFS to at most 5, the visited list will be small enough that iterating over the list is faster than a map lookup.
The second optimization is effective when we are processing path finding requests with just 1 or 2 destination assets. Most of the path finding traffic fits this case. When we have a few destination assets there is no point in continuing the DFS after we have visited the destination assets.
The third optimization is the most impactful one. Assume we have a max path length of 3, this means our path should have no more than three edges,
src
->asset1
->asset2
->dest
. The dfs function was enforcing the max path length by checking that the length of the visited list was not greater thanmaxPathLength
. This check was done at the very beginning of the function.I realized that we should move this check to a later point in the code. Specifically, we should first check if we are at a terminal node and then we can check if the
maxPathLength
condition. By doing that we eliminate the case where we are at the end of amaxPathLength
path. We do not need to traverse the node's outgoing edges because the path is going to exceedmaxPathLength
.Here are the benchmarks for these changes:
As a basis of comparison, here are the benchmarks for the code without the changes:
There is room for improvement but the other ideas I had would affect the accuracy of the results. The nice thing about these improvements is that the output of the path finding implementation remains identical. Once the orderbook becomes sufficiently large that the path finding implementation becomes too slow even with these improvements, we can start looking at changes which sacrifice accuracy for efficiency.