Table request time complexity #6817

sanjayts-dv · 2023-10-20T10:33:55Z

sanjayts-dv
Oct 20, 2023

Hi folks, was curious about knowing that given S sources and D destinations, what is the time complexity for the table retrieval request? Is it S * D?

In our use-case, we would like to understand how to tackle our ever increasing request runtime based on the increasing source/destination points. So if, for e.g., a 5000*5000 request takes around 5 seconds, would we receive some speed-up by splitting them into 5 1000*5000 requests across multiple instances? Thanks.

danpat · 2023-10-20T14:08:43Z

danpat
Oct 20, 2023
Collaborator

The original implementation is based on this paper: http://algo2.iti.kit.edu/schultes/hwy/distTable.pdf

|S|·Dijkstra(F +|G′K |)+|T |·Dijkstra(F )+O(|S|·|T |·f ).
If both sets S and T are large, the dominating term is
|S| · |T | · f

As you can see, there are a lot of pieces in the complexity puzzle. Your best bet is to try. One observation to make though - you will pay the coordinate snapping price multiple times if you make multiple requests.

To address that, @oxidase made a PR a few years ago that made the manyToMany plugin capable of doing the routing part multi-threaded: #4454 - but it was removed at some point, I think when we added support for both CH and MLD algorithms. It might be worth attempting to restore if you really want to squeeze the lowest latency out of a single request.

0 replies

jcoupey · 2023-10-20T14:24:29Z

jcoupey
Oct 20, 2023
Collaborator

it was removed at some point, I think when we added support for both CH and MLD

Interesting. Does it mean that the parallelization approach used back then was only viable for CH and would not have been usable with MLD?

0 replies

danpat · 2023-10-20T16:00:13Z

danpat
Oct 20, 2023
Collaborator

@jcoupey I honestly don't remember, but I'm sure we can dig up the commits where it happened and see if the commit messages give more context.

It's a bit of a tradeoff - if you're only handling your own requests, then it makes sense to split a single request across many processors to make it fast.

If you're handling public traffic (like we do at Mapbox), then it can make sense to limit a request to a single CPU, despite the slower response time. If there are enough other concurrent requests occurring that you can utilize all your compute, it is arguably better to provide consistent latency based on request shape, rather than variable performance depending on load (which is what would happen if large requests were allowed to use other CPU cores and fight with other concurrent smaller requests).

0 replies

DennisOSRM · 2023-10-20T17:05:54Z

DennisOSRM
Oct 20, 2023
Maintainer

it was removed at some point, I think when we added support for both CH and MLD

Interesting. Does it mean that the parallelization approach used back then was only viable for CH and would not have been usable with MLD?

That's a fair assessment.

0 replies

DennisOSRM · 2023-10-20T17:07:14Z

DennisOSRM
Oct 20, 2023
Maintainer

For practical purposes the CH based many-to-many implementation is in O(S*T).

0 replies

jcoupey · 2023-10-23T08:12:41Z

jcoupey
Oct 23, 2023
Collaborator

It's a bit of a tradeoff

Sure, totally get that eating up all cores is not always desired. But this means it would probably be possible to pass the number of cores allowed as a runtime option, a bit like what osrm-extract does. Then users would be able to make their own tradeoff. Then the single-threaded scenario would just be a particular case. Or am I missing something here and this would imply some overhead for the current default?

0 replies

sanjayts-dv · 2023-11-03T13:10:10Z

sanjayts-dv
Nov 3, 2023
Author

Thank you for the clarifications

This might be a silly question but assuming we are pre-processing our map data using osrm-extract followed by osrm-contract and instead of the in-built server access the map data using FFI (OSRM object) which is initiatized with a .osrm file, would it be fair to assume we end up using the CH algorithm internally as opposed to MLD?

0 replies

danpat · 2023-11-03T13:59:38Z

danpat
Nov 3, 2023
Collaborator

Yes - the access method (osrm-routed or FFI) has nothing to do with the algorithms used and the structure of the data. If you're using osrm-contract and not osrm-customize, then you have CH data.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Table request time complexity #6817

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Table request time complexity #6817

sanjayts-dv Oct 20, 2023

Replies: 8 comments

danpat Oct 20, 2023 Collaborator

jcoupey Oct 20, 2023 Collaborator

danpat Oct 20, 2023 Collaborator

DennisOSRM Oct 20, 2023 Maintainer

DennisOSRM Oct 20, 2023 Maintainer

jcoupey Oct 23, 2023 Collaborator

sanjayts-dv Nov 3, 2023 Author

danpat Nov 3, 2023 Collaborator

sanjayts-dv
Oct 20, 2023

danpat
Oct 20, 2023
Collaborator

jcoupey
Oct 20, 2023
Collaborator

danpat
Oct 20, 2023
Collaborator

DennisOSRM
Oct 20, 2023
Maintainer

DennisOSRM
Oct 20, 2023
Maintainer

jcoupey
Oct 23, 2023
Collaborator

sanjayts-dv
Nov 3, 2023
Author

danpat
Nov 3, 2023
Collaborator