-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
N-Opa and optimizations #1271
base: master
Are you sure you want to change the base?
N-Opa and optimizations #1271
Conversation
…till need skip stops
Hi @doublestranded! Great to hear that you've been continuing to think about and work on these hard route geometry problems. I'll have a look through your PR -- and will ping @irees to do so as well -- as time allow. In terms of practical effects, what would this PR change externally: Would releasing it mean that current RSPs should be recalculated? Would the distances in SSPs need to be recalculated? Finally, re your option 3: we've actually been starting to move some of the heaviest GTFS import steps to Go -- it's not yet ready for public GitHub, but glad to chat more about that with you some time if you're interested. |
@doublestranded this is great - I am reading through the approach now. The new data importer currently uses simple linear interpolation to add distances and missing values, but when it is ready we can definitely implement this approach there as well. |
Thank you for taking a look @drewda and @irees! That's a great question about re-calculating both the RSP and SSPs. I say there's no rush to do that, even once this is complete and merged. Distances errors are still very theoretical right now (one slightly broken spec doesn't mean much in my opinion), but I do suspect they are some somewhere with such a large repository of data. Assuming continual imports of feeds when changed, manual imports feels a little unnecessary. I'm doing some local imports and investigation (when time permits for me as well) to test; if I start finding more legitimate errors with the existing algorithm, I'd say reconsider. Another thing to do might be to alert the Valhalla team of an impending merge. I'm glad you mentioned that @irees - that's exactly what I had in mind. This is a problem in general with the shapes and stops without distances given, and the code should appropriately reflect the layers of abstraction involved. @drewda As far as imports, thanks for the heads ups on the switch to Go. I probably don't need to be in the loop for now, but I'd be curious to see it when completed. I'll let you know of any progress with improving performance and finding inaccuracies. |
@drewda @irees - I’m starting to wrap up this PR. I’ve been testing it on some feeds, and fixing small bugs here and there. So far so good. For performance, I noticed that there was a bug in the way I was making (unnecessary) stack calls in the N-Opa code, and fixing that greatly improved performance (!). It’s possible there are other improvements to be made, but imports seem to run in reasonable time. I held off on implementing an “exact” solution. That solution, in my thinking at least, would have involved “pulverizing” each and every segment at all stops’ closest points to the segment, instead of breaking the segments based on a uniform small size. This would have involved an extra sorting step that would have impacted performance. Instead, I opted for a computation that ensures that each stop gets an assignment, and that segment sizes are small enough to account for stops being close together. I also simplified some of the logic regarding first stops and last stops. I felt that treating these two cases differently than the other stops was a too much of a judgement call, and one that conflicted with the new algorithm’s ability to compute the surrounding stops accurately. This necessitated updating a few of the tests to meet then new expectations. Finally, I ran into some issues with Ruby versions and gem dependencies - you may already be aware? I initially suspected Ruby 2.3 was causing problems with the builds, so I tried merging in the PR for Ruby 2.5 (it's probably best to go ahead and upgrade). Then some of the convex hull computations broke. From what I could tell, there are some issues transforming coordinates when proj4 is activated in RGeo (locally the computations match the test expectations). Ultimately, I copied out the buffer method from RGeo so we can bypass the transformations (I’ll provide a link in the comments). I think the best solution might turn out to be upgrading RGeo to the latest, along with its companion gems. Because one of those gems is ‘activerecord-postgis-adapter’, an upgrade to Rails 5 looks to be in order. |
Hello Transitland - I’m submitting this PR after some time reflecting on the distance calculation problem for RSPs and shapes/stops without supplied distances in general. In particular, I was curious to see whether the Alevras paper in the comments could really be applied here. I’m not sure how this PR will fit with Transitland’s future plans, but right now it’s still more of a conceptual PR that shouldn’t be merged just yet. I hope it’s useful!
Some of the goals of this PR are:
e.g, loading and memoizing an RSP’s stops from SQL?). I’m hoping more progress can be made on the application of N-OPA and performance. There’s a few options, such as keeping a pre-selection heuristic to ensure only “complex” route lines need N-OPA, writing N-OPA in a faster language like C or Go, or adding a heuristic within N-OPA taking advantage of some mathematical properties, or all of the above. It probably seems strange that an heuristic algorithm would be replaced with another, but it’s still better to base the heuristic on an algorithm that’s more exact; or at least can be configured to be exact. I wouldn’t merge this just yet, as performance is still not great.