ALT stopping criterion incomplete?

HendrikLeuschner · February 18, 2019, 4:19pm

Hi,

while working with ALT, we came across minor differences that sometimes occur while routing. It seems ALT does not guarantee finding the optimal route. It seems like the stopping criterion used in bidirectional AStar, which is used for ALT, is not complete. According to Slides from KIT (Slide 34), it is necessary to have additional summands for the start and target node potential. In the current GH implementation, the stopping criterion is the same as in a bidir dijkstra, which does not seem to be sufficient.

To clarify, currently the stopping criterion is:
currFrom.weight + currTo.weight >= bestPath.getWeight()

But we think it should be
currFrom.weight + currTo.weight >= bestPath.getWeight() + weightApprox.approximate(start) - weightApprox.approximate(goal)

Any idea why the current implementation does not respect this? We found that ALT produces different routes for the same start and goal nodes if we use different numbers of landmarks. We think this might be the cause for that problem.

Thanks,
Hendrik

karussell · February 18, 2019, 4:35pm

See one problem could be this here: Landmark routing issue if QueryGraph splits longer edge · Issue #1532 · graphhopper/graphhopper · GitHub

If you think it is something different it would be nice to have a simple unit or integration tests that reproduce the issue.

the stopping criterion is the same as in a bidir dijkstra, which does not seem to be sufficient.

It should be the one of the AStarBidirection. Where did you find that it is using the one from bidir Dijkstra?

aoles · February 18, 2019, 7:30pm

Thank you for your feedback! We encountered the discrepancies when running benchmarks which differed only in the number of active landmarks (6 vs 8). The routes were simple point-to-point queries without any intermediate points, so it’s rather unlikely that this is related to virtual nodes.

It should be the one of the AStarBidirection. Where did you find that it is using the one from bidir Dijkstra?

Both AStarBidirection and DijkstraBidirectionRef extend AbstractBidirAlgo which defines the stop condition to be currFrom.weight + currTo.weight >= bestPath.getWeight();.

karussell · February 18, 2019, 7:39pm

Interesting, as we never stumbled over incorrect routes in practice. From a quick scan of the formula in the slides it looks like you are right. Do you have a reference in a paper?

And if you do this improved finish condition: were the problems fixed? And how is speed affected?

karussell · February 25, 2019, 10:46pm

Did you investigate this? I can investigate this only if I have a reproduceable bug. It might be that our special potential in A* (ConsistentWeightApproximator) does not require to adapt the finish condition.

aoles · March 1, 2019, 4:09pm

Let me apologize for not getting back to you earlier. It took us a while to investigate this as we have been looking into different possibilities.

Regarding the stop condition, in general it seems to be correct. I’m sorry for the confusion, the inequality on the KIT slides is just wrong, no idea where it comes from. We compared your implementation with the derivations from the slides referenced in AStarBidirection class and didn’t find any issues.

However, there is still a problem which originates from the precision loss of stored landmark distances. In short, the double to short truncation and scaling by factor introduces a bias in the forward and backward potentials returned by ConsistentWeightApproximator such that the true value is somewhere in the range computed potential +/- factor. This is important because these potentials are part of currFrom.weight and + currTo.weight so in worst case both weights might end up overestimated by a value up to factor leading the search to finish prematurely. We believe that in order to guarantee that the optimal path is found it is necessary to account for the precision loss by modifying the stop condition to read

currFrom.weight + currTo.weight - 2 * factor >= bestPath.getWeight()

The differences are rather subtle and therefore not easy to reproduce as they depend on the exact choice of landmarks and their weights. For the fastest weighting the suboptimal routes are typically just a few seconds worse than the optimal ones, an this is for routes which take several hours! For example, out of 1000 test routes only one of them yields different results depending on the number of active landmarks set. Interestingly, in this case the worse route is returned when more active landmarks are used (8 vs. 6). The suboptimal route marked in red (28944.5s) differs from the optimal blue one (28942.7s) by the following fragment

karussell · March 1, 2019, 4:23pm

Thanks a lot for the investigation and clarification. Please also see this discussion: https://github.com/graphhopper/graphhopper/issues/1557

And so yes, the precision that we use is very tight and could lead to such problems, but given the fact that it is not easy to increase the precision without doubling the memory requirement I think those small suboptimalities (<0.01%) are acceptable. There might be even bugs - every hint and improvement would be appreciated

Are you using 0.11 where an improvement in this matter is already included? https://github.com/graphhopper/graphhopper/pull/1376

aoles · March 6, 2019, 4:00pm

Thank you for pointing me to the corresponding GH issue. I believe it is still possible to address the problem without increasing the accuracy of stored landmarks by just modifying the stopping condition. Please find my detailed explanation on GitHub.

I don’t think this actually addresses the original underlying problem. As far as I can tell the modified storage layout just improves the distance approximation by increasing the range of allowed values, however, it does not deal with the precision loss at all as the double values are still scaled by factor and their fractional part gets truncated.