Landmarks algorithm and performance with large weights


I’ve historically used the ‘flexible’ mode in my application, which I understand will default to bi-directional A*. I’ve recently been experimenting with moving to ‘hybrid’ mode (based on LM). I’ve made the necessary changes to my custom weighting, such that weights are never decreased.

Whilst LM works just fine for me, I note that it’s no faster than flexible mode if I have my custom weighting enabled (the number of visited nodes for a 10km route is ~30000). If I disable my custom weighting then it provides a huge speed increase (visited nodes drops to ~1700).

My custom weighting can use some very large weights (it has a multiplying effect on the existing edge weight, between 1x - 64x). The vast majority of edges will be assigned higher weights in my use case.

Is this expected? Is there anything I can reasonably do?


Do you mean you create some LM preparation of your custom weighting and this is slow? This should not be.

But if you mean you create some LM preparation of some weighting X and modify this per-request then the response time can degrade to the speed of bidirectional A* (in theory it shouldn’t be worse). The reason for that is that the changes might force the route too much away from the optimum for which the preparation was built.

Ah ha, that makes sense. Yes, my weights can change per request. Practically speaking though, there are only 6 discrete values (users have a slider from 0-100% in steps of 20). I am doing the LM preparation with the slider set to zero, which effectively disables my custom weighting. However, the default in the UI is 80%, and most users don’t change this.

So, I think what I will try is this: default my weighting to the 80% value and do the LM preparation with this. Then if anyone changes this slider, just force LM to be disabled for the request. This should give an optimal result with the majority of requests.

Does this sound sane?

Then if anyone changes this slider, just force LM to be disabled for the request. This should give an optimal result with the majority of requests.

You could also create two or more LM preparations like one at 0% and one at 80%. And if you not have enough RAM - not all of them have to via in memory as you can use MMAP now for only specific ones (see this feature)

1 Like

Thanks! I’ve gone with just enabling LM preparation for the 80% case at the moment. It looks like it’s going to yield a big speed boost. Thanks again!

1 Like

I implemented the changes and tested on a small area (England) successfully. I then re-imported the whole planet.

However, I find that the performance with LM degrades significantly with the planet-wide import, even for the same routes I was testing with the England import.

For example, with the England import, a foot profile route from London to Sheffield (370km) would visit 81k nodes with LM. But with the planet import, the same route would visit 917k nodes. No change in codebase or config, only the OSM file.

Is this expected? If so, will increasing the number of landmarks offset this?

If I disable LM and use flexible mode A*, then the performance doesn’t seem to vary by size of imported area (4.2m nodes visited for the same route above).

The tricky thing for world wide import is that you have less landmarks per area. This is also the reason we cut Eurasia into two parts for landmarks. I.e. certain routes via landmark between Asia and Europe will fail. You could do the same for e.g. UK.

Furthermore you could hand-pick the landmarks, because the algorithm does not pick the best and only greedily tries to pick some good enough landmarks (then also the import time might be a bit faster).
Config is

  prepare.lm.split_area_location: some.geo.json

But handpicking is a bit special and without experience it will lead to worse speed. So, try:
1: to locate the landmarks at the border of the whole routeable area!
2: they should be located as far away as possible from each other (so that many routes benefit)
3. if you want to improve speed from between north and south England then you can add a landmark in either the south or the north (or both).

(for our own hosting we do not do this effort as the improvements over our simple landmark searcher algorithm did not justify the effort the last time I looked into this)

Is this expected? If so, will increasing the number of landmarks offset this?

Yes, this will definitely help too.

If you have not enough RAM to do this you could try to switch to MMAP and increase the landmark heavily. Then the speed improvements from this increase might beat the slowness due to MMAP (have not tried it yet).

Thanks for all the detail. The vast majority of routes generated by my app are quite short (10-20km), but need to support the whole planet. So maybe there isn’t a big performance benefit with LM for my use case (aside from for when creating very long routes).

I’m going to try bumping up prepare.lm.landmarks to 48, I think I have enough RAM. (The default of 16 created a 20GB file for the planet).

For the fastest Weighting with the default landmark count we definitely see a good speed (and node count) improvement and any weighting should get a speed up. And yes, you can try increase the landmark count if you do not want touch the other settings

(the RAM requirement should increase proportionally i.e. 60GB instead of your current 20GB)

I’ll also play a bit with our landmark setup in the next weeks.

After increasing the number of landmarks to 48, I saw around a 10% decrease in the number of visited nodes from when using 16 landmarks. Given the increased RAM and processing overhead, I think I’ll just stick with 16. I’ll be enabling LM globally shortly.

Thanks again!

Can you share the autogenerated landmarks? While import a geojson is printed out which you can display with tools like

Sure thing:

48 landmarks:

16 landmarks:

Thanks! They do look fine from the first sight. Did you separate UK from the mainland via an additional splitter JSON?

No, I didn’t. Everything’s default for LM, except the prepare.lm.landmarks value. If the UK is split off then I suspect it’s probably because I’m blocking ferries, so I guess it creates the UK as a subnetwork. (Correct me if I’m wrong!)

Aha, ok. Strange that there are soo many landmarks bundled in England. This should give inner-England routes a good boost.

One big problem for a heavy customization is likely the bigger interval we would have to support and we don’t due to the memory limitations (every weight needs to fit into 2 bytes).

And then the maximum weight is easily reached (see LandmarkStorage.estimateMaxWeight) which means that the precalculated landmark weights do no longer give a hint regarding their direction (as all of them are the same: infinite). Instead we could try to change the precision which is currently static, but this could lead to other or similar problems I fear :slight_smile: … or we store the weight in 2.5 or 3 bytes.

Ah, no. It works differently and we currently do it properly already (at least in theory). So the only option to improve this would be to compress the value somehow or increase the space per value.

Thanks for the details of the inner workings.

As background: I had recently changed my custom weighting (per the earlier conversation) such that it returned values 1…64, and this value is used as a multiplier for the superweight. However, when I enabled LM for the planet and tried this, I hit an exception with the landmark weights exceeding Integer.MAX_VALUE. So instead I reverted my weights to be in the range (1/32)…2, in steps of 1/32, (so 64 possible values). This builds fine. (Note that I’m aware using weights <1 is an issue for LM, but I’m not doing this dynamically. Any weights <1 are part of the preprocessing. Everything dynamic is >=1).

To be clear, my custom weighting is taking the superweight and multiplying it by a number in the range of (1/32)…2

I haven’t looked at LandmarkStorage.estimateMaxWeight yet, I will do.

I will also try removing my custom weighting and re-running it for the planet, and seeing how this fares. This will at least give us the best-case performance, and I can decide whether to pursue it after that.

1 Like
Powered by Discourse