Optimization for Route API


#1

Hello guys. I’m working with GraphHopper Open Source and my main usage is for Route API.
Do you have any suggestion to improve the dataset for this API?

For example:
I’m using a cropped osm.pbf file. Can I keep only tags for the road network, to reduce the dataset size (if I can do it, which are the tags used on Route API)?

I don’t use Isochrone API, Map Matching API, and Geocoding API. Actually, I don’t even use the front-end. Can I remove these features to reduce some computational resource?

Thanks in advance :smile:


#3

@karussell can you please mute him. It becomes annoying.

@Anyaoha please create your own thread if you have questions like @karussell just said to you in the last post? And stop hijacking other threads all the time.


#4

I am also interested in this question. Am curious to know what you have tried / found out. I am also a user like you, not a developer.

I have looked at the map of a city and removed “tracks” and “footpaths” etc and reduced the size considerably. I only left primary, secondary, tertiary and a few other highways in. I found that it is indeed faster, by about 10%, and affects fastest more than shortest. It also depends on whether you use CH or not.


#5

What are you requirements? Do you need to reduce RAM usage at query time or import time? Or do you mean something else?


#6

For now, I only trimmed my database to improve the import time.
Some time ago I did something related for a research - I filtered some tags to improve an algorithm’ analysis time.

The import time is not so important for me because I can handle it on my load balancer health-checker. Anyway, could be amazing to reduce it too.

The RAM usage is the most important in my opinion.
Today I’m using AWS EC2 instances for high memory usage.
I can see the memory freeable always decreasing, even without use.

Another point: my usage is the Route API exclusively. How can I remove the ‘front-end’?


#7

If query speed is sufficient you can try MMAP option for dataaccess but without SSD it will be much slower. It is always a balance of query speed, resources used when doing the import and RAM usage.

Another point: my usage is the Route API exclusively. How can I remove the ‘front-end’?

I suggest not to fork GH, instead create a custom MyApplication class in a project that depends on GH and then do not add the assets: https://github.com/graphhopper/graphhopper/blob/master/web/src/main/java/com/graphhopper/http/GraphHopperApplication.java#L39


#8

@karussell, about the tags… Do you recommend to remove some unused tags from the dataset to improve the import time / routing algorithm run time?


#9

They shouldn’t matter. What matters is a fast disc like SSD and lot’s of RAM.


#10

Thanks for all the support karussell.

One more question related to the dataset:

Can you see performance improvements when it use smaller datasets?

Consider that scenario: I always run the Route API to calculate ETA on paths over a determined polygon (Sao Paulo). However, my dataset is the whole planet-osm.
Could the routing algorithm have a better response time on a trimmed dataset (Brazil-only instead of planet-osm)?

Make sense to me that a trimmed dataset requires less memory. But I would like to know about the performance (can I use the measurement action to calculate it?)

Thanks again :+1:


#11

Yes, unfortunately. But we are still investigating why, because this should not be, but probably there are not so big differences if you sort the graph (graph.sort=true).

can I use the measurement action to calculate it?

Yes. Measurement should give you insights into this


#12

Hello karussell.
Just to let you know I did load testings with smaller datasets.
I got some performance improvements on it.

However…
I run the performance tests on graphhopper 0.9 (my current production state) and graphhopper 0.12 (we will update to that version).
In my load testings using the same configurations and same data, the version 0.9 was faster than 0.12.

graphhopper 0.9 avg throughput = 121.4 req/sec
graphhopper 0.12 avg throughput = 96.32 req/sec

Do you have some similar result? Any idea why this difference?
Which version have the best performance on your tests?

Thanks


#13

Usually we improve on performance. But performance tests are really tedious to get right and also it is complex to improve all scenarios. So in order to know if this is really a problem you need to send us a reproducable measurement.

Preferable you do

./graphhopper.sh measurement area.pbf

and then you see in the resulting properties files what is going on and if the differences of both versions is reproducable.


#14

I did the measurement process and the results are basically the same for graphhopper 0.9 and 0.12

Could you take a look?

So, in theory, the response time should be the same?
I will double check my infrastructure to try to find some difference between the servers.


#15

Yes, the interesting variables are routingCH.mean (default speed mode), routingLM8.mean (hybrid mode) and routing.mean (flex mode).

But it could be that there is a regression with regards to the number of parallel volume as we currently do not test against this.

Can you try switching the servers? If the difference is still due to the versions then it would be interesting how you do the tests and how we can reproduce this.


#16

Hey karussell. New results:

I did another measurement test.
Now with the same config (new profile and another settings) that I had on the load testing.

graphhopper 0.9:

ARGS="config=$CONFIG graph.location=$GRAPH datareader.file=$OSM_FILE \
graph.flag_encoders=newcar prepare.ch.weightings=no prepare.lm.weightings=fastest prepare.min_network_size=200 prepare.min_one_way_network_size=200 routing.lm.disabling_allowed=true routing.non_ch.max_waypoint_distance=1000000 graph.dataaccess=RAM_STORE datareader.preferred_language=en"

graphhopper 0.12:

ARGS="$GH_WEB_OPTS graph.location=$GRAPH datareader.file=$OSM_FILE \
graph.flag_encoders=newcar prepare.ch.edge_based=off prepare.ch.weightings=no prepare.lm.weightings=fastest prepare.min_network_size=200 prepare.min_one_way_network_size=200 routing.lm.disabling_allowed=true routing.non_ch.max_waypoint_distance=1000000 graph.dataaccess=RAM_STORE datareader.preferred_language=en"

You can see the results on the same Google Sheet, on the tab encoder newcar.

|                 | GraphHopper 0.9 | GraphHopper 0.12   |
|-----------------|-----------------|--------------------|
| routingCH.mean  | -               | -                  |
| routingLM8.mean | 2.3982183808    | 2.0212344464000003 |
| routing.mean    | 8.167217608     | 6.772436764        |

Greater values = Better performance?


#17

No the values shown are average routing times in ms.


#18

Ok, that make sense :+1:
But it still weird… because graphhopper 0.12 was slower than 0.9 on my load testing

Additional info about the load testing:
JMeter --> 5 thread groups with the following configuration were started to get average values for response time:

Thread count: 50
Startup Time: 30 seconds
Hold Load: 10 minutes
Shutdown Time: 30 seconds

Same input data in every test.
Same networking and infrastructure (hardware) for both versions.

graphhopper 0.12 results:

| Throughput | Response time (min) | Response time (avg) | Response time (max) |
|------------|---------------------|---------------------|---------------------|
| 95.2       | 270                 | 501                 | 64501               |
| 97.6       | 266                 | 496                 | 4352                |
| 96.1       | 266                 | 496                 | 6155                |
| 96.2       | 270                 | 496                 | 4376                |
| 96.5       | 268                 | 494                 | 64060               |
| =96.32     | =268                | =496.6              | =28688.8            |

graphhopper 0.9 results:

| Throughput | Response time (min) | Response time (avg) | Response time (max) |
|------------|---------------------|---------------------|---------------------|
| 116.6      | 230                 | 409                 | 15227               |
| 121.8      | 266                 | 391                 | 3242                |
| 123.1      | 266                 | 387                 | 8615                |
| 122.5      | 267                 | 389                 | 4384                |
| 123        | 267                 | 388                 | 4027                |
| =121.4     | =259.2              | =392.8              | =7099               |

#19

Ok no idea really and better ask @karussell, but the Measurement class only tests the Java API (no server setup is involved, which could explain the difference).


#20

Hello @karussell. Can you give us a :raised_hand: to let me know what could be happening? I really would like to use the new version, but that performance issue could be a problem for me.