Hello guys. I’m working with GraphHopper Open Source and my main usage is for Route API.
Do you have any suggestion to improve the dataset for this API?
For example:
I’m using a cropped osm.pbf file. Can I keep only tags for the road network, to reduce the dataset size (if I can do it, which are the tags used on Route API)?
I don’t use Isochrone API, Map Matching API, and Geocoding API. Actually, I don’t even use the front-end. Can I remove these features to reduce some computational resource?
I am also interested in this question. Am curious to know what you have tried / found out. I am also a user like you, not a developer.
I have looked at the map of a city and removed “tracks” and “footpaths” etc and reduced the size considerably. I only left primary, secondary, tertiary and a few other highways in. I found that it is indeed faster, by about 10%, and affects fastest more than shortest. It also depends on whether you use CH or not.
For now, I only trimmed my database to improve the import time.
Some time ago I did something related for a research - I filtered some tags to improve an algorithm’ analysis time.
The import time is not so important for me because I can handle it on my load balancer health-checker. Anyway, could be amazing to reduce it too.
The RAM usage is the most important in my opinion.
Today I’m using AWS EC2 instances for high memory usage.
I can see the memory freeable always decreasing, even without use.
Another point: my usage is the Route API exclusively. How can I remove the ‘front-end’?
If query speed is sufficient you can try MMAP option for dataaccess but without SSD it will be much slower. It is always a balance of query speed, resources used when doing the import and RAM usage.
Another point: my usage is the Route API exclusively. How can I remove the ‘front-end’?
Can you see performance improvements when it use smaller datasets?
Consider that scenario: I always run the Route API to calculate ETA on paths over a determined polygon (Sao Paulo). However, my dataset is the whole planet-osm.
Could the routing algorithm have a better response time on a trimmed dataset (Brazil-only instead of planet-osm)?
Make sense to me that a trimmed dataset requires less memory. But I would like to know about the performance (can I use the measurement action to calculate it?)
Yes, unfortunately. But we are still investigating why, because this should not be, but probably there are not so big differences if you sort the graph (graph.sort=true).
can I use the measurement action to calculate it?
Yes. Measurement should give you insights into this
Hello karussell.
Just to let you know I did load testings with smaller datasets.
I got some performance improvements on it.
However…
I run the performance tests on graphhopper 0.9 (my current production state) and graphhopper 0.12 (we will update to that version).
In my load testings using the same configurations and same data, the version 0.9 was faster than 0.12.
Usually we improve on performance. But performance tests are really tedious to get right and also it is complex to improve all scenarios. So in order to know if this is really a problem you need to send us a reproducable measurement.
Preferable you do
./graphhopper.sh measurement area.pbf
and then you see in the resulting properties files what is going on and if the differences of both versions is reproducable.
Yes, the interesting variables are routingCH.mean (default speed mode), routingLM8.mean (hybrid mode) and routing.mean (flex mode).
But it could be that there is a regression with regards to the number of parallel volume as we currently do not test against this.
Can you try switching the servers? If the difference is still due to the versions then it would be interesting how you do the tests and how we can reproduce this.
Ok, that make sense
But it still weird… because graphhopper 0.12 was slower than 0.9 on my load testing
Additional info about the load testing:
JMeter --> 5 thread groups with the following configuration were started to get average values for response time:
Ok no idea really and better ask @karussell, but the Measurement class only tests the Java API (no server setup is involved, which could explain the difference).
Hello @karussell. Can you give us a to let me know what could be happening? I really would like to use the new version, but that performance issue could be a problem for me.
It could be everything. Between those versions we did not only change the algorithms, but also which roads are accepted and this is highly likely the reason. (e.g. you can try to exclude tracks and see that speed will be much faster, but tracks are allowed in some countries so we have to include them until country-rules are ready)