GraphHopper.com | Forum | GitHub | Maps | Blog

Minimize downtime when regenerating graph-cache


#1

Of course anyone needs to update his environment with fresh pbfs data soon or later; to minimize downtime a 2nd instance of Graphopper can be run with the “-a import” to just create a new graph-cache “offline”. Then just restart the “-a web” Graphhopper instance: however this will create an outage (in my case just a few mins, acceptable).

I was wandering if there is any better way to reduce the downtime, any hint?

I found this discussion, where it is mentioned "At least two independent processes are recommended for production (no down time). " which suggests to have two listeners up and use just one at time, correct?

Thanks!


#2

I couldn’t convince the guys at GH to facilitate this using HTTP calls so I created a fork and implemented it. I wanted this to be merged to the main version but my pull request was declined.
The code can be found here:


There’s a rebuild http call that receives a pbf file, creates a cache, and replaces the in-memory graph object so the the downtime is minimal.
Here’s the code that uses it:

This is not ideal as this forks drifts away from the main version.
The way I think it should be properly solved is using docker - create two containers with GH, rebuild one of them, and switch between them. Docker is the ideal solution as it does exactly that.
I have yet to migrate my current solution to docker though…


#3

GraphHopper is nothing special and you can place it behind a load balancer (like one does for other services) or you create two GraphHopper instances to swap them on demand (there is no need to fork GraphHopper to do this, just use our maven dependencies in a java project). The problem of the latter solution is that you need twice the amount of RAM and in case you need to replace the hardware or restart the server/node etc you still need the load balancer solution, so we do not see a reason to include the second solution.

Please note that it is note recommended to do the import on the same machine where you do the routing as the import process can consume lots of resources (CPU, disk, RAM).