Multi-Server import and setup

Hey,

in preparation for a larger scale of operation I would like to deploy GH on multiple servers and ensure both high-availability and better performance of the service.

Assuming (i,e) I need the entire Europe map and require “foot” profile, for “routing” only, is there a way to share the results of the import across multiple servers, so that individual imports can be avoided?

Basically, have a “stronger” machine, doing the map import and several “less strong” machines service the Routing API.

M.

Yes, sure. Just copy the graph-cache folder (created after import) and ensure that you use the same config.yml for all servers.

Thanks!

And can that folder by shared among multiple instances (i,e from a shared storage)… it is readonly?

Likely it won’t work as a lock file needs to be created to prevent certain problems. But there are options to make it work. However this is not always reliable and we don’t recommend it. Have a look into this PR comment.

Too bad..

This means, the cache folder must be copied to the new instance after the “importer” has created it… :confused:

When you use a shared (network?) storage this is done automatically and yes, for GH you currently need to do it explicitly.

However this is also an advantage: if you need to restart the GH server then this will be much quicker as the graph-cache is already there.

Not sure I understand, having a shared storage, doesn’t automatically copy anything.
And you are also saying that an (and each) instance has its own lockfile.

This is not related to the location of the cache, but to the fact that there is a cache, right?

I do not know the internals of how exactly a network storage works, but if a process uses a network storage, isn’t the data copied via network to the process, or how it will arrive to that process?

If you have a graph-cache of e.g. 180GB and 60MB/sec then the restart will take 51 minutes just because of the copying process.

This is not related to the location of the cache, but to the fact that there is a cache, right?

A graph-cache on a local disk should allow faster restarts than if the graph-cache is on a network storage (usually network read time is much slower than reading from disk).

so going back to the basics, each instance loading the cache from the “shared” folder, will:

  1. load it to memory completely or just copy it locally?
  2. add it’s own lock file in it
  3. when crashed do these 2 above all over again?

this makes a difference in the expected behavior of that shared “copy”…

If all files already exist and if no import is necessary and if you use the default dataaccess type then it will load it into memory completely, yes.

add it’s own lock file in it

Of course every graphhopper instance uses the same lock file. If it already exists it won’t start. There is an option in GraphHopper that avoids this - set allowWrites=false. However this is only partially true as explained in #3144 but should work for your case.

when crashed do these 2 above all over again?

what do you mean here?