Serving GTFS of Germany on 32 GB RAM?

Hi all, I’m trying to get a Germany-wide GH instance with GTFS running on my 32 GB RAM server, but it seems like it inevitably crashes either during import or after a few hours of stable use no matter which settings I use. I feel like this shouldn’t be the case, since the entire graph-cache folder is < 20 GB.

Here’s my setup:

  • GH 11.0 running in a docker container (eclipse-temurin:21-jre) with no RAM restrictions
  • Ubuntu 24.04, 32 GB RAM, 8 GB of extra memory swap space
  • Separate build and serve runs, with the following Java Options:
JAVA_OPTS_SERVE="-Xms24g -Xmx24g -XX:+UseZGC -XX:MaxMetaspaceSize=512m -XX:+ExitOnOutOfMemoryError" # RAM heap around graph cache size on serve, ZGC based on GH docs
JAVA_OPTS_BUILD="-Xms24g -Xmx24g -XX:+UseParallelGC -XX:+UseStringDeduplication -XX:MaxMetaspaceSize=512m -XX:+ExitOnOutOfMemoryError" # parallel GC based on GH docs
  • MMAP already enabled in config
  • GTFS feed from https://gtfs.de/de/feeds/
  • Custom profiles only for reduced speed limits inside of cities
  • The rest of the server is relatively light on RAM, some docker containers and the system together totalling maybe 2-4 GB

Here’s what is happening: The first import often crashes with a java.lang.OutOfMemoryError during the GTFS feed import, but with some tuning of the Java heap I made it work at least once. However, during serving, I keep getting SIGSEGV errors, either during startup of the container, or after a few hours of normal stable use:

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000742e77916b17, pid=1, tid=11

On a 64 GB test server, everything ran fine for weeks with around 50 GB of Java heap allocated. Still, is this expected behaviour? It seems like it should be possible to serve a single country on 32 GB.

Any help is appreciated, thanks!

With GTFS enabled it needs a lot of heap due to the required timetable datastructure, but also probably because GraphHopper is not optimized yet for German-wide imports. (again, with GTFS enabled! Without GTFS GraphHopper could probably easily scale to european-coverage with 32GB RAM)

If you enabled MMAP you likely need to reduce the Xmx (and Xms) setting to give the JVM more heap room for other objects during import that live on the heap and do not use MMAP.

Best would be if you post your config. E.g. as a start disable CH and LM. Also disable elevation and try just a single profile etc all to get it through one time and see how much RAM it needs and how long it takes.

Thanks karussell! Reducing the heap has often led to crashes as well unfortunately.

My config is quite close to the default, with only city speed calculations added, no elevation. The normal imports all run easily, only GTFS creates issues.

graphhopper:

  datareader.file: ""
  graph.location: /graphhopper/graph-cache
  gtfs.file: /gtfs-data/fv.zip,/gtfs-data/rv.zip,/gtfs-data/nv.zip


  ##### Routing Profiles ####
  profiles:
   - name: car
     custom_model: {
       "priority": [
         { "if": "!car_access", "multiply_by": "0" }
       ],
       "speed": [
         { "if": "true", "limit_to": "car_average_speed" },
         { "if": "urban_density == CITY", "multiply_by": "0.7" }
       ],
       "distance_influence": 90
     }

   - name: foot
     custom_model_files: [foot.json]

   - name: bike
     custom_model: {
      "priority": [
        { "if": "true",  "multiply_by": "bike_priority" },
        { "if": "bike_network == INTERNATIONAL || bike_network == NATIONAL",  "multiply_by": "1.8" },
        { "else_if": "bike_network == REGIONAL || bike_network == LOCAL",  "multiply_by": "1.5" },
        { "if": "mtb_rating > 2",  "multiply_by": "0" },
        { "if": "hike_rating > 1",  "multiply_by": "0" },
        { "if": "country == DEU && road_class == BRIDLEWAY && bike_road_access != YES", "multiply_by": "0" },
        { "if": "!bike_access && (!backward_bike_access || roundabout)",  "multiply_by": "0" },
        { "else_if": "!bike_access && backward_bike_access",  "multiply_by": "0.2" }
      ],
      "speed": [
        { "if": "true", "limit_to": "bike_average_speed" },
        { "if": "!bike_access && backward_bike_access", "limit_to": "6" },
        { "if": "urban_density == CITY", "multiply_by": "0.8" }
      ]
     }


  profiles_ch:
    - profile: car
    - profile: foot
    - profile: bike

  profiles_lm: []


  #### Encoded Values ####
  graph.encoded_values: car_access, car_average_speed, foot_access, foot_priority, foot_network, foot_average_speed, average_slope, hike_rating, mtb_rating, bike_priority, bike_access, bike_network, roundabout, bike_average_speed, country, road_class, foot_road_access, bike_road_access, urban_density


  #### Elevation ####
  # none


  #### Country-dependent defaults for max speeds ####
  # none


  #### Urban density (built-up areas) ####
  graph.urban_density.threads: 8
  graph.urban_density.residential_radius: 400
  graph.urban_density.residential_sensitivity: 6000
  graph.urban_density.city_radius: 1500
  graph.urban_density.city_sensitivity: 1000


  #### Subnetworks ####
  prepare.min_network_size: 200
  prepare.subnetworks.threads: 1

  #### Routing ####
  routing.snap_preventions_default: tunnel, bridge, ferry
  routing.non_ch.max_waypoint_distance: 1000000


  #### Storage ####
  import.osm.ignored_highways:  # None
  graph.dataaccess.default_type: MMAP

# Dropwizard server configuration
[...]

I found an old RAM stick and “upgraded” to slower 48 GB for now with 32GB of heap memory, and everything runs smoothly. I also agree that without GTFS, everything very easily fits in a smaller server.

Maybe it’s a developer question then, because it seems strange that just GTFS feeds eat up tons more memory than the entire rest of the application. Strangely enough, the entire GTFS import also runs comfortably in a few GBs of RAM for almost the entire import process (tens of minutes), but then spikes to tens of GBs in a second when the records have already been processed (during getInterpolatedStopTimesForTrip). I don’t have any idea if there is an optimization hidden in there though.

Question regarding CH (since I don’t use LM): does CH also include the public transport feed? Because the non-GTFS import is always quite easy.