[solved] Saint Petersburg GTFS (smaller than Berlin's) needs 40 GB of heap memory. Why?

The documentation example Berlin & Brandenburg GTFS compiles perfectly even on a laptop with default parameters (Xmx8g = 8GB heap space).
Now, I tried Saint Petersburg, a city of about the same 5M+ population.
Here’s the feed. https://transitfeeds.com/p/saint-petersburg/826
I cut OSM.pbf to contain only Saint Petersburg, so it’s only 32Mb.

Config yml:

    graphhopper:
  datareader.file: /data/spb.osm.pbf  # 32Mb, russia from Geofabrik sliced by the city boundary with keep ways & keep relations.
  gtfs.file: /data/spb-lenobl.zip  # https://transitfeeds.com/p/saint-petersburg/826, latest (1 july)
  graph.location: /graphs/spb-transit
  graph.flag_encoders: foot

server:
  application_connectors:
    - type: http
      port: 8989
  admin_connectors:
    - type: http
      port: 8990

I tried running on 12, 24, 32 GB heap space, and it failed on the inter-network connections stage.

What does make this particular GTFS feed so very heavy for Graphhopper?

Comparing the feed files:

Berlin http://transitfeeds.com/p/verkehrsverbund-berlin-brandenburg
61 Mb compressed, 448 Mb uncompressed. stop_times.txt 289 Mb.

Saint Petersburg http://transitfeeds.com/p/saint-petersburg/826
35 Mb compressed, 282 Mb uncompressed, stop_times.txt 234 Mb.

I thought it must have been stop_times, but it’s not.

Last time it failed at this stage:

INFO  [2020-07-07 11:51:25,368] com.graphhopper.gtfs.GraphHopperGtfs: Looking for inter-feed transfers
web_1  | java.lang.OutOfMemoryError: Java heap space
web_1  | 	at java.base/java.util.Arrays.copyOf(Unknown Source)
web_1  | 	at com.carrotsearch.hppc.IntArrayList.ensureBufferSpace(IntArrayList.java:351)
web_1  | 	at com.carrotsearch.hppc.IntArrayList.insert(IntArrayList.java:173)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$SortedIntSet.addOnce(LocationIndexTree.java:799)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemLeafEntry.addNode(LocationIndexTree.java:766)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.addNode(LocationIndexTree.java:901)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.addNode(LocationIndexTree.java:917)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.addNode(LocationIndexTree.java:917)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.addNode(LocationIndexTree.java:917)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.addNode(LocationIndexTree.java:917)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.addNode(LocationIndexTree.java:917)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.addNode(LocationIndexTree.java:917)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex$1.set(LocationIndexTree.java:887)
web_1  | 	at com.graphhopper.storage.index.BresenhamLine$1.set(BresenhamLine.java:79)
web_1  | 	at com.graphhopper.storage.index.BresenhamLine.bresenham(BresenhamLine.java:45)
web_1  | 	at com.graphhopper.storage.index.BresenhamLine.calcPoints(BresenhamLine.java:75)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.addNode(LocationIndexTree.java:892)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree$InMemConstructionIndex.prepare(LocationIndexTree.java:870)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree.getPrepareInMemIndex(LocationIndexTree.java:224)
web_1  | 	at com.graphhopper.storage.index.LocationIndexTree.prepareIndex(LocationIndexTree.java:290)
web_1  | 	at com.graphhopper.gtfs.GraphHopperGtfs.importPublicTransit(GraphHopperGtfs.java:193)
web_1  | 	at com.graphhopper.GraphHopper.postProcessing(GraphHopper.java:934)
web_1  | 	at com.graphhopper.GraphHopper.process(GraphHopper.java:662)
web_1  | 	at com.graphhopper.GraphHopper.importOrLoad(GraphHopper.java:625)
web_1  | 	at com.graphhopper.http.GraphHopperManaged.start(GraphHopperManaged.java:125)
web_1  | 	at io.dropwizard.lifecycle.JettyManaged.doStart(JettyManaged.java:27)
web_1  | 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
web_1  | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
web_1  | 	at org.eclipse.jetty.server.Server.start(Server.java:407)
web_1  | 	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
web_1  | 	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
web_1  | 	at org.eclipse.jetty.server.Server.doStart(Server.java:371)
web_1 exited with code 1

[addition] Managed to compile on Xmx40g.

INFO  [2020-07-07 13:28:12,590] com.graphhopper.gtfs.GraphHopperGtfs: flushed graph totalMB:40960, usedMB:31562)

~40GB of RAM used when running it.

Solved the issue by fixing date intervals (some start dates were 2020, while end dates were 2019-12-31)
and emptying the frequencies.txt file.
19GB RAM used, took 3 minutes to compile.

1 Like

After an experiment, I can for sure tell that frequencies.txt was causing the problem. In Berlin’s GTFS it’s empty.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.