Hi everyone! We’re experiencing some challenges processing the complete planet.osm.pbf file in our Kubernetes environment, especially considering we’re using turn_costs in our configuration, and would appreciate some guidance.
Current Setup:
- Graphhopper version: 9.1
- Environment: Kubernetes (GKE)
- Node resources: 30 CPU cores, ~236GB RAM
- Input: Complete planet.osm.pbf file
- Java settings: -Xmx170g -Xms60g -XX:+UseG1GC
Specific Configurations:
# Java Settings
env:
- name: JAVA_OPTS
value: -Xmx170g -Xms60g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+HeapDumpOnOutOfMemoryError
- name: GRAPHHOPPER_MEMORY_MAP_DATAACCESS
value: "true"
- name: GRAPHHOPPER_GRAPH_DATAACCESS
value: "RAM_STORE"
# Profile Configurations with Turn Costs
profiles:
- name: car
turn_costs:
vehicle_types: [ motorcar, motor_vehicle ]
u_turn_costs: 120
- name: truck
weighting: custom
turn_costs:
vehicle_types: [ hgv ]
u_turn_costs: 120
The Problem: We’re trying to process the complete planet.osm.pbf file, but we’re facing significant challenges. Besides memory issues (initially getting OOMKilled errors), we’ve noticed that using turn_costs is considerably impacting processing time and resource consumption. After adjusting memory settings, the process starts but still takes an extremely long time and occasionally fails.
Questions:
- Are our Java memory settings optimized for processing the complete planet file considering the use of turn_costs?
- What’s the expected impact of turn_costs on memory consumption and processing time?
- Are there specific recommended configurations to optimize processing when using turn_costs?
- Would you recommend splitting the processing into regions? If so, what would be a good approach while maintaining turn_costs functionality?
- What are the recommended minimum hardware requirements for processing the complete planet file with turn_costs enabled?
We’re particularly interested in understanding how to balance the use of turn_costs (which is essential for our application) with efficient planet file processing. Any experience or tips on how to optimize this specific scenario would be highly valuable.
Thank you in advance for your help!