Running the routing engine on GraalVM, OpenJDK8 and OpenJDK10, +Shenandoah GC, +ZGC

Today I heard about the GraalVM release 1.0 and as we use OpenJDK8 with and without Shenondoah GC already in production I thought it was time to make a simple speed comparison.

I executed some performance tests with 1GB RAM:

./graphhopper.sh measurement berlin-latest.osm.pbf

For the latest release 0.10.1 and got the following differences for the latest explosion of possibilities as a production JVM:

OpenJDK8 (161-b12) OpenJDK10 OpenJDK8+Shenandoah 04-17-2018 OpenJDK10+Shenandoah 04-17-2018 OpenJDK11+ZGC 03-15-2018 Graal RC1
route (ms) 12-13 11-14 12.7-13.8 13.4-13.7 10.7-12 12.6-13
route CH (ms) 1.30-1.33 1.14-1.16 1.53-1.60 1.35-1.37 1.12-1.28 1.30-1.38
route LM4 (ms) 1.48-1.55 1.33 1.70-1.80 1.53-1.61 1.30-1.42 1.49-1.64
prepare CH (s) 7-10 8 9 10 9 7-8

Every run was executed 3 times and the average number was used, where the default settings of the measurement action was used. All running on a plugged-in T460 linux laptop.

The import was too different probably because of the disc cache or too short running time. Have not listed it here.

So JDK10 is roughly 10% faster than JDK8. And the JVM with Shenandoah is 10% slower. So not really surprising and all in small boundaries.

Of course under load and production settings with much larger heaps this will be more interesting and look completely different and one should introduce a maximum observed pause time or something to see the value of Shenandoah, ZGC or Graal. Graal is recommended already for production load but I couldn’t find a binary release of 1.0. Although VMs with Shenandoah and ZGC are currently not marked as production ready several users (including us) reported that they use Shenandoah under production load. Here are some real values of Shenandoah compared to GC1 from 2017.

Here is a FOSDEM 2018 talk about ZGC (or video) and here is a technical intro video into Graal and here Shenandoah.

Update: Graal EE resulted in the following numbers:

route (ms)     : 19
route CH (ms)  : 1.39
route LM4 (ms) : 1.7
prepare CH (s) : 8-9

Normal routing is much slower compared to Graal CE or OpenJDK8 which is a known limitation of the current RC.

3 Likes

Some new results with oracle jdk-8u201, Graal RC12 (EE+CE), openjdk v11.0.2 (with default GC1, ParallelGC and ZGC), a recent custom build of jdk11 with Shenandoah and openjdk 12 RC1 (same GCs). Using our improved measurement environment. Using Intel i7-3770; -Xmx6g; germany

Some notes:

  • prepare and import is where throughput is important and ZGC or shenandoah wouldn’t make sense and can be ignored
  • routing has heavy RAM usage (many smaller new objects)
  • all is single threaded (we need some good concurrent load tests in the future for all routing* scenarios)
  • for import_time the unit is seconds not ms
  • all tests are executed 4 times (every single routing* test is run thousand times, see the Measurement class for details)
  • there are two “x_compare” measurements ensuring that the measurement is reproducable

Some thoughts:

  • Graal has huge problems with routing
  • JDK11 and 12 (default GC1) have problems with routingLM8
  • Graal EE is not always faster than CE
  • The biggest difference is for prepare.ch.time where Graal EE is the best with 525s but e.g. jdk11 takes over 663s