Support for other coordinate systems?


#1

I’ve been hacking around with a custom map-matching solution, and noticed that there’s a lot of pain associated with polar coordinates. For example, calculating the distance between two points becomes non-trivial, and not only makes code harder to follow (especially when optimizations are used), but I imagine it slows things down too. (How much, I’ve no idea, but I’d be very interested in the results.)

So I was wondering if different coordinate systems have been considered - specifically, cartesian/planar ones? From what I’ve seen, most of the individual machinery is there (e.g. different metrics, etc.) but I’m not sure whether it’d all tie together nicely.

It’s a pretty unlikely use case (as you’d have to convert the OSM data first), but it may be worth it if the performance gains are worthwhile.


#2

For example, calculating the distance between two points becomes non-trivial
and not only makes code harder to follow (especially when optimizations are used)

You mean the normalized distance comparisons? What is ‘a lot of pain’ there?

but I imagine it slows things down too. (How much, I’ve no idea, but I’d be very interested in the results.)

Well, one has to measure everything, JVM is usually very fast with computation, and even the normalization makes everything only a bit faster.

So I was wondering if different coordinate systems have been considered

Not yet

specifically, cartesian/planar ones?

For some calculations a faster ‘planer’ approximation can be used and is used.


#3

OK, that may have been a slight exaggeration, given the complexity of the entire project. It’s more that any geometric calculation becomes more difficult (distance between two points, intersection of line and circle, bearing, etc.) - which it harder to follow/debug/develop (for me anyway).

It sounds like you don’t think there’ll be much performance gain, so there might not be much point. If it’s easy enough to figure out, I might have a crack.


#4

It would be a nice but very time consuming test to find this out which source has to change and where one can boost speed. Probably one could try for LocationIndexTree first to see if and how much it is faster. And everything which makes the project simpler and/or faster by at least 5 or 10% is very welcome :slight_smile:

which it harder to follow/debug/develop (for me anyway).

Where possible I tried to make this explicit via the DistanceCalc tool.


#5

I’ve noticed that, and I’m hoping it means it’ll be easier to implement a new method without mucking up too much of the existing code. Nice forethought = )

OK. I’ve already been hacking around in LocationIndexTree, but I might just try everything (as e.g. I probably want to save routes in planar coordinates too). I imagine the speed gains (if any) will be more relevant in the map-matching, where it’s dealing with a lot more points. I’ll have a look at implementing today, as planar coordinates might significantly simplify my “find points at a radius” solution … and speed might be a nice side benefit.


#6

You are very correct! I’ve hacked it to run with planar (NZTM) coordinates, and it wasn’t pretty, or fun. That said, half of my time was actually spent in converting the OSM data, which was also very hacky - I could convert OSM -> planar easily using ogr2ogr (in e.g. sqllite format), but then converting that to osm/pbf for GH was very painful.

Anyway, to brass tacks. I created a test of about 5000 A->B random routes (from Auckland NZ). I did three tests:

  • A: using defaults (i.e. CH).
  • B: no CH, plus turn costs, plus ALT_ROUTE algorithm + alternative_route.max_paths=10, but only for the first 100 routes.
  • C: no CH, plus ASTAR algorithm, but only for the first 100 routes.

Timings

With planar:

A: 3.14s (sd=0.07s) with 99 errors - all 'Connection between locations not found'. [samples: 3.11, 3.10, 3.07, 3.34, 3.17, 3.13, 3.17, 3.09, 3.07, 3.09, 3.15, 3.17. 3.15]
B: 13.8s (sd=1.0s) with 0 errors. [samples: 13.37, 13.35, 13.41, 15.71, 13.42]
C: 2.29s (sd=0.07s) with 0 errors. [samples: 2.51, 2.27, 2.31, 2.25, 2.26, 2.27, 2.22, 2.26, 2.28, 2.28, 2.30, 2.26, 2.28]

Without:

A: 3.78s (sd=0.03s) with 0 errors. [samples: 3.77, 3.73, 3.80, 3.82, 3.74, 3.80, 3.85, 3.79, 3.80, 3.77, 3.82, 3.75, 3.78]
B: 12.8s (sd=0.7) with 0 errors. [samples: 12.09, 13.40, 13.26, 12.00, 13.32]
C: 2.07s (sd=0.03s) with 0 errors. [samples: 2.07, 2.12, 2.11, 2.07, 2.09, 2.08, 2.08, 2.06, 2.04, 2.11, 2.01, 2.07, 2.06]

Apples and oranges

One the one hand, it’s same computer, and I’ve tried to ensure variables are the “same” (e.g. minResolutionInMeter is same for both). On the other, I’ve done a lot of stuff which I don’t fully understand. For example, in intToDegrees I just guessed and changed DEGREE_FACTOR to 10f, or in bresenham.calcPoints I just left the hardcoded value at 0.1 (if it was too big, stuff seemed to be missing from the tree … though I thought it would’ve needed to increase for these coordinates). I also did lots of manual hacks while testing and trying to get it to work … some of which may still be lying round = ) Anyway, they might all contribute to the 99 errors above, and the slowness of the non-CH approaches.

Oh, I also removed all of the normedDistance stuff (as it was confusing) … but I’ve now realised I should have left it in (and just returned the square of the planar distance). Not sure how much that’s slowed it down, but I doubt it’s too much.

Conclusions

Firstly, it needs more investigation by someone who can make a fair comparison! That said, if those errors aren’t falsely speeding it up*, and if I haven’t done something silly to give misleading results, then it looks pretty good for CH. Especially as it’s essentially a net gain (both in terms of speed / simplicity) for “free” (i.e. no algorithmic changes required, just using a simpler coordinate system). [In my case, it allows me to do some stuff which I’m not sure I could in WGS84.]

As for the other algorithms - I’m not sure what’s going on there, as I can’t see how simpler calculations would have slowed it down. So I’m guessing it’s to do with how I utilised some of the internal data structures (which maybe didn’t affect the CH graph), or one of the issues above.

Where did the speedups occur? Mostly just replacing distance etc. calculations with simple Pythagorus, plus a few simplifications e.g. not needing to check for longitude boundary crossings.

Could it be useful? Maybe. If the data used is covered by an (accurate) planar coordinate system, then it’s possible. However, as OSM only supports WGS84, we’d either need to implement coordinate conversion on the import (which shouldn’t be pretty trivial with e.g. Proj4J), or find an alternative tool for changing the coordinates in the OSM data.

How hard to implement (well)? I don’t really know, but I don’t think it’d be that bad. All I really added was a new DistanceCalc for planar coordinates. The rest would just be tidying up the code to be e.g. have a configurable DistanceCalc etc., and maybe changing lat, lon, ele to x, y, z, or similar.

Finally, I’m assuming most of the speedup occurs in the location lookups. With that in mind, it’s possible it’ll have a greater impact in map-matching.

@karussell, what do you think?

  • @karussell, do you know whether they’d speed it up or slow it down? At most the speedup would be like ignoring 99 / 5000 tests, which doesn’t account for all of the speedup.

#7

Nice that you get it working :slight_smile: !

Measuring on the JVM is not that straightforward, you need to warmup the JVM first and take a few hundred routes, for CH I use even 5K. E.g. have a look into Measurement.java which can be started easily via ./graphhopper.sh measurement your.pbf

Regarding the connection not found -> see this discussion as it could be related: 0.5.0: Connection between locations not found

but I’ve now realised I should have left it in (and just returned the square of the planar distance).
Not sure how much that’s slowed it down, but I doubt it’s too much.

One always has to test this, those sometimes confirms my guesses but sometimes I was already proven wrong :slight_smile: … the JVM does sometimes tricky optimizations which one cannot guess.

and if I haven’t done something silly to give misleading results, then it looks pretty good for CH.
Especially as it’s essentially a net gain (both in terms of speed / simplicity) for “free”

Yes, indeed nice.

All I really added was a new DistanceCalc for planar coordinates.

Ok, hopefully I have not worked around it somewhere …

Finally, I’m assuming most of the speedup occurs in the location lookups.

The location lookup is also measured in ‘measurement’ separately


#8

Right. It’s entirely possible that the method I used to convert the coordinates did not preserve all the ways, etc. I’d really want to make sure that’s perfect first (now I think I really should’ve just used Proj4J from the start), as if it isn’t there’s not much point testing.

With that in mind, @karussell, do you think it’s worth pursuing? As above, I’m unsure of the benefits for most use cases (e.g. I don’t think you’d get planar coordinates covering e.g. the US or Germany). I’ve spent a bit longer than I should have on this, so I’ll only pursue further if you think it could genuinely add value.


#9

You mean you can apply your method only for smaller regions? Then the afford is probably not worth enough but it is your decision


#10

Correct. For those in small countries which have planar coordinates (e.g. NZ like me), then it’d be useful. I’ll leave it for now - if others express interest, we can look at implementing it then. Anyway, it’s been fun, and a useful way to learn a bit more about GH.


#11

What I now do not understand: you are using just NZ and you have response times with CH in the range of seconds? Even without CH it should be <0.5s! Or are you using memory mapped data access?


#12

I don’t know, sorry - I’m just following the standard importOrLoad method (though I’m pretty sure RAMIntDataAccess.java is the one being used).

I think there’s a miscommunication - It’s 3.78s for all 4950 routes i.e.less than a millisecond per route.


#13

FYI while playing round with measurement of map-matching, I realized I could use it for the above. Results:

####planar
measurement.time=23311

polar

measurement.time=29315

So about a 20% faster.


#14

@kodonnell There are indeed cartesian coordinate systems for large regions, e.g. the european EPSG:25831, 25832…. In my case, the whole project is set up to use cartesian coords. As you’ve already mentioned, it simplifies a lot of calculations, debuging and stuff. Currently, I am forced to convert the network and all the routing requests to WGS84. The path calculated by graphhopper is then again transformed back into the cartesian… So, yes, I would be definitely be interested.


#15

You’re right @neuma - I’d forgotten, but I think we usually work in planar coordinate and have to convert to/from WGS84 to use GH (which isn’t that painful, but did mean investing in a nice way of doing that.) I guess there are two important questions for the community:

  • how many use GH for a “small” region that can be covered by a single planar coordinate system? E.g. NZ, or London, etc.
  • how many use non-WGS84, by default?

Personally, I don’t think it’d be too hard (apart from time) to:

  • support reading OSM in different coordinate systems
  • use distance calcs appropriate to coordinate system

Thoughts, @karussell?


#16

This will make things more complex (but I do not know how much maybe as @kodonnell suggested it is not that much?) and I do not see much gains and there are currently only two interested in this :wink: so I would postpone it for now.


#17

Not too much, as most of the framework already supports it (e.g. different types of distance calcs, etc.). There’d probably have to be some slight refactoring of some of the API (where WGS84 is baked in), but it’ll be superficial - the coordinate system doesn’t affect the ultimate network and routing stuff. I suspect the main challenge would be in reading OSM (which really only supports WSG84) into different coordinates. Though even that’s probably not too hard.

I tend to agree, though if I have some time, it’d be a fun addition. For me, the original reason was performance, and there are bigger fish to fry in that area.