Tighter integration of Viterbi algorithm into map-matching?

In short, I’ve found considerable performance gains in more tightly integrating the viterbi algorithm into map-matching - something like e.g. 5-10x performance increase. A lot of this was associated with map-matching specific knowledge to enhance the use of maps etc. For example, I assign an index to each candidate that gets created - and this means maps aren’t needed (we can just use arrays with the lookup being candidate.idx). And, when it comes to transitions, instead of having e.g. HashMap<Transition, Double>, we can use TLongDoubleHashMap, when the Long key is e.g. a pairing function on the to/from candidate.idx. There are a raft of others (which border on optimising too far).

In addition, it was easier to use the some of the Viterbi internals for further optimisation. For example, I no longer test a ‘from’ candidate (i.e. calculate all of its routes to ‘to’ candidates) if it is not possible for it to have been reached thus far. To be fair, this is probably possible with the current hmm-lib too.

Anyway, the possible benefits:

  • performance as above.
  • ease of development … given the abstract typing (is that the word?) I found it hard to develop (from a map-matching point of view) whereas with everything as e.g. a GPXEvent, it made more sense.

The possible downsides:

  • ease of development … it means we have to maintain the viterbi algorithm in addition to the other map-matching stuff, and also makes it more complicated for new developers to understand the entire code base, etc.

Overall, I’d probably recommend sticking with the current approach of using the separate hmm-lib, as I doubt others have my performance requirements (so there’s little to be gained but much to be lost). However, I thought it worth noting here in case it’s of use to anyone, or others regard it worthy of further discussion.

Cc @karussell, @stefanholder.

1 Like

Sounds interesting. Maybe we can learn from this and somehow make the hmm-lib (or the coupling to it) more efficient?

To be fair, this is probably possible with the current hmm-lib too.

Yeah, we should try to keep them separated but apply improvements if possible.

Overall, I’d probably recommend sticking with the current approach of using the separate hmm-lib, as I doubt others have my performance requirement

Well, performance is very important for map matching, because one can have millions of tracks to do this. But tightly coupling the lib to GH also means loosing development progress of hmm-lib. I can imagine we can get still some perf gains without this coupling. Maybe you analyze what small part of your changes where the most significant and we try to solve this with the separate hmm-lib?

Well, performance is very important for map matching, because one can have millions of tracks to do this.

That’s our case - and each track may have tens of millions of candidates as well. However, I just realised that my comparison was misleading, sorry - I’ve got a lot of other optimisations, most importantly using weight matrices which make the routing cost negligible (and hence the total speed a few orders of magnitude faster). So while I get a 5-10x performance increase, it’s likely to be much less (relatively speaking) for the current code base. How much? I don’t know = )

we try to solve this with the separate hmm-lib?

Good idea - it should be easy enough to fork and play, especially with the measurement suite. Let’s see what @stefanholder thinks. That said, at this stage this is all low priority for me - still working on sequences.

You mean precalculated matrices of all nodes or avoid path.extract?

The former - see https://github.com/graphhopper/graphhopper/issues/904. For context, a full weight matrix for my country (which takes four hours to build, single-threaded) would be on the order of 100G (stored as chars). After some fooling round, I compress it to about 8G, if I set all times > 1 hour travel to zero, as I’m not interested in map-matching between candidates over an hour apart. This is configurable - an hour is quite long (but appropriate for us), and if the cut-off was e.g. 1 minute - which is reasonable for many tracks - I’m pretty sure the matrix would be under 1G. (I could easily add more compression, but that means more CPU time.)

Anyway, assuming a 1 hour cut-off (i.e. 8G), with a bit of trickery, a buffered instance for a particular map-matching job is usually less than 1G - and remember that’s for my much-larger-than-normal number of candidate nodes. (If we’re dealing with e.g. 1 minute cut-offs, and ‘normal’ number of candidates, then memory would be pretty insignificant, I think - though normal routing is also a proportionally lesser cost over such short distances/number of candidates too.) Anyway, the total “routing” time (decompressing and building the buffered view) is now less than the viterbi processing time.

It’s a possibly generalizable idea, and while useful for ‘big data’ situations, it adds complexity and hassle (and is less beneficial) for ‘normal’ situations. Better start a new thread if we want to discuss it more.