Reinforcement Learning to Address the VRP/TSP problems

I have a more abstract/futuristic question.
Recent advance in Reinforcement Learning (RL) seems to address the TSP and VRP problems in a way that challenges the traditional Combinatorial Optimization approach.

Here are some resources:

  1. A tutorial for TSP with RL
  2. A recent paper that shows a method to solve VRP with RL.
  3. A more recent paper that shows a technique to shorten the training period.

What is your take?
Do you believe that RL is adaptable to the size and complexity of a real-world navigation problem?

Very interesting. Is already there something for playing?

Yep, Item 2 publish a git hub repo.

Will be happy to hear back your thoughts.