Job/Activity refactor (general discussion)

Hi Stefan and others!

I created this separated topic to separate my original problem with the upraised need of making jsprit more flexible regarding Jobs and Activities.

My original post on this topic was:

I saw, that the code often uses instanceof at several places to make Job type dependent tasks. Using instanceof intensively (in code where decision is made) is a sure sign that the encapsulation is somehow violated. It leads to extension-anomaly: introducing a new class in the inheritance tree would be a nightmare, because lots of class implementations would have to modify. (Additionally, when these instanceof-driven if-branches are not terminated by an “else throw exception” branch, even finding out all the places to contribute into would be hard.) Why is this dangerous? Because what these codes do is really a type implementation-specific function, so their right place would be in the type implementation (or in a class factored from the implementation).

I offered my help for the refactor to Stefan (on private), and started to do some behavoiur-equivalent refactoring already. But as Stefan suggested, if we take the time and step over the limitations, we should think it over first.

Here are some of my thoughts (as a initiation of a discussion):

There are two types we are talking about: Jobs and Activities.

As far as I see, there are 4 types of activities:

  • Pickup: where there is a cargo load, so capacity is reserved
  • Delivery: where there is a cargo unload, so capacity is freed up
  • Service: where there is no cargo manipulation (althogh there could be cargo requirement), so there is no change of capacity
  • Exchange (beter name required): where there is both a cargo unload and load, so the capacity could either be reserved or freed.

As far as I see, all job could be broken down to this four activities, and these are differ mainly on the type of cargo manipulition.

Jobs, on the other hand, is something which describes a business use case. Service and Shipment is two examples, but Shipment with backhaul is a third one, and I’m sure there are still others. This is the interface the user may extend.
I tried to collect the (abstract) properties of the jobs:

  • They all break down to a well defined set of activities (with order defined)
  • They should be handled by the same vehicle on the same route

The main aspect of this interpretation of the meaning of Job and Activity, that only the Jobs should be offered for extension.

This means that to achive flexibility and extensibility all we need to do is allow the user the following logical steps are needed:

  1. Define the four activities
  2. Give the responsibility of breaking down jobs to activities to the Job implementations (I’ve already prototyped it in my fork of the code and it cleaned up a lot of code.)
  3. Clean up the tight coupling of Job by eliminating instanceof references
  4. Make the internal constraints/insterter/ruiner implementations to keep together and in the right order of the activities of any Job. (This is abstract, so any new job will fit in well. Even beter, that this requirement is simple – I don’t know if the implementation is the same)

As it seems, the internal constraint/cost implementations should mainly concentrate on intra-job relationships. All inter-job relationships best to left to the user to define in the future, too.

However, there are some questions this solution raise up, especially about the boundaries of the Jobs:

Is there a need to support not completely sequencial activities within a Job or can they be represented with several jobs and constraints between then?
An example: do we need to support a Job, where there are two pickup activities and a delivery/service. The pickups should preceed the delivery, but the order of two pickups are indifferent. Or is it three Jobs with constraints?
The first is much more flexible, but the internal implementation of the internal part would be much more complex (as far as I feel).

Do we need to support a stronger ordering among the activities of a Job, where the immediate neighbourhoodness is required?
I feel that the answer is no, because this could always be united into one ability, and would be hard to handle them (keeping them together).

So this is my first though on the subject.

I am looking forward on your ideas and your critics. :wink:

UPDATE1: Of course, although I forgot, there is a fifth Activity type: Internal, which is the common ancestor of the specialized activities used internal, such as Break.

1 Like

About the activity hierarchy

I made some deeper analyzis about the activities. As far as I see, now the different activities (for example, DeliveryShipment, DeliveryService, etc) of the same type (DeliveryActivity) are only separated, because it takes some information from the associated Job directly, and it does it differently.

These data, delegated to the job:

  • location
  • operation time
  • size (capacity change)
  • the job itself

The Jobs are immutable classes, so copying these data at creation time onto the activity would not cause any synchronization anomaly, but it would allow to eliminate mostly parallel implementations or the need to extend Ability hierarchy, when new Job types are introduced.

The JobActivity structure would be as simple as:

Thanks a lot for opening this discussion. I really appreciate it since I feel that it is really worth to discuss this.
A few thoughts to activities:
I think we even do not need Service anymore. It might jus be a Pickup or Delivery where nothing is actually loaded/unloaded. I`d prefer to model it as Delivery in the sense of deliver a service. Why do you think we cannot replace Exchange with Delivery AND Pickup? When it only comes to load/unload cargo, we could even boil Pickup and Delivery down to a single Activity where we allow negative capacityDemand also. However, we still need to make sure that it remains simple to model a VRP. Another interesting activity is for example PickupCapacity and DeliveryCapacity if a trailer can be picked up.
I will come up with further thoughts soon.

1 Like

You are right. (You have a much deeper insight of the mechanism.)
If one follows your logic, there is not even the need to make separated activities for pick up and delivery, because the positive/negative capacity could provide the same effect. :wink:
Distinguishing Service, Pickup and Delivery (and the others) has only one reason to do: makes it easier the model to make and the solution to read. If the Job instance creates only a single type of Ability and only the size would tell whether it is a pickup, a service or delivery, it would be easy to do it wrong, harder to trace or debug.
They have the common ancestor, so keeping the distinct types doesn’t give any additional complexity to the code using them: use the ancestor when you don’t need to make any selective decision, use the specific class when you need. (The class schema in my previous post suggests the same: all functions are in the ancestor level.)

For your other question: why to meld a pickup and delivery, which has to be done immediately after each other, into a single activity? To ease work of the optimizer: don’t need to manually keep the two activities together. As far as I was able to find out, it would required the ruin and insert methods to handle them together, which is an avoidable complexity in them.)

A quick question: what is the required compliancy level of the jsprit code?
As far as I see from the code, it is Java 6.
Don’t you thought about moving Java 8?
Now, when even Java 7 is revoked for more than a year, and we are only a year from Java 9, Java 8 is the common version.
Java 8 is a major release (as important as the Java 5 was with the introduction of generics) and it could offer many useful new features: default interface implementation, lambdas, stream framework.

1 Like

No compliance. We can switch to Java 8.

Thus, we need a general job where we can add a number of existing activities, e.g. a job is then just a container for related activities. Your case can then be modelled by a Job that has a PickupActivity and a DeliveryActivity. However, the DeliveryActivity would not change capacity. Therefore, for each activity we should be able to specify capacity demand, etc… This means that time windows, capacity, service times need to be moved to activities here.

Thus, we need a general job where we can add a number of existing activities, e.g. a job is then just a container for related activities. Your case can then be modelled by a Job that has a PickupActivity and a DeliveryActivity. However, the DeliveryActivity would not change capacity. Therefore, for each activity we should be able to specify capacity demand, etc… This means that time windows, capacity, service times need to be moved to activities here.

This is right the thing I proposed in my first entry: Jobs are for grouping activities that should be done on the same route, keeping order.
The better: for Friday I made these changes on my code fork. I had no time to document it, that’s why I didn’t presented my solution for you, yet. But tomorrow the first thing would be to document and introduce my proposal. As a teaser, here is what I’ve acomplished so far:

  • Refactored the Jobs and Abilities according this topic.
    • The jobs now easily extendable.
    • The jobs now responsible for providing the activites they are built up from.
    • There are technically a fixed number of activities.
    • The activities now contain all information, so no delegation to jobs needed (although one can get the related Job), so the need of having different Activities for each jobs isn’t needed any more.
  • Temporally still kept the old activities (but postfixed their name with DEPRECATED), but now they a simple extension of the new ones. (This allowed me to run unit test without the need to change all the inserters, constraints and all, so I was able to test, that the refactor is identical.) At Friday evening I was able to run all the unit tests correctly (except one, which gives me a mockito error I can’t figure out).

Let me consolidate the code, make some docs and I’ll present the code tomorrow!

1 Like

So here is the first version of the refactor.

I focused mainly on the structural refactor and didn’t bring the changes over the whole code, especially not in deeper algorithm code (such as strategies, calculators, etc.) The main goal was to create the new structure and let it discuss and validate.

Let’s see, what changed:


Jobs went through a main refactor. With the new interpretation, the Job is a user-defined, bussiness-level expression, defining a task to do. It contains one or more activities. These activities should be executed on the same route and on the defined order.

The job is meant to be extended as new user requirements arise.

There are several predefined jobs:

And what’s in the root implementation?

public abstract class AbstractJob implements Job {

    private int index;
    protected List<Location> allLocations = new ArrayList<>();
    private JobActivityList activityList;

public AbstractJob() {
    activityList = new SequentialJobActivityList(this);

public int getIndex() {
    return index;

public void setIndex(int index) {
    this.index = index;

protected void addLocation(Location location) {
    if (location != null) {

public List<Location> getAllLocations() {
    return allLocations;

protected abstract void createActivities();

public JobActivityList getActivityList() {
    return activityList;

The job is extended with two new responsibilities:

  • it enlists all the locations it interacts with
  • it creates and provides all the activities it consists of


It is an interface to group job activities and to serve requests on them. Now there is only one implemenration SequentialJobActivityList, which contains the activities as a strictly ordered chain. (The internal implementation is hidden, so any future, more sophisticated implementations are possible.)


I kept the JobActivityFactory interface. There were several dozen occurencies of implementation, but all of them fall into one of the following ones:

Creating new activities for Jobs
This is now mainly done by the Job itself, so the implementation became as simple as:

public class SimpleJobActivityFactory implements JobActivityFactory {

    public List<JobActivity> createActivities(Job job) {
        return job.getActivityList().getAll();

Cloning the activities of a Job
This is now removed from the responsibility of the VehicleRoutingProblem into the JobActivityList.

public class CopyJobActivityFactory implements JobActivityFactory {

    public List<JobActivity> createActivities(Job job) {
        return job.getActivityList().getAllDuplicated();

TODO: Also, they are stateless, so creating new instance is needless, maybe we should change them to singletons.

In our new interpretation, Activities are the predefined atomic building bricks. They are not meant to be extended by users, only use them in their new Job implemetations.
The new Ability structure, but two comments on the current hierarchy:

  • For the sake of incremental refactor, there are still the old activities, but they are simple no-new-code extensions of the corresponding new abilities. Their names are postfixed by ‘DEPRECATED’
  • There is still a NEW postfix on the new core ability implementation, until the refactor successfully done.

The common ancestor is the AbstractActivityNEW. It has

  • index
  • name
  • location
  • capacity
  • Runtime data:
    • arrTime, endTime
    • thorithical earliest and latest start

The InternalActivity and InternalJobActivity is only a grouper adding nothing new, but both of them are implementing the InternalActivityMarker interface. (This is used for validating that internal abilities can only be created by internal jobs (such as Break, which implements the package private InternalJobMarker interface.)

The JobActivity adds a AbstractJob field, so the instances of these activities are always associated to a job.
The four new activities are the ones defined earlier in this topic: Service (no capacity change), Pickup (capacity allocation = positive), Delivery (capacity release = negative) and Exchange (capacity change = positive, negative, zero).

What it offers (things to do)

This is only the support part of the refactor. Now comes the part I am not at home: to change the algorithm.

I would like to add some additional support function to the JobActivityList to help constraints. (I’ll do it tomorrow).

The insert/ruin strategies should be changed to keep the activities of the same job together. If you give me some guide, I may be able to do this, but I guess it is much faster if you do it.

Also I continue to eliminate the instanceof’s and the old references where I could with my knowledge and add some unit tests to the new classes and methods.

My suggestion, that it would be wise to make a branch for the changes (so I could create pull requests to that branch).

Thanks a lot. I am going to review it. It would be great you create a pull request here: Then we can discuss code there.

Hi, @Balage1551,

I would suggest that we remove the restrict that the activities in a job must go with the defined order. Otherwise I have feeling that it might be too restricted that it is not so practically useful. I have not really calculated but it seems to me that we have more posts about jobs in the same route than those about jobs in a specific sequence.

Meanwhile we can let users define the sequence relation between activities in a job, and it could be either inside the job definition (e.g., a collection of lists of activity ids), or outside (e.g., hard activity constraints).

What do you think?

Best regards,

Hi He,

We need to keep in mind that this will increase the complexity of inserting a job significantly. When we make no assumption about the order of the activity and we have like 3 activities, we need to calculate insertion costs for 6 combination like (act1,act2,act3), (act1,act3,act2), (act2,act1,act3), (act2,act3,act1), etc. If you want to make sure that these activities are “just” in the same route, you are probably much better off specifying them as individual jobs and adding a hard route constraint.


Hi Stefan,

It will be like in the ShipmentInsertionCalculator, the deliverShipmentLoop does not start from j = i, but from j = 0.

But yeah, I agree. For a job with N activities, it seems that the complexity will be multiplied by N!.

The old way has some issues:

  1. when dealing with multiple related jobs (e.g., 10 jobs must go to the same route), it is difficult that in an iteration all of them are ruined, which seems one of the motivations @Balage1551 triggered this discussion (see here);

  2. sometimes we might want those related jobs are either all assigned to a route, or all unassigned, which seems difficult to realize now (see here).

Best regards,

Ok. I aggree with the issues you have.

I do not aggree with:

I think it is not that easy since the “deliveryAct” for j<i will change arrTime etc. for “pickupAct”. Thus you really need entire new loops for N! combinations.

Hi Stefan,

I am not sure but how is this different from that the deliveryAct changes arrTime etc. for other activities? If it is because the pickupAct is not included in the state manager, will my proposed state array make it easier, or not really?

Best regards,

Thinking of the complexity impact is the depth of code understanding I have still not. :wink:

However, when I read your comments I’ve already implemented another version of JobActivityList, the GraphJobActivityList to fulfill the request of @jie31best. It uses a DAG (directed acyclic graph) for storing ordering dependencies. What I’ll do after my lunch is a getAllPossibleOrdering() implementation (I’ve already found an algorithm to build up all possible topological ordering of a DAG, but in C++, so I have to translate to Java).

As for the complexity and usefulness, it will be there, none of the current job implementations will use it, and we could decide later whether to adapt to it.

Here is the code:

 * DAG (Directed Acyclic Graph) based activity list implementation.
 * <p>
 * The inserted activities and their relations create the dependency graph.
 * </p>
 * @author balage
public class GraphJobActivityList extends SequentialJobActivityList {

    // Directly added dependencies
    protected Map<JobActivity, Set<JobActivity>> dependencies = new HashMap<>();

    // Cached transitive dependencies
    protected Map<JobActivity, Set<JobActivity>> transitivePrecedingDependencyCache = new HashMap<>();
    protected Map<JobActivity, Set<JobActivity>> transitiveSubsequentDependencyCache = new HashMap<>();

    public GraphJobActivityList(AbstractJob job) {

    public void addActivity(JobActivity activity) {
        if (_activities.contains(activity)) {
        dependencies.put(activity, new HashSet<JobActivity>());
        transitivePrecedingDependencyCache.put(activity, new HashSet<JobActivity>());
        transitiveSubsequentDependencyCache.put(activity, new HashSet<JobActivity>());

     * Adds a dependency between two activities. If the activities not in the list, they are also added.
     * @param priorActivity
     *            The prior activity.
     * @param subsequentActivity
     *            The subsequent activity.
     * @throws IllegalArgumentException
     *             If the activities can't be added (see {@linkplain #addActivity(JobActivity)}) or if the new
     *             dependency would create a cycle in the dependency graph.
    public void addDependency(JobActivity priorActivity, JobActivity subsequentActivity) {
        // Add activities if not added yet
        if (!_activities.contains(priorActivity)) {
        if (!_activities.contains(subsequentActivity)) {
        // Check if dependency already in there
        if (dependencies.get(priorActivity).contains(subsequentActivity)) {

        // Check if the new dependency would create a cycle
        if (transitiveSubsequentDependencyCache.get(subsequentActivity).contains(priorActivity)) {
            throw new IllegalArgumentException("Dependency between '"+priorActivity+"' and '"+subsequentActivity+"' would create a cycle.");

        // Add new dependency

        // Update cache
        // === Subsequent =======
        // The new subsequent abilities are the subsequent and its subsesequent abilities
        Set<JobActivity> subsequentActivitiesToAdd = new HashSet<>(transitiveSubsequentDependencyCache.get(subsequentActivity));
        // The abilities to add the new ones to: the prior and its prior abilities
        Set<JobActivity> subsequentActivitiesToUpdate = new HashSet<>(transitivePrecedingDependencyCache.get(priorActivity));

        // === Preceding =======
        // The new prior abilities are the prior and its trasitive prior abilities
        Set<JobActivity> priorActivitiesToAdd = new HashSet<>(transitivePrecedingDependencyCache.get(priorActivity));
        // The abilities to add the new ones to: the subsequent and its transitively subsequent abilities
        Set<JobActivity> priorActivitiesToUpdate = new HashSet<>(transitiveSubsequentDependencyCache.get(subsequentActivity));

        // Do the updates
        subsequentActivitiesToUpdate.forEach(a -> transitiveSubsequentDependencyCache.get(a).addAll(subsequentActivitiesToAdd));
        priorActivitiesToUpdate.forEach(a -> transitivePrecedingDependencyCache.get(a).addAll(priorActivitiesToAdd));

    public Set<JobActivity> getPreceding(JobActivity activity) {
        if (!_activities.contains(activity)) {
            throw new IllegalArgumentException("Activity '" + activity + "' is not in the list.");

        return Collections.unmodifiableSet(transitivePrecedingDependencyCache.get(activity));

    public Set<JobActivity> getSubsequent(JobActivity activity) {
        if (!_activities.contains(activity)) {
            throw new IllegalArgumentException("Activity '" + activity + "' is not in the list.");

        return Collections.unmodifiableSet(transitiveSubsequentDependencyCache.get(activity));

    public List<List<JobActivity>> getPossibleOrderings() {
        // TODO
        return super.getPossibleOrderings();


PS: It is now inherits from SequentialJobActivityList, but as soon as the support for the graph one is implemented, we should switch them, because the sequential is a simple case of the Graph one.

I’ve implemented the function, so now you can use the getPossibleOrderings() function to get all the orderings fulfill the dependency rules. (For sequential ordering, naturally, it will return with only one.)

I’ve made a pull request into the graphhopper:job-activity-refactoring branch.

1 Like

@stefan: I’m continuing to eliminate the instanceof’s and I did it in the DefaultScorer. This becomes to be a gray area to me, so I would like a confirmation I thought it well. (The conversation is identical by comparing the results of the old function and new one.)

The old code:

 public double score(InsertionData best, Job job) {
    double score;
    if (job instanceof Service) {
        score = scoreService(best, job);
    } else if (job instanceof Shipment) {
        score = scoreShipment(best, job);
    } else {
        throw new IllegalStateException("not supported");
    System.out.format("OLD SCORE: %6.2f   NEW SCORE: %6.2f    PASS: %s\n", score, calculateScore(best, job),
            (score == calculateScore(best, job) ? "true" : "false"));
    return score;
 private double scoreShipment(InsertionData best, Job job) {
    Shipment shipment = (Shipment) job;
    double maxDepotDistance_1 = Math.max(
            getDistance(best.getSelectedVehicle().getStartLocation(), shipment.getPickupLocation()),
            getDistance(best.getSelectedVehicle().getStartLocation(), shipment.getDeliveryLocation())
    double maxDepotDistance_2 = Math.max(
            getDistance(best.getSelectedVehicle().getEndLocation(), shipment.getPickupLocation()),
            getDistance(best.getSelectedVehicle().getEndLocation(), shipment.getDeliveryLocation())
    double maxDepotDistance = Math.max(maxDepotDistance_1, maxDepotDistance_2);
    double minTimeToOperate = Math.min(shipment.getPickupTimeWindow().getEnd() -      
             shipment.getDeliveryTimeWindow().getEnd() - shipment.getDeliveryTimeWindow().getStart());
    return Math.max(timeWindowParam * minTimeToOperate, minTimeWindowScore) + depotDistanceParam * maxDepotDistance;
 private double scoreService(InsertionData best, Job job) {
    Location location = ((Service) job).getLocation();
    double maxDepotDistance = 0;
    if (location != null) {
        maxDepotDistance = Math.max(
                getDistance(best.getSelectedVehicle().getStartLocation(), location),
                getDistance(best.getSelectedVehicle().getEndLocation(), location)
    return Math.max(timeWindowParam * (((Service) job).getTimeWindow().getEnd() - ((Service) job).getTimeWindow().getStart()), minTimeWindowScore) +
             depotDistanceParam * maxDepotDistance;

What I see is the only thing we need to unify this to get all the “operational time windows”. (The locations are already available by the getAllLocations() function of the Job. So by adding a getOperationalTimeWindows() function (containing only a TW for Service and two TWs for Shipment) we could be as simple as the following code:

public double score(InsertionData best, Job job) {
    Vehicle selectedVehicle = best.getSelectedVehicle();
    double maxFromStart = job.getAllLocations().stream()
            .mapToDouble(l -> getDistance(selectedVehicle.getStartLocation(), l))
    double maxToEnd = job.getAllLocations().stream()
            .mapToDouble(l -> getDistance(selectedVehicle.getEndLocation(), l))
    double maxDepotDistance = Math.max(maxFromStart, maxToEnd);
    double minTimeToOperate = job.getOperationTimeWindows().stream()
            .mapToDouble(tw -> tw.getEnd() - tw.getStart())
    return Math.max(timeWindowParam * minTimeToOperate, minTimeWindowScore) + depotDistanceParam * maxDepotDistance;

One question left

This function seems not to take account if there are more time windows which is a new feature i jsprit. As I see from the code of Service (for example), the getTimeWindow simply gives the first TW which looks like some backward compatibility hack. Shouldn’t this function work with all time windows? (Easy to fix it now, only the new getOperationalTimeWindows() has to be changed to return all.)

Well, I managed to unify the Inserter as well.

private class UnifiedInsertionHandler extends JobInsertionHandler {

    public UnifiedInsertionHandler() {

    public void handleJobInsertion(Job job, InsertionData iData, VehicleRoute route) {
        route.setVehicleAndDepartureTime(iData.getSelectedVehicle(), iData.getVehicleDepartureTime());
        if (!iData.getSelectedVehicle().isReturnToDepot()) {
            if (iData.getDeliveryInsertionIndex() >= route.getActivities().size()) {

        List<JobActivity> acts = job.getActivityList().getAllDuplicated();
        acts.forEach(act -> route.getTourActivities().addActivity(iData.getDeliveryInsertionIndex(), act));

        // Handles all // delegator.handleJobInsertion(job, iData, route);

Elegant, simple and I had to add only a getStartLocation() and a getEndLocation() function to the Job.

PS: I’ve changed the JobInsertationHandler from interface to abstract class to be able to move there the delegation. I can’t watch code duplication.

Next questions:

Question 1:
Am I right, that AvgServiceDistance (and the corresponding EuclideanServiceDistance) is now deprecated and the AvgServiceAndShipmentDistance is used instead? (The later seems to be a superset of the previous.)

As I see, only some unit tests use it and in them the reference can be substituted by the AvgServiceAndShipmentDistance implementation.

Now the AvgServiceAndShipmentDistance could serve any two Jobs, so the name is not appropriate any more.

My suggestion:

  • Remove or mark @Deprecated the above two classes.
  • Rename the AvgServiceAndShipmentDistance to DeafaultJobDistance. (After the deprecation, this is the only remaining implementation of JobDistance.)

Question 2:
Now I see, that there are several DistanceCalculators and several *Costs implementation for them.
This cost calculators contains 99% the same code, so maybe we could simplify this code by

  • Making the distance calculators from static methods to classes (singetlon).
  • Creating only one *Costs implementation and injecting the distance calculator into them.
  • If for backward compatibility the old Costs implementations should be kept, make them a simple no-code extension of the new one and mark them @Deprecated.

As for the distance unit of GreatCircle (Maybe the Geographical name would be more meaningful) could be moved into Costs level.

May I do this refactory?

Question 3:
This is a big, strategic question, that could be answered only by someone who knows how many users the library has.
Do all this wide range of refactors should be strictly backward compatible? Or could it be called v2.0, and let some ballast to drop, in spite it would require some (mainly straightforward, never structural) code modification to switch to this version?

@Balage1551 First of all, thanks again for your effort. I really appreciate it. Actually, the scorer should consider multiple time windows as well.