Hi all,
I am looking for a strategy to find solutions for a problem where customers need to be served periodically in a given planning horizon (of P = 20 (work)days (4 weeks à 5 days each)). Customers have (multiple) time windows, service durations and a frequency value k, meaning a customer needs to be served k times during the planning horizon with a time gap between services (or periodicity) p of exactly P / k days. For example: a customer with k = 2 needs to be served 2 times exactly every p = 10 days.
My current approach is as follows.
Step 1) (weekdayassignment): First I solve the problem with 20 vehicles, one for each day, to assign customers to single days in the schedule. Since customers with high k will be served more often and hence restrict the solution more strongly than customers with low k, I have chosen to model this by assigning a virtual demand equal to k to each customer and assign a limited load capacity to my vehicles. Ergo, if a customer with, e.g. k = 4 is assigned to the first Monday, the corresponding vehicle will spend a relatively large portion of its load capacity on this customer and will have to return to the depot after serving a relatively smaller amount of customers on that day. At the end of step one I have an interim solution for all of the 20 days in P, with every customer appearing exactly once.
Step 2) (weekclustering): Then, for each weekday, I decide which customers to serve in the same week. For a customer with k = 4, the week assignment is trivial, since he/she will be served every week (and the day of the week has already been determined in step 1)). Since a customer with, e.g., k = 2 can be served in (week 1 and week 3) or (week 2 and week 4), I have to assign the customer to one of the two possibilities. The same goes for customers with k = 1, only now there are 4 possibilities to assign the customer to a week: (week 1) or (week 2) or (week 3) or (week 4). So I cluster the customers geometrically into k clusters for each occurring k elem{1,2,4} (I basically do a kmeans clustering, but with approximately equal cluster size, if anyone is interested). These clusters then represent weeks in the schedule.
Step 3) (weekassignment): Then I assign the clusters from step 2) to actual weeks in P. I start with the clusters with k = 4 (trivial) and then assign the two clusters with k=2 in such a way, that the overall driving distance is minimized; then repeat for clusters with k = 1.
Step 4) (dayoptimization): I then solve the TSP for each day in P.
The approach returns plausible results, which is nice. It has, however, its weaknesses.

If the customers are not homogeneous, i.e. if a customer is either located very far from the others or has a much greater service duration than the others, the trick with the load capacity from step 1 can fail; such a customer would need to displace more customers from that day to other days than given by its k. In extreme cases – even if I make sure that the “strange” customer is the only one who gets served on a certain day in step 1) – there might still be too many customers assigned to the same weekday in different weeks to find a feasible solution for some of the days in step 4).

Another effect, very much associated with problem 1): As the planning horizon gets longer – more precisely: as the number of weeks (and hence the number of possible values for k grows) – the load capacity crutch also gives less and less plausible approximations for the number of customers that can be served on a given weekday.
My questions are: How can I make the approach more robust to include outliers / “strange” customers as described in step 1)? How can I make the approach more generic for longer planning horizons? What do you think about the general idea?
TL;DR: I am using jsprit to find solutions (robustly and as generically as possible) to assignment problems with periodicity constraint. My current approach has wekanesses. Pls help.