Vehicle Routing Challenge Update

Vehicle Routing Challenge Update: Introducing Time Windows

Challenge Formulation

The Vehicle Routing Problem with Time Windows (VRPTW) involves determining a set of cost-effective routes for a fleet of identical vehicles operating from a single depot to serve a geographically dispersed set of customers. Each vehicle has a fixed capacity and each customer has a known demand for goods and a defined time window during which service must begin. If a vehicle arrives before this time window, it must wait; if it arrives after, service is considered infeasible. The primary objective is to minimise the total distance the fleet must travel to deliver goods to all customers and return to the depot, such that:

  • Each customer is visited by exactly one vehicle.

  • The total demand serviced by each vehicle does not exceed its capacity.

  • Each vehicle starts and ends its route at the depot.

  • Service at each customer commences within the customer’s defined time window.

  • The number of vehicles utilised is less than a set fleet size.

  • Vehicles wait if they arrive early, and service durations are accounted for within the schedule.

Note: In the typical problem formulation of the VRPTW, the primary objectives are to minimise both the number of vehicles utilised and the total distance travelled, while strictly adhering to the stated constraints. The DIMACS [1] implementation challenge adopts the convention (already used in several recent works[2][3]) of only minimising the total distance while enforcing a max fleet size.


In TIG, it is essential for a challenge to be asymmetric - challenging to solve yet straightforward
to verify. Many optimisation challenges are not inherently asymmetric but can be transformed into an asymmetric decision problem by specifying a baseline value and requiring solvers to exceed that value.

Baseline Algorithm

To ensure TIG’s resistance to Sybil attacks, it is essential to use a baseline solver that is both simple and computationally efficient. While a standard greedy algorithm is typically employed as a baseline, we are currently considering Solomon’s I1 insertion heuristic [4], contributed by a member of our community.

Note: Testing to verify the runtime of this algorithm is ongoing.

Several versions of baseline solvers, including heuristic and metaheuristic approaches, were developed and tested by a community member before they decided on Solomon’s I1 insertion heuristic [4:1]. This suggestion was based on multiple factors, including computational efficiency, simplicity, and resistance to Sybil attacks.

Testing on Solomon and Homberger-Gehring benchmark instances [5](200- and 400-customer cases) revealed that Solomon’s I1 produces solutions approximately 10% worse than best-known values on average. Attempts to enhance performance through techniques like 2-opt, r-opt, or swapping yielded marginal improvements that did not justify the added computational cost.

Solomon’s I1 heuristic is a widely recognised constructive approach for solving the VRPTW. It incrementally constructs routes by inserting customers into positions that minimise an insertion cost, while satisfying vehicle capacity and time window constraints.

Algorithm Steps

  1. Initialisation: Start with an empty route and select an initial customer to serve.

  2. Insertion Process: For each unrouted customer:

    • Evaluate all feasible positions in the current route.

    • Compute the insertion cost at each position, considering both distance and time adjustments.

    • Select the position that minimises the cost while maintaining feasibility (capacity and time window constraints).

  3. Route Completion: Insert the chosen customer into the route. Repeat until no more customers can be added.

  4. Open a New Route: If unrouted customers remain, open a new route and repeat the process until all customers are routed.

Insertion Cost Criteria

The insertion cost is computed using two levels of cost functions:

First-Level Cost ( C_1 )

The first-level cost prioritizes minimising added distance and time adjustments. For a candidate customer u inserted between two consecutive nodes i_{p-1} and i_p , the cost function is:

C_1(i_{p-1}, u, i_p) = a_1 \bigl[d(i_{p-1}, u) + d(u, i_p) - \mu d(i_{p-1}, i_p)\bigr] + a_2 \bigl(b_{j,u} - b_j\bigr),

where:

  • d(x, y) is the distance between nodes x and y ,

  • b_{j,u} and b_j represent service start times,

  • a_1, a_2, \mu are parameters that balance distance and time impacts.

Second-Level Cost ( C_2 )

The second-level cost adjusts for the proximity of the candidate customer to the depot and is defined as:

C_2(i_{p-1}, u, i_p) = \lambda d(0, u) - C_1(i_{p-1}, u, i_p),

where \lambda weights the depot’s distance influence.

Why Solomon’s I1?

Solomon’s I1 heuristic has the following strengths:

  • It produces competitive solutions within approximately 10% of best-known values.

  • It is easy to implement without requiring complex data structures or preprocessing steps.

  • It is computationally efficient, even as problem instances scale.

  • It explicitly handles time feasibility, yielding well-structured routes.

  • It has adjustable cost parameters allowing for different insertion strategies.

Compared to alternative baselines, such as simple greedy heuristics or the Clarke & Wright savings heuristic (adapted for time windows), Solomon’s I1 achieves a strong balance of solution quality, feasibility, and implementation simplicity. Its broad acceptance in VRPTW research makes it a reliable benchmark for evaluating more sophisticated algorithms. However, if runtime proves to be excessively long, we may need to consider reverting to a simpler baseline algorithm to maintain computational efficiency.

Challenge Generation

Importance

The instance generation process is designed to produce scenarios that closely mirror established academic benchmark instances. It is crucial that these generated instances accurately reflect real-world conditions, as the game incentivises algorithms specifically optimised for these scenarios.

VRPTW

The following instance generation process was proposed by a member of the community to replicate the benchmark instances of Homberger and Gehring[6]. It incorporates both random and clustered customer distributions, coupled with tight time window constraints and low vehicle capacity, characteristics defined as RC1 instances in Solomon’s seminal paper on the VRPTW[4:2] and later utilised by Homberger and Gehring[6:1]. These instances serve as critical benchmarks for evaluating algorithms addressing the VRPTW. While benchmark instances typically vary in customer distribution (random, clustered, or mixed), time window tightness (loose or tight), and vehicle capacity (small or large), the current instance generation is designed to produce the RC1 type. This was suggested by a member of the community as it reflects more realistic and operationally challenging conditions, where tighter time constraints on vehicle arrivals require algorithms to achieve a delicate balance between route optimisation and precise scheduling. Such instances emphasise the need for greater accuracy in sequencing decisions, pushing algorithms to deliver both efficient routing and careful time management.

Dataset Generation

Grid and Depot Initialisation

  • A grid of size \frac{N}{2} \times \frac{N}{2} is initialised, where N is the total number of customers.

  • The depot is located at the centre of the grid: (\frac{N}{4}, \frac{N}{4}) .

Difference: In contrast to the original CVRP challenge, which fixed the grid size at 500 \times 500 with the depot centred at (250, 250), this approach adopts a variable grid size. This mimics the majority of the Homberger instances (for N > 200)[6:2][5:1].

Note: The paper by Uchoa[7] on new benchmark instances for the CVRP mentions they generate instances on a fixed 1000 \times 1000 grid for customer instances ranging from 100–1000. This method could be implemented if beneficial.

Customer (Node) Position Assignment

Customers are divided evenly into two groups: one composed of clustered customers and the other of random customers.

To generate clustered locations, we begin by selecting k seed points at random within the grid, which define the centres of our clusters. We then determine the positions of the remaining \frac{N}{2} - k customers based on their proximity to these seed points. Candidate positions are generated at random and accepted with a probability governed by an exponential decay function, ensuring that locations nearer to a seed point are more likely to be chosen. This acceptance probability is given by:

P(\text{candidate}) = \sum_{i=1}^{k} \exp\Bigl(-\frac{\text{distance}(\text{candidate}, \text{seed}_i)}{\lambda}\Bigr)

where k is the total number of seed points, \text{distance}(\text{candidate}, \text{seed}_i) is the Euclidean distance from the candidate location to the i-th seed, and \lambda is a scaling factor that controls the decay rate.

The remaining \frac{N}{2} customers are placed randomly within a \tfrac{N}{2} \times \tfrac{N}{2} coordinate space, ensuring that their locations are independently distributed and not influenced by seed points.

Note: The paper by Uchoa[7:1] mentioned sampling from a uniform distribution of [3,8] for the number of seed customers. This paper generates instances up to 1000 customers. This paper has a choice of \lambda = 40 based on experimental results. Smaller values create overly dense and isolated clusters, while larger values produce sparse, overlapping clusters. For the Homberger instances, it was observed that these were most easily replicated with a choice of 8 “seed” customers and \lambda = 6 . (The exact choice for these parameters remains to be decided.)

Demand Generation

Customer demands are randomly assigned integer values between 1 and 50. The depot’s demand is fixed at 0.

Note: Random sampling within the range [1,50] was employed in an effort to replicate the Homberger benchmark instances[6:3], although the specific sampling methodology was not explicitly detailed in their paper.

Distance Matrix Construction

A symmetric distance matrix is built, where each entry represents the Euclidean distance between two nodes. Distances are calculated using:

\text{distance} = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2}

Distances are rounded to the nearest integer.

Note: Uchoa’s work[7:2] identifies two common conventions for handling distances in the VRP literature: exact methods typically round values to the nearest integer, whereas heuristic methods generally avoid rounding. To mitigate any drawbacks associated with rounding, Uchoa’s instances were designed on a [0,1000]\times[0,1000] grid rather than the more traditional [0,100]\times[0,100] scale. This choice results in optimal solutions whose magnitudes are typically between 10^4 and 10^5, an advantageous range for exact methods with a precision of 10^{-6}. As a consequence, the “plateau effect” and the artificial performance boost sometimes observed in exact methods are substantially reduced.

Some contemporary solvers, such as OR-Tools and VROOM, require integer distance inputs. This facilitates a preprocessing step to scale distances (e.g., by a factor of 10 or 100) if a certain precision is required. After obtaining a solution, the results are then rescaled to their original units. Other solvers, like jspirit, accept floating-point inputs without any specific precision handling.

This rounding convention originally proposed in Solomon (1987)[4:3] is defined as:

d_{ij} = \frac{\left\lfloor 10e_{ij}\right\rfloor}{10}

where d_{ij} is the final distance output, e_{ij} is the euclidean distance between customers i and j and \left\lfloor10e_{ij}\right\rfloor is the distance input into the solver.

Depot Due Time

The approach for establishing the depot due time wasn’t described by Solomon[4:4]. From observations of the instances, a member of our community has proposed the following formula:

l_0 = d_{0i_F} + (s_{\text{av}} + d_{\text{av}})\times n_{\text{av}},

where

n_{\text{av}} = \frac{\text{capacity}}{\text{average customer demand}}.

Here l_0 is the depot due time (the latest time of arrival at the depot), d_{0i_F} is the direct distance between the depot and the furthest customer i_F , s_{\text{av}} is the average service time, d_{\text{av}} = \frac{N}{4} \times 0.5214 is the average distance between customers, N is the number of customers and n_{\text{av}} is the average number of customers per route.

The formula for d_{\text{av}} comes from the average distance between two points inside a square with side length \frac{N}{4} . Since the depot is always centred, the average distance inside a quarter of the grid was computed analytically using the formula for the average distance between two uniformly random points inside a square[8][9].

Note: This formula does not account for the clustering of certain customers, which could influence the average distance between them. However, this is likely acceptable, as clustering generally results in shorter distances between customers along a given route. The average number of customers served per route is estimated based on the ratio of average demand to vehicle capacity, which, under the current parameters, is approximately 8 customers per route. Additionally, the distance to the furthest customer is used as an offset to ensure that depot time constraints are sufficiently flexible. This prevents overly restrictive limits on the solution space and range of feasible routes.

Time Window Assignment

Time windows due times are assigned differently depending on whether customer locations are randomly distributed or clustered.

Randomly Distributed Customers:
Time window assignments are determined by their distance from the depot. Specifically, each randomly distributed customer i is assigned a due time drawn uniformly from the interval:

[d_{0i}, l_0 - d_{0i} - s_i]

where d_{0i} is the distance from the depot to customer i , l_0 is the depot’s due time, and s_i is the service time for customer i . This method ensures that each time window is feasible, allowing a vehicle to leave the depot, reach the customer, perform the service, and return on time.

Clustered Customers:
Time windows due times are first allocated to a set of seed customers using the same distance-based approach as above. Once these seed time window due times are established, time windows due times for the remaining clustered customers are assigned by adding a random offset, uniformly drawn from a predetermined range (currently [20, 180]), to the seed customer’s due time.

Note: Assigning time windows based on seed customers enables quick assignment for clustered customers without requiring an initial optimisation step, making a more efficient generation. This seed-based method reflects realistic delivery schedules by clustering time windows geographically and temporally. The offset range of [20, 180] is an initial suggestion and could be altered.

Density

A subset of customers is assigned non-zero ready times to introduce variability. These non-zero ready times are set 30 units prior to the customer’s due time. The proportion of such customers is determined by the density parameter, currently set to 50%.

Note: It was noted by a member of our community that lower density values increase the number of feasible solutions, thereby adding to the problem’s complexity. However, excessively low densities may deviate from real-world conditions, where such scenarios are less common. The chosen 50% density strikes a balance between computational complexity and practical realism, though higher values, such as 75%, could also be considered for certain applications. It is worth noting that most instances with unknown optimal solutions tend to exhibit densities less than 50%.

TIG Challenge Summary

For this challenge, the two difficulty parameters are N, the number of customers and the parameter better_than_baseline, which defines the performance threshold for solution acceptance.

The baseline value is determined using Solomon’s I1 insertion heuristic that iteratively inserts customers into routes based on a cost function that balances distance and time constraints. The routes are built one by one until all customers are served. The goal is to produce a solution better than the baseline’s total distance by at least the specified factor (better_than_baseline), while ensuring all VRPTW constraints are satisfied.


Code Implementation

While the Rust implementation for this challenge update is not yet complete, it will mirror the logic used in this Python implementation. We thank a member of our community for contributing the original code, which is attached below with slight modifications.
solver_vrptw.py (21.9 KB)
generator_vrptw.py (6.0 KB)


  1. DIMACS. “Vehicle Routing Problem with Time Windows (VRPTW) Challenge.” Accessed: 2025-01-15. http://dimacs.rutgers.edu/programs/challenge/vrp/vrptw/ ↩︎

  2. Roberto Baldacci, Aristide Mingozzi, Roberto Roberti, Recent exact algorithms for solving the vehicle routing problem under capacity and time window constraints, European Journal of Operational Research, Volume 218, Issue 1, 2012, Pages 1-6, ISSN 0377-2217. ↩︎

  3. Diego Pecin, Claudio Contardo, Guy Desaulniers, and Eduardo Uchoa. 2017. New Enhancements for the Exact Solution of the Vehicle Routing Problem with Time Windows. INFORMS J. on Computing 29, 3 (Summer 2017), 489–502. ↩︎

  4. Marius M. Solomon. Algorithms for the Vehicle Routing and Scheduling Problems with Time Window Constraints. Operations Research, 35(2):254–265, 1987. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎

  5. SINTEF. Vehicle Routing Problem with Time Windows (VRPTW). VRPTW, Accessed: 2024-12-05. ↩︎ ↩︎

  6. J. Homberger and H. Gehring. Two Evolutionary Metaheuristics for the Vehicle Routing Problem with Time Windows. Infor, 37:297–318, 1999. ↩︎ ↩︎ ↩︎ ↩︎

  7. Eduardo Uchoa, Diego Pecin, Artur Pessoa, Marcus Poggi, Thibaut Vidal, Anand Subramanian. New benchmark instances for the Capacitated Vehicle Routing Problem. European Journal of Operational Research, 257(3):845–858, 2017. ↩︎ ↩︎ ↩︎

  8. Eric W. Weisstein. Square Line Picking. MathWorld–A Wolfram Web Resource. https://mathworld.wolfram.com/SquareLinePicking.html, Accessed: 2024-12-05. ↩︎

  9. Mean line segment length. Wikipedia. Mean line segment length - Wikipedia, Accessed: 2024-12-05. ↩︎

2 Likes

Vehicle Routing Update: Finalised Version

At TIG, we are dedicated to ensuring that our challenges align with real-world benchmarks with the aim of incentivising state-of-the-art innovation. As such, in collaboration with a world-leading expert in this field, @vidalt, we have refined our challenge instance generation process, and are pleased to share our final design.

Instance Generation

To align with almost all existing VRPTW benchmark instances, distances are two-dimensional Euclidean. Both the depot and customers are positioned at integer coordinates within a [0, 1000] × [0, 1000] grid. As described in the paper “New Benchmark Instances for CVRP” by Uchoa et al. [1], instances can be characterised by several attributes: the number of customers, depot positioning, customer positioning, demand distribution, and average route size. Below, we detail the choices made for these attributes for our instances.

Instance Attributes

Depot Positioning

Central (C) – the depot is positioned in the center of the grid, point (500,500).

Customer Positioning

Random-Clustered (RC) – half of the customers are positioned in clusters while the other half are randomly placed across the grid. Every customer, as well as the depot, is located at a unique point on the grid.

This positioning process differs slightly from that described in [1:1] to ensure a faster and consistent instance generation. First, a number S—representing the cluster seeds—is selected from a uniform discrete distribution UD[3,8]. These S seeds are then randomly positioned on the grid. Instead of using an exponential decay mechanism to attract the remaining N/2−S customers to clusters as in [1:2], each customer now has a 50% chance of being assigned to a cluster. If selected, the customer randomly picks one of the cluster seeds and its position is generated using a truncated normal distribution centered on the seed’s coordinates, with a standard deviation of 60. This ensures that the customer is placed close to the seed while still adhering to the grid boundaries. The chosen standard deviation was determined experimentally to best replicate the clustering behavior achieved by the exponential decay attraction in [1:3].

Demand Distribution

Small Values, Large Variance – demands from UD[1,35].

Average Route Size

The average route size, r_{\text{av}} , is chosen to be in the range ~11-12 customers per route, in line with the average route size chosen across instances in [1:4].

r_{\text{av}} = \frac{\small\text{capacity}}{\small\text{avg_demand}} = \small\frac{200}{17.5} = 11.43

Time Window Generation

Depot Due Time

Based on experimentation and a detailed comparison with depot due times in the Solomon and Homberger instances [2], the following formula was selected for the depot due time:

l_0 = d_{0i_F} + (s_{i} + d_{\text{av}})\times r_{\text{av}},

where l_0 is the depot due time, d_{0i_F} is the direct distance between the depot and the furthest customer i_F , s_{i} is the service time, with s_i =10 \forall i , and d_{\text{av}} is the average distance between customers.

The average customer distance, d_{\text{av}}, is derived from the mean distance between two points in a square of side length grid_size/2. As the depot is centered, we compute the average in a quarter of the grid using the formula for uniformly random points in a square [3][4].

d_{\text{av}} = \frac{\small\text{grid_size}}{2} \times 0.5214 = 260.7.

Therefore, the depot due time is given by

l_0 = d_{0i_F} + 3094.1.

Customer’s Due Times

The due times of the time windows are determined differently based on whether customer locations are randomly distributed or clustered.

Initially, every customer is assigned a due time drawn uniformly from the interval,

[d_{0i}, l_0 - d_{0i} - s_i],

where d_{0i} is the distance from the depot to customer i , l_0 is the depot’s due time, and s_i is the service time for customer i .

This method ensures that each time window is feasible, allowing a vehicle to leave the depot, reach the customer, perform the service, and return on time.

Clustered Customers:
For clustered customers, the due times are adjusted to be closer to their respective seed customer’s due time. This is achieved by recalculating each clustered customer’s due time as the average of its original due time and that of its seed, while still respecting the original bounds.

Ready Times

The depot’s ready time is set to zero. The ready times for a subset of customers is defined as their due time minus a randomly selected time window width chosen from a uniform distribution UD[10,60]. The density parameter, currently set at 50%, determines the proportion of customers with non-zero ready times. This approach creates variability in the time window widths, yielding both tight and loose windows.

Decisions on Conventions

  1. The euclidean distances will be rounded to the nearest integer distance, as was chosen in [1:5].
  2. The number of routes considered a solution for each instance will not be fixed, but an upper bound will be set based on the baseline solution.

  1. Eduardo Uchoa, Diego Pecin, Artur Pessoa, Marcus Poggi, Thibaut Vidal, Anand Subramanian. New benchmark instances for the Capacitated Vehicle Routing Problem. European Journal of Operational Research, 257(3):845–858, 2017. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎

  2. SINTEF. Vehicle Routing Problem with Time Windows (VRPTW). VRPTW, Accessed: 2024-12-05. ↩︎

  3. Eric W. Weisstein. Square Line Picking. MathWorld–A Wolfram Web Resource. https://mathworld.wolfram.com/SquareLinePicking.html, Accessed: 2024-12-05. ↩︎

  4. Mean line segment length. Wikipedia. Mean line segment length - Wikipedia, Accessed: 2024-12-05. ↩︎

1 Like

Good job, @Aoibheann and @vidalt. The first things comes into my mind, don’t you afraid to fix many parameters like grid size, due time, etc and encourage innovators to hardcode some optimizations and further face problems with development of very narrow limited solvers or even basically have trouble in algorithm comparison on Solomon or Homberger and Gehring benchmarks?

Given the proposed instance generation method presented here (fixed grid size, integer-rounded distances, specific time window distribution parameters, etc.), how do you plan to ensure that submitted solvers don’t become overly specialized or optimized towards these exact parameters or scenarios? This is something syebastian has raised and I think it’s an important question to address. Have you considered additional testing methodologies, such as hidden validation instances with slightly varying parameters, periodic updates to the instance-generation parameters, or explicit checks on solver robustness across different benchmarks, to encourage and reward the development of genuinely innovative solvers?

Thanks, @Jake_Logos and @syebastian, for your comments. It’s great to get feedback on challenge design.

Regarding the fixed parameters—such as grid size and depot due time—we were advised that using a fixed or variable grid size is a relatively minor design decision. In the Set X instances, a fixed grid was chosen primarily for numerical stability; it helps avoid excessively large numbers that can cause precision issues in exact solvers. Additionally, since some algorithms are sensitive to input scale (for example, distances in meters versus kilometers), keeping instance objective values within a similar order of magnitude minimizes calibration issues. This influenced our decision of a fixed grid over a variable one, meaning that depot due time now depends solely on the furthest customer from the depot. The difference should just be in the scale of the distances and depot due time without altering instance difficulty.

We also acknowledge the concern that innovators might hard-code optimizations, leading to overly narrow algorithm designs. We will be monitoring the evolution of algorithms within TIG and consult with an expert on design and parameter specifications if innovation tends in an unfavourable direction.

Our instances are designed to align closely with established benchmarks (Solomon, Homberger–Gehring, and the new CVRP Set X benchmark design), and our analysis indicates that algorithm performance on our randomly generated instances is comparable to these standard benchmarks. Our long-term aim is to periodically compare TIG algorithms against SOTA algorithms on standard benchmarks to ensure we continue to incentivise and reward the development of genuinely innovative solvers.

I hope this addresses your questions/concerns.