Traveling Salesman Problem (TSP) is a well-known NP-hard combinatorial optimization problem. Recently, many researchers have used deep reinforcement learning to solve it. However, traffic factors are rarely considered in their works, in which the traveling time between customer locations is assumed to be constant over the planning horizon. For many practical scenarios, the traffic conditions between customer locations may change over time due to the impact of traffic patterns. Thus, this paper considers a Time-Dependent Traveling Salesman Problem with Time Windows (TDTSPTW), where the time dependency is obtained by fitting the collected traffic data into real-time traffic function with the interpolation method. We propose a deep reinforcement learning framework to solve TDTSPTW. Extensive experiments on TDTSPTW instances indicate that the proposed method can capture the real-time traffic changes and yield high-quality solutions within a very short time, compared with other typical baseline algorithms.
https://doi.org/10.1109/SMC52423.2021.9658956Cite as:
@inproceedings{Wu_2021, doi = {10.1109/smc52423.2021.9658956}, url = {https://doi.org/10.1109%2Fsmc52423.2021.9658956}, year = 2021, month = {oct}, publisher = {{IEEE}}, author = {Guojin Wu and Zizhen Zhang and Hong Liu and Jiahai Wang}, title = {Solving Time-Dependent Traveling Salesman Problem with Time Windows with Deep Reinforcement Learning}, booktitle = {2021 {IEEE} International Conference on Systems, Man, and Cybernetics ({SMC})} }