Train timetable rescheduling (TTR) is the core of real-time train dispatching. It is a challenging sequential decision-making problem due to complex constraints. Leveraging deep reinforcement learning’s efficacy in sequential decision-making, we formulate TTR as a Markov Decision Process (MDP) and propose a Proximal Policy Optimization (PPO) method. To enhance exploration efficiency in large action spaces, we introduce a conflict-aware action selection rule (ASR) and a multi-dimensional discrete action space. Experiments on the Wuhan-Guangzhou high-speed railway under various disturbances demonstrate that: (1) The proposed method yields lower total train arrival delays than heuristic rules and other DRL algorithms, with an average optimality gap below 4.0%; (2) The computational time is approximately 4 s; (3) The ASR contributes to an average delay reduction of 27.2%; and (4) Sensitivity analysis indicates optimal performance with 5 trains per action dimension. The results validate the effectiveness and applicability of the proposed method in real-time TTR.
Luo, J., Huang, P., Li, Z., Pang, Z., D'Ariano, A. (2026). Train timetable rescheduling considering potential train operation conflicts: an enhanced deep reinforcement learning approach. INTERNATIONAL JOURNAL OF RAIL TRANSPORTATION, 1-32 [10.1080/23248378.2025.2611805].
Train timetable rescheduling considering potential train operation conflicts: an enhanced deep reinforcement learning approach
D'Ariano, Andrea
2026-01-01
Abstract
Train timetable rescheduling (TTR) is the core of real-time train dispatching. It is a challenging sequential decision-making problem due to complex constraints. Leveraging deep reinforcement learning’s efficacy in sequential decision-making, we formulate TTR as a Markov Decision Process (MDP) and propose a Proximal Policy Optimization (PPO) method. To enhance exploration efficiency in large action spaces, we introduce a conflict-aware action selection rule (ASR) and a multi-dimensional discrete action space. Experiments on the Wuhan-Guangzhou high-speed railway under various disturbances demonstrate that: (1) The proposed method yields lower total train arrival delays than heuristic rules and other DRL algorithms, with an average optimality gap below 4.0%; (2) The computational time is approximately 4 s; (3) The ASR contributes to an average delay reduction of 27.2%; and (4) Sensitivity analysis indicates optimal performance with 5 trains per action dimension. The results validate the effectiveness and applicability of the proposed method in real-time TTR.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


