A novel control design problem for a class of nonstrict feedback multi-agent systems (MAS) in discrete-time form is studied based on reinforcement learning (RL) and applied to multi-marine vehicles (MMV). Firstly, for this kind of discrete-time MAS, a novel system transformation, which can not only solve the noncausal problem that exists in the backstepping method but also reduce the computational complexity, is proposed. Secondly, the algebraic-loop problem inherent in the conventional controller design is solved by compensating the dynamics and using the property of neural network (NN). Thirdly, the multi-gradient recursive (MGR) RL scheme is developed for the sake of designing the optimal controller. Finally, the stability analysis is presented, and all signals are ensured to be semi-global uniformly ultimately bounded (SGUUB) in the Lyapunov’s sense. Besides, this scheme is applied to the MMV which can be described in the non-strict feedback form to extend the application of the designed controller. The MMV simulation demonstrates the validation of this scheme.
Bai, W., Chen, D., Zhao, B.o., D'Ariano, A. (2025). Reinforcement Learning Control for a Class of Discrete-Time Non-Strict Feedback Multi-Agent Systems and Application to Multi-Marine Vehicles. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 10(5), 3613-3625 [10.1109/tiv.2024.3458894].
Reinforcement Learning Control for a Class of Discrete-Time Non-Strict Feedback Multi-Agent Systems and Application to Multi-Marine Vehicles
D'Ariano, Andrea
2025-01-01
Abstract
A novel control design problem for a class of nonstrict feedback multi-agent systems (MAS) in discrete-time form is studied based on reinforcement learning (RL) and applied to multi-marine vehicles (MMV). Firstly, for this kind of discrete-time MAS, a novel system transformation, which can not only solve the noncausal problem that exists in the backstepping method but also reduce the computational complexity, is proposed. Secondly, the algebraic-loop problem inherent in the conventional controller design is solved by compensating the dynamics and using the property of neural network (NN). Thirdly, the multi-gradient recursive (MGR) RL scheme is developed for the sake of designing the optimal controller. Finally, the stability analysis is presented, and all signals are ensured to be semi-global uniformly ultimately bounded (SGUUB) in the Lyapunov’s sense. Besides, this scheme is applied to the MMV which can be described in the non-strict feedback form to extend the application of the designed controller. The MMV simulation demonstrates the validation of this scheme.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


