Abstract:In order to solve the problems of difficult training and modeling of multiple UAVs in complex obstacles, an end-to-end motion planning method for multiple UAVs shuttling through the forest based on chain training and proximal policy optimization (PPO) algorithm with heuristic information is proposed. Considering the dynamic characteristics of UAV and the complexity of three-dimensional continuous environment, an effective reinforcement learning training method for motion planning strategy is designed. Simulation results show the effectiveness and superiority of the proposed method in the task of multiple UAVs formation shuttling through the forest. The results show that the method can maintain a certain formation stability and reach the target point on the premise of obstacle avoidance, and it is superior to the traditional artificial potential field method in maintaining formation stability and passing rate. This study provides a new perspective and solution for autonomous navigation and path planning of UAV formation in complex environment.