Abstract:Aiming at the methods of task allocation, route planning and command control of unmanned underwater vehicle (UUV), a new control implementation method, command agent based on deep reinforcement learning, is proposed to replace human in the loop or automatic command and control. Build UUV command agent training platform, design scenarios required for learning and training, conduct state design, data adaptation, decision analysis and rule base establishment, and select proximal policy optimization (PPO) reinforcement learning algorithm for training. The application verification of the command agent generated by training and learning is carried out. The results show that the command intelligence can effectively carry out task allocation, route planning, command and control of UUV, and make bold guesses. By continuously optimizing the algorithm, the winning rate of defeating the traditional rule-based control method can be improved.