Abstract:In order to enable UAVs to obtain an accident-free, shorter and safer flight path between the take-off point and the end point in complex 3D urban environments, this paper designs an advanced population optimization algorithm based on Proximal Policy Optimization Gradient (PPO) in dynamic environments for Unmanned Aerial Vehicle (UAV)-assisted distribution system, which consists of improved PPO algorithms, advanced population optimization algorithms, and a python experimental platform. Based on the characteristics of the standard PPO algorithm and Particle Swarm Optimization (PSO) algorithm, this paper makes a new improvement to the PPO algorithm.PPO-PSO mainly uses particle optimization to modify the iterative method of the intelligences, which solves the problem of the neural network"s poor local search ability. In this paper, the convergence of the algorithm is demonstrated, and simulation is carried out in python environment to verify its effectiveness. The simulation results show that PPO-PSO has better convergence speed and solving speed, and has better robustness.