Abstract:Aiming at the problems of low target pixels, numerous small targets and complex background in the production scene of open-pit mine, an multiscale and super-resolution network (MS_Net) is proposed based on YOLOv5s. In the feature fusion module, the three-scale detection of PANet is upgraded to four-scale detection to improve the multi-scale learning ability of the network, and sub-pixel convolution is used as an up-sampling method; A multi layer fusion (MLF) module is proposed to fuse the features of three output layers of PANet, and a feature map with rich semantic information and spatial information is obtained.In the prediction layer, SIoU is used as the localization loss function to optimize the parameters of the model. The experimental results show that the mAP of MS_Net is 79.4% and the FPS is 59 on PASCAL VOC data set, and the mAP is 80.2% and the FPS is 64.5 on mine data set, and the model can identify and detect the target in the open-pit mine quickly, accurately and efficiently.