Abstract:In order to solve the problem of data drift in the information clustering process of heterogeneous networks and the inability of current incremental update methods to identify pseudo-relevant data, an incremental update method for heterogeneous network information based on a high-frequency data co-occurrence clustering algorithm is proposed. The drift point deviation mining algorithm is utilized to search for the locations of drift points in the dataset, identifying datasets with a high number of drift points as abnormal and subsequently removed. Data attributes in the dataset are statistically analyzed, and frequent occurrences within the same dataset are marked. An association feature selection algorithm is employed to extract representative morpheme attributes, and genuine information that matches user queries is extracted for incremental updates. Experimental results demonstrate that this method can successfully converge drift data points, achieving a pseudo-relevant data precision rate of 98.2%, indicating strong feasibility.