This paper presents a deep learning algorithm for anomaly detection in high-frequency trading data.The algorithm utilizes a staged sliding window Transformer architecture to capture multi-scale temporal features.Experimental results show that the proposed method outperforms traditional and deep learning approaches in terms of accuracy, F1-Score, and AUC-ROC.The model provides important support for market supervision but suffers from false positives.