LLM-STDAN: SPATIO-TEMPORAL DATA AUGMENTATION NETWORK OF A LARGE LANGUAGE MODEL FOR UNSUPERVISED RAILWAY SAFETY RISK DIVISION

Yifan Wang, Lingbin Bu, Liang Zhang, Qiming Ma, Zhiyuan Li, Qi Wang, Fanliang Bu

Keywords

Data augmentation, railway safety, unsupervised clustering, risk division

Abstract

Railway safety risk assessment is a key task to ensure the reliability of the transport system, but traditional methods rely on manually labelled data and are difficult to effectively capture complex spatio- temporal correlation features. To this end, this paper proposes a large language model spatio-temporal data augmentation network (LLM- STDAN), which uses unsupervised learning to classify railway safety risk states and levels. The network model uses large language models, multiple attention mechanisms, and contrastive learning techniques to construct text data augmentation, spatio-temporal information fusion, and unsupervised clustering modules, respectively, to enhance the effectiveness and stability of unsupervised clustering. Empirical tests with the railway accident dataset and operational environment data in the Federal Railroad Administration show that LLM-STDAN outperforms other benchmark models in the unsupervised accident division task, with a clustering profile coefficient index of 98.7%.

Important Links:

Go Back