LLM-STDAN: SPATIO-TEMPORAL DATA AUGMENTATION NETWORK OF A LARGE LANGUAGE MODEL FOR UNSUPERVISED RAILWAY SAFETY RISK DIVISION

Yifan Wang; Lingbin Bu; Liang Zhang; Qiming Ma; Zhiyuan Li; Qi Wang; Fanliang Bu

doi:10.2316/J.2026.201-0582

LLM-STDAN: SPATIO-TEMPORAL DATA AUGMENTATION NETWORK OF A LARGE LANGUAGE MODEL FOR UNSUPERVISED RAILWAY SAFETY RISK DIVISION

Yifan Wang, Lingbin Bu, Liang Zhang, Qiming Ma, Zhiyuan Li, Qi Wang, Fanliang Bu

Keywords

Data augmentation, railway safety, unsupervised clustering, risk division

Abstract

Railway safety risk assessment is a key task to ensure the reliability of the transport system, but traditional methods rely on manually labelled data and are diﬃcult to eﬀectively capture complex spatio- temporal correlation features. To this end, this paper proposes a large language model spatio-temporal data augmentation network (LLM- STDAN), which uses unsupervised learning to classify railway safety risk states and levels. The network model uses large language models, multiple attention mechanisms, and contrastive learning techniques to construct text data augmentation, spatio-temporal information fusion, and unsupervised clustering modules, respectively, to enhance the eﬀectiveness and stability of unsupervised clustering. Empirical tests with the railway accident dataset and operational environment data in the Federal Railroad Administration show that LLM-STDAN outperforms other benchmark models in the unsupervised accident division task, with a clustering proﬁle coeﬃcient index of 98.7%.

Important Links:

References
DOI: 10.2316/J.2026.201-0582
From Journal (201) Mechatronic Systems and Control - 2026

Go Back