Chen Zhang and Chongyang Shi
Entity extraction, Geographical name extraction, Geographical name classification
A geographical name is a conventional language sign, which is not only used for a geographic entity, but also widely applied in the naming of other entities, thus significantly affecting the accuracy of news-specific geographical name extractions. In this paper, the characteristics of the geographical names in the news are analyzed and the geographical names are divided into geographic entity, organizational entity, in addition to other entities. For the unbalance of these three entities in the news, a SMOTE-based random walk algorithm is proposed here to realize the classification and identification of geographical names.