L. Ru, Z. Tong, Y. Liu, and S. Ma (PRC)
Chinese Name Recognition, Statistical Analysis, Unknown Word Recognition.
In this paper, we proposed a unified solution for Chinese name Recognition based analysis into large scale Chinese Web corpuses. In our approach, a Chinese name is identified according to its component, context and structure features. The possibility of a three-character string being a Chinese name is calculated according to statistical analysis into Web corpus which contains over 100 million Web pages and 24 million Chinese names. Experimental results based on a widely-adopted Chinese annotated corpus show that our method is effective by achieving 93% precision and 89% recall rate.
Important Links:
Go Back