A Turkish Automatic Text Summarization System

Z. Altan (Turkey)


Natural Language Processing, Text Summarization,Corpus, Frequency


The system developed in this study uses a Turkish text as input, and after the implementation of a sequence of procedures the summary results accomplishing the target sentence length. The study has been specialized to obtain more significant results for the articles on economic matters. We converted the context of all papers into HTML documents to provide the formal structures. The training system comprehends a corpus including 50 different articles. Moreover, it is possible to add the summarized document to the corpus in case of demand. The choice of the summarization percentage has been made by the user. We have utilized the Internet both for the development environment of the system and its running platform. The program interface runs on the Web through browsers. This study principally depends on the statistical analysis of the paragraphs, sentences and words in the document according to predefined specific weighting factors. Although these weight points determine the skeleton of summary, the sentences for the summarization are primarily chosen to emphasize the semantic integrity.

