A Machine Learning Approach for Indonesian Question Answering System

A. Purwarianti, M. Tsuchiya, and S. Nakagawa (Japan)


natural language processing, machine learning, question answering, question classifier, answer finder, Indonesian


Our research is to investigate a machine learning approach in order to build an Indonesian Question Answering System. Based on our experiments result on the question classification task, we choose to use SVM as the machine learning algorithm. Similar with ordinary QA systems, we divide our system into three subcomponents: question classifier, passage retriever and answer finder. The SVM algorithm is employed in the question classifier and answer finder modules. To overcome the language resource poorness problem of Indonesian language, we introduce a bi-gram frequency attribute extracted from a downloaded newspaper corpus. The comparison among attribute combination is shown in our question classifier experiment. The t-test shows that the question shallow parser result attribute joined with bi-gram frequency attribute gives significant improvement compared to the baseline (bag of words). Our question classifier achieves 96% accuracy. We also compare some attribute combinations in the answer finder module. We find that the join attribute between the expected answer type (EAT) and the attributes of the question classifier gives higher MRR score than using only the EAT attribute or only the attribute of the question classifiers. Our QA system achieves MRR (Mean Reciprocal Rank) of 0.52 for exact answers.

Important Links:

Go Back