M. Zhong, M. Georgiopoulos, and G.C. Anagnostopoulos (USA)
decision tree, pruning, Law of Succession
The decision tree classifier is a well-known methodology for classification. It is widely accepted that a fully grown tree is usually over-fit to the training data and thus should be pruned back. In this paper, we analyze the overtraining issue theoretically using an the k-norm risk estimation ap proach with Lidstone’s Estimate. Our analysis allows the deeper understanding of decision tree classifiers, especially on how to estimate their misclassification rates using our equations. We propose a simple pruning algorithm based on our analysis and prove its superior properties, including its independence from validation and its efficiency.
Important Links:
Go Back