Mining Legal Text to Create a Litigation History Database

M. Chaudhary, C. Dozier, G. Atkinson, G. Berosik, X. Guo, and S. Samler (USA)


Text mining, information extraction, record linkage, co reference resolution, named entity recognition.


This paper describes a text mining application that creates a litigation history database by extracting information from caselaw text and court dockets. The text mining application combines the techniques of named entity extraction, record linkage, relationship extraction among named entities, co-reference resolution, and document classification to extract information from text and populate a relational database that supports sophisticated litigation trend analysis. The novelty of this system lies in the targeted application of several text-mining techniques to extract entities and relationships from legal documents in such a way that they align with the semantics of a relational database designed for the delivery of litigation trend analysis. To populate this database, we extracted and stored over 90 million litigation representation facts that identify attorneys representing client companies in U.S. courts from 1990 to the present.

Important Links:

Go Back