Improved use of machine learning techniques in named entity recognition

Kitoogo, Fredrick Edward

View/Open

Thesis report (2.739Mb)

Date

2009-09

Author

Kitoogo, Fredrick Edward

Metadata

Show full item record

Abstract

The current digital era and particularly the evolution of the World Wide Web (WWW) has generated a multiplicity of knowledge resources stored in electronic formats. Some of the texts even have some form of resource description framework describing embedded meta-knowledge such as Author, Title, Date, Subject, and so on. The existence of such unexploited knowledge has arisen into the need for the utilization of large volumes of information from the resources, a key area of natural language processing (NLP). One of the primary methods of NLP used in understanding natural language is Named Entity Recognition (NER), a technique of systematically identifying and classifying (component) words into predefined entities (such as Person, Organization or Location names). Although many approaches to NER have been developed, the complexity of the NER task has posed a great challenge to develop systems with better performance. The recent trend employed to tackle the NER problem is the use of machine learning techniques. In this work, we begin with an extensive review of literature related to the research, then present the approaches which embrace the widely used machine learning dynamics for natural language processing which constitute classifier combination, feature engineering and meta-knowledge. We introduce the notion of recursive stacking for NER to smarten the classifier combination technique. A multi-objective genetic algorithm (MOGA) and a feature exploration technique are applied for feature engineering. Correspondingly, we formalize the domain independence capability in NER by introducing the concept of domain independent features. Consequently the idea of meta-knowledge is used to provide a basis for the use of specific classification algorithms as well as their corresponding combinations. To exhibit the feasibility of the approaches used, we induce the different models on different data sets which mainly comprised of manually annotated judicial data sets. Comprehensive experimental results demonstrate the benefits of our approaches. The methods applied in this work are empirically constituted and the results of this work provide a theoretical justification for integrating the three machine learning dynamics and provide a fundamental step in achieving a framework for NER.

URI

http://hdl.handle.net/10570/495

Collections

School of Computing and Informatics Technology (CIT) Collection