A Machine Learning Approach to Anaphora Resolution Including Named Entity Recognition, PP Attachment Disambiguation, and Animacy Detection

Nøklestad, Anders

Doctoral thesis

View/Open

397_Noeklestad_17x24.pdf (1.203Mb)

Year

2009

Abstract

Avhandlingen beskriver et automatisk anaforløsningssystem (AL-system) for norsk med fokus på pronominale anaforer i skjønnlitterære tekster. Systemet bygger primært på maskinlæringsmetoder og er det første norske AL-systemet som bruker maskinlæring. Et sett av lingvistisk funderte filtre fjerner inkompatible antecedentkandidater før de resterende kandidatene klassifiseres enten som antecedenter eller ikke-antecedenter. Den nærmeste kandidaten som klassifiseres som passende antecedent (hvis en slik finnes), velges som antecedent for pronomenet.

Tre ulike maskinlæringsmetoder for klassifisering blir evaluert og sammenliknet: minnebasert læring (MBL), maksimum entropi-modellering (MaksEnt) og støttevektormaskiner (SVMer). Metodene blir testet både med standard parameterverdier og med automatisk optimiserte verdier. Ulike pronomen håndteres av ulike klassifikatorer. To andre kunnskapsfattige tilnærminger, en faktor/indikator-basert tilnærming og en som er basert på Centering Theory, blir sammenliknet med maskinlæringsmetodene. De beste maskinlæringsmetodene fungerer signifikant bedre enn tilnærmingene som ikke er basert på maskinlæring og signifikant bedre enn det eneste eksisterende AL-systemet for norsk.

Avhandlingen beskriver også utvikling og evaluering av tre støttemoduler som bidrar med informasjon til AL-systemet: en navnetypegjenkjenner, en PP-tilordner og en animathetsdetektor. Ulike maskinlæringsmetoder blir testet og sammenliknet med hensyn til hvor godt de fungerer for de to første modulene. PP-tilordneren er basert på en nyskapende form for halvovervåket læring, mens animathetsdetektoren bruker to ulike metoder for å hente ut animathetsinformasjon for substantiver fra Internett. De tre støttemodulene evalueres både som selvstendige NLP-verktøy og som informasjonskilder for AL-systemet.

I nesten alle eksperimentene som er beskrevet i avhandlingen, fungerer MBL like godt eller bedre enn MaksEnt, mens prestasjonsnivået til SVMene er signifikant dårligere.

The thesis describes an automatic anaphora resolution (AR) system for Norwegian, focussing on the resolution of pronominal anaphora in fiction material. The system relies primarily on machine learning (ML) methods, and is the first Norwegian AR system to use machine learning. A set of linguistically motivated filters remove incompatible antecedent candidates before the remaining ones are classified as either antecedent or non-antecedent. The closest candidate classified as a suitable antecedent (if any) is selected as the antecedent of the pronoun.

For the classifier, three different machine learning methods are evaluated and compared: memory-based learning (MBL), maximum entropy modelling (MaxEnt), and support vector machines (SVMs). The methods are tested with default as well as automatically optimized parameter settings. Different pronouns are handled by separate classifiers. Two other knowledge-poor approaches, a factor/indicator-based approach and a Centering Theory approach, are compared to the machine learning methods. The best machine learning approaches perform significantly better than the non-ML approaches and significantly better than the only previously existing Norwegian AR system.

The thesis also describes the development and evaluation of three support modules providing information to the AR system: a named entity recognizer, a PP attachment disambiguator, and an animacy detector. Various machine learning methods are tested and compared with respect to the first two modules. The PP module introduces a novel kind of semi-supervised learning, while the animacy detector employs two different procedures for using the World Wide Web to obtain animacy information for nouns. The three support modules are evaluated both as standalone NLP tools and as information sources for the AR system.

In almost all experiments described in this thesis, MBL performs better than or equally well as MaxEnt, while the performance of the SVMs is significantly worse.