Comparing Improved Language Models for Sentence Retrieval in Question Answering

Publication date

2007-10

Authors

Merkel, Andreas
Klakow, Dietrich

Editors

Advisors

Supervisors

DOI

Document Type

Part of book or chapter of book

Collections

Open Access logo

License

Abstract

A retrieval system is a very important part in a question answering framework. It reduces the number of documents to be considered for finding an answer. For further refinement, the documents are split up into smaller chunks to deal with topic variability in larger documents. In our case, we divided the documents into single sentences. Then a language model based approach was used to re-rank the sentence collection. For this purpose, we developed a new language model toolkit. It implements all standard languagemodeling techniques and ismore flexible than other tools in terms of backingoff strategies, model combinations and design of the retrieval vocabulary. With the aid of this toolkit we conducted re-ranking experiments with standard language model based smoothing methods. On top of these algorithms we developed some new, improved models including dynamic stop word reduction and stemming. We also experimented with query expansion depending on the type of a query. On a TREC corpus, we demonstrate that our proposed approaches provide a performance superior to the standard methods. In terms of Mean Reciprocal Rank (MRR) we can prove a performance gain from 0.31 to 0.39. 3.1

Keywords

Citation