Comparing Improved Language Models for Sentence Retrieval in Question Answering
Files
Publication date
2007-10
Authors
Merkel, Andreas
Klakow, Dietrich
Editors
Advisors
Supervisors
DOI
Document Type
Part of book or chapter of book
Metadata
Show full item recordCollections
License
Abstract
A retrieval system is a very important part in a question answering framework. It reduces
the number of documents to be considered for finding an answer. For further refinement, the
documents are split up into smaller chunks to deal with topic variability in larger documents.
In our case, we divided the documents into single sentences. Then a language model based
approach was used to re-rank the sentence collection.
For this purpose, we developed a new language model toolkit. It implements all standard
languagemodeling techniques and ismore flexible than other tools in terms of backingoff
strategies, model combinations and design of the retrieval vocabulary. With the aid
of this toolkit we conducted re-ranking experiments with standard language model based
smoothing methods. On top of these algorithms we developed some new, improved models
including dynamic stop word reduction and stemming. We also experimented with query
expansion depending on the type of a query. On a TREC corpus, we demonstrate that our
proposed approaches provide a performance superior to the standard methods. In terms of Mean Reciprocal Rank (MRR) we can prove a performance gain from 0.31 to 0.39.
3.1