Putting the t where it belongs : Solving a confusion problem in Dutch
Files
Publication date
2008-11
Authors
Stehouwer, Herman
Bosch, Antal van den
Editors
Advisors
Supervisors
DOI
Document Type
Part of book or chapter of book
Metadata
Show full item recordCollections
License
Abstract
A common Dutch writing error is to confuse a word ending in -d with a neighbor word
ending in -dt. In this paper we describe the development of a machine-learning-based disambiguator
that can determine which word ending is appropriate, on the basis of its local
context. We develop alternative disambiguators, varying between a single monolithic
classifier and having multiple confusable experts disambiguate between confusable pairs.
Disambiguation accuracy of the best developed disambiguators exceeds 99%; when we apply
these disambiguators to an external test set of collected errors, our detection strategy
correctly identifies up to 79% of the errors.