Similarity Rules! Exploring Methods for Ad-Hoc Rule Detection

Publication date

2008-11

Authors

Dickinson, Markus
Foster, Jennifer

Editors

Advisors

Supervisors

DOI

Document Type

Part of book or chapter of book

Collections

Open Access logo

License

Abstract

"One problem facing the extraction of treebank grammars is that of ad hoc rules, rules used for constructions specific to one data set and unlikely to be used on new data (Dickinson, 2008). These rules can be erroneous, cover ungrammatical text, or reveal issues with the treebank’s annotation scheme. These are significant problems since training on erroneous data can be detrimental to parsing performance (e.g., Dickinson and Meurers, 2005; Hogan, 2007), and the use of precision grammars in grammar checking and generation requires distinctions between grammatical and ungrammatical sentences (e.g., Bender et al., 2004). Ad hoc rules are especially problematic when they point to inconsistent aspects of the annotation scheme, as the scheme forms the basis of any analysis using it"

Keywords

Citation