Constructing a Valence Lexicon for a Treebank of German
Files
Publication date
2008-11
Authors
Hinrichs, Erhard W.
Telljohann, Heike
Editors
Advisors
Supervisors
DOI
Document Type
Part of book or chapter of book
Metadata
Show full item recordCollections
License
Abstract
"Treebanks allow for the creation of a valence lexicon per side effect. The TüBa-D/Z
valence lexicon has been created in lockstep with the development of the TüBa-
D/Z treebank as such. For each verb encountered in the treebank, the annotators
created a lexical entry that records the valence frames of the verbs contained in
the sentence, unless they are already contained in the valence lexicon as result of
previous annotation. The TüBa-D/Z valence lexicon currently contains a total of
8013 frames for 4896 distinct verb lemmas. Since treebank annotation is still ongoing,
the lexicon will continue to grow.
Such a lexicon has utility in its own right as a resource for lexicalized parsing
and a variety of NLP applications. At the same time, the lexicon can serve as a
source for aiding consistency of annotation and automatic detection of annotation
errors"