Towards a Multi-Representational Treebank

Publication date

2008-11

Authors

Xia, Fei
Rambow, Owen
Bhatt, Rajesh
Palmer, Martha
Misra Sharma, Dipti

Editors

Advisors

Supervisors

DOI

Document Type

Part of book or chapter of book

Collections

Open Access logo

License

Abstract

Computational, descriptive, and theoretical linguistics use both phrase (PS) structure and dependency structure (DS) to represent syntax. We believe that the next-generation treebank should be multi-representational, designed for both representations with an automatic conversion. In this paper, we highlight the assumptions made by existing PS-to-DS and DS-to-PS conversion algorithms and show the limitations of these algorithms. We then propose a new DS-to-PS conversion algorithm that outperforms existing algorithms and allows more flexibility. Our experiments and error analysis show that high-quality DS-to-PS conversion is possible if the conversion process is performed at the designing stage of treebank construction to ensure that all information we wish to represent in PS is provided in DS.

Keywords

Citation