Evaluating deep syntactic parsing : Using TOSCA for the analysis of why-questions

Theijssen, Daphne; Verberne, Suzan; Oostdijk, Nelleke; Boves, Lou

Evaluating deep syntactic parsing : Using TOSCA for the analysis of why-questions

Files

bookpart.pdf (110.03 KB)

Publication date

2007-10

Authors

Theijssen, Daphne

Verberne, Suzan

Oostdijk, Nelleke

Boves, Lou

Document Type

Part of book or chapter of book

Metadata

Show full item record

Collections

LOTOS

Abstract

Previous research has shown that the high level of detail in syntactic trees produced by the TOSCA parsing system (Oostdijk 1996) is beneficial to why-question answering (QA) (Verberne et al. 2006b). TOSCA is an interactive system, i.e. it needs human verification after automatic tagging and parsing. Since only manually corrected TOSCA output has been offered to the why-QA system until now, TOSCA needs extrinsic evaluation of its use in the why-QA system. In this paper we present a necessary step towards it, namely an intrinsic evaluation of the performance of TOSCA on why-questions, which also enables us to trace elements in the parser that leave room for improvement. The evaluation shows that the modularity of the current TOSCA system has a dramatic effect on its performance: Tagging errors and missing syntactic markers radically decrease the coverage and the Parseval scores. Applying the Leaf-Ancestor Assessment metric for parser evaluation, we conclude that the level of detail does not really affect parser accuracy. This stimulates the automatic use of the parsing component in TOSCA for the purpose of why-QA. A new version of TOSCA is under construction, in which the level of detail in the parses is maintained, while there is no longer a need to separately provide POS tags or insert any syntactic markers.

URI

https://dspace.library.uu.nl/handle/1874/296751

Evaluating deep syntactic parsing : Using TOSCA for the analysis of why-questions

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI