Order out of Chaos: Construction of Knowledge Models from PDF Textbooks

Alpizar-Chacon, Isaac; Sosnovsky, Sergey

doi:https://doi.org/10.1145/3395027.3419585

Order out of Chaos: Construction of Knowledge Models from PDF Textbooks

Files

How and where to find an Open Access version of this publication?

Publication date

2020-09-29

Authors

Alpizar-Chacon, Isaac

Sosnovsky, S.A.

DOI

https://doi.org/10.1145/3395027.3419585

Document Type

Contribution to conference

Metadata

Show full item record

Collections

Utrecht University Repository

Abstract

Textbooks are educational documents created, structured and formatted by domain experts with the main purpose to explain the knowledge in the domain to a novice. Authors use their understanding of the domain when structuring and formatting the content of a textbook to facilitate this explanation. As a result, the formatting and structural elements of textbooks carry the elements of domain knowledge implicitly encoded by their authors. Our paper presents an extendable approach towards automated extraction of this knowledge from textbooks taking into account their formatting rules and internal structure. We focus on PDF as the most common textbook representation format; however, the overall method is applicable to other formats as well. The evaluation experiments examine the accuracy of the approach, as well as the pragmatic quality of the obtained knowledge models using one of their possible applications - semantic linking of textbooks in the same domain. The results indicate high accuracy of model construction on symbolic, syntactic and structural levels across textbooks and domains, and demonstrate the added value of the extracted models on the semantic level.

Keywords

knowledge modeling, model extraction, PDF processing, textbook, Software, Information Systems

Citation

Alpizar-Chacon, I & Sosnovsky, S 2020, 'Order out of Chaos : Construction of Knowledge Models from PDF Textbooks', Paper presented at 20th ACM Symposium on Document Engineering, DocEng 2020, Virtual, Online, United States, 29/09/20 - 1/10/20 pp. 1-10. https://doi.org/10.1145/3395027.3419585, conference

URI

https://dspace.library.uu.nl/handle/1874/415582

Order out of Chaos: Construction of Knowledge Models from PDF Textbooks

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI