Building a Large-scale Persona Dialog Dataset
Files
Publication date
2018-11-08
Editors
Advisors
Supervisors
DOI
Document Type
Abstract
Metadata
Show full item recordCollections
License
Abstract
We proposed a primary version of a large scale multi-turn dialogue dataset in Chinese that contains over 25 million sessions of dialogues crawled from Weibo1. Diversified personality traits for each dialogue participant are collected to facilitate modelling persona in dialogues. Our dataset fills the blank of the resources for building personalised dialogue systems in open-domain conversations and can also serves as an important resource for a wide range of studies.
Keywords
Citation
Zheng, Y, Chen, G & Huang, M 2018, 'Building a Large-scale Persona Dialog Dataset', The workshop on natural language generation for human robot interaction, Tilburg, Netherlands, 8/11/18 - 8/11/18. < https://hbuschme.github.io/nlg-hri-workshop-2018/assets/papers/Zheng-NLG-HRI-Workshop-2018.pdf >, workshop