Predicting author profiles from online abuse directed at public figures
Publication date
2022-03
Editors
Advisors
Supervisors
Document Type
Article
Metadata
Show full item recordCollections
License
taverne
Abstract
The problem of online threats and abuse directed at public figures could potentially be mitigated with a computational approach, where sources of abusive language are better understood or identified through author profiling. However, abusive language constitutes a specific domain of language that is untested on whether differences emerge based on personality, age, or gender of text authors. The present study presents a unique data set of 789 abusive messages directed at politicians. It examines statistical relationships between author demographics of text authors and (abusive) language, then uses a machine learning approach to predict personality, age, and gender based on language in the texts. Results showed that (a) personality traits could be determined within 10% of their actual value, (b) age was determined with an error margin of 10 years, and (c) gender was classified correctly in 70% of the cases. Even though we found statistically significant relationships between language use and demographics, prediction performance was poor when compared to previous research on author profiling. Therefore, we suggest that further research is needed before author profiling systems can be of significant value within the context of abusive language and threat assessment.
Keywords
Taverne
Citation
Vegt, I V D, Kleinberg, B & Gill, P 2022, 'Predicting author profiles from online abuse directed at public figures', Journal of Threat Assessment and Management, vol. 9, no. 1, pp. 17–32. https://doi.org/10.1037/tam0000172