Text mining for social science – The state and the future of computational text analysis in sociology

Publication date

2022-11

Authors

Macanovic, AISNI 0000000512552387

Editors

Advisors

Supervisors

Document Type

Article
Open Access logo

License

cc_by

Abstract

The emergence of big data and computational tools has introduced new possibilities for using large-scale textual sources in sociological research. Recent work in sociology of culture, science, and economic sociology has shown how computational text analysis can be used in theory building and testing. This review starts with an introduction of the history of computer-assisted text analysis in sociology and then proceeds to discuss five families of computational methods used in contemporary research. Using exemplary studies, it shows how dictionary methods, semantic and network analysis tools, language models, unsupervised, and supervised machine learning can assist sociologists with different analytical tasks. After presenting recent methodological developments, this review summarizes several important implications of using large datasets and computational methods to infer complex meaning in texts. Finally, it calls researchers from different methodological traditions to adopt text mining tools while remaining mindful of lessons learned from working with conventional data and methods.

Keywords

Big data, Content analysis, Machine learning, Natural language processing, Text analysis, Text mining, Education, Sociology and Political Science

Citation

Macanovic, A 2022, 'Text mining for social science – The state and the future of computational text analysis in sociology', Social Science Research, vol. 108, 102784, pp. 1-17. https://doi.org/10.1016/j.ssresearch.2022.102784