tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection

Publication date

2020-07

Authors

Peinelt, Nicole
Nguyen, DongISNI 0000000419527451
Liakata, Maria

Editors

Jurafsky, Dan
Chai, Joyce
Schluter, Natalie
Tetreault, Joel

Advisors

Supervisors

Document Type

Part of book
Open Access logo

License

Abstract

Semantic similarity detection is a fundamental task in natural language understanding. Adding topic information has been useful for previous feature-engineered semantic similarity models as well as neural models for other tasks. There is currently no standard way of combining topics with pretrained contextual representations such as BERT. We propose a novel topic-informed BERT-based architecture for pairwise semantic similarity detection and show that our model improves performance over strong neural baselines across a variety of English language datasets. We find that the addition of topics to BERT helps particularly with resolving domain-specific cases.

Keywords

Citation

Peinelt, N, Nguyen, D & Liakata, M 2020, tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection. in D Jurafsky, J Chai, N Schluter & J Tetreault (eds), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 7047-7055. https://doi.org/10.18653/v1/2020.acl-main.630