PLM-eXplain: Divide and Conquer the Protein Embedding Space

van Eck, Jan; Gogishvili, Dea; Silva, Wilson; Abeln, Sanne

doi:https://doi.org/10.1093/bioinformatics/btaf631

PLM-eXplain: Divide and Conquer the Protein Embedding Space

Files

btaf631.pdf (2.18 MB)

Publication date

2026-01

Authors

van Eck, Jan

Gogishvili, Dea

Silva, Wilson

Abeln, Sanne

DOI

https://doi.org/10.1093/bioinformatics/btaf631

Document Type

Article

Metadata

Show full item record

Collections

Utrecht University Repository

License

cc_by

Abstract

MOTIVATION: Protein language models (PLMs) have revolutionized computational biology through their ability to generate powerful sequence representations for diverse prediction tasks. However, their black-box nature limits biological interpretation and translation to actionable insights. Bridging this gap requires approaches that maintain predictive performance while providing interpretable explanations of model behaviour. RESULTS: We present PLM-eXplain (PLM-X), an explainable adapter layer that bridges this gap by factoring PLM embeddings into two complementary components: an interpretable subspace based on established biochemical features, and a residual subspace that retains predictive, non-interpretable information. Using embeddings from ESM2 and ProtBert, PLM-X incorporates well-established properties, including secondary structure and hydropathy, while maintaining high predictive performance. We demonstrate the effectiveness of our approach across three biologically relevant classification tasks: extracellular vesicle association, transmembrane helix prediction, and aggregation propensity prediction. PLM-X enables biological interpretation of model decisions without sacrificing accuracy, offering a generalizable solution for enhancing PLM interpretability across various downstream applications. AVAILABILITY AND IMPLEMENTATION: Source code and models are available at https://github.com/AIT4LIFE-UU/PLM-eXplain/.

Keywords

Statistics and Probability, Biochemistry, Molecular Biology, Computer Science Applications, Computational Theory and Mathematics, Computational Mathematics

Citation

van Eck, J, Gogishvili, D, Silva, W & Abeln, S 2026, 'PLM-eXplain : Divide and Conquer the Protein Embedding Space', Bioinformatics (Oxford, England), vol. 42, no. 1, btaf631. https://doi.org/10.1093/bioinformatics/btaf631

URI

https://dspace.library.uu.nl/handle/1874/480132

PLM-eXplain: Divide and Conquer the Protein Embedding Space

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI