Validating biomarkers and models for epigenetic inference of alcohol consumption from blood

Publication date

2021-10-26

Authors

Maas, Silvana C E
Vidaki, Athina
Teumer, Alexander
Costeira, Ricardo
Wilson, Rory
van Dongen, Jenny
Beekman, Marian
Völker, Uwe
Grabe, Hans J
Kunze, Sonja

Editors

Advisors

Supervisors

Document Type

Article

Collections

Open Access logo

License

cc_by

Abstract

BACKGROUND: Information on long-term alcohol consumption is relevant for medical and public health research, disease therapy, and other areas. Recently, DNA methylation-based inference of alcohol consumption from blood was reported with high accuracy, but these results were based on employing the same dataset for model training and testing, which can lead to accuracy overestimation. Moreover, only subsets of alcohol consumption categories were used, which makes it impossible to extrapolate such models to the general population. By using data from eight population-based European cohorts (N = 4677), we internally and externally validated the previously reported biomarkers and models for epigenetic inference of alcohol consumption from blood and developed new models comprising all data from all categories. RESULTS: By employing data from six European cohorts (N = 2883), we empirically tested the reproducibility of the previously suggested biomarkers and prediction models via ten-fold internal cross-validation. In contrast to previous findings, all seven models based on 144-CpGs yielded lower mean AUCs compared to the models with less CpGs. For instance, the 144-CpG heavy versus non-drinkers model gave an AUC of 0.78 ± 0.06, while the 5 and 23 CpG models achieved 0.83 ± 0.05, respectively. The transportability of the models was empirically tested via external validation in three independent European cohorts (N = 1794), revealing high AUC variance between datasets within models. For instance, the 144-CpG heavy versus non-drinkers model yielded AUCs ranging from 0.60 to 0.84 between datasets. The newly developed models that considered data from all categories showed low AUCs but gave low AUC variation in the external validation. For instance, the 144-CpG heavy and at-risk versus light and non-drinkers model achieved AUCs of 0.67 ± 0.02 in the internal cross-validation and 0.61-0.66 in the external validation datasets. CONCLUSIONS: The outcomes of our internal and external validation demonstrate that the previously reported prediction models suffer from both overfitting and accuracy overestimation. Our results show that the previously proposed biomarkers are not yet sufficient for accurate and robust inference of alcohol consumption from blood. Overall, our findings imply that DNA methylation prediction biomarkers and models need to be improved considerably before epigenetic inference of alcohol consumption from blood can be considered for practical applications.

Keywords

Alcohol inference, Blood, DNA methylation, Epigenetics, Inference, Prediction, Molecular Biology, Genetics, Developmental Biology, Genetics(clinical)

Citation

Maas, S C E, Vidaki, A, Teumer, A, Costeira, R, Wilson, R, van Dongen, J, Beekman, M, Völker, U, Grabe, H J, Kunze, S, Ladwig, K-H, van Meurs, J B J, Uitterlinden, A G, Voortman, T, Boomsma, D I, Slagboom, P E, van Heemst, D, van der Kallen, C J H, van den Berg, L H, Waldenberger, M, Völzke, H, Peters, A, Bell, J T, Ikram, M A, Ghanbari, M & Kayser, M 2021, 'Validating biomarkers and models for epigenetic inference of alcohol consumption from blood', Clinical Epigenetics, vol. 13, no. 1, 198. https://doi.org/10.1186/s13148-021-01186-3