Ultrafast and accurate sequence alignment and clustering of viral genomes

Publication date

2025-06

Authors

Zielezinski, Andrzej
Gudyś, Adam
Barylski, Jakub
Siminski, Krzysztof
Rozwalak, Piotr
Dutilh, Bas E.ISNI 0000000389464735
Deorowicz, Sebastian

Editors

Advisors

Supervisors

Document Type

Article
Open Access logo

License

cc_by

Abstract

Viromics produces millions of viral genomes and fragments annually, overwhelming traditional sequence comparison methods. Here we introduce Vclust, an approach that determines average nucleotide identity by Lempel–Ziv parsing and clusters viral genomes with thresholds endorsed by authoritative viral genomics and taxonomy consortia. Vclust demonstrates superior accuracy and efficiency compared to existing tools, clustering millions of genomes in a few hours on a mid-range workstation.

Keywords

Biotechnology, Biochemistry, Molecular Biology, Cell Biology

Citation

Zielezinski, A, Gudyś, A, Barylski, J, Siminski, K, Rozwalak, P, Dutilh, B E & Deorowicz, S 2025, 'Ultrafast and accurate sequence alignment and clustering of viral genomes', Nature Methods, vol. 22, no. 6, pp. 1191-1194. https://doi.org/10.1038/s41592-025-02701-7