Is UMAP accurate? Addressing some fair and unfair criticism.
February 14, 2025 @ 10:00 – 11:00 CET
Nikolay Oskolkov, NBIS, Lund University
NBIS and SciLifeLab Data Centre arrange an open SciLifeLab AI Seminar Series aimed at knowledge-sharing about Artificial Intelligence and applications in the Life Science community. The seminar series is open to everyone. The seminar is run over Zoom on the third Friday of the month during academic terms, typically between 10 and 11 am, with approx. 45 min presentation and 15 min discussion.
Abstract
UMAP is a golden standard dimensionality reduction method in single cell biology, yet it has a controversial reputation and is sometimes heavily criticized, see for example [1 – 5]. In particular, the recent Nature publication of All of Us program [6] gave rise to an avalanche of discussions in scientific community regarding the controversial UMAP figure of human populations suggesting that UMAP is not accurate for this purpose. Remarkably, the main criticism of UMAP originates (to the best of my knowledge) from the population genomics community, wile the single cell community seems to be satisfied with the quality of UMAP analysis.
In this talk I will discuss peculiarities of data in single cell and population genomics analyses, and explain some insights from the UMAP algorithm, which could potentially attempt to resolve the contradiction between the two communities and very different research questions studied by the communities. I will also cover the foundations of PCA + tSNE + UMAP algorithms and emphasize their pros and cons for different types of data in Life Sciences.
[1] https://simplystatistics.org/posts/2024-12-23-biologists-stop-including-umap-plots-in-your-papers/
[2] https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011288
[3] https://x.com/jkpritch/status/1759769445759893832?lang=en
[4] https://www.nature.com/articles/d41586-024-00568-w
[5] https://www.science.org/content/article/huge-genome-study-confronted-concerns-over-race-analysis
[6] https://www.nature.com/articles/s41586-023-06957-x
Slides