2605.15001v3 [about]cs.LGcs.CVeess.AS
Computer Science > Machine Learning
On the geometry of learned representations
- 1 Department of Biomedical Engineering, University of Basel
- 2 Visiting Researcher, EPFL
(Submitted on 13 Apr 2024 (v1), last revised 15 May 2026 (this version, v3))
I am a final-year PhD student at the University of Basel, supervised by Alexander Navarini and Marc Pouly. I also spent time as a visiting researcher at EPFL with Maria Brbić.
My research is about the geometry of learned representations, the structure inside a neural network’s embedding space that every downstream decision (similarity, outlier detection, cross-modal alignment) ultimately reads from.
The Platonic Representation Hypothesis claims that as neural networks scale, their representations converge to a single shared geometry. We revisit this claim [1] and find that the reported convergence largely tracks model size rather than reflecting real alignment, and that once we correct for the confound, what remains is agreement on local neighborhood structure rather than global shape. We call this narrower picture the Aristotelian Representation Hypothesis, and it pins down which signals from a representation downstream tools can rely on across encoder choices.
A learned representation also carries traces of the data it was trained on, beyond what the labels reveal. Within a single dataset, isolated samples are usually off-topic, tight pairs near-duplicates, and label outliers mislabeled, which turns the representation space itself into a label-free auditor for image [3, 4] and audio [5] collections. Across two modalities, preserving each side’s local structure regularizes alignment when paired data is scarce, taking the requirement from millions to tens of thousands [2]. Across collections of an entire scientific field, harmonizing many representations into a single latent space lets us audit not one dataset but a whole discipline, as we do for digital dermatology, where 1.1 million images across 29 public archives yield the first quantitative atlas of the field and expose what those archives do and do not cover [6].
Outside research, I spend my time with my family and on the bike.
Recent activity
| May 01, 2026 | “Revisiting the Platonic Representation Hypothesis: An Aristotelian View” has been accepted at ICML 2026 (Seoul). Paper / Project page / Code |
|---|---|
| Apr 01, 2026 | Gave an invited talk on “Revisiting the Platonic Representation Hypothesis: An Aristotelian View” at the ELIZA Guest Lecture Series at University of Freiburg, hosted by Thomas Brox. |
| Feb 20, 2026 | New preprint: “Revisiting the Platonic Representation Hypothesis: An Aristotelian View” is now on arXiv! |
| Jan 20, 2026 | Our paper “Representation-Based Data Quality Audits for Audio” has been accepted at ICASSP 2026 in Barcelona, Spain! |
| Oct 28, 2025 | Our paper “Clinical Uncertainty Impacts Machine Learning Evaluations” has been accepted at ML4H 2025 in San Diego, US! |
References (selected)
- Revisiting the Platonic Representation Hypothesis: An Aristotelian View. International Conference on Machine Learning (ICML), 2026. project arXiv:2602.14486 code
- With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You. Advances in Neural Information Processing Systems (NeurIPS), 2025. project arXiv:2506.16895
- Intrinsic Self-Supervision for Data Quality Audits. Advances in Neural Information Processing Systems (NeurIPS), 2024. project arXiv:2305.17048 code
- CleanPatrick: A Benchmark for Image Data Cleaning. Journal of Data-centric Machine Learning Research (DMLR), 2026. project arXiv:2505.11034
- Representation-Based Data Quality Audits for Audio. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2026. project arXiv:2509.26291
- A Global Atlas of Digital Dermatology to Map Innovation and Disparities. Preprint, 2025. project arXiv:2601.00840
Submission history
From: Fabian Gröger <fabian.groeger@unibas.ch>
- [v3] 15 May 2026 Research statement reframed around representation geometry.
- [v2] 12 May 2026 Added project deep-walkthroughs (ARH, quality-audits, SkinMap).
- [v1] 13 Apr 2024 Initial version.