ABOUT ME

WELCOME!

I am a senior researcher at Microsoft Research Cambridge. I study the computational principles of intelligence – and turn them into smarter ways to train AI.

(New!) Towards Next Gen Training Algorithms for AI

Progress in AI isn’t just about scaling compute – discovering better learning rules matters just as much. Our recent work uncovers a new, unifying view of LLM training through the lens ofloss-landscape symmetries induced by architectures. We compile those insights into ARO (Adaptively Rotated Optimization), a new optimization framework that pushes the efficiency frontier for training LLMs at scale.

Check out our technical report: ARO: A New Lens On Matrix Optimization For Large Models. This work will be featured at the Microsoft Research Forum (Mar 3, 2026). Register now!

Past Research

I maintain a broad interest in core machine learning, with past research focused on advanced methodologies in probabilistic modeling, causal machine learning, and decision-making. My research has translated into high-impact deployments, both inside Microsoft and through external partners. Publicly disclosed examples include AI-driven personalized education, see media coverage: AI helps create personalized math lessons for students.

Bio

Before joining Microsoft, I did my Ph.D (2018- early 2023) in Machine Learning Group, CBL at the University of Cambridge, supervised by Prof. José Miguel Hernández-Lobato, and advised by Prof. Richard Turner. My PhD research focused on the field of probabilistic and causal machine learning. Check out my PhD thesis Advances in Bayesian Machine Learning: From Uncertainty to Decision Making. During my PhD, I also worked as an intern researcher at Microsoft Reserach Cambridge (MSRC), under the supervision of Dr. Cheng Zhang. Before joining the University of Cambridge, I obtained an MRes degree in Computational Statistics and Machine Learning from the Depertment of Computer Science, University College London, supervised by Prof. David Barber. During my time at UCL, my research focused on Stein methods for Bayesian inference on doubly intractable models.