Hi, I am a 4th year PhD candidate at Machine Learning Department, Carnegie Mellon University. I am advised by Aarti Singh and Drew Bagnell. I am also closely working with Wen Sun. I am partially supported by Two Sigma PhD fellowship.

In the past, I have interned at FAIR Paris (with Remi Munos), Amazon NYC (with Udaya Ghai and Dean Foster), and Microsoft Research NYC (with Akshay Krishnamurthy and Dylan Foster). I finished my master degree in MLD, advised by Kris Kitani. I completed my undergraduate at UC San Diego with CS and Math majors and I was advised by Sicun Gao.

Research

I am interested in the theory, science and application of interactive decision-making. My current study focuses on when and how we can achieve efficient and robust learning, from thinking about the two foundations of interactive decision-making: the environment (structure, data) and the policy (nowadays with foundation models). I am also interested in the application of principled decision-making algorithms in large-scale real-world applications, such as generative models and robotics.

/

Hybrid Reinforcement Learning from Offline Observation Alone
Yuda Song, J. Andrew Bagnell, Aarti Singh
ICML, 2024
We consider a practical setting of hybrid RL where the agent only has access to offline observation data without action labels (e.g., videos of human demonstrations), and we show that it is possible to achieve efficient learning in this setting with a practical algorithm.
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
Yuda Song, Lili Wu, Dylan J. Foster, Akshay Krishnamurthy
ICML, 2024
We introduce a new theoretical framework, RichCLD, in which the agent performs control based on high-dimensional observations, but the environment is governed by low-dimensional latent states and Lipschitz continuous dynamics.
Provable Benefits of Representational Transfer in Reinforcement Learning
(alphabetical order) Alekh Agarwal, Yuda Song, Wen Sun, Kaiwen Wang, Mengdi Wang, Xuezhou Zhang
COLT, 2023
We prove the benefit of representation learning on diverse source environments which allows efficient learning on the source environment with the learned representation under the low-rank MDPs setting.

Talks

Rethinking the Foundation of LLM Post-Training: Signal and Objective
  • Frontiers in Online Reinforcement Learning Workshop, March 2026.
  • Stanford, March 2026.
  • Harvard ML Foundations Group, February 2026.
Harnessing Additional Feedback in LLM Post-Training
  • CMU Goomba Lab, October 2025. [Slides]
  • FAIR Paris, July 2025.
Hybrid RL: Efficient RL with Both Online and Offline Data
  • Amazon NYC, July 2024.
  • ISAIM Special Session on Deep Reinforcement Learning: Bridging Theory and Practice, January 2024.
  • RL Theory Seminar, November 2023. [Slides][Recording]

Teaching & Service

Lecturer
  • CMU 10734: Foundations of Autonomous Decision Making under Uncertainty (Fall 2024, Fall 2025)
Guest Lecturer
  • Cornell CS6789: Foundations of Reinforcement Learning (Fall 2024)
  • CMU 17740: Algorithmic Foundations of Interactive Learning (Fall 2024)
Teaching Assistant
  • UCSD CSE291: Topics in Search and Optimization (Winter 2020)
  • UCSD CSE154: Deep Learning (Fall 2019)
  • UCSD CSE150: Introduction to AI: Search and Reasoning (Winter 2019, Spring 2020)
  • UCSD CSE30: Computer Organization and Systems Programming (Spring 2019, Winter 2018)
  • UCSD CSE11: Introduction to CS & OOP (Fall 2018)
Reviewer
  • Conference: COLT (2025-), ICML (2021-), NeurIPS (2021-), ALT (2024-), ICLR (2022-), AAAI (2021-2022)
  • Journal: TMLR, JMLR, IEEE Transactions on Signal Processing
  • Outstanding Reviewer: NeurIPS 2022