Research
I am interested in the practical theory of interactive decision-making.
My current study focuses on provably efficient setups and algorithms in Reinforcement Learning by leveraging existing data and the structure of the problem.
I am also interested in the application of principled decision-making algorithms in large-scale real-world applications.
|
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Yuda Song, Hanlin Zhang, Udaya Ghai, Carson Eisenach, Sham M. Kakade, Dean Foster
NeurIPS Workshop on Mathematics of Modern Machine Learning, 2024
|
The Importance of Online Data: Understanding Preference Fine-Tuning Through the Lens of Coverage
Yuda Song, Gokul Swamy, Aarti Singh, J. Andrew Bagnell, Wen Sun
NeurIPS, 2024.
We prove that offline contrastive-based method (e.g., DPO)
requires a stronger coverage property than online RL-based method (e.g., RLHF). We propose
Hybrid Preference Optimization to combine the benefits of both offline and online methods.
|
Hybrid Reinforcement Learning from Offline Observation Alone
Yuda Song, J. Andrew Bagnell, Aarti Singh
ICML, 2024.
|
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
Yuda Song, Lili Wu, Dylan J. Foster, Akshay Krishnamurthy
ICML, 2024.
We introduce a new theoretical framework, RichCLD (Rich-Observation RL with Continuous Latent Dynamics), in which the agent performs control based on high-dimensional observations, but the environment is governed by low-dimensional latent states and Lipschitz continuous dynamics.
|
Provable Benefits of Representational Transfer in Reinforcement Learning
(alphabetical order) Alekh Agarwal, Yuda Song, Wen Sun, Kaiwen Wang, Mengdi Wang, Xuezhou Zhang
COLT, 2023.
[code]
We prove the benefit of representation learning on diverse source environments which allows efficient learning on the
source environment with the learned representation under the low-rank MDPs setting.
|
The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms
Anirudh Vemula, Yuda Song, Aarti Singh, J. Andrew Bagnell, Sanjiban Choudhury
ICML, 2023.
[code]
|
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Yuda Song*, Yifei Zhou*, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun
ICLR, 2023.
[code]
[Talk at RL theory seminars]
Combining online data and offline data can solve RL with both statistical and computation efficiency.
Experiments on Montezuma's Revenge reveals that hybrid RL works much better
than pure online RL and pure offline RL.
|
Representation Learning for General-sum Low-rank Markov Games
Chengzhuo Ni, Yuda Song, Xuezhou Zhang, Chi Jin, Mengdi Wang
ICLR, 2023.
|
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach
Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh Agarwal, Wen Sun
ICML, 2022.
[code]
[Talk at RL theory seminars]
An efficient rich-observation RL algorithm that learns to decode from rich observations to latent states
(via adversarial training), while balancing exploration and exploitation.
|
No-regret Model-Based Meta RL for Personalized Navigation
Yuda Song, Ye Yuan, Wen Sun, Kris Kitani
L4DC, 2022.
|
Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design
Ye Yuan, Yuda Song, Zhengyi Luo, Wen Sun, Kris Kitani
ICLR, 2022.
[Project Page]
|
PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration
Yuda Song, Wen Sun
ICML, 2021
[code]
A simple provably efficient model-based algorithm that achieves competitive performance in both dense reward
continuous control tasks and sparse reward control tasks that require efficient exploration.
|
Provably Efficient Model-based Policy Adaptation
Yuda Song, Aditi Mavalankar,
Wen Sun, Sicun Gao
ICML, 2020
[Project Page]
[code]
We study Sim-to-Real/policy transfer/policy adaptation under a model-based framework
resulting an algorithm that enjoyes strong theoretical guarantees
and excellent empirical performance.
|
Teaching Assistant
UCSD CSE291: Topics in Search and Optimization (Winter 2020)
UCSD CSE154: Deep Learning (Fall 2019)
UCSD CSE150: Introduction to AI: Search and Reasoning (Winter 2019, Spring 2020)
UCSD CSE30: Computer Organization and Systems Programming (Spring 2019, Winter 2018)
UCSD CSE11: Introduction to CS & OOP (Fall 2018)
|
Service
Reviewer: ICML (2021-), NeurIPS (2021-), ICLR (2022-), AAAI (2021-2022).
Top Reviewer: NeurIPS 2022
|
|