Yuda Song

Hi, I am first year PhD student at Machine Learning Department, Carnegie Mellon University . I am coadvise by Aarti Singh and Drew Bagnell. I am also closely working with Wen Sun. I finished my master degree in MLD, advised by Kris Kitani. I completed my undergraduate at UC San Diego with CS and Math double major and I was advised by Sicun Gao.

Semantic Scholar  /  Google Scholar  /  Email  /  CV  /  Github  /  Twitter

profile photo
Research

I am broadly interested in machine learning, especially reinforcement learning. I am current focusing on provably statistically and computationally efficient settings and algorithms in RL.

Preprints
Provable Benefits of Representational Transfer in Reinforcement Learning
(alphabetical order) Alekh Agarwal, Yuda Song, Wen Sun, Kaiwen Wang, Mengdi Wang, Xuezhou Zhang

[code]
Publication
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach
Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh Agarwal, Wen Sun
ICML, 2022.

[code] [Talk at RL theory seminars]
Online No-regret Model-Based Meta RL for Personalized Navigation
Yuda Song, Ye Yuan, Wen Sun, Kris Kitani
L4DC, 2022.
Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design
Ye Yuan, Yuda Song, Zhengyi Luo, Wen Sun, Kris Kitani
ICLR, 2022. Oral presentation

[Project Page]
PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration
Yuda Song, Wen Sun
ICML, 2021

[code]
Provably Efficient Model-based Policy Adaptation
Yuda Song, Aditi Mavalankar, Wen Sun, Sicun Gao
ICML, 2020

[Project Page] [code]
Teaching Assistant

  • UCSD CSE291: Topics in Search and Optimization (Winter 2020)
  • UCSD CSE154: Deep Learning (Fall 2019)
  • UCSD CSE150: Introduction to AI: Search and Reasoning (Winter 2019, Spring 2020)
  • UCSD CSE30: Computer Organization and Systems Programming (Spring 2019, Winter 2018)
  • UCSD CSE11: Introduction to CS & OOP (Fall 2018)

  • Service

  • Reviewer: AAAI (2021-), ICML (2021-), NeurIPS (2021-), ICLR (2022).


  • Source code from here