[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@yus167 Avatar @yus167 Yuda Song

Yuda Song posts on X about rl, bound, drew, san diego the most. They currently have XXX followers and XX posts still getting attention that total XX engagements in the last XX hours.

Engagements: XX #

Engagements Line Chart

Mentions: X #

Mentions Line Chart

Followers: XXX #

Followers Line Chart

CreatorRank: undefined #

CreatorRank Line Chart

Social Influence #


Social category influence finance

Social topic influence rl, bound, drew, san diego, stacking, faster

Top Social Posts #


Top posts by engagements in the last XX hours

"So the tl;dr: under stochastic dynamics avoid distillation and pay the compute tax for RL. But can distillation be rescued Yesour theory shows it benefits from a smooth expert and indeed experts trained with moderate motor noise transfer better (cf. DART) (9/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"As an instructive model we propose the perturbed Block MDP: a Block MDP with small emission noise. This models robotics settings where states are largely decodable but not perfectly (e.g. occlusion sensor noise). (5/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"But when the dynamics are stochastic we prove a lower bound: expert distillation can be arbitrarily bad. Empirically the gap between distillation and RL grows as environments get more stochastic. (8/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"This is joint work with the amazing Dhruv Rohatgi and my advisors Aarti Singh and Drew Bagnell. Check out our arXiv if you are interested: Our paper is accepted to NeurIPS 2025 with a spotlight. See you in San Diego (10/10)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"Our main insight: 🔹 RL succeeds if belief contraction error is small. 🔸 Distillation succeeds if decodability error and belief contraction error are small. The remaining question: how do these quantities compare (4/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"We first identify two key quantities: X. Decodability error: how stochastic the belief (the posterior of the underlying state given observations) is. X. Belief contraction error GMR 23: how much old observations affect the belief. (3/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"🔹 Theory says: in perturbed BMDPs belief contraction error decays exponentially with the frame-stack length. 👉 This explains why RL with enough stacking works in locomotion. (6/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"This choice is fundamental: Distillation can be much faster but has well-documented failure cases. RL with longer history can usually succeed but at huge compute cost. We wanted a framework (theory + experiments) to predict which wins in practice. (2/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements