[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@yus167 Yuda Song

Yuda Song posts on X about rl, stacking, faster, bound the most. They currently have XXX followers and X posts still getting attention that total X engagements in the last XX hours.

Engagements: X #

X Month XXXXX -XX%
X Months XXXXXX +258%

Mentions: X #

Followers: XXX #

X Month XXX +16%
X Months XXX +109%

CreatorRank: undefined #

Social Influence #

Social category influence finance XX%

Social topic influence rl 37.5%, stacking 12.5%, faster 12.5%, bound 12.5%, drew 12.5%, san diego XXXX%

Top accounts mentioned or mentioned by @max_simchowitz

Top Social Posts #

Top posts by engagements in the last XX hours

"🔹 Theory says: in perturbed BMDPs belief contraction error decays exponentially with the frame-stack length. 👉 This explains why RL with enough stacking works in locomotion. (6/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"This choice is fundamental: Distillation can be much faster but has well-documented failure cases. RL with longer history can usually succeed but at huge compute cost. We wanted a framework (theory + experiments) to predict which wins in practice. (2/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"We first identify two key quantities: X. Decodability error: how stochastic the belief (the posterior of the underlying state given observations) is. X. Belief contraction error GMR 23: how much old observations affect the belief. (3/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"Our main insight: 🔹 RL succeeds if belief contraction error is small. 🔸 Distillation succeeds if decodability error and belief contraction error are small. The remaining question: how do these quantities compare (4/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"As an instructive model we propose the perturbed Block MDP: a Block MDP with small emission noise. This models robotics settings where states are largely decodable but not perfectly (e.g. occlusion sensor noise). (5/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"But when the dynamics are stochastic we prove a lower bound: expert distillation can be arbitrarily bad. Empirically the gap between distillation and RL grows as environments get more stochastic. (8/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"So the tl;dr: under stochastic dynamics avoid distillation and pay the compute tax for RL. But can distillation be rescued Yesour theory shows it benefits from a smooth expert and indeed experts trained with moderate motor noise transfer better (cf. DART) (9/n)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements

"This is joint work with the amazing Dhruv Rohatgi and my advisors Aarti Singh and Drew Bagnell. Check out our arXiv if you are interested: Our paper is accepted to NeurIPS 2025 with a spotlight. See you in San Diego (10/10)"
X Link @yus167 2025-10-15T03:02Z XXX followers, XXX engagements