[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@yifan_zhang_ Avatar @yifan_zhang_ Yifan Zhang @ NeurIPS

Yifan Zhang @ NeurIPS posts on X about rpg, llm, muon, 1b the most. They currently have XXXXX followers and XX posts still getting attention that total XXXXXX engagements in the last XX hours.

Engagements: XXXXXX #

Engagements Line Chart

Mentions: X #

Mentions Line Chart

Followers: XXXXX #

Followers Line Chart

CreatorRank: XXXXXXX #

CreatorRank Line Chart

Social Influence

Social category influence currencies

Social topic influence rpg #203, llm, muon, 1b, the official, strong

Top Social Posts

Top posts by engagements in the last XX hours

"RPG (KL-regularized Policy Gradients) is a second-order optimizer that uses Hessian information (Fisher information matrices). TLDR: PG is like SGD RPG is like Muon/Shampoo 😃"
X Link 2025-12-11T01:38Z 3116 followers, 12.4K engagements

"🚀DeepSeek V3.2 officially utilized our corrected KL regularization term in their training objective On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning See also It will be even better if they can properly cite our work😀"
X Link 2025-12-01T13:47Z 3106 followers, 143K engagements

"@jasondeanlee True but frontier labs really use Muon and Shampoo"
X Link 2025-12-11T03:14Z 3072 followers, XXX engagements

"@jasondeanlee Any ideas worth 1B"
X Link 2025-12-11T03:14Z 3081 followers, XXX engagements

"🚀Introducing GRAPE: Group Representational Position Encoding. Embracing General Relative Law of Position Encoding unifying and improving Multiplicative and Additive Position Encoding such as RoPE and Alibi Better performance with a clear theoretical formulation Project Page: Paper: Devoted to the frontier of superintelligence hope you will enjoy it"
X Link 2025-12-08T21:05Z 3116 followers, 28.2K engagements

"Mixture of Parrots: Experts improve memorization more than reasoning"
X Link 2025-12-09T16:56Z 3115 followers, 70.6K engagements

"🚀 Newly updated RPG (KL-Regularized Policy Gradient) is available on arXiv: X. Our trained model beats the official checkpoint of Qwen3-4B-Instruct We extend our experiments to an 8K context length and find that RPG-REINFORCE with RPG-Style Clip achieves XX% accuracy on AIME25 surpassing the official Qwen3-4B-Instruct model (47%) and outperforming strong baselines. X. REINFORCE ESTIMATOR IS ALL YOU NEED RPG-REINFORCE consistently outperforms PPO/GRPO-style gradient estimators X. RPG is a second-order optimizer It utilizes second-order information and has a clear connection to the Natural"
X Link 2025-12-12T16:49Z 3117 followers, 21.9K engagements