[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.] #  @jiqizhixin 机器之心 JIQIZHIXIN 机器之心 JIQIZHIXIN posts on X about instead of, voxels, robot, affordable the most. They currently have XXXXXX followers and XXX posts still getting attention that total XXXXX engagements in the last XX hours. ### Engagements: XXXXX [#](/creator/twitter::819861340294524928/interactions)  - X Week XXXXXX -XX% - X Month XXXXXXX -XX% - X Months XXXXXXXXX +2,764% - X Year XXXXXXXXX +728,289% ### Mentions: XX [#](/creator/twitter::819861340294524928/posts_active)  - X Week XX +18% - X Month XXX -XX% - X Months XXX +809% - X Year XXX +44,100% ### Followers: XXXXXX [#](/creator/twitter::819861340294524928/followers)  - X Week XXXXXX +1.70% - X Month XXXXXX +12% - X Months XXXXXX +131% ### CreatorRank: XXXXXXXXX [#](/creator/twitter::819861340294524928/influencer_rank)  ### Social Influence [#](/creator/twitter::819861340294524928/influence) --- **Social category influence** [nfts](/list/nfts) #3575 [technology brands](/list/technology-brands) XXXX% [travel destinations](/list/travel-destinations) XXXX% [countries](/list/countries) XXXX% [social networks](/list/social-networks) XXXX% **Social topic influence** [instead of](/topic/instead-of) 2.83%, [voxels](/topic/voxels) #27, [robot](/topic/robot) 1.89%, [affordable](/topic/affordable) 1.89%, [university of](/topic/university-of) 1.89%, [llm](/topic/llm) 1.89%, [6969](/topic/6969) 0.94%, [flexible](/topic/flexible) 0.94%, [agi](/topic/agi) 0.94%, [$4751t](/topic/$4751t) XXXX% **Top accounts mentioned or mentioned by** [@nlituanie](/creator/undefined) [@googleclouds](/creator/undefined) [@wzihanw](/creator/undefined) [@ruipeterpan](/creator/undefined) [@polynoamial](/creator/undefined) [@openai](/creator/undefined) [@deepseekai](/creator/undefined) [@32showing](/creator/undefined) [@heydariai](/creator/undefined) [@ju4np3dz](/creator/undefined) [@minchonchisf](/creator/undefined) [@rryssf_](/creator/undefined) [@mefaso](/creator/undefined) **Top assets mentioned** [Voxels (voxels)](/topic/voxels) ### Top Social Posts [#](/creator/twitter::819861340294524928/posts) --- Top posts by engagements in the last XX hours "Kinematic-aware generation for next-gen animation & motion tasks Stability AI presents: Stable Part Diffusion 4D (SP4D) From a single video SP4D generates paired RGB + kinematic part videos going beyond appearance-based segmentation to capture true articulation. Key ideas: - Dual-branch diffusion (RGB + parts) - Spatial color encoding flexible part counts shared VAE - BiDiFuse + contrastive loss temporal & spatial consistency - New KinematicParts20K dataset (20K rigged objects) Results: ✨ Lift 2D part maps 3D skeletons & skinning weights 🌍 Generalizes to real-world novel objects rare poses" [X Link](https://x.com/jiqizhixin/status/1970332094049165610) [@jiqizhixin](/creator/x/jiqizhixin) 2025-09-23T03:39Z 10.4K followers, 1267 engagements "What is AGI Dan Hendrycks Yoshua Bengio Eric Schmidt Gary Marcus Max Tegmark and many others just released A Definition of AGI. Basically AGI is an AI that can match or exceed the cognitive versatility and proficiency of a well-educated adult. And no surprise GPT-4 and GPT-5 perform very poorly on the ten core cognitive components of their standard" [X Link](https://x.com/jiqizhixin/status/1979019210870395155) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T02:58Z 10.4K followers, 15.6K engagements "How well can multimodal LLMs understand long-distance travel videos Enter VIR-Bench a new benchmark with XXX real-world travel videos that challenges models to reconstruct itineraries and reason over extended geospatial-temporal trajectories. 🚗 Why it matters: mastering long-range video reasoning is key for embodied-AI planning and autonomous navigation. Findings: even top MLLMs struggle revealing major gaps in long-horizon understanding. A prototype travel agent built on VIR-Bench shows clear performance gains proving the benchmarks real-world value" [X Link](https://x.com/jiqizhixin/status/1979098765920473265) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T08:14Z 10.4K followers, 1102 engagements "📬 #PapersAccepted by Jiqizhixin Our report: VIR-Bench: Evaluating Geospatial and Temporal Understanding of MLLMs via Travel Video Itinerary Reconstruction Waseda University CyberAgent and others Paper: Code:" [X Link](https://x.com/jiqizhixin/status/1979098770844602868) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T08:14Z 10.4K followers, XXX engagements "This is huge A UCLA team managed to build an optical generative model that runs on light instead of GPUs. In their demo a shallow encoder maps noise into phase patterns which a free-space optical decoder then transforms into imagesdigits fashion butterflies faces even Van Goghstyle artwithout any computation during synthesis. ⚡ The results rival digital diffusion models pointing to ultra-fast energy-efficient AI powered by photonics. Optical generative models Nature Paper:" [X Link](https://x.com/jiqizhixin/status/1973616181740179474) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-02T05:09Z 10.4K followers, 174.6K engagements "📬 #PapersAccepted by Jiqizhixin Our report: GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction Beihang University Rawmantic AI and others Paper: Project: Code:" [X Link](https://x.com/jiqizhixin/status/1978284201800802580) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-15T02:18Z 10.4K followers, XXX engagements "Say goodbye to GRPOGVPO is here GVPO (Group Variance Policy Optimization) proposed by a NeurIPS 2025 paper from HKUST(GZ) and Zuoyebang is a new algorithm that tackles the instability plaguing advanced post-training methods like GRPO. GVPO introduces an analytical solu tion to the KL-constrained reward maximization problem and bakes it directly into its gradient weights aligning every update with the true optimal policy. Why it matters: - Stable by design guarantees a unique optimal solution - Flexible sampling no on-policy or importance sampling constraints - Physically intuitive the" [X Link](https://x.com/jiqizhixin/status/1978670665940271356) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-16T03:53Z 10.4K followers, 12.6K engagements "How can we make generalist robot hands both dexterous and affordable RAPID Hand is a co-designed hardware & software platform with: - 20-DoF compact robotic hand - Wrist vision + fingertip tactile + proprioception (sub-7 ms latency) - High-DoF teleoperation with stable retargeting Trained diffusion policies show state-of-the-art performance proving RAPID Hand enables high-quality low-cost data collection for multi-fingered manipulation" [X Link](https://x.com/jiqizhixin/status/1979022647725035876) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T03:12Z 10.4K followers, XXX engagements "📬 #PapersAccepted by Jiqizhixin Our report: RAPID Hand: A Robust Affordable Perception-Integrated Dexterous Manipulation Platform for Generalist Robot Autonomy Sun Yat-sen University University of California Merced CASIA Paper: Project: Code:" [X Link](https://x.com/jiqizhixin/status/1979022652514931187) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T03:12Z 10.4K followers, XXX engagements "📬 #PapersAccepted by Jiqizhixin Our report: RiskPO: Risk-based Policy Optimization via Verifiable Reward for LLM Post-Training Peking University Paper: Code:" [X Link](https://x.com/jiqizhixin/status/1979034051014201421) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T03:57Z 10.4K followers, XXX engagements "How can we make text-to-speech systems speak the worlds dialects Tsinghua and Giant Network build DiaMoE-TTS a unified IPA-based framework that brings scalable and expressive dialect TTS to life. 🎯 Key innovations: - Standardizes phonetic representations to resolve orthography & pronunciation ambiguity - Uses a dialect-aware Mixture-of-Experts to model phonological variation - Adapts fast to new dialects via LoRA and Conditioning Adapters Results: natural expressive speech even zero-shot synthesis on unseen dialects and niche domains like Peking Opera with just a few hours of data" [X Link](https://x.com/jiqizhixin/status/1979102214099734746) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T08:28Z 10.4K followers, 1614 engagements "📬 #PapersAccepted by Jiqizhixin Our report: DiaMoE-TTS: A Unified IPA-Based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot Adaptation Tsinghua and Giant Network Paper: Code: Checkpoint: Dataset:" [X Link](https://x.com/jiqizhixin/status/1979102220760289471) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T08:28Z 10.4K followers, XXX engagements "Can high school geometry teach AI to understand space 📐 A new study tackles the critical challenge of spatial intelligence in Multimodal Large Language Models (MLLMs). Researchers found that fine-tuning models on Euclid30K a new dataset of 30000 Euclidean geometry problems confers broadly transferable spatial skills. After this geometry-centric training models achieved substantial zero-shot gains across four separate spatial reasoning benchmarks without any task-specific adaptation. For instance the average accuracy on the VSI-Bench benchmark rose from XXXX% to XXXX% showing this is a" [X Link](https://x.com/jiqizhixin/status/1980053599452303761) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-19T23:29Z 10.4K followers, 1049 engagements "Can autonomous driving think like it sees not just reason symbolically Alibaba and other propose a spatio-temporal Chain-of-Thought (CoT) that lets visual language models (VLMs) reason visually generating imagined future frames to plan trajectories. By unifying visual generation + understanding the model acts as a world simulator predicting how the scene evolves over time not just describing it. 📈 Results show stronger visual reasoning and planning moving autonomous driving beyond text-based logic toward true simulation-based intelligence. This paper has been accepted as a NeurIPS 2025" [X Link](https://x.com/jiqizhixin/status/1975407572569235778) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-07T03:47Z 10.4K followers, XXX engagements "How can we make LLMs actually use the context theyre given Meet CARE a native retrieval-augmented reasoning framework that teaches models to explicitly integrate evidence into their own thought process. Instead of relying on heavy supervised fine-tuning or external web searches CARE lets the model retrieve and reason internally weaving relevant in-context tokens directly into its reasoning chain. Across real-world and counterfactual QA benchmarks CARE delivers higher retrieval accuracy and more reliable answers than traditional RAG or supervised approaches. 🧠 The result: context-faithful" [X Link](https://x.com/jiqizhixin/status/1976491312405991509) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-10T03:33Z 10.4K followers, 1485 engagements "Well you may not need fine-tuning anymore. Meet ACE (Agentic Context Engineering) a framework that turns LLM contexts into living adaptive playbooks that grow and refine over time. Unlike traditional context-tuning (which suffers from brevity bias and context collapse) ACE uses structured generation reflection curation cycles to preserve rich domain insights and scale with long-context models. Results: ✅ +10.6% on agent benchmarks ✅ +8.6% on finance reasoning ✅ Lower latency & rollout cost Matches top production agents on AppWorld and beats them on harder tests all with smaller open-source" [X Link](https://x.com/jiqizhixin/status/1976903463360774380) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-11T06:51Z 10.4K followers, 1367 engagements "Our report: Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models Stanford SambaNova UC Berkeley Paper:" [X Link](https://x.com/jiqizhixin/status/1976903467181785184) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-11T06:51Z 10.4K followers, XXX engagements "Ever wondered how LLMs evolve from predicting the next token to following your instructions Post-training 101: A hitchhiker's guide into LLM post-training This is a new guide breaks down the basics of LLM post-training covering the full journey from pre-training to instruction tuning: 🔹 Transitioning from language modeling to instruction following 🔹 Supervised Fine-Tuning (SFT) data curation objectives and losses 🔹 Reinforcement Learning methods RLHF RLAIF RLVR and how reward models work 🔹 Evaluation frameworks for measuring post-training quality Link:" [X Link](https://x.com/jiqizhixin/status/1977194258596561211) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-12T02:07Z 10.4K followers, 34.5K engagements "Robots can now learn to act better through trial and error A new study from Tsinghua Shanghai Qi Zhi Institute and Zhongguancun Academy puts Reinforcement Learning (RL) to the test for Vision-Language-Action (VLA) models. Unlike standard supervised fine-tuning (SFT) which struggles with compounding errors RL directly optimizes for task success. The researchers built a comprehensive benchmark to study how RL affects generalization across: 👀 Visual shifts 🧩 Semantic understanding 🦾 Action execution Key findings: - RL (especially PPO) boosts semantic and execution robustness - Maintains" [X Link](https://x.com/jiqizhixin/status/1978004521419985240) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-14T07:46Z 10.4K followers, XXX engagements "Are Gaussian Splatting's limitations holding back the future of 3D surface reconstruction 🤔 Enter GeoSVR a novel framework that leverages sparse voxels to create stunningly accurate detailed and complete 3D surfaces. By using a Voxel-Uncertainty Depth Constraint and Sparse Voxel Surface Regularization GeoSVR overcomes common challenges in the field ensuring geometric consistency and sharp details. Experiments show it outperforms existing methods in accuracy and completeness especially in difficult scenarios" [X Link](https://x.com/jiqizhixin/status/1978284197556183072) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-15T02:18Z 10.4K followers, 1298 engagements "Agentic Entropy-Balanced Policy Optimization Renmin University of China Kuaishou Technology Paper: Code:" [X Link](https://x.com/jiqizhixin/status/1979161843890520288) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T12:25Z 10.4K followers, XXX engagements "RL keeps evolving Now you can teach LLMs to reason better by rewarding risk-taking. Risk-based Policy Optimization (RiskPO) is a new reinforcement learning framework for post-training LLMs. Instead of averaging rewards like GRPO RiskPO uses a Mixed Value-at-Risk objective to: - Emphasize rare but informative reasoning paths - Prevent entropy collapse and overconfidence - Encourage deeper exploration Plus a smart bundling scheme enriches feedback for more stable training. Results: Big gains in math multimodal and code reasoning beating GRPO on both Pass@1 and Pass@k" [X Link](https://x.com/jiqizhixin/status/1979034046610002125) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T03:57Z 10.4K followers, 6350 engagements "Can todays LLMs safely stay on mission A new study introduces operational safety an LLMs ability to accept or refuse queries appropriately within its intended use. Researchers benchmarked XX open-weight models and found all remain highly unsafe for real-world deployment: - Qwen-3 (235B): XXXX% - Mistral (24B): XX% - GPTs: 6273% - Gemma & Llama-3: collapse to XX% XX% To fix this they propose prompt-based steering (Q-ground & P-ground) boosting safety by up to +41%. 📬 #PapersAccepted by Jiqizhixin Our report: OffTopicEval: When Large Language Models Enter the Wrong Chat Almost Always Nanyang" [X Link](https://x.com/jiqizhixin/status/1980157765751554555) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-20T06:22Z 10.4K followers, 1184 engagements "Today's #1 Paper on Hugging Face Agentic Entropy-Balanced Policy Optimization (AEPO) With this method we can train smarter and more capable AI web agents without their learning processes collapsing. Its a reinforcement learning (RL) algorithm that addresses a key instability issue. Existing methods often over-rely on entropy (uncertainty) leading to training failures. AEPO intelligently balances this entropy during both exploration and policy updates. It uses a dynamic rollout that prevents the agent from getting stuck in uncertain loops and a novel optimization technique to learn from tricky" [X Link](https://x.com/jiqizhixin/status/1979161839121707043) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-17T12:25Z 10.4K followers, 1332 engagements "A big step toward stable scalable LLM agent training Rutgers University & Adobe just identifies a key pitfall in LLM agent training: the explorationexploitation cascade failure where agents first prematurely converge to bad strategies then collapse into chaotic exploration. To fix this they propose Entropy-regularized Policy Optimization (EPO) which: X Smooths entropy to prevent instability X Balances exploration & exploitation adaptively X Ensures monotonic entropy variance reduction Results: +152% on ScienceWorld +19.8% on ALFWorld. 📬 #PapersAccepted by Jiqizhixin Our report: EPO:" [X Link](https://x.com/jiqizhixin/status/1980299972684775584) [@jiqizhixin](/creator/x/jiqizhixin) 2025-10-20T15:48Z 10.4K followers, XXX engagements
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
机器之心 JIQIZHIXIN posts on X about instead of, voxels, robot, affordable the most. They currently have XXXXXX followers and XXX posts still getting attention that total XXXXX engagements in the last XX hours.
Social category influence nfts #3575 technology brands XXXX% travel destinations XXXX% countries XXXX% social networks XXXX%
Social topic influence instead of 2.83%, voxels #27, robot 1.89%, affordable 1.89%, university of 1.89%, llm 1.89%, 6969 0.94%, flexible 0.94%, agi 0.94%, $4751t XXXX%
Top accounts mentioned or mentioned by @nlituanie @googleclouds @wzihanw @ruipeterpan @polynoamial @openai @deepseekai @32showing @heydariai @ju4np3dz @minchonchisf @rryssf_ @mefaso
Top assets mentioned Voxels (voxels)
Top posts by engagements in the last XX hours
"Kinematic-aware generation for next-gen animation & motion tasks Stability AI presents: Stable Part Diffusion 4D (SP4D) From a single video SP4D generates paired RGB + kinematic part videos going beyond appearance-based segmentation to capture true articulation. Key ideas: - Dual-branch diffusion (RGB + parts) - Spatial color encoding flexible part counts shared VAE - BiDiFuse + contrastive loss temporal & spatial consistency - New KinematicParts20K dataset (20K rigged objects) Results: ✨ Lift 2D part maps 3D skeletons & skinning weights 🌍 Generalizes to real-world novel objects rare poses"
X Link @jiqizhixin 2025-09-23T03:39Z 10.4K followers, 1267 engagements
"What is AGI Dan Hendrycks Yoshua Bengio Eric Schmidt Gary Marcus Max Tegmark and many others just released A Definition of AGI. Basically AGI is an AI that can match or exceed the cognitive versatility and proficiency of a well-educated adult. And no surprise GPT-4 and GPT-5 perform very poorly on the ten core cognitive components of their standard"
X Link @jiqizhixin 2025-10-17T02:58Z 10.4K followers, 15.6K engagements
"How well can multimodal LLMs understand long-distance travel videos Enter VIR-Bench a new benchmark with XXX real-world travel videos that challenges models to reconstruct itineraries and reason over extended geospatial-temporal trajectories. 🚗 Why it matters: mastering long-range video reasoning is key for embodied-AI planning and autonomous navigation. Findings: even top MLLMs struggle revealing major gaps in long-horizon understanding. A prototype travel agent built on VIR-Bench shows clear performance gains proving the benchmarks real-world value"
X Link @jiqizhixin 2025-10-17T08:14Z 10.4K followers, 1102 engagements
"📬 #PapersAccepted by Jiqizhixin Our report: VIR-Bench: Evaluating Geospatial and Temporal Understanding of MLLMs via Travel Video Itinerary Reconstruction Waseda University CyberAgent and others Paper: Code:"
X Link @jiqizhixin 2025-10-17T08:14Z 10.4K followers, XXX engagements
"This is huge A UCLA team managed to build an optical generative model that runs on light instead of GPUs. In their demo a shallow encoder maps noise into phase patterns which a free-space optical decoder then transforms into imagesdigits fashion butterflies faces even Van Goghstyle artwithout any computation during synthesis. ⚡ The results rival digital diffusion models pointing to ultra-fast energy-efficient AI powered by photonics. Optical generative models Nature Paper:"
X Link @jiqizhixin 2025-10-02T05:09Z 10.4K followers, 174.6K engagements
"📬 #PapersAccepted by Jiqizhixin Our report: GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction Beihang University Rawmantic AI and others Paper: Project: Code:"
X Link @jiqizhixin 2025-10-15T02:18Z 10.4K followers, XXX engagements
"Say goodbye to GRPOGVPO is here GVPO (Group Variance Policy Optimization) proposed by a NeurIPS 2025 paper from HKUST(GZ) and Zuoyebang is a new algorithm that tackles the instability plaguing advanced post-training methods like GRPO. GVPO introduces an analytical solu tion to the KL-constrained reward maximization problem and bakes it directly into its gradient weights aligning every update with the true optimal policy. Why it matters: - Stable by design guarantees a unique optimal solution - Flexible sampling no on-policy or importance sampling constraints - Physically intuitive the"
X Link @jiqizhixin 2025-10-16T03:53Z 10.4K followers, 12.6K engagements
"How can we make generalist robot hands both dexterous and affordable RAPID Hand is a co-designed hardware & software platform with: - 20-DoF compact robotic hand - Wrist vision + fingertip tactile + proprioception (sub-7 ms latency) - High-DoF teleoperation with stable retargeting Trained diffusion policies show state-of-the-art performance proving RAPID Hand enables high-quality low-cost data collection for multi-fingered manipulation"
X Link @jiqizhixin 2025-10-17T03:12Z 10.4K followers, XXX engagements
"📬 #PapersAccepted by Jiqizhixin Our report: RAPID Hand: A Robust Affordable Perception-Integrated Dexterous Manipulation Platform for Generalist Robot Autonomy Sun Yat-sen University University of California Merced CASIA Paper: Project: Code:"
X Link @jiqizhixin 2025-10-17T03:12Z 10.4K followers, XXX engagements
"📬 #PapersAccepted by Jiqizhixin Our report: RiskPO: Risk-based Policy Optimization via Verifiable Reward for LLM Post-Training Peking University Paper: Code:"
X Link @jiqizhixin 2025-10-17T03:57Z 10.4K followers, XXX engagements
"How can we make text-to-speech systems speak the worlds dialects Tsinghua and Giant Network build DiaMoE-TTS a unified IPA-based framework that brings scalable and expressive dialect TTS to life. 🎯 Key innovations: - Standardizes phonetic representations to resolve orthography & pronunciation ambiguity - Uses a dialect-aware Mixture-of-Experts to model phonological variation - Adapts fast to new dialects via LoRA and Conditioning Adapters Results: natural expressive speech even zero-shot synthesis on unseen dialects and niche domains like Peking Opera with just a few hours of data"
X Link @jiqizhixin 2025-10-17T08:28Z 10.4K followers, 1614 engagements
"📬 #PapersAccepted by Jiqizhixin Our report: DiaMoE-TTS: A Unified IPA-Based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot Adaptation Tsinghua and Giant Network Paper: Code: Checkpoint: Dataset:"
X Link @jiqizhixin 2025-10-17T08:28Z 10.4K followers, XXX engagements
"Can high school geometry teach AI to understand space 📐 A new study tackles the critical challenge of spatial intelligence in Multimodal Large Language Models (MLLMs). Researchers found that fine-tuning models on Euclid30K a new dataset of 30000 Euclidean geometry problems confers broadly transferable spatial skills. After this geometry-centric training models achieved substantial zero-shot gains across four separate spatial reasoning benchmarks without any task-specific adaptation. For instance the average accuracy on the VSI-Bench benchmark rose from XXXX% to XXXX% showing this is a"
X Link @jiqizhixin 2025-10-19T23:29Z 10.4K followers, 1049 engagements
"Can autonomous driving think like it sees not just reason symbolically Alibaba and other propose a spatio-temporal Chain-of-Thought (CoT) that lets visual language models (VLMs) reason visually generating imagined future frames to plan trajectories. By unifying visual generation + understanding the model acts as a world simulator predicting how the scene evolves over time not just describing it. 📈 Results show stronger visual reasoning and planning moving autonomous driving beyond text-based logic toward true simulation-based intelligence. This paper has been accepted as a NeurIPS 2025"
X Link @jiqizhixin 2025-10-07T03:47Z 10.4K followers, XXX engagements
"How can we make LLMs actually use the context theyre given Meet CARE a native retrieval-augmented reasoning framework that teaches models to explicitly integrate evidence into their own thought process. Instead of relying on heavy supervised fine-tuning or external web searches CARE lets the model retrieve and reason internally weaving relevant in-context tokens directly into its reasoning chain. Across real-world and counterfactual QA benchmarks CARE delivers higher retrieval accuracy and more reliable answers than traditional RAG or supervised approaches. 🧠 The result: context-faithful"
X Link @jiqizhixin 2025-10-10T03:33Z 10.4K followers, 1485 engagements
"Well you may not need fine-tuning anymore. Meet ACE (Agentic Context Engineering) a framework that turns LLM contexts into living adaptive playbooks that grow and refine over time. Unlike traditional context-tuning (which suffers from brevity bias and context collapse) ACE uses structured generation reflection curation cycles to preserve rich domain insights and scale with long-context models. Results: ✅ +10.6% on agent benchmarks ✅ +8.6% on finance reasoning ✅ Lower latency & rollout cost Matches top production agents on AppWorld and beats them on harder tests all with smaller open-source"
X Link @jiqizhixin 2025-10-11T06:51Z 10.4K followers, 1367 engagements
"Our report: Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models Stanford SambaNova UC Berkeley Paper:"
X Link @jiqizhixin 2025-10-11T06:51Z 10.4K followers, XXX engagements
"Ever wondered how LLMs evolve from predicting the next token to following your instructions Post-training 101: A hitchhiker's guide into LLM post-training This is a new guide breaks down the basics of LLM post-training covering the full journey from pre-training to instruction tuning: 🔹 Transitioning from language modeling to instruction following 🔹 Supervised Fine-Tuning (SFT) data curation objectives and losses 🔹 Reinforcement Learning methods RLHF RLAIF RLVR and how reward models work 🔹 Evaluation frameworks for measuring post-training quality Link:"
X Link @jiqizhixin 2025-10-12T02:07Z 10.4K followers, 34.5K engagements
"Robots can now learn to act better through trial and error A new study from Tsinghua Shanghai Qi Zhi Institute and Zhongguancun Academy puts Reinforcement Learning (RL) to the test for Vision-Language-Action (VLA) models. Unlike standard supervised fine-tuning (SFT) which struggles with compounding errors RL directly optimizes for task success. The researchers built a comprehensive benchmark to study how RL affects generalization across: 👀 Visual shifts 🧩 Semantic understanding 🦾 Action execution Key findings: - RL (especially PPO) boosts semantic and execution robustness - Maintains"
X Link @jiqizhixin 2025-10-14T07:46Z 10.4K followers, XXX engagements
"Are Gaussian Splatting's limitations holding back the future of 3D surface reconstruction 🤔 Enter GeoSVR a novel framework that leverages sparse voxels to create stunningly accurate detailed and complete 3D surfaces. By using a Voxel-Uncertainty Depth Constraint and Sparse Voxel Surface Regularization GeoSVR overcomes common challenges in the field ensuring geometric consistency and sharp details. Experiments show it outperforms existing methods in accuracy and completeness especially in difficult scenarios"
X Link @jiqizhixin 2025-10-15T02:18Z 10.4K followers, 1298 engagements
"Agentic Entropy-Balanced Policy Optimization Renmin University of China Kuaishou Technology Paper: Code:"
X Link @jiqizhixin 2025-10-17T12:25Z 10.4K followers, XXX engagements
"RL keeps evolving Now you can teach LLMs to reason better by rewarding risk-taking. Risk-based Policy Optimization (RiskPO) is a new reinforcement learning framework for post-training LLMs. Instead of averaging rewards like GRPO RiskPO uses a Mixed Value-at-Risk objective to: - Emphasize rare but informative reasoning paths - Prevent entropy collapse and overconfidence - Encourage deeper exploration Plus a smart bundling scheme enriches feedback for more stable training. Results: Big gains in math multimodal and code reasoning beating GRPO on both Pass@1 and Pass@k"
X Link @jiqizhixin 2025-10-17T03:57Z 10.4K followers, 6350 engagements
"Can todays LLMs safely stay on mission A new study introduces operational safety an LLMs ability to accept or refuse queries appropriately within its intended use. Researchers benchmarked XX open-weight models and found all remain highly unsafe for real-world deployment: - Qwen-3 (235B): XXXX% - Mistral (24B): XX% - GPTs: 6273% - Gemma & Llama-3: collapse to XX% XX% To fix this they propose prompt-based steering (Q-ground & P-ground) boosting safety by up to +41%. 📬 #PapersAccepted by Jiqizhixin Our report: OffTopicEval: When Large Language Models Enter the Wrong Chat Almost Always Nanyang"
X Link @jiqizhixin 2025-10-20T06:22Z 10.4K followers, 1184 engagements
"Today's #1 Paper on Hugging Face Agentic Entropy-Balanced Policy Optimization (AEPO) With this method we can train smarter and more capable AI web agents without their learning processes collapsing. Its a reinforcement learning (RL) algorithm that addresses a key instability issue. Existing methods often over-rely on entropy (uncertainty) leading to training failures. AEPO intelligently balances this entropy during both exploration and policy updates. It uses a dynamic rollout that prevents the agent from getting stuck in uncertain loops and a novel optimization technique to learn from tricky"
X Link @jiqizhixin 2025-10-17T12:25Z 10.4K followers, 1332 engagements
"A big step toward stable scalable LLM agent training Rutgers University & Adobe just identifies a key pitfall in LLM agent training: the explorationexploitation cascade failure where agents first prematurely converge to bad strategies then collapse into chaotic exploration. To fix this they propose Entropy-regularized Policy Optimization (EPO) which: X Smooths entropy to prevent instability X Balances exploration & exploitation adaptively X Ensures monotonic entropy variance reduction Results: +152% on ScienceWorld +19.8% on ALFWorld. 📬 #PapersAccepted by Jiqizhixin Our report: EPO:"
X Link @jiqizhixin 2025-10-20T15:48Z 10.4K followers, XXX engagements
/creator/x::jiqizhixin