[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
机器之心 JIQIZHIXIN posts on X about instead of, university of, voxels, learn to the most. They currently have XXXXXX followers and XXX posts still getting attention that total XXXXX engagements in the last XX hours.
Social category influence technology brands XXXX% nfts #1026 stocks XXXX% travel destinations XXXX% countries XXXX% finance XXXX%
Social topic influence instead of #1354, university of #1070, voxels #7, learn to 1.12%, shanghai #595, singapore #1907, llm #97, demo #710, 6969 0.56%, meta XXXX%
Top accounts mentioned or mentioned by @justinechoes @casper_hansen_ @teknium1 @openai @rfsharko @ssoni83588 @aiml4health @kourouklides @gut_ai_f @32showing @furongh @bangan @sichengzhuml @xiaoyuliu1231 @google @yuhangzhou2 @meta @casperhansen @elderplinius @hebbarmp
Top assets mentioned Voxels (voxels) Microsoft Corp. (MSFT) Salesforce Inc (CRM)
Top posts by engagements in the last XX hours
"Can reasoning LLMs think better if their Chain-of-Thought is continuous instead of discrete 🧠✨ This Meta paper introduces the first scalable way to train continuous CoTs with reinforcement learningno need to distill from discrete references. By using "soft" tokens (mixtures of tokens + noise) for RL exploration the method learns continuous reasoning traces with hundreds of steps at minimal overhead. On math benchmarks with Llama & Qwen (up to 8B) continuous CoTs match discrete ones at pass@1 but surpass them at pass@32showing richer reasoning diversity. The best strategy: train with"
X Link @jiqizhixin 2025-10-04T05:37Z 10.3K followers, 8032 engagements
"Robots can now learn to act better through trial and error A new study from Tsinghua Shanghai Qi Zhi Institute and Zhongguancun Academy puts Reinforcement Learning (RL) to the test for Vision-Language-Action (VLA) models. Unlike standard supervised fine-tuning (SFT) which struggles with compounding errors RL directly optimizes for task success. The researchers built a comprehensive benchmark to study how RL affects generalization across: 👀 Visual shifts 🧩 Semantic understanding 🦾 Action execution Key findings: - RL (especially PPO) boosts semantic and execution robustness - Maintains"
X Link @jiqizhixin 2025-10-14T07:46Z 10.3K followers, XXX engagements
"📬 #PapersAccepted by Jiqizhixin Our report: Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Duke University National University of Singapore University of Maryl and Adobe Models: Project: Paper:"
X Link @jiqizhixin 2025-10-13T06:34Z 10.3K followers, XXX engagements
"Salesforce just proposed UserRL. It's a unified framework for building user-centric agentic models through standardized gym environments and simulated users. Using Qwen3 models under the GRPO algorithm the study uncovers: X SFT cold start is essential for unlocking early interaction skills. X Trajectory-level rewards boost multi-turn efficiency and quality. X Simulated users (even open-source ones like Qwen3-32B) enable scalable cost-effective training"
X Link @jiqizhixin 2025-10-08T07:31Z 10.3K followers, 1161 engagements
"Kinematic-aware generation for next-gen animation & motion tasks Stability AI presents: Stable Part Diffusion 4D (SP4D) From a single video SP4D generates paired RGB + kinematic part videos going beyond appearance-based segmentation to capture true articulation. Key ideas: - Dual-branch diffusion (RGB + parts) - Spatial color encoding flexible part counts shared VAE - BiDiFuse + contrastive loss temporal & spatial consistency - New KinematicParts20K dataset (20K rigged objects) Results: ✨ Lift 2D part maps 3D skeletons & skinning weights 🌍 Generalizes to real-world novel objects rare poses"
X Link @jiqizhixin 2025-09-23T03:39Z 10.3K followers, 1266 engagements
"Can AI-generated 3D models understand physics Meet PhysX-3D a new paradigm bringing physical grounding to 3D generation. While most models focus on geometry and texture PhysX-3D teaches AI to model how objects behave in the real world. It introduces two key components: - PhysXNet the first physics-annotated 3D dataset covering X dimensions: scale material affordance kinematics and function. - PhysXGen a physics-aware image-to-3D generator that links structure and physical properties through a dual-branch architecture. The result: 3D assets that look real and act real paving the way for"
X Link @jiqizhixin 2025-10-14T06:41Z 10.3K followers, 1467 engagements
"What if 3D models could be generated with precise cross-modal controlbeyond just text or images Tencent presents Hunyuan3D-Omni a unified framework that accepts point clouds voxels bounding boxes and skeletal priors enabling fine-grained controllable 3D asset creation. Built for games film and design. Model available on Hugging Face"
X Link @jiqizhixin 2025-10-04T01:05Z 10.3K followers, 1100 engagements
"Ever wondered how LLMs evolve from predicting the next token to following your instructions Post-training 101: A hitchhiker's guide into LLM post-training This is a new guide breaks down the basics of LLM post-training covering the full journey from pre-training to instruction tuning: 🔹 Transitioning from language modeling to instruction following 🔹 Supervised Fine-Tuning (SFT) data curation objectives and losses 🔹 Reinforcement Learning methods RLHF RLAIF RLVR and how reward models work 🔹 Evaluation frameworks for measuring post-training quality Link:"
X Link @jiqizhixin 2025-10-12T02:07Z 10.3K followers, 34.3K engagements
"Huge LLMs can now think longer without burning quadratic compute Mila Microsoft and others just introduced Markovian Thinking a paradigm that decouples reasoning length from context size turning LLM reasoning into a linear-compute process. Their system Delethink trains models in fixed-size reasoning chunks: at each boundary the model writes a compact textual state resets the context and seamlessly continues reasoning. Results are striking: an R1-Distill 1.5B model thinks up to 24K tokens with only 8K context outperforming LongCoT-RL trained on full 24K sequences at X lower compute cost (7 vs."
X Link @jiqizhixin 2025-10-10T01:56Z 10.3K followers, 45K engagements
"Can diffusion-based LLMs outpace traditional autoregressive models ⚡🧠 Meet dInfer the first efficient modular framework for inference on diffusion-based large language models (dLLMs) a new generation of parallel text generators. dInfer breaks inference into four key modules: - Model core architecture integration - Diffusion iteration manager orchestrates denoising steps - KV-cache manager optimizes memory reuse - Decoding strategy balances speed and quality With both algorithmic and system-level optimizations dInfer hits 1100 tokens/sec on HumanEval and 800+ tokens/sec across benchmarks on"
X Link @jiqizhixin 2025-10-16T03:22Z 10.3K followers, XXX engagements
"Are Gaussian Splatting's limitations holding back the future of 3D surface reconstruction 🤔 Enter GeoSVR a novel framework that leverages sparse voxels to create stunningly accurate detailed and complete 3D surfaces. By using a Voxel-Uncertainty Depth Constraint and Sparse Voxel Surface Regularization GeoSVR overcomes common challenges in the field ensuring geometric consistency and sharp details. Experiments show it outperforms existing methods in accuracy and completeness especially in difficult scenarios"
X Link @jiqizhixin 2025-10-15T02:18Z 10.3K followers, 1234 engagements
"Can a vision-language model teach itself to reasonwithout any human labels 👀 Meet Vision-Zero a new framework that lets VLMs improve through competitive visual games instead of costly datasets. Heres how it works: - Strategic Self-Play: models play Whos the Spy-style games generating their own training data. - Any Images Any Domain: from synthetic scenes to real-world photos Vision-Zero builds reasoning through play. - Iterative-SPO: a new loop that alternates self-play with RL sustaining long-term gains. The result Label-free state-of-the-art reasoning outperforming even annotation-heavy"
X Link @jiqizhixin 2025-10-13T06:34Z 10.3K followers, 6577 engagements
"This is huge A UCLA team managed to build an optical generative model that runs on light instead of GPUs. In their demo a shallow encoder maps noise into phase patterns which a free-space optical decoder then transforms into imagesdigits fashion butterflies faces even Van Goghstyle artwithout any computation during synthesis. ⚡ The results rival digital diffusion models pointing to ultra-fast energy-efficient AI powered by photonics. Optical generative models Nature Paper:"
X Link @jiqizhixin 2025-10-02T05:09Z 10.3K followers, 174.2K engagements
"📬 #PapersAccepted by Jiqizhixin Our report: GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction Beihang University Rawmantic AI and others Paper: Project: Code:"
X Link @jiqizhixin 2025-10-15T02:18Z 10.3K followers, XXX engagements
"Can autonomous driving think like it sees not just reason symbolically Alibaba and other propose a spatio-temporal Chain-of-Thought (CoT) that lets visual language models (VLMs) reason visually generating imagined future frames to plan trajectories. By unifying visual generation + understanding the model acts as a world simulator predicting how the scene evolves over time not just describing it. 📈 Results show stronger visual reasoning and planning moving autonomous driving beyond text-based logic toward true simulation-based intelligence. This paper has been accepted as a NeurIPS 2025"
X Link @jiqizhixin 2025-10-07T03:47Z 10.3K followers, XXX engagements
"Well you may not need fine-tuning anymore. Meet ACE (Agentic Context Engineering) a framework that turns LLM contexts into living adaptive playbooks that grow and refine over time. Unlike traditional context-tuning (which suffers from brevity bias and context collapse) ACE uses structured generation reflection curation cycles to preserve rich domain insights and scale with long-context models. Results: ✅ +10.6% on agent benchmarks ✅ +8.6% on finance reasoning ✅ Lower latency & rollout cost Matches top production agents on AppWorld and beats them on harder tests all with smaller open-source"
X Link @jiqizhixin 2025-10-11T06:51Z 10.3K followers, 1352 engagements
"Say goodbye to GRPOGVPO is here GVPO (Group Variance Policy Optimization) proposed by a NeurIPS 2025 paper from HKUST(GZ) and Zuoyebang is a new algorithm that tackles the instability plaguing advanced post-training methods like GRPO. GVPO introduces an analytical solu tion to the KL-constrained reward maximization problem and bakes it directly into its gradient weights aligning every update with the true optimal policy. Why it matters: - Stable by design guarantees a unique optimal solution - Flexible sampling no on-policy or importance sampling constraints - Physically intuitive the"
X Link @jiqizhixin 2025-10-16T03:53Z 10.3K followers, 4546 engagements
"📬 #PapersAccepted by Jiqizhixin Our report: Expert-as-a-Service: Towards Efficient Scalable and Robust Large-scale MoE Serving National University of Singapore Shanghai Qiji Zhifeng Co. Ltd. and others Paper:"
X Link @jiqizhixin 2025-10-15T02:28Z 10.3K followers, XXX engagements
"Yes it turns out diffusion models can learn from feedback as effectively as language models do with RL Tsinghua NVIDIA and Stanford introduced Diffusion Negative-aware FineTuning (DiffusionNFT) a new online reinforcement learning paradigm that finally makes RL practical for diffusion models. Instead of struggling with intractable likelihoods or reverse-sampling hacks DiffusionNFT works directly on the forward process via flow matching contrasting positive vs. negative generations to guide improvement. ✨ Key perks: - Works with any black-box solver no likelihood estimation needed. - CFG-free"
X Link @jiqizhixin 2025-10-10T03:58Z 10.3K followers, 26.9K engagements
"📬 #PapersAccepted by Jiqizhixin Our report: UserRL: Training Interactive User-Centric Agent via Reinforcement Learning Salesforce AI Research University of Illinois Urbana-Champaign Paper: Code:"
X Link @jiqizhixin 2025-10-08T07:31Z 10.3K followers, XXX engagements
"Nice survey on Reinforcement Learning. This comprehensive survey covers XXX papers and maps how RL empowers LLMs across their full lifecycle from pre-training and alignment fine-tuning to reinforced reasoning where models learn to think better through verifiable feedback. It highlights RL with Verifiable Rewards (RLVR) as a key step toward more reliable interpretable and self-improving AI systems while cataloging datasets benchmarks and open-source frameworks that drive the field. 📚 A must-read for those exploring the frontier of RL-enhanced reasoning and alignment in next-gen LLMs"
X Link @jiqizhixin 2025-10-07T07:32Z 10.3K followers, XXX engagements
"Is building a state-of-the-art Large Multimodal Model (LMM) from scratch prohibitively expensive LLaVA-OneVision-1.5 says no. It's a family of open efficient and reproducible LMMs that deliver top-tier performance on a budget. The team developed a complete end-to-end framework including massive curated datasets (85M for pre-training 22M for instruction tuning) enabling training for under $16000. The results are stunning: - The 8B model outperforms Qwen2.5-VL-7B on XX of XX benchmarks. - The 4B model surpasses Qwen2.5-VL-3B on all XX benchmarks. This work democratizes access to building"
X Link @jiqizhixin 2025-10-15T08:30Z 10.3K followers, 1207 engagements
"An intriguing paper from Apple. MoEs Are Stronger than You Think: Hyper-Parallel Inference Scaling with RoE Paper:"
X Link @jiqizhixin 2025-10-05T06:58Z 10.3K followers, 31.8K engagements
"📬 #PapersAccepted by Jiqizhixin Our report: LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training LLaVA-OneVision Community Contributors Code: Paper: Model & data: Demo:"
X Link @jiqizhixin 2025-10-15T08:30Z 10.3K followers, XXX engagements
"Struggling to deploy massive Mixture-of-Experts (MoE) models without system instability EaaS is a novel serving system that makes MoE deployment efficient scalable and robust. It works by disaggregating MoE modules into independent stateless microservices. This clever design enables fine-grained resource scaling and provides inherent fault tolerance. The system is powered by a high-performance CPU-free communication library to ensure minimal overhead. The outcome is a system that saves up to XXXX% of computing resources by adapting to traffic and suffers less than a X% throughput reduction"
X Link @jiqizhixin 2025-10-15T02:28Z 10.3K followers, 1078 engagements