[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.] [@HuggingPapers](/creator/twitter/HuggingPapers) "ByteDance introduces FR3E A new RL framework that tackles unstable exploration in LLM reasoning tasks. It enables more stable training and boosts accuracy on AIME24 leading to robust and structured outputs"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1943465262076039342) 2025-07-11 00:20:00 UTC 3693 followers, XXX engagements "TikTok just open-sourced SWE-Perf A first-of-its-kind benchmark that pits LLMs against XXX real-world repository-level performance tasks distilled from XXX k GitHub PRs"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945938105170882775) 2025-07-17 20:06:12 UTC 3694 followers, XXX engagements "New paper: A Systematic Analysis of Hybrid Linear Attention. Researchers from @ByteDance_AI & @UCSC_AI extensively analyzed XX hybrid models uncovering efficient ways to achieve Transformer-level recall. Learn more:"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1943401762704019603) 2025-07-10 20:07:41 UTC 3693 followers, XXX engagements "Alibaba Group unveils Ovis-U1: a powerful 3-billion-parameter unified model for multimodal understanding text-to-image generation and image editing"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1939959254351548822) 2025-07-01 08:08:23 UTC 3667 followers, 1191 engagements "Google DeepMind just dropped Gemini XXX Pro a thinking model that hits SoTA on frontier coding & reasoning while juggling 3-hour videos"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1946422822235984245) 2025-07-19 04:12:18 UTC 3698 followers, XXX engagements "Mistral released Voxtral on Hugging Face Incorporates state-of-the-art audio input capabilities into LLMs while retaining best-in-class text performance. 3B and 24B variants It excels at speech transcription translation and audio understanding"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945511426946154764) 2025-07-16 15:50:44 UTC 3693 followers, XXX engagements "New paper: LayerCake introduces a novel token-aware layer-localized contrastive decoding method. It boosts LLM factual generation by aligning token types with specific Transformer layers. No training or model modification needed"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945277001436140009) 2025-07-16 00:19:13 UTC 3662 followers, XXX engagements "ByteDance unveils CriticLean: a novel framework for *reliable* mathematical formalization via critic-guided Reinforcement Learning It makes Lean X proofs more accurate"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1943102586220855622) 2025-07-10 00:18:52 UTC 3693 followers, XXX engagements "A 165-page survey distilling 1300+ papers into Context Engineeringthe discipline that moves AI from static prompts to dynamic production-grade context orchestration"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1946062103917330921) 2025-07-18 04:18:56 UTC 3673 followers, XXX engagements "ByteDance Seed team unveils PyVision This new framework enables MLLMs to dynamically generate execute and refine Python-based tools for flexible interactive visual reasoning. A big leap towards more agentic AI"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1944368344401903652) 2025-07-13 12:08:32 UTC 3695 followers, 5596 engagements "Introducing T-LoRA Customize Diffusion Models with just one image and say goodbye to overfitting. Achieve unmatched fidelity & diversity"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1943643826771358022) 2025-07-11 12:09:34 UTC 3671 followers, 13.2K engagements "Kwai Keye-VL Technical Report just dropped on Hugging Face Kuaishou introduces an 8-billion-parameter multimodal foundation model engineered for cutting-edge short-video understanding. Achieves state-of-the-art results on public video benchmarks and maintains robust general-purpose vision-language abilities"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1940624892308656198) 2025-07-03 04:13:24 UTC 3675 followers, XXX engagements "Is your LLM truly reasoning or just memorizing New research uncovers critical data contamination in popular benchmarks for RL-enhanced LLM reasoning. Only accurate reward signals prove to yield true performance gains on clean data"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945093511151542719) 2025-07-15 12:10:05 UTC 3664 followers, XXX engagements "From ByteDance: EmbRACE-3K boasts 3000+ language-guided tasks in photorealistic Unreal Engine environments for embodied AI. Leading VLMs currently score XX% zero-shot. Fine-tuning Qwen2.5-VL-7B yields strong gains. Learn more:"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945033236117086607) 2025-07-15 08:10:35 UTC 3661 followers, XXX engagements "Alibaba Group's HumanOmniV2 is here It enables omni-modal LLMs to deeply understand human intentions by reasoning with global context. This new framework tackles common shortcut problems and achieves state-of-the-art results"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1942016018265550866) 2025-07-07 00:21:14 UTC 3672 followers, 15.4K engagements "Google Research and collaborators unveil AgentsNet A new benchmark for multi-agent LLM coordination Tests how LLMs collaborate in networks scaling up to XXX agents"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945639670714425839) 2025-07-17 00:20:20 UTC 3698 followers, 8897 engagements "Alibaba Group and UIUC introduce PAPO a new RL framework for multimodal reasoning. It teaches models to perceive & reason by tackling the visual perception bottleneck. XXXX% reduction in perception errors. XXX% overall gain. No extra data needed"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1943161739253551510) 2025-07-10 04:13:55 UTC 3667 followers, 1021 engagements "Tencent just released RLVER on Hugging Face It's a pioneering reinforcement learning framework that uses verifiable emotion rewards to create truly empathetic AI agents"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1944190398986858769) 2025-07-13 00:21:27 UTC 3695 followers, 1237 engagements "New paper: EmbRACE-3K A groundbreaking dataset for embodied AI agents pushing models to reason & act in complex photorealistic Unreal Engine environments. Current VLMs struggle in these interactive settings. Ready for the next generation of intelligent agents"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945033225790701644) 2025-07-15 08:10:32 UTC 3660 followers, XXX engagements "MMHU A massive-scale multimodal benchmark for human behavior understanding in autonomous driving. 57k human motion clips & 1.73M frames from Waymo YouTube & self-collected data with rich annotations for motion trajectory intention & safety-critical labels"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945916891668558104) 2025-07-17 18:41:55 UTC 3698 followers, 7981 engagements "New research from Kuaishou Technology: VMoBA is a game-changing sparse attention mechanism that dramatically accelerates video diffusion model training & inference"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1939903454547255726) 2025-07-01 04:26:39 UTC 3675 followers, XXX engagements "Discover the challenges of multi-agent LLM coordination as networks scale. Explore interactive demos and dive into the data Paper: Dataset: Demo:"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945639680482914506) 2025-07-17 00:20:22 UTC 3697 followers, XXX engagements "Deep dive into LayerCake: a novel approach to boosting LLM factuality. This decoding-time strategy precisely targets attention across layers and token types for improved reliability. Read the paper: Code:"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1945277010990756341) 2025-07-16 00:19:15 UTC 3675 followers, XXX engagements "New paper just dropped on Hugging Face: a statistical diagnosis for training LLM web agents. It shows how combining SFT with on-policy RL achieves SOTA performance using only XX% of the compute required by pure SFT"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1943945298570534949) 2025-07-12 08:07:30 UTC 3673 followers, XXX engagements "A 165-page survey distilling 1300+ papers into Context Engineering - the discipline that moves AI from static prompts to dynamic production-grade context orchestration"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1946106083698733083) 2025-07-18 07:13:41 UTC 3696 followers, 1349 engagements "NVIDIA just released Long-RL It's a full-stack framework Scaling reinforcement learning to long videos up to 256k tokens on a single A100 node"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1943525597684339149) 2025-07-11 04:19:46 UTC 3691 followers, 8548 engagements "One Token to Fool LLM-as-a-Judge A single : or Lets solve this step-by-step can dupe generative reward models into giving false-positive scoreshighlighting a critical flaw in RL pipelines"  [@HuggingPapers](/creator/x/HuggingPapers) on [X](/post/tweet/1946481718027812916) 2025-07-19 08:06:20 UTC 3698 followers, XXX engagements
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
@HuggingPapers
"ByteDance introduces FR3E A new RL framework that tackles unstable exploration in LLM reasoning tasks. It enables more stable training and boosts accuracy on AIME24 leading to robust and structured outputs" @HuggingPapers on X 2025-07-11 00:20:00 UTC 3693 followers, XXX engagements
"TikTok just open-sourced SWE-Perf A first-of-its-kind benchmark that pits LLMs against XXX real-world repository-level performance tasks distilled from XXX k GitHub PRs" @HuggingPapers on X 2025-07-17 20:06:12 UTC 3694 followers, XXX engagements
"New paper: A Systematic Analysis of Hybrid Linear Attention. Researchers from @ByteDance_AI & @UCSC_AI extensively analyzed XX hybrid models uncovering efficient ways to achieve Transformer-level recall. Learn more:" @HuggingPapers on X 2025-07-10 20:07:41 UTC 3693 followers, XXX engagements
"Alibaba Group unveils Ovis-U1: a powerful 3-billion-parameter unified model for multimodal understanding text-to-image generation and image editing" @HuggingPapers on X 2025-07-01 08:08:23 UTC 3667 followers, 1191 engagements
"Google DeepMind just dropped Gemini XXX Pro a thinking model that hits SoTA on frontier coding & reasoning while juggling 3-hour videos" @HuggingPapers on X 2025-07-19 04:12:18 UTC 3698 followers, XXX engagements
"Mistral released Voxtral on Hugging Face Incorporates state-of-the-art audio input capabilities into LLMs while retaining best-in-class text performance. 3B and 24B variants It excels at speech transcription translation and audio understanding" @HuggingPapers on X 2025-07-16 15:50:44 UTC 3693 followers, XXX engagements
"New paper: LayerCake introduces a novel token-aware layer-localized contrastive decoding method. It boosts LLM factual generation by aligning token types with specific Transformer layers. No training or model modification needed" @HuggingPapers on X 2025-07-16 00:19:13 UTC 3662 followers, XXX engagements
"ByteDance unveils CriticLean: a novel framework for reliable mathematical formalization via critic-guided Reinforcement Learning It makes Lean X proofs more accurate" @HuggingPapers on X 2025-07-10 00:18:52 UTC 3693 followers, XXX engagements
"A 165-page survey distilling 1300+ papers into Context Engineeringthe discipline that moves AI from static prompts to dynamic production-grade context orchestration" @HuggingPapers on X 2025-07-18 04:18:56 UTC 3673 followers, XXX engagements
"ByteDance Seed team unveils PyVision This new framework enables MLLMs to dynamically generate execute and refine Python-based tools for flexible interactive visual reasoning. A big leap towards more agentic AI" @HuggingPapers on X 2025-07-13 12:08:32 UTC 3695 followers, 5596 engagements
"Introducing T-LoRA Customize Diffusion Models with just one image and say goodbye to overfitting. Achieve unmatched fidelity & diversity" @HuggingPapers on X 2025-07-11 12:09:34 UTC 3671 followers, 13.2K engagements
"Kwai Keye-VL Technical Report just dropped on Hugging Face Kuaishou introduces an 8-billion-parameter multimodal foundation model engineered for cutting-edge short-video understanding. Achieves state-of-the-art results on public video benchmarks and maintains robust general-purpose vision-language abilities" @HuggingPapers on X 2025-07-03 04:13:24 UTC 3675 followers, XXX engagements
"Is your LLM truly reasoning or just memorizing New research uncovers critical data contamination in popular benchmarks for RL-enhanced LLM reasoning. Only accurate reward signals prove to yield true performance gains on clean data" @HuggingPapers on X 2025-07-15 12:10:05 UTC 3664 followers, XXX engagements
"From ByteDance: EmbRACE-3K boasts 3000+ language-guided tasks in photorealistic Unreal Engine environments for embodied AI. Leading VLMs currently score XX% zero-shot. Fine-tuning Qwen2.5-VL-7B yields strong gains. Learn more:" @HuggingPapers on X 2025-07-15 08:10:35 UTC 3661 followers, XXX engagements
"Alibaba Group's HumanOmniV2 is here It enables omni-modal LLMs to deeply understand human intentions by reasoning with global context. This new framework tackles common shortcut problems and achieves state-of-the-art results" @HuggingPapers on X 2025-07-07 00:21:14 UTC 3672 followers, 15.4K engagements
"Google Research and collaborators unveil AgentsNet A new benchmark for multi-agent LLM coordination Tests how LLMs collaborate in networks scaling up to XXX agents" @HuggingPapers on X 2025-07-17 00:20:20 UTC 3698 followers, 8897 engagements
"Alibaba Group and UIUC introduce PAPO a new RL framework for multimodal reasoning. It teaches models to perceive & reason by tackling the visual perception bottleneck. XXXX% reduction in perception errors. XXX% overall gain. No extra data needed" @HuggingPapers on X 2025-07-10 04:13:55 UTC 3667 followers, 1021 engagements
"Tencent just released RLVER on Hugging Face It's a pioneering reinforcement learning framework that uses verifiable emotion rewards to create truly empathetic AI agents" @HuggingPapers on X 2025-07-13 00:21:27 UTC 3695 followers, 1237 engagements
"New paper: EmbRACE-3K A groundbreaking dataset for embodied AI agents pushing models to reason & act in complex photorealistic Unreal Engine environments. Current VLMs struggle in these interactive settings. Ready for the next generation of intelligent agents" @HuggingPapers on X 2025-07-15 08:10:32 UTC 3660 followers, XXX engagements
"MMHU A massive-scale multimodal benchmark for human behavior understanding in autonomous driving. 57k human motion clips & 1.73M frames from Waymo YouTube & self-collected data with rich annotations for motion trajectory intention & safety-critical labels" @HuggingPapers on X 2025-07-17 18:41:55 UTC 3698 followers, 7981 engagements
"New research from Kuaishou Technology: VMoBA is a game-changing sparse attention mechanism that dramatically accelerates video diffusion model training & inference" @HuggingPapers on X 2025-07-01 04:26:39 UTC 3675 followers, XXX engagements
"Discover the challenges of multi-agent LLM coordination as networks scale. Explore interactive demos and dive into the data Paper: Dataset: Demo:" @HuggingPapers on X 2025-07-17 00:20:22 UTC 3697 followers, XXX engagements
"Deep dive into LayerCake: a novel approach to boosting LLM factuality. This decoding-time strategy precisely targets attention across layers and token types for improved reliability. Read the paper: Code:" @HuggingPapers on X 2025-07-16 00:19:15 UTC 3675 followers, XXX engagements
"New paper just dropped on Hugging Face: a statistical diagnosis for training LLM web agents. It shows how combining SFT with on-policy RL achieves SOTA performance using only XX% of the compute required by pure SFT" @HuggingPapers on X 2025-07-12 08:07:30 UTC 3673 followers, XXX engagements
"A 165-page survey distilling 1300+ papers into Context Engineering - the discipline that moves AI from static prompts to dynamic production-grade context orchestration" @HuggingPapers on X 2025-07-18 07:13:41 UTC 3696 followers, 1349 engagements
"NVIDIA just released Long-RL It's a full-stack framework Scaling reinforcement learning to long videos up to 256k tokens on a single A100 node" @HuggingPapers on X 2025-07-11 04:19:46 UTC 3691 followers, 8548 engagements
"One Token to Fool LLM-as-a-Judge A single : or Lets solve this step-by-step can dupe generative reward models into giving false-positive scoreshighlighting a critical flaw in RL pipelines" @HuggingPapers on X 2025-07-19 08:06:20 UTC 3698 followers, XXX engagements
/creator/twitter::1906617820122595328/posts