Dark | Light
# ![@SFResearch Avatar](https://lunarcrush.com/gi/w:26/cr:twitter::2827069807.png) @SFResearch Salesforce AI Research

Salesforce AI Research posts on X about ai, llm, data, check the most. They currently have [------] followers and [---] posts still getting attention that total [-----] engagements in the last [--] hours.

### Engagements: [-----] [#](/creator/twitter::2827069807/interactions)
![Engagements Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::2827069807/c:line/m:interactions.svg)

- [--] Week [-----] -72%
- [--] Month [-------] -76%
- [--] Months [---------] -43%
- [--] Year [---------] +558%

### Mentions: [--] [#](/creator/twitter::2827069807/posts_active)
![Mentions Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::2827069807/c:line/m:posts_active.svg)

- [--] Week [--] +150%
- [--] Month [--] +86%
- [--] Months [---] +8.20%
- [--] Year [---] -11%

### Followers: [------] [#](/creator/twitter::2827069807/followers)
![Followers Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::2827069807/c:line/m:followers.svg)

- [--] Week [------] +0.01%
- [--] Month [------] +1.50%
- [--] Months [------] +9.90%
- [--] Year [------] +19%

### CreatorRank: [-------] [#](/creator/twitter::2827069807/influencer_rank)
![CreatorRank Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::2827069807/c:line/m:influencer_rank.svg)

### Social Influence

**Social category influence**
[technology brands](/list/technology-brands)  57.69% [stocks](/list/stocks)  56.15% [finance](/list/finance)  1.54% [countries](/list/countries)  0.77% [travel destinations](/list/travel-destinations)  0.77% [social networks](/list/social-networks)  0.77%

**Social topic influence**
[ai](/topic/ai) 62.31%, [llm](/topic/llm) #864, [data](/topic/data) 11.54%, [check](/topic/check) 10%, [agentic](/topic/agentic) #1375, [systems](/topic/systems) 5.38%, [leaderboard](/topic/leaderboard) 4.62%, [$crm](/topic/$crm) #40, [software](/topic/software) 3.85%, [the first](/topic/the-first) 3.85%

**Top accounts mentioned or mentioned by**
[@caimingxiong](/creator/undefined) [@silviocinguetta](/creator/undefined) [@jasonwu0731](/creator/undefined) [@lijunnan0409](/creator/undefined) [@jotyshafiq](/creator/undefined) [@huanwang](/creator/undefined) [@salesforce](/creator/undefined) [@semihyavuz](/creator/undefined) [@liuzuxin](/creator/undefined) [@yingbozhouai](/creator/undefined) [@iscreamnearby](/creator/undefined) [@philippelaban](/creator/undefined) [@stevenhoi](/creator/undefined) [@huggingface](/creator/undefined) [@streattaylor](/creator/undefined) [@violetnpeng](/creator/undefined) [@ucla](/creator/undefined) [@hllowrld](/creator/undefined) [@virprabh](/creator/undefined) [@yutongdai](/creator/undefined)

**Top assets mentioned**
[Salesforce Inc (CRM)](/topic/$crm) [New Gold Inc. (NGD)](/topic/$ngd)
### Top Social Posts
Top posts by engagements in the last [--] hours

"๐Ÿงต Jason Wu (@jasonwu0731) on our simulation and trustworthy AI work. (1/3) #FutureOfAI #TrustworthyAI"  
[X Link](https://x.com/SFResearch/status/2001727006711648337)  2025-12-18T18:51Z 19.1K followers, [---] engagements


"Demographics aren't enough to simulate human behavior. ๐Ÿง  New research introduces SCOPE: a framework and persona dataset collection that moves beyond demographic templates to build richer AI personas grounded in sociopsychological structure. ๐Ÿ“„ Paper: Key findings across [--] models: Demographics alone explain only 1.5% of variance in human responses Adding traits values & identity narratives improves behavioral alignment while reducing bias SCOPE personas outperform existing approaches on SimBench an external social and behavioural benchmark. The work also shows SCOPE can augment existing"  
[X Link](https://x.com/SFResearch/status/2017029560500400327)  2026-01-30T00:18Z 19.1K followers, [----] engagements


"At Salesforce AI Research we believe that the most transformative breakthroughs happen when we collaborate with the brightest minds in the academic community. Meet our [----] academic grant recipients: #FutureOfAI #EnterpriseAI https://sforce.co/4rmJ8DA https://sforce.co/4rmJ8DA"  
[X Link](https://x.com/SFResearch/status/2017376399804260396)  2026-01-30T23:16Z 19.1K followers, 11.9K engagements


"(2/4) Nanyun (Violet) Peng @VioletNPeng @UCLA is developing MAP-SE a Multi-Agent Persuasion Simulation Engine that studies how influence emerges across adaptive agents with long-term memorymoving beyond simple one-on-one interactions. #AIResearch #AgenticAI"  
[X Link](https://x.com/SFResearch/status/2017376575700828297)  2026-01-30T23:17Z 19.1K followers, [----] engagements


"@VioletNPeng @UCLA (3/4) Victor Zhong @hllo_wrld @UWaterloo is creating a framework for "distributional evaluation" using Language Personas grounded in real user data. This enables more realistic scalable AI testing that identifies bias and failure modes. #AIEvaluation #TrustedAI"  
[X Link](https://x.com/SFResearch/status/2017376659817533695)  2026-01-30T23:17Z 19.1K followers, [----] engagements


"@VioletNPeng @UCLA @hllo_wrld @UWaterloo (4/4) Percy Liang @percyliang @Stanford is building fully open-source models for agentic tasks. Using the Marin framework his team is training 8B and 32B models with complete transparencyfrom data curation to reinforcement learning. #OpenSource #AgenticAI"  
[X Link](https://x.com/SFResearch/status/2017376739236733027)  2026-01-30T23:17Z 19.1K followers, [----] engagements


"Last month @SFResearch team members created onesies for donation to @Hospital_Art supporting their mission to bring art to hospitals worldwide. Proud of our team's commitment to compassion and community impact. #AIforGood"  
[X Link](https://x.com/SFResearch/status/2018003985324535831)  2026-02-01T16:50Z 19.1K followers, [----] engagements


"The next phase of AI isn't about larger models. We're engineering systems that give LLMs what they lack: long-term memory multistep reasoning capabilities and orchestration that enables real-world action. This is how AI moves from chatbots to business transformation. https://sforce.co/46sEUlP https://sforce.co/46sEUlP"  
[X Link](https://x.com/SFResearch/status/2018816271572152767)  2026-02-03T22:38Z 19.1K followers, [---] engagements


"(2/22) GTA1: GUI Test-time Scaling Agent: GTA1 introduces test-time scaling for GUI agents using multiple candidate action proposals and RL-based grounding to achieve state-of-the-art performance on autonomous task completion across platforms. Authors: Yan Yang Dongxu Li Yutong Dai Yuhao Yang Ziyang Luo Zirui Zhao Zhiyuan Hu Junzhe Huang Amrita Saha Zeyuan Chen Ran Xu Liyuan Pan Silvio Savarese Caiming Xiong Junnan Li https://bit.ly/4o04fdX https://twitter.com/i/web/status/2019787013247897845 https://bit.ly/4o04fdX https://bit.ly/4o04fdX https://twitter.com/i/web/status/2019787013247897845"  
[X Link](https://x.com/SFResearch/status/2019787013247897845)  2026-02-06T14:55Z 19.1K followers, [--] engagements


"(3/22) Variation in Verification: Understanding Verification Dynamics in Large Language Models: Generative verifiers can help weak LLM generators nearly match stronger ones in test-time scaling (closing gaps by 75.5%) but verification effectiveness varies with problem difficulty and verifier scaling alone has limits. Authors: Yefan Zhou Austin Xu Yilun Zhou Janvijay Singh Jiang Gui Shafiq Joty https://bit.ly/3McnwuS https://twitter.com/i/web/status/2019787015319859635 https://bit.ly/3McnwuS https://bit.ly/3McnwuS https://twitter.com/i/web/status/2019787015319859635 https://bit.ly/3McnwuS"  
[X Link](https://x.com/SFResearch/status/2019787015319859635)  2026-02-06T14:55Z 19.1K followers, [--] engagements


"(18/22) Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels: Authors: Zhepeng Cen Haolin Chen Shiyu Wang Zuxin Liu Zhiwei Liu Ding Zhao Silvio Savarese Caiming Xiong Huan Wang Weiran Yao https://bit.ly/3IFuMhf https://bit.ly/3IFuMhf"  
[X Link](https://x.com/SFResearch/status/2019787047251153181)  2026-02-06T14:55Z 19.1K followers, [--] engagements


"(19/22) Grounded Test-Time Adaptation for LLM Agents 10char link Authors: Arthur Chen Zuxin Liu Jianguo Zhang Akshara Prabhakar Zhiwei Liu Shelby Heinecke Silvio Savarese Victor Zhong Caiming Xiong"  
[X Link](https://x.com/SFResearch/status/2019787048941465788)  2026-02-06T14:55Z 19.1K followers, [--] engagements


"(8/22) WALT: Web Agents that Learn Tools: WALT reverse-engineers website functionality into reusable tools like search filter and createshifting from fragile step-by-step interactions to reliable tool invocation with higher success and fewer steps on VisualWebArena and WebArena. Authors: Viraj Prabhu @virprabh Yutong Dai @yutong_dai Matthew Fernandez Jing Gu @jinggu4ai Krithika Ramakrishnan Yanqi Luo Silvio Savarese @silviocinguetta Caiming Xiong @CaimingXiong Junnan Li @LiJunnan0409 Zeyuan Chen @ZeyuanChen Ran Xu @stanleyran โœ… Accepted to #ICLR2026 https://bit.ly/4nhJf0K"  
[X Link](https://x.com/SFResearch/status/2019822861918024056)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(9/22) SCUBA: Salesforce Computer Use Benchmark: SCUBA benchmarks computer-use agents on [---] real Salesforce CRM tasks across admin sales and service workflowsopen-source agents achieve 5% success vs. 39% for closed-source models in zero-shot settings improving to 50% with demonstrations while reducing time and costs by 13-16%. Authors: Yutong Dai @yutong_dai Krithika Ramakrishnan Jing Gu @jinggu4ai Matthew Fernandez Yanqi Luo Viraj Prabhu @virprabh Zhenyu Hu Silvio Savarese @silviocinguetta Caiming Xiong @CaimingXiong Zeyuan Chen @ZeyuanChen Ran Xu @stanleyran โœ… Accepted to #ICLR2026"  
[X Link](https://x.com/SFResearch/status/2019822864141353188)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(12/22) Improving LLM Alignment with References: Reference-guided evaluation improves LLM-based evaluators and enables effective semi-self-improvementachieving 73.1% on AlpacaEval and 58.7% on Arena-Hard with Llama-3-8B-Instruct comparable to finetuned reward models. Authors: Kejian Shi Yixin Liu PeiFeng Wang @PeifengWang3 Alexander Fabbri Shafiq Rayhan Joty @JotyShafiq Arman Cohan โœ… Accepted to #ICLR2026 https://bit.ly/4am3czb https://bit.ly/4am3czb https://bit.ly/4am3czb https://bit.ly/4am3czb"  
[X Link](https://x.com/SFResearch/status/2019822871137194096)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(13/22) Distill-SynthKG: Distilling Knowledge Graph Synthesis Workflow for Improved Coverage and Efficiency: SynthKG introduces ontology-free KG synthesis that distills into Distill-SynthKG for efficient single-step generationsurpassing models 8x larger in KG quality and outperforming baselines in retrieval and question-answering with a novel graph-based RAG framework. Authors: Prafulla Kumar Choubey Xin Su Man Luo Xiangyu Peng @beckypeng6 Caiming Xiong @CaimingXiong Tiep Le Shachar Rosenman @SRosenamn Vasudev Lal @vasudev_lal Phil Mui Ricky Ho Phillip Howard Chien-Sheng Wu @jasonwu0731 โœ…"  
[X Link](https://x.com/SFResearch/status/2019822873678954916)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(17/22) Entropy-Based Block Pruning for Efficient Large Language Models: Entropy-based pruning outperforms cosine similarity methods by leveraging entropy patterns across Transformer blocksdecreasing early then increasingas a more effective measure of information richness for reducing model size while preserving accuracy. Authors: Liangwei Yang @Liangwei_Yang Yuhui Xu Juntao Tan Doyen Sahoo @doyensahoo Silvio Savarese @silviocinguetta Caiming Xiong @CaimingXiong Huan Wang @huan__wang Shelby Heinecke @shelbyh_ai โœ… Accepted to #ICLR2026 https://bit.ly/46cEErl https://bit.ly/46cEErl"  
[X Link](https://x.com/SFResearch/status/2019822881769672776)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(18/22) Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels: Webscale-RL introduces a scalable pipeline converting pre-training documents into 1.2M verifiable QA pairs across 9+ domainsRL training on this dataset achieves continual pre-training performance with [---] fewer tokens offering a viable path to scaling RL to pre-training levels. Authors: Zhepeng Cen Haolin Chen Shiyu Wang Zuxin Liu @LiuZuxin Zhiwei Liu Ding Zhao Silvio Savarese @silviocinguetta Caiming Xiong @CaimingXiong Huan Wang @huan__wang Weiran Yao @iscreamnearby โœ… Accepted to #ICLR2026"  
[X Link](https://x.com/SFResearch/status/2019822884248498661)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(19/22) Grounded Test-Time Adaptation for LLM Agents: Parametric online adaptation aligns LLM agents to environment-specific formats while non-parametric dynamics grounding learns causal state transitions through persona-driven explorationtogether addressing syntactic and semantic mismatches to boost WebArena multi-site success from 2% to 23% Authors: Arthur Chen @arthurchen189 Zuxin Liu @LiuZuxin Jianguo Zhang @JianguoZhang3 Akshara Prabhakar @aksh_555 Zhiwei Liu Shelby Heinecke @shelbyh_ai Silvio Savarese @silviocinguetta Victor Zhong @hllo_wrld Caiming Xiong @CaimingXiong โœ… Accepted to"  
[X Link](https://x.com/SFResearch/status/2019822886307893687)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(20/22) Scalable Chain of Thoughts via Elastic Reasoning: Elastic Reasoning separates chain-of-thought into thinking and solution phases with independent budgets prioritizing solution completeness under constraintsachieving robust performance with lower training costs and more concise reasoning across math and coding benchmarks. Authors: Yuhui Xu @xyh6666 Hanze Dong @hendrydong Lei Wang @leiwang Doyen Sahoo @doyensahoo Junnan Li @LiJunnan0409 Caiming Xiong @CaimingXiong โœ… Accepted to #ICLR2026 https://bit.ly/4a0luqS https://bit.ly/4a0luqS https://bit.ly/4a0luqS https://bit.ly/4a0luqS"  
[X Link](https://x.com/SFResearch/status/2019822889101353115)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(21/22) OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs: OFTSR achieves one-step image super-resolution with tunable fidelity-realism trade-off by aligning student predictions to teacher model sampling trajectoriesreaching state-of-the-art performance on FFHQ DIV2K and ImageNet without multi-step overhead. Authors: Yuanzhi Zhu Ruiqing Wang Shilin Lu Junnan Li @LiJunnan0409 Hanshu Yan @hanshu_yan Kai Zhang โœ… Accepted to #ICLR2026 https://bit.ly/3ZipY67 https://bit.ly/3ZipY67 https://bit.ly/3ZipY67 https://bit.ly/3ZipY67"  
[X Link](https://x.com/SFResearch/status/2019822891320086865)  2026-02-06T17:17Z 19.1K followers, [---] engagements


"๐Ÿšจ New paper and dataset alert: When do multi-agent systems actually help ๐Ÿ“„ Paper: Despite growing interest most multi-agent systems (MAS) today rely on local sequential and hand-designed orchestrationmaking it difficult to reason about their benefits or scale them effectively. Introducing MAS-Orchestra which takes a new perspective: treat MAS orchestration as a holistic function-calling reinforcement learning problem with an explicit notion of the degree of multi-agentness (DoM). We also introduce MASBench a benchmark designed to systematically measure when MAS outperforms single-agent"  
[X Link](https://x.com/SFResearch/status/2018388343151628766)  2026-02-02T18:17Z 19.1K followers, [----] engagements


"@Salesforce AI Research has [--] papers accepted to ICLR [----] advancing work across LLM reasoning evaluation systems knowledge graphs and agent architectures. Our research addresses critical challenges in making AI systems more reliable efficient and effective for enterprise applications. #ICLR2026 #FutureOfAI #EnterpriseAI https://twitter.com/i/web/status/2019822846122365374 https://twitter.com/i/web/status/2019822846122365374"  
[X Link](https://x.com/SFResearch/status/2019822846122365374)  2026-02-06T17:17Z 19.1K followers, [----] engagements


"(14/22) CoAct-1: Computer-using Multi-agent System with Coding Actions: CoAct-1 introduces a multi-agent system combining GUI control with programmatic executionan Orchestrator delegates subtasks to GUI Operator or Programmer agents achieving 60.76% success on OSWorld (new SOTA) while reducing steps from [--] to [-----]. Authors: Linxin Song Yutong Dai @yutong_dai Viraj Prabhu @virprabh Jieyu Zhang Taiwei Shi Li Li Junnan Li @LiJunnan0409 Silvio Savarese @silviocinguetta Zeyuan Chen @ZeyuanChen Jieyu Zhao Ran Xu @stanleyran Caiming Xiong @CaimingXiong โœ… Accepted to #ICLR2026"  
[X Link](https://x.com/SFResearch/status/2019822875675402397)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(15/22) LLMs Get Lost in Multi-Turn Conversation: LLMs show 39% performance drop in multi-turn vs. single-turn conversations across six tasksanalysis of 200000+ simulated conversations reveals they make premature assumptions and fail to recover when taking wrong turns. Authors: Philippe Laban @PhilippeLaban Hiroaki Hayashi @hiroakiLhayashi Yingbo Zhou Jennifer Neville @ProfJenNeville โœ… Accepted to #ICLR2026 https://bit.ly/3ZoC6T1 https://bit.ly/3ZoC6T1"  
[X Link](https://x.com/SFResearch/status/2019822877780984142)  2026-02-06T17:17Z 19.1K followers, [--] engagements


"(2/7) Software issue localization maps bug reports to code functions needing fixes. As codebases grow across languages manual localization becomes infeasible. SWERANK+ addresses this with multilingual ranking and agentic search"  
[X Link](https://x.com/SFResearch/status/2021051350751248412)  2026-02-10T02:39Z 19.1K followers, [---] engagements


"(3/7) SWERANKMULTI extends code ranking to [--] languages using SWELOCMULTI dataset (155K instances 4K+ repos): Bi-encoder retriever LLM reranker JavaScript Java TypeScript Ruby Rust Go PHP C C++ Python ๐Ÿ“„ https://arxiv.org/abs/2512.20482 https://arxiv.org/abs/2512.20482"  
[X Link](https://x.com/SFResearch/status/2021051352324112810)  2026-02-10T02:39Z 19.1K followers, [--] engagements


"(6/7) The agentic approach mirrors developer workflows: broad exploration context building converging to root cause. Combines efficiency of specialized retrievers with depth of agentic reasoning. ๐Ÿ“„ https://arxiv.org/abs/2512.20482 https://arxiv.org/abs/2512.20482"  
[X Link](https://x.com/SFResearch/status/2021051359701893193)  2026-02-10T02:39Z 19.1K followers, [---] engagements


"(7/7) By @gangi_official @YeLiu918 @WentingZhao9 @_jaedoo2 @TarunSures41845 @daniel_js_lee @caimingxiong @yingbozhou @semih__yavuz @JotyShafiq at @Salesforce AI Research UIUC KAIST AI #FutureOfAI #EnterpriseAI"  
[X Link](https://x.com/SFResearch/status/2021051362352726122)  2026-02-10T02:39Z 19.1K followers, [---] engagements


"Deep research agents typically scale depthmore sequential steps. But what about scaling width ๐Ÿค” ๐Ÿ“„ Paper: We introduce Wide & Deep (W&D) research agents: a framework exploring parallel tool calling to boost performance while reducing costs and latency. Key results on BrowseComp HLE and GAIA: ๐Ÿ“Š Parallel tool calling improves accuracy across GPT-5 Gemini and Claude ๐Ÿ’ฐ 36% reduction in API costs 41% reduction in wall-clock time ๐ŸŽฏ W&D with GPT-5-Medium achieves 62.2% on BrowseCompbeating GPT-5-High's 54.9% Why it works: ๐Ÿ” Enhanced source credibility through diverse information gathering โœ…"  
[X Link](https://x.com/SFResearch/status/2021604439434412083)  2026-02-11T15:17Z 19.1K followers, [----] engagements


"eVerse: CRMArena Org Data Generator The enterprise AI bottleneck isn't model performanceit's the last mile between pilot and production. ๐Ÿ”’ The Challenge: How do you rigorously test AI agents without touching real customer data Privacy regulations like GDPR prohibit using production data for development. But sanitized data removes the complexity and volume agents need to learn effectively. ๐Ÿ’ก The Solution: eVerse is our answer to enterprise AI's training problem: privacy-preserving simulation environments where AI agents can fail safely learn from realistic scenarios and improve before they"  
[X Link](https://x.com/SFResearch/status/2021275968296341615)  2026-02-10T17:31Z 19.1K followers, [---] engagements


"Salesforce AI Research is hiring a Research Scientist (including Senior/Lead level) with deep expertise in Self-Evolving Agents and Reinforcement Learning. Apply: We're expanding our Frontier RL stack to support continuous self-evolutionbuilding the foundation for autonomous systems that improve through interaction and experience. This role sits at the intersection of cutting-edge research and real-world enterprise applications where your work will directly impact how millions of CRM customers leverage AI. ๐Ÿง  Key requirements: PhD in ML/RL or equivalent research experience Strong publication"  
[X Link](https://x.com/SFResearch/status/2021976364413342162)  2026-02-12T15:55Z 19.1K followers, [----] engagements


"Introducing COVID-19 Search a new AI-powered search tool that equips scientists and researchers with the most relevant information about COVID-19. Learn more about this tool at https://sfdc.co/covid19search https://sfdc.co/covid19search"  
[X Link](https://x.com/SFResearch/status/1273267349513093120)  2020-06-17T14:52Z 19.1K followers, [---] engagements


"Announcing the Third Annual AI Research Grant For more details and how to apply: Blog: Website: Good luck to our future applicants https://einstein.ai/outreach/grants https://blog.einstein.ai/announcing-the-annual-salesforce-ai-research-grant/ https://einstein.ai/outreach/grants https://blog.einstein.ai/announcing-the-annual-salesforce-ai-research-grant/"  
[X Link](https://x.com/SFResearch/status/1292938586283630592)  2020-08-10T21:39Z 19.1K followers, [---] engagements


"Congrats to our ICLR [----] Accepted Paper Authors @CaimingXiong @jesse_vig @thisismadani @nazneenrajani @semih__yavuz @mrnt0810 @yingbozhou_ai @jasonwu0731 @sachin_logs @LiJunnan0409 @stevenhoi @panzhou9 @StrongDuality and all our amazing collaborators"  
[X Link](https://x.com/SFResearch/status/1351675114857799680)  2021-01-19T23:37Z 19.1K followers, [--] engagements


"Thank you to everyone who submitted a proposal to our third annual Salesforce AI Research Grant. Were proud to announce our [----] round of winners. Congratulations @bluevincent @Diyi_Yang @mutembesa @danqi_chen Read More: https://blog.einstein.ai/celebrating-the-winners-of-the-third-annual-salesforce-ai-research-grant/ https://blog.einstein.ai/celebrating-the-winners-of-the-third-annual-salesforce-ai-research-grant/"  
[X Link](https://x.com/SFResearch/status/1353859038073614336)  2021-01-26T00:15Z 19.1K followers, [---] engagements


"Were thrilled to announce that Silvio Savarese (@silviocinguetta) former associate professor of Computer Science at Stanford University has joined @salesforce as our new EVP and Chief Scientist of Salesforce Research"  
[X Link](https://x.com/SFResearch/status/1382731365712498689)  2021-04-15T16:23Z 19.1K followers, [---] engagements


"Congrats to our ACL [----] Accepted Paper Authors @CaimingXiong @JotyShafiq @baxterkb @jasonwu0731 @owenhaoliu @Wenpeng_Yin @huan__wang and all of our amazing collaborators"  
[X Link](https://x.com/SFResearch/status/1390442672557477888)  2021-05-06T23:05Z 19.1K followers, [--] engagements


"Can #AI language models learn from evolution to design proteins Learn how Salesforce is taking a step towards enabling solutions to cure disease and clean our planet. Blog: Paper: http://biorxiv.org/content/10.1101/2021.07.18.452833v1 http://blog.einstein.ai/learning-from-evolution/ http://biorxiv.org/content/10.1101/2021.07.18.452833v1 http://blog.einstein.ai/learning-from-evolution/"  
[X Link](https://x.com/SFResearch/status/1417152275290546178)  2021-07-19T16:00Z 19.1K followers, [--] engagements


"Meet CodeT5 - the first code-aware encoder-decoder pre-trained model that achieves SoTA on [--] sub-tasks in CodeXGLUE Learn how its disrupting software development. Blog: Paper: GitHub: #codeintelligence https://github.com/salesforce/CodeT5 https://arxiv.org/abs/2109.00859 http://blog.einstein.ai/codet5/ https://github.com/salesforce/CodeT5 https://arxiv.org/abs/2109.00859 http://blog.einstein.ai/codet5/"  
[X Link](https://x.com/SFResearch/status/1433825494600798209)  2021-09-03T16:13Z 19.1K followers, [--] engagements


"Congrats to our NeurIPS [----] Accepted Paper Authors @CaimingXiong @yubai01 @huan__wang @lrvarshney @a1vinchan @thisismadani @benwkrause @nikhil_ai @LiJunnan0409 @ramprs21 @AkhileshGotmare @JotyShafiq @stevenhoi and all of our amazing collaborators"  
[X Link](https://x.com/SFResearch/status/1443358695899811841)  2021-09-29T23:35Z 19.1K followers, [--] engagements


"Do you want to launch your career in machine learning research Our new AI Residency Program can allow you to do just that. Set yourself up for success in applying to PhD programs w/ real-world experience at one of the industry's top AI research programs. https://sforce.co/AIResTwitter https://sforce.co/AIResTwitter"  
[X Link](https://x.com/SFResearch/status/1468977412457185283)  2021-12-09T16:14Z 19.1K followers, [---] engagements


"Discover CTRLsum a generic summarization framework that enables users to control the content of the generated summaries along multiple dimensions. Blog: Code: #NLP #summarization https://github.com/salesforce/ctrl-sum https://arxiv.org/abs/2012.04281 https://blog.einstein.ai/ctrlsum/ https://github.com/salesforce/ctrl-sum https://arxiv.org/abs/2012.04281 https://blog.einstein.ai/ctrlsum/"  
[X Link](https://x.com/SFResearch/status/1471255744364236805)  2021-12-15T23:07Z 19.1K followers, [--] engagements


"Did you know most #NLP models are not designed to handle code-mixing where each sentence contains multiple languages Learn how @samsontmr @SFResearch is changing that. Blog: Paper: Code: https://github.com/salesforce/adversarial-polyglots https://www.aclweb.org/anthology/2021.naacl-main.282 https://blog.salesforceairesearch.com/code-mixing https://github.com/salesforce/adversarial-polyglots https://www.aclweb.org/anthology/2021.naacl-main.282 https://blog.salesforceairesearch.com/code-mixing"  
[X Link](https://x.com/SFResearch/status/1485741225797771264)  2022-01-24T22:28Z 19.1K followers, [---] engagements


"Meet BLIP: Bootstrapping Language-Image Pre-training for unified Vision-Language understanding/generation. New model architecture + Dataset bootstrapping = SoTA results on a wider range of V+L tasks than other models http://blog.salesforceairesearch.com/blip-bootstrapping-language-image-pretraining http://blog.salesforceairesearch.com/blip-bootstrapping-language-image-pretraining"  
[X Link](https://x.com/SFResearch/status/1496965107863093249)  2022-02-24T21:47Z 19.1K followers, [--] engagements


"Discover CodeGen - an AI model that turns simple natural-language requests into executable code. Learn more about this breakthrough in conversational AI programming. Paper: Blog: Code: https://github.com/salesforce/CodeGen https://blog.salesforceairesearch.com/codegen/ https://arxiv.org/abs/2203.13474 https://github.com/salesforce/CodeGen https://blog.salesforceairesearch.com/codegen/ https://arxiv.org/abs/2203.13474"  
[X Link](https://x.com/SFResearch/status/1508946550780690432)  2022-03-29T23:17Z 19.1K followers, [---] engagements


"Want to build bots better Try Converse: a new Task-Oriented Dialogue System that simplifies chatbot building while handling complex tasks and conversations. #NLP #AI Code: Paper: Blog: https://blog.salesforceairesearch.com/converse-task-oriented-dialogue-system/ https://arxiv.org/abs/2203.12187 https://github.com/salesforce/converse https://blog.salesforceairesearch.com/converse-task-oriented-dialogue-system/ https://arxiv.org/abs/2203.12187 https://github.com/salesforce/converse"  
[X Link](https://x.com/SFResearch/status/1512481062194057220)  2022-04-08T17:22Z 19.1K followers, [--] engagements


"Check out our #NAACL2022 accepted papers Congrats to the authors We hope everyone enjoys the conference @EhsanHAsl @owenhaoliu @CaimingXiong @murakhovska @jasonwu0731 @alexfabbri4 @mrnt0810 @jesse_vig @iam_wkr @semih__yavuz @yingbozhou_ai @LHung1610 @stevenhoi @PhilippeLaban"  
[X Link](https://x.com/SFResearch/status/1515035881257668614)  2022-04-15T18:34Z 19.1K followers, [--] engagements


"Read our blog on #ACL2022. Congrats to all our authors for their accepted papers http://blog.salesforceairesearch.com/salesforce-at-acl-2022 http://blog.salesforceairesearch.com/salesforce-at-acl-2022"  
[X Link](https://x.com/SFResearch/status/1527342212089991182)  2022-05-19T17:35Z 19.1K followers, [--] engagements


"Our CodeGen models are now available at @huggingface (Model size variants: 350M 2B 6B and 16B.) Clone the latest transformers repository and try it out Paper: Models: https://huggingface.co/modelssearch=salesforce+codegen https://arxiv.org/abs/2203.13474 https://huggingface.co/modelssearch=salesforce+codegen https://arxiv.org/abs/2203.13474"  
[X Link](https://x.com/SFResearch/status/1541818596048875520)  2022-06-28T16:19Z 19.1K followers, [---] engagements


"CodeRL advances program synthesis by integrating pretrained language models + deep reinforcement learning. Using unit test feedback in model training and inference + an improved CodeT5 model it achieves SOTA results on competition-level programming tasks. https://blog.salesforceairesearch.com/coderl https://blog.salesforceairesearch.com/coderl"  
[X Link](https://x.com/SFResearch/status/1549536751785349120)  2022-07-19T23:28Z 19.1K followers, [---] engagements


"ETSformer is a time-series forecasting model that combines the classical intuition of seasonal-trend decomposition and exponential smoothing with the Transformer framework introducing novel exponential smoothing and frequency attention mechanisms. https://blog.salesforceairesearch.com/etsformer-time-series-forecasting https://blog.salesforceairesearch.com/etsformer-time-series-forecasting"  
[X Link](https://x.com/SFResearch/status/1562198187384877057)  2022-08-23T22:00Z 19.1K followers, [--] engagements


"Using both natural and artificial abilities the human relationship with tools has drastically evolved. The best tools are powerful because theyre easy to use. This is where our skill of language and AI meet. Learn more on how conversation can power AI https://blog.salesforceairesearch.com/age-of-conversational-ai/ https://blog.salesforceairesearch.com/age-of-conversational-ai/"  
[X Link](https://x.com/SFResearch/status/1577388085083602945)  2022-10-04T20:00Z 19.1K followers, [--] engagements


"Time-series forecasting methods perform poorly on long sequences when data changes over time. DeepTime overcomes this issue by using forecasting-as-meta-learning on deep time-index models. Result: state-of-the-art performance and a highly efficient model. https://blog.salesforceairesearch.com/deeptime-meta-learning-time-series-forecasting https://blog.salesforceairesearch.com/deeptime-meta-learning-time-series-forecasting"  
[X Link](https://x.com/SFResearch/status/1580649580726652928)  2022-10-13T20:00Z 19.1K followers, [---] engagements


"For time series forecasting deep learning isnt scalable for streaming data and non-stationary data makes it hard. FSNet learns deep forecasting models on the fly and handles non-stationary data + concept drift. Learn more https://blog.salesforceairesearch.com/fsnet-deep-time-series-forecasting/ https://blog.salesforceairesearch.com/fsnet-deep-time-series-forecasting/"  
[X Link](https://x.com/SFResearch/status/1586055193749233667)  2022-10-28T18:00Z 19.1K followers, [--] engagements


"Do you want to make your dog look like a golden retriever Or get a picture of a cat surfing Researchers at Salesforce recently developed a new editing algorithm called EDICT - here's a thread on the results and details ๐Ÿงต"  
[X Link](https://x.com/SFResearch/status/1612886999152857088)  2023-01-10T19:00Z 19.1K followers, 29.2K engagements


"Check out our #ICLR2023 Accepted Papers Congrats to all the authors @silviocinguetta @CaimingXiong @huan__wang @doyensahoo @yingbozhou_ai @hiroakiLhayashi @erik_nijkamp @stevenhoi @YuBai01 @jasonwu0731"  
[X Link](https://x.com/SFResearch/status/1617941161892777984)  2023-01-24T17:43Z 19.1K followers, [----] engagements


"We introduce the Salesforce CausalAI Library an open source library for causal analysis of time series and tabular data. GitHub: GitHub Documentation: Tech Report: Blog: https://blog.salesforceairesearch.com/causalai/ https://arxiv.org/abs/2301.10859 https://opensource.salesforce.com/causalai/latest/index.html https://github.com/salesforce/causalai https://blog.salesforceairesearch.com/causalai/ https://arxiv.org/abs/2301.10859 https://opensource.salesforce.com/causalai/latest/index.html https://github.com/salesforce/causalai"  
[X Link](https://x.com/SFResearch/status/1620497269937012736)  2023-01-31T19:00Z 19.1K followers, 12.7K engagements


"Check out our #CVPR2023 Accepted Papers Congrats to all the authors @silviocinguetta @CaimingXiong @jcniebles @nikhil_ai @LiJunnan0409 @bram_wallace @realNingYu"  
[X Link](https://x.com/SFResearch/status/1630733242373664769)  2023-03-01T00:54Z 19.1K followers, [----] engagements


"Editing an image using AI but want to keep the details Check out our work EDICT (๐ŸŽ†CVPR 2023๐ŸŽ†): Gradio Demo: Code: Arxiv: Authors: @bram_wallace @nikhil_ai https://arxiv.org/abs/2211.12446 https://huggingface.co/spaces/Salesforce/EDICT Do you want to make your dog look like a golden retriever Or get a picture of a cat surfing Researchers at Salesforce recently developed a new editing algorithm called EDICT - here's a thread on the results and details ๐Ÿงต https://t.co/vHDdVonBz0 https://arxiv.org/abs/2211.12446 https://huggingface.co/spaces/Salesforce/EDICT Do you want to make your dog look"  
[X Link](https://x.com/SFResearch/status/1636075152491556864)  2023-03-15T18:41Z 19.1K followers, 18.6K engagements


"In Loving Memory of Dragomir Radev. You will be missed. โ™ฅ @dragomir_radev https://blog.salesforceairesearch.com/in-loving-memory-of-drago-radev/ https://blog.salesforceairesearch.com/in-loving-memory-of-drago-radev/"  
[X Link](https://x.com/SFResearch/status/1643629570799833088)  2023-04-05T15:00Z 19.1K followers, [----] engagements


"Check out our #ACL2023 Accepted Papers Congrats to all the authors @silviocinguetta @CaimingXiong @alexfabbri4 @JiachengNLP @memray0 @JotyShafiq @jasonwu0731 @iam_wkr @PhilippeLaban @jesse_vig @yingbozhou_ai @semih__yavuz"  
[X Link](https://x.com/SFResearch/status/1653807447772102656)  2023-05-03T17:03Z 19.1K followers, 14.9K engagements


"๐Ÿ”ฅIntroducing XGen-7B a new 7B LLM trained on 8K seq. length for 1.5T tokens. Better or comparable results with MPT Falcon LLaMA OpenLLaMA in text & code tasks. Blog: Code: Training cost $150K for 1T token https://github.com/salesforce/xgen http://blog.salesforceairesearch.com/xgen/ https://github.com/salesforce/xgen http://blog.salesforceairesearch.com/xgen/"  
[X Link](https://x.com/anyuser/status/1674131144898650113)  2023-06-28T19:02Z 19.1K followers, 22.5K engagements


"Releasing ๐Ÿš€ CodeGen2.5 ๐Ÿš€ a small but mighty LLM for code. - On par with models twice its size - Trained on 1.5T tokens - Features fast infill sampling Blog: Paper: Code: Model: https://huggingface.co/Salesforce/codegen25-7B-multi https://github.com/salesforce/CodeGen https://arxiv.org/abs/2305.02309 https://blog.salesforceairesearch.com/codegen25 https://huggingface.co/Salesforce/codegen25-7B-multi https://github.com/salesforce/CodeGen https://arxiv.org/abs/2305.02309 https://blog.salesforceairesearch.com/codegen25"  
[X Link](https://x.com/SFResearch/status/1677056474491785216)  2023-07-06T20:46Z 19.1K followers, 230.2K engagements


"Introducing XGen-Image-1 our first foray into training large text-to-image models. Trained for $75K using TPUs on the LAION dataset XGen-Image-1 matches the performance of Stable Diffusion 1.5/2.1. https://blog.salesforceairesearch.com/prototyping-xgen-image-1/ https://blog.salesforceairesearch.com/prototyping-xgen-image-1/"  
[X Link](https://x.com/anyuser/status/1689713203125829634)  2023-08-10T18:59Z 19.1K followers, 18.7K engagements


"Our blog for Diffusion-DPO is now live๐Ÿš€ In this project we brought the benefits of Reinforcement Learning from Human Feedback (RLHF) to text-to-image diffusion models at scale for the first time. https://blog.salesforceairesearch.com/diffusion-dpo/ https://blog.salesforceairesearch.com/diffusion-dpo/"  
[X Link](https://x.com/anyuser/status/1744798676084801666)  2024-01-09T19:09Z 19.1K followers, [----] engagements


"Check out our #ICLR2024 Accepted Papers. Congratulations to all of our authors"  
[X Link](https://x.com/anyuser/status/1748112385376936140)  2024-01-18T22:37Z 19.1K followers, [----] engagements


"We have [--] accepted papers at NAACL this year Congratulations to all of our authors on their work"  
[X Link](https://x.com/anyuser/status/1768393313533788305)  2024-03-14T21:46Z 19.1K followers, [----] engagements


"๐ŸŒŸ Meet #Moirai: Revolutionizing time-series forecasting with universal models Say goodbye to dataset-specific models and hello ๐Ÿ‘‹ to accurate forecasts across domains Code: LOTSA data: Blog post: https://sforce.co/3TCMDqu https://sforce.co/4axHHtQ https://sforce.co/4aADhSM https://sforce.co/3TCMDqu https://sforce.co/4axHHtQ https://sforce.co/4aADhSM"  
[X Link](https://x.com/anyuser/status/1774852504326422588)  2024-04-01T17:33Z 19.1K followers, 13.1K engagements


"Check out Diffusion-DPO๐ŸŒŸ Bridging the gap between StableDiffusion & closed models like Midjourney v5. Our #TextToImage model uses human feedback for state-of-the-art alignment marking a new era in AI creativity Code: Blog: https://sforce.co/3VHYQg3 https://sforce.co/4ab7p7J https://sforce.co/3VHYQg3 https://sforce.co/4ab7p7J"  
[X Link](https://x.com/anyuser/status/1776359178186928432)  2024-04-05T21:20Z 19.1K followers, [----] engagements


"Meet our Tiny Giant. Our 1B parameter model xLAM-1B is now the best micro model for function calling outperforming models 7x its size including GPT-3.5 & Claude. On-device agentic AI is here. #AIResearch #SLM #TinyButMighty Paper: Github: https://apigen-pipeline.github.io/ https://arxiv.org/pdf/2406.18518 https://apigen-pipeline.github.io/ https://arxiv.org/pdf/2406.18518"  
[X Link](https://x.com/anyuser/status/1807811770267971984)  2024-07-01T16:21Z 19.1K followers, 59.1K engagements


"Just in Our Tiny Giant xLAM-1B-fc has officially arrived on @huggingface with a few friends๐ŸŽ‰ Check out for our suite of small agentic models including xLAM-1B-fc and xLAM-7B-fc with mobile-ready quantized versions nowโšก#LAM #AIModels #AI https://bit.ly/4faoYaQ https://bit.ly/4faoYaQ"  
[X Link](https://x.com/anyuser/status/1814012765167886423)  2024-07-18T19:02Z 19.1K followers, 25.4K engagements


"Exciting news ๐ŸŽŠ Our models xLAM-7B-fc and xLAM-1B-fc are ranked #3 and #25 on the Berkeley Function Calling leaderboard. Notably they are the smallest models on the leaderboard๐Ÿš€๐Ÿ“Š #AI #AIModels #AIresearch Check out our suite of small agentic models including mobile-ready quantized versions. ๐Ÿค— @huggingface: http://bit.ly/4faoYaQ http://bit.ly/4faoYaQ"  
[X Link](https://x.com/SFResearch/status/1814352696088113561)  2024-07-19T17:32Z 19.1K followers, [----] engagements


"Trying out a snazzy new LLM tonight Take a break download and try our Tiny Giant xLAM-1B and xLAM-7B now on @huggingface. Your agentic AI workflows will thank you #tinybutmighty https://bit.ly/4faoYaQ https://bit.ly/4faoYaQ"  
[X Link](https://x.com/anyuser/status/1815977733521875112)  2024-07-24T05:10Z 19.1K followers, 10.8K engagements


"Breaking news โžกโžกโžก We just released the MINT-1T ๐Ÿƒdataset One trillion tokens. Multimodal. Interleaved. Open-source. Perfect for training multimodal models and advancing their pre-training. Try it today Blog: Dataset: https://bit.ly/3YikQiN https://bit.ly/3YikQPP https://bit.ly/3YikQiN https://bit.ly/3YikQPP"  
[X Link](https://x.com/anyuser/status/1816180275472392675)  2024-07-24T18:34Z 19.1K followers, 28.5K engagements


"(1/12) Can different LLMs give you unique and novel ideas Very likely NO ๐Ÿค– " :  " reveals: LLMs often on purely imaginary and hallucinated contents Explore ๐Ÿงตor full paper: https://arxiv.org/abs/2407.16604 https://arxiv.org/abs/2407.16604"  
[X Link](https://x.com/anyuser/status/1816225478836978092)  2024-07-24T21:34Z 19.1K followers, 22.9K engagements


"๐Ÿ’ฅ xLAM-7b beats #GPT-4 in function calling according to the The Berkeley Function Calling Leaderboard second only to #Claude 3.5-Sonnet. Our "Tiny Giant" models are ranking [--] and [--]. Check it out: #tinybutmighty #SLM (and congrats team) https://bit.ly/3WIZdY3 https://bit.ly/3WIZdY3"  
[X Link](https://x.com/anyuser/status/1817992118264328669)  2024-07-29T18:34Z 19.1K followers, 11K engagements


"Open science wins again Introducing Salesforce Research DEI our AI software engineering agents org achieving a 34.3% resolve rate on SWE-Bench Lite crushing closed-source systems GitHub: Paper: #OpenScience #AIForAll https://www.arxiv.org/abs/2408.07060 https://salesforce-research-dei-agents.github.io/ https://www.arxiv.org/abs/2408.07060 https://salesforce-research-dei-agents.github.io/"  
[X Link](https://x.com/SFResearch/status/1823760020791517501)  2024-08-14T16:34Z 19.1K followers, [----] engagements


"๐Ÿš€ Supercharge your RAG pipeline ๐Ÿš€ Introducing LlamaRank our SOTA reranker outperforming leading APIs in general document ranking and code search across diverse datasets Blog: Try it out on @togethercompute: Built on Llama3-8B-Instruct and with linear and calibrated scoring for easy interpretation LlamaRank isn't just powerful it's blazingly fast. https://bit.ly/3SZHybZ https://bit.ly/3MmHDTu https://bit.ly/3SZHybZ https://bit.ly/3MmHDTu"  
[X Link](https://x.com/anyuser/status/1828193575441506381)  2024-08-26T22:11Z 19.1K followers, 23.7K engagements


"Introducing the full xLAM family our groundbreaking suite of Large Action Models ๐Ÿš€ From the 'Tiny Giant' to industrial powerhouses xLAM is revolutionizing AI efficiency #AIResearch #AIEfficiency ๐Ÿค— Hugging Face Collection: ๐Ÿคฉ Research Blog ๐Ÿ—ž Press Release: Meet the family: xLAM-1B / TINY: Our 1B parameter marvel ideal for on-device AI. Outperforms larger models despite its compact size xLAM-7B / SMALL: Perfect for swift academic exploration with limited GPU resources. xLAM-8x7B / MEDIUM: Mixture-of-experts model balancing latency resources and performance for industrial applications."  
[X Link](https://x.com/anyuser/status/1832117658533134375)  2024-09-06T18:04Z 19.1K followers, 15.3K engagements


"๐Ÿ‘‡UPDATED DATASET๐Ÿ‘‡Fineweb training dataset just got leaner We've tackled the 70% duplication issue in this valuable 93.4TB dataset. Same great data now more efficient and cost-effective. #AIResearch #DataEfficiency https://bit.ly/3XI3wlB https://bit.ly/3XI3wlB"  
[X Link](https://x.com/anyuser/status/1838954730938081440)  2024-09-25T14:52Z 19.1K followers, 19.2K engagements


"Introducing SFR-Judge our new family of three judge models (8B 12B and 70B parameters) a game-changer for auto-evaluation and reward modeling. Blog: Paper: Github: (code coming soon): ๐Ÿ’ฅ Trained to perform pairwise comparison direct scoring and classification judgments ๐Ÿ’ฅ Outperformed many open-source judges on 10/13 benchmarks ๐Ÿ’ฅ Broken the 90% accuracy barrier on RewardBench - a first for generative models ๐Ÿ’ฅ Showed less bias across [--] key metrics than many other judge models ๐Ÿ’ฅ Matched/outperformed GPT-4o on most pairwise & direct scoring and classification tasks Accelerate your own model"  
[X Link](https://x.com/anyuser/status/1839718503072547040)  2024-09-27T17:27Z 19.1K followers, 35.3K engagements


"๐Ÿ† ๐Ÿ† ๐Ÿ† Our groundbreaking research on prompt leakage in multi-turn LLM interactions is amongst the top-50% industry-track papers accepted to #EMNLP2024 We propose a novel threat model uncover social engineering vulnerabilities measure fine-grained leakage and apply different mitigation techniques. Learn how to build more #SecureAI systems: #LLMSecurity #AISafety #TrustedAI https://arxiv.org/abs/2404.16251 https://arxiv.org/abs/2404.16251"  
[X Link](https://x.com/SFResearch/status/1841566430661144910)  2024-10-02T19:50Z 19.1K followers, [----] engagements


"๐Ÿ“ข๐Ÿ“ข๐Ÿ“ขIntroducing xGen-MM-Vid (BLIP-3-Video) This highly efficient multimodal language model is laser-focused on video understanding. Compared to other models xGen-MM-Vid represents a video with a fraction of the visual tokens (e.g. [--] vs. [----] tokens). Paper: Website: Researchers ๐Ÿงต:๐Ÿ‘‡ https://bit.ly/3Yvyqiy https://arxiv.org/abs/2410.16267 https://bit.ly/3Yvyqiy https://arxiv.org/abs/2410.16267"  
[X Link](https://x.com/anyuser/status/1848793628166205944)  2024-10-22T18:28Z 19.1K followers, 12.7K engagements


"โ“Beyond "right or wrong": Introducing a novel RAG evaluation framework based on sub-question coverage. How do we measure if RAG systems are giving complete answers to complex questions Enter: Do RAG Systems Cover What Matters Evaluating and Optimizing Responses with Sub-Question Coverage #AccurateAI ๐Ÿ“ŽPaper: ๐Ÿงตstarts here ๐Ÿ‘‡ 1) We propose decomposing questions into sub-questions and classifying them into three typescore background and follow-upto reflect their roles and importance. ๐Ÿ’  Core sub-questions are central to addressing the main query. ๐Ÿ’  Background sub-questions provide necessary"  
[X Link](https://x.com/anyuser/status/1849583290685980915)  2024-10-24T22:46Z 19.1K followers, [----] engagements


"๐Ÿง  Breaking Research ๐Ÿง  Solving the LLM "Goldilocks Problem" Introducing Auto-CEI: A breakthrough training method that helps train LLMs find the sweet spot between overconfident (plausible but incorrect) hallucinations and overcautious (I dont know) refusals. ๐Ÿ”— Full paper: ๐Ÿงต Research review: ๐Ÿ‘‡ #LLMResearch #TrustedAI https://ar5iv.org/abs/2410.07627 https://ar5iv.org/abs/2410.07627"  
[X Link](https://x.com/anyuser/status/1851743691733397895)  2024-10-30T21:51Z 19.1K followers, [----] engagements


"๐Ÿš€Introducing Moirai-MoE:๐Ÿš€ the first mixture-of-experts time series foundation model a breakthrough in universal forecasting Moirai-MoE achieves token-level model specialization autonomously delivering an impressive 17% performance boost over its predecessor Moirai at the same model size. Plus it outperforms other foundation models with up to 65x fewer activated parameters ๐Ÿ’ชDive deeper: ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿค— Models: ๐Ÿ”ฌ Blog: ๐Ÿงต Technical details: ๐Ÿ‘‡ (1/6) Compared to our previous model Moirai using multi-heuristic-defined input/output projection layers to model time series with different"  
[X Link](https://x.com/anyuser/status/1854991012789141723)  2024-11-08T20:54Z 19.1K followers, 42.2K engagements


"๐Ÿš€ Introducing GIFT-Eval: ๐ŸŽThe new gold standard in time series forecasting evaluation 144K+ time series. [--] datasets. One benchmark to rule them all. Dive in: Paper Blog: ๐Ÿง : Github Dataset Leaderboard Our comprehensive GIFT-Eval tests models across ALL domains frequencies & prediction lengths - from zero-shot to full-shot scenarios. Help us advance innovation in AI time series research #TimeSeries #Forecasting #DataScience .and dive into our researcher's technical thread here: ๐Ÿงต๐Ÿ‘‡ https://bit.ly/3ACSshZ https://bit.ly/3ANNTBv https://bit.ly/3YICI59 https://sforce.co/3ALOcwE"  
[X Link](https://x.com/anyuser/status/1856405404575785074)  2024-11-12T18:35Z 19.1K followers, 15K engagements


"Another amazing #EMNLP2024 comes to a close but we at @SFReseach #NeverStopLearning. Missed us in Miami Bookmark save and explore the research below. Thanks @emnlpmeeting -- what an incredible week #Salesforce #AIResearch #NLP ----- Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems ๐Ÿ”– Paper: Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing ๐Ÿ”– Paper: DataNarrative: Automated Data-Driven Storytelling with Visualizations and Texts ๐Ÿ”– Paper: FOLIO: Natural Language Reasoning with First-Order Logic ๐Ÿ”– Paper: Evaluating Psychological"  
[X Link](https://x.com/anyuser/status/1857937393715704074)  2024-11-17T00:02Z 19.1K followers, [----] engagements


"๐Ÿ“Š Meet LaTent Reasoning Optimization (LaTRO):๐Ÿ“Š A principled variational approach to optimize LLM reasoning: ๐Ÿ’ฅ Paper: ๐Ÿ’ฅ Code: By treating reasoning as sampling from a latent distribution LaTRO improves zero-shot math accuracy by 12.5% over base modelsno external rewards needed. Implement self-rewarding reasoning in your models today #AIResearch #DeepLearning https://bit.ly/3YUoQVF https://bit.ly/3YUoP43 https://bit.ly/3YUoQVF https://bit.ly/3YUoP43"  
[X Link](https://x.com/anyuser/status/1858508232802640281)  2024-11-18T13:51Z 19.1K followers, 10.1K engagements


"๐ŸŒณ๐ŸŒณ๐ŸŒณIntroducing "CodeTree"๐ŸŒณ๐ŸŒณ๐ŸŒณ The first unified framework combining tree-based strategy exploration + execution feedback + LLM agent guidance for code generation. ๐Ÿ–‡ Paper: ๐Ÿ“ˆ Setting new standards with GPT-4: 95.1% HumanEval 98.7% MBPP 43.0% CodeContests Why CodeTree works: ๐ŸŒณ Tree structure unifies strategy planning implementation & refinement ๐ŸŒณ Novel Critic Agent guides search & pruning ๐ŸŒณ Combines execution feedback + LLM reasoning ๐ŸŒณ Breakthrough on complex tasks (27.6% on SWEBench) Our framework enables efficient exploration of coding strategies and multi-stage refinement"  
[X Link](https://x.com/anyuser/status/1863710368230424998)  2024-12-02T22:22Z 19.1K followers, 16.3K engagements


"๐ŸŒณ๐ŸŒณ๐ŸŒณ Take a closer look at CodeTree ๐ŸŒณ๐ŸŒณ๐ŸŒณ 1/6 Dive deep into our new framework for code generation with large language models (LLMs) combining multi-agent collaboration with an efficient tree search strategy. Code: Paper: Technical thread :๐Ÿ‘‡ https://bit.ly/3Vo0Au0 https://bit.ly/3Vo0AKw https://bit.ly/3Vo0Au0 https://bit.ly/3Vo0AKw"  
[X Link](https://x.com/anyuser/status/1864455413992698267)  2024-12-04T23:43Z 19.1K followers, [----] engagements


"๐Ÿ”ฌ๐Ÿ”ฌ๐Ÿ”ฌIntroducing ProVision: A new system for transforming images into verified instruction data for multimodal language models (MLMs) at massive scale Scene graphs + programmatic synthesis generate 10M+ diverse automated Q&A pairs. Fully verifiable. Training MLMs Dive in: ๐Ÿ“ฐBlog: ๐Ÿ—žPaper: ๐Ÿ’ปDataset: ๐Ÿ‘‡Researchers ๐Ÿงต๐Ÿ‘‡ (1/6) Why build ProVision Training multimodal LMs demands massive instruction datasets - pairing images with Q&As. Manual creation is costly while using existing models risks hallucinations. ProVision's novel solution Scene graphs + human-written programs. We represent images"  
[X Link](https://x.com/anyuser/status/1877109435795124568)  2025-01-08T21:45Z 19.1K followers, 20.8K engagements


"๐ŸŒฎ Introducing ๐ŸŒฎ TACO - our new family of multimodal action models that combine reasoning with real-world actions to solve complex visual tasks ๐Ÿ“ŠResults: 20% gains on MMVet 3.9% average improvement across [--] benchmarks 1M+ synthetic CoTA traces in training ๐Ÿ”“ ๐Ÿ”“๐Ÿ”“Fully open-sourced ๐Ÿ”“๐Ÿ”“๐Ÿ”“ Get started with: ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿ“ฑ Demo: ๐Ÿค– Models: ๐Ÿ“š Datasets: ๐Ÿงต .and our Technical deep-dive starts here (1/4) How does TACO work ๐Ÿค” โ›“TACO answers complex questions by generating Chains-of-Thought-and-Action (CoTA) executing intermediate actions with external tools such as OCR calculator and depth"  
[X Link](https://x.com/anyuser/status/1877487452178493877)  2025-01-09T22:47Z 19.1K followers, 70.7K engagements


"๐Ÿ”ฌWere so excited about TACO. our new open sourced multimodal model family that excels at complex visual reasoning tasks requiring multiple steps and external tools ๐Ÿ“Š The results speak for themselves: ๐ŸŒฎ30-50% accuracy boost vs. few-shot CoTA prompting ๐ŸŒฎUp to 20% improvement on MMVet benchmark ๐ŸŒฎConsistent outperformance across [--] benchmarks but check out our new blog that brings it to life and a great write-up by @Marktechpost Ready to get started ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿ“ฑ Demo: ๐Ÿค– Models: ๐Ÿ“š Datasets: ๐Ÿงต Research thread https://bit.ly/3Pxtzbv https://bit.ly/4j2ZG0h https://bit.ly/3PwrEE2"  
[X Link](https://x.com/anyuser/status/1880045371734339684)  2025-01-17T00:12Z 19.1K followers, [----] engagements


"๐Ÿšจ๐Ÿšจ๐ŸšจJust released๐Ÿšจ๐Ÿšจ๐Ÿšจ ๐Ÿš€Introducing the Salesforce Code Embedding Model Family (SFR-Embedding-Code) ranked #1 on CoIR Benchmark ๐Ÿš€ Available in [--] sizes: 2B 400M. Key Highlights: [--] 2B Model: Achieves #1 on CoIR. 2400M Model: Best-performing model under 0.5B parameters. [--] Multi-lingual multi-task unified training framework for code retrieval [--] Supports [--] programming languages including Python Java C++ JavaScript C# and more ๐Ÿง‘๐Ÿ’ปโœจEmpower your next AI Coding Agent with the best code embedding models ๐Ÿง‘๐Ÿ’ปโœจ Join us in advancing #AccurateAI: ๐Ÿ“ŽPaper: ๐Ÿค—400M Model: ๐Ÿค—2B Model: #CodeAI"  
[X Link](https://x.com/anyuser/status/1880383207310266470)  2025-01-17T22:34Z 19.1K followers, 22.8K engagements


"๐ŸŽ‰ โœ Our research on advancing AI-generated writing accepted to #CHI2025 โœ ๐ŸŽ‰ Our paper reveals how expert edits fix AI text issuesfrom clichs to purple prose creating better data for Reinforcement Learning from Human Feedback (RLHF) alignment. Thanks @acm_chi we'll see you in Yokohama Check it out #RLHFdata #AIforWriting https://arxiv.org/pdf/2409.14509 https://arxiv.org/pdf/2409.14509"  
[X Link](https://x.com/anyuser/status/1882557221835342078)  2025-01-23T22:33Z 19.1K followers, [----] engagements


"๐Ÿ“ฃ From efficient key caches and multimodal embeddings to self-improving reasoning and faithful context adherence. we're thrilled to present a broad range of powerful new research at #ICLR2025 ๐ŸŽ‰ Bookmark our accepted papers below and we'll see you in Singapore @iclr_conf ๐Ÿ”– REGENESIS: LLMs can grow into reasoning generalists via self improvement ๐Ÿง Becky Xiangyu Peng Congying Xia Xinyi Yang Caiming Xiong Jason Wu Chen Xing ๐Ÿ”–SiReRAG: Indexing Similar and Related Information for Multihop Reasoning ๐Ÿง  Nan Zhang Prafulla Choubey Alexander. Fabbri Gabriel Bernadett-Shapiro Jason Wu ๐Ÿ”–FaithEval:"  
[X Link](https://x.com/anyuser/status/1883739110902222864)  2025-01-27T04:49Z 19.1K followers, [----] engagements


"๐Ÿ”ฌAdvanced agent systems RAG evaluation instruction-following and more. Our team's accepted papers at #NAACL2025 span from professional CRM research to parallel in-context learning. ๐ŸŽ‰A huge congrats to our researchers and thanks to @naacl we're excited to share and discuss with the community this spring ๐Ÿ’ซ ๐Ÿ‘‡๐Ÿ“‘Bookmark and explore the research below ๐Ÿ“‘๐Ÿ‘‡ ๐Ÿ“ŽCRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments: ๐Ÿ‘Steeve Huang Akshara Prabhakar Sidharth Dhawan Yixin Mao Huan Wang Silvio Savarese Caiming Xiong Philippe Laban Chien-Sheng"  
[X Link](https://x.com/SFResearch/status/1883987493931893035)  2025-01-27T21:16Z 19.1K followers, [----] engagements


"We built SFR-Embedding-Code to bridge a critical gap: While text retrieval has advanced rapidly code retrieval needed specialized attention. Our open-source models achieve SOTA results by learning from diverse code and text tasks and supporting [--] programming languages. See why SFR-Embedding is the Top-1 model on the CoIR Leaderboard ๐Ÿฅ‡ #CodeRetrieval #AIforDevelopers ๐Ÿ“– Read more in our latest blog: For the models and more: ๐Ÿค—400M Model: ๐Ÿค—2B Model: ๐Ÿ†CoIR Leaderboard: ๐Ÿ“„Technical Report: https://bit.ly/4gSZteu https://bit.ly/3CkgRKj https://bit.ly/3PCqxmp https://bit.ly/4jhDRdp"  
[X Link](https://x.com/anyuser/status/1885125009372234218)  2025-01-31T00:36Z 19.1K followers, [----] engagements


"๐Ÿ”„ PerfCodeGen: When LLMs learn from their own code execution. Our training-free framework outperforms human solutions in up to 67% of coding tasks by doing what great developers do - test analyze refine repeat. ๐Ÿ“Š Paper: ๐Ÿง‘๐Ÿ’ป Code: ๐Ÿ“ฐ MarkTechPost: ๐Ÿงต Researcher's walk-through๐Ÿ‘‡ #EfficientAI #CodeGeneration https://bit.ly/4jEVGDp https://bit.ly/4akP20J https://bit.ly/4jmH5wb https://bit.ly/4jEVGDp https://bit.ly/4akP20J https://bit.ly/4jmH5wb"  
[X Link](https://x.com/anyuser/status/1886532042516504933)  2025-02-03T21:47Z 19.1K followers, [----] engagements


"โšก Meet BOLT: A novel approach to develop long chain-of-thought reasoning in LLMs without relying on knowledge distillation or extensive human annotations. ๐Ÿ“„ Three key stages: [--] LongCoT data bootstrapping via in-context learning [--] Supervised fine tuning [--] Online refinement Achieves 40%+ gains on Arena-Hard & strong results across MT-Bench WildBench & MATH500 - all with just [--] examples. *Shout out to @_akhaliq for sharing it http://arXiv.org/abs/2502.03860v1 http://arXiv.org/abs/2502.03860v1"  
[X Link](https://x.com/anyuser/status/1888008582580346914)  2025-02-07T23:34Z 19.1K followers, [----] engagements


"๐Ÿ”‰ New advances in LLM reasoning capabilities accepted for oral presentation at #ICLR2025 ๐Ÿ“Ž Paper: ReGenesis introduces a novel approach where models self-improve their reasoning through abstraction-to-concrete progression - no human supervision needed. Key findings: Self-synthesized reasoning paths Superior generalization to new tasks 6.1% improvement in OOD performance Validated across multiple model architectures Our work opens new possibilities for developing more robust and generalizable AI systems. Stay tuned for the full presentation and see you in Singapore #AIResearch #AIReasoning"  
[X Link](https://x.com/anyuser/status/1889732754637484055)  2025-02-12T17:46Z 19.1K followers, [----] engagements


"๐Ÿš€Just dropped: Reward-Guided Speculative Decoding (RSD) - our breakthrough approach that makes LLM inference up to [---] faster while IMPROVING accuracy. ๐Ÿ“„Paper: ๐Ÿ’ปCode: ๐Ÿ‘‡ Key innovations in RSD: ๐Ÿ‘‡ [--] Biased Acceleration - Unlike traditional speculative decoding methods that enforce unbiasedness RSD incorporates a controlled bias to prioritize high-reward outputs. [--] Dynamic Quality Control - Process Reward Model (PRM) acts as real-time quality gate only engaging costly target model when needed [--] Proven Optimality - Mathematically derived threshold strategy ensures optimal"  
[X Link](https://x.com/anyuser/status/1890200184900169847)  2025-02-14T00:43Z 19.1K followers, [----] engagements


"๐ŸŽ‰Just Announced: "ViUniT: Visual Unit Tests for More Robust Visual Programming" has been accepted at #CVPR2025 Paper Link: Project Page: Researchers walk-through ๐Ÿ‘‡ In collaboration with @UPenn we introduce ViUniT a framework that enhances the reliability of visual programs by automatically generating unit tests by leveraging #LLMs and #DiffusionModels. Our approach: ๐Ÿ“Š Boosts model performance by 11.4% and outperforms gpt-4o-mini by 7.7%. ๐Ÿ”„ Reduces right-for-wrong-reasons errors by 40%. ๐Ÿ’ก Introduces innovative applications like best program selection answer refusal and unsupervised reward"  
[X Link](https://x.com/anyuser/status/1897059634864673205)  2025-03-04T23:00Z 19.1K followers, [----] engagements


"๐Ÿ“ฃ Introducing Text2Data open-sourced for the research community ๐Ÿ–‡ Paper: Code: ๐Ÿงช A major advancement in multimodal AI - a low-resource universal text-to-anything framework capable of bridging text with diverse modalities (molecules motion sequences time series) without costly human annotations ๐ŸŽฌ Text2Data in action: ๐ŸŽฌ Our framework first learns general data patterns from unlabeled data (blue) then fine-tunes with limited labeled examples (red) using constraint optimization to prevent forgetting. At bottom you see molecules generated with increasing polarizability levels from 'very low'"  
[X Link](https://x.com/anyuser/status/1898114547988238586)  2025-03-07T20:52Z 19.1K followers, [----] engagements


"Our paper "Can AI writing be salvaged Mitigating Idiosyncrasies and Improving Human-AI Alignment in the Writing Process through Edits" has been awarded a Best Paper Honorable Mention and is in the Top 5% of submissions for #CHI2025 ๐ŸŽ‰ Check it out here: #AI #Research #AIWriting @jasonwu0731 @TuhinChakr @PhilippeLaban https://arxiv.org/pdf/2409.14509 https://arxiv.org/pdf/2409.14509"  
[X Link](https://x.com/anyuser/status/1905365925286871047)  2025-03-27T21:06Z 19.1K followers, [----] engagements


"(1/4) Foundation models are revolutionizing time series analysisbut their success depends on large diverse high-quality datasets which poses a major challenge. Enter synthetic data reshaping Time Series Foundation Models (TSFMs) & Time Series LLMs (TSLLMs). Our survey explores how it tackles data scarcity improves model training & unlocks new research directions. ๐Ÿงต ๐Ÿ“ Paper: https://arxiv.org/abs/2503.11411 https://arxiv.org/abs/2503.11411"  
[X Link](https://x.com/anyuser/status/1905708191419412677)  2025-03-28T19:46Z 19.1K followers, [----] engagements


"๐Ÿšจ New Survey Alert ๐Ÿšจ ๐Ÿง A Survey of Frontiers in LLM Reasoning: Inference Scaling Learning to Reason and Agentic Systems ๐Ÿ“˜ Paper: ๐Ÿง  Project Page: ๐Ÿงต Researcher's thread: ๐Ÿ‘‡ (1/6) Reasoning is the key to unlocking true AI intelligence.๐Ÿ”‘ Two factors that affect the reasoning capabilities are: [--] Regime: how and at what stage is reasoning achieved [--] Architecture: what components are involved in the reasoning process โšกWe present a comprehensive survey along these two dimensions summarizing recent progress and covering: Regimes from inference scaling (e.g. OpenAI o1) to learning to reason (e.g."  
[X Link](https://x.com/anyuser/status/1907939602293297217)  2025-04-03T23:33Z 19.1K followers, [----] engagements


"๐Ÿ‘ Looking for VLMs that go beyond generators to transform multimodal embeddings Meet "VLM2VEC: Training Vision-Language Models for Massive Multimodal Embedding Tasks" ๐Ÿ“Ž Paper: ๐Ÿ’ป Website: Our #ICLR25-featured paper shows how vision language models transform into powerful embedders for classification VQA retrieval and visual grounding. We unlock strong emergent capabilities by deeply fusing vision and language rather than shallow combinations. Visit us in Singapore to see how we're redefining multimodal representation learning #MultimodalAI #VLMs https://tiger-ai-lab.github.io/VLM2Vec/"  
[X Link](https://x.com/anyuser/status/1908378560764547474)  2025-04-05T04:37Z 19.1K followers, [----] engagements


"Our xLAM (#LargeActionModels) family just got an upgrade [--] Multi-turn natural conversation support [--] Smarter multi-step reasoning [--] Models from 1B to 70B for ultimate flexibility ๐Ÿค— HuggingFace: ๐Ÿ‘‘ BFCL Leaderboard: Our research models xLAM-70B-r ranks #1 and xLAM-32B-r #2 on the BFCL function-calling leaderboardbeating GPT-4o Gemini Qwen & more. xLAM-8B-r lands at #4 ahead of GPT-4o. And our Tiny Giant xLAM-1B-r plus xLAM-3B-r outperform much larger models like Mistral-Large and DeepSeek-V3. This is just the beginningwe're building even stronger xLAM models internally to inspire future"  
[X Link](https://x.com/SFResearch/status/1913285109375209606)  2025-04-18T17:34Z 19.1K followers, 10.8K engagements


"๐Ÿ“ฃ Meet: "From AI-Slop to AI-Polish" tackling the elephant in the room ๐Ÿ˜ AI writing quality is "mid" at best. Despite LLMs crushing coding their creative writing feels pedestrian. We introduce: [--] Writing Quality Benchmark (WQ): First comprehensive testbed for writing quality assessment [--] Writing Quality Reward Models (WQRM): Outperforming GPT-4o & Claude with 74% accuracy on WQ [--] Test-time compute strategies yielding text preferred by experts 66% of the time ๐Ÿ–‡ Paper: Time to raise the bar on AI-generated text beyond "coherent but clichd." #AIResearch #NLP #WritingQuality"  
[X Link](https://x.com/SFResearch/status/1915606959799050744)  2025-04-25T03:21Z 19.1K followers, [----] engagements


"๐Ÿ”ฌ NEW BLOG DROP Our complete technical breakdown on small language models is now available on our research blog: Read here: ๐Ÿ” Discover our research on enterprise-ready AI that delivers powerful performance without the bloat ๐Ÿ‘€ See breakthrough results on long-context understanding at 128K tokens Math prowess revealed: 95% on GSM8K 92.5% on MATH 46.7% on AIME [----] ๐Ÿ’ป Code mastery: 41.1% on LiveCodeBench best-in-class performance ๐Ÿ’ช Our "small but long" approach proves deliberate engineering beats brute-force scalingoffering predictable costs enhanced privacy and reduced environmental impact."  
[X Link](https://x.com/anyuser/status/1918460416805556459)  2025-05-03T00:19Z 19.1K followers, [----] engagements


"Introducing APIGen-MT: Our agentic pipeline for multi-turn synthetic data generation that produces high-quality training data for tuning AI agents Try our open-sourced dataset today ๐Ÿ“Š Paper: ๐Ÿค— Dataset: We used APIGen-MT to train our xLAM-2 model family including xLAM-2-70b-fc-r still #1 on the BFCL leaderboard with 78.2% accuracy outperforming frontier models like GPT-4o and Claude [---] in function-calling tasks especially in challenging multi-turn scenarios. ๐Ÿค We're open-sourcing 5K high-quality trajectories and trained models to advance AI agent research. ๐Ÿง  xLAM Model Family: ๐Ÿ” BFCL:"  
[X Link](https://x.com/anyuser/status/1920616561406009487)  2025-05-08T23:07Z 19.1K followers, 10.7K engagements


"Excited to announce SWERank our code ranking framework for software issue localization. โžกPaper: โžกGitHub Project Page: โžกAI-Generated Podcast: โžกCode Data and Models: Coming soon (1/3) ๐Ÿงต Pinpointing the exact location of a software issue in code is a critical but often time-consuming part of software development. Current agentic approaches to localization can be slow and expensive relying on complex steps and often closed-source models. We introduce SWERank a retrieve-and-rerank framework that comprises SWERankEmbed a bi-encoder code retriever and SWERankLLM a listwise LLM code reranker."  
[X Link](https://x.com/anyuser/status/1922070680830448066)  2025-05-12T23:25Z 19.1K followers, [----] engagements


"๐ŸšจMODEL RELEASE We're thrilled to announce our powerful compact xGen-small model family now available for the research community. ๐Ÿค—Download xGen-Small model: Key highlights: xGen-9B: highly competitive on long-context understanding up to 128K tokens Exceptional math reasoning: 95.3% GSM8K 91.6% MATH 50.0% AIME [----] Superior code generation: 50.6% on LiveCodeBench Our "small but long" approach proves strategic engineering beats brute-force scaling. Full breakdown in our blog: Technical report available here: Advance your research today and tell us what you think #SLMs #EnterpriseAI"  
[X Link](https://x.com/anyuser/status/1922329256538849567)  2025-05-13T16:33Z 19.1K followers, 43.6K engagements


"We're thrilled to announce BLIP3-o a breakthrough in unified multimodal models that excels at both image understanding and generation in a single autoregressive architecture ๐Ÿ’ซ ๐Ÿ“Š Paper: ๐Ÿค— Models: ๐Ÿง  Code: ๐Ÿ“ฝ Learn on the go (AI Generated): Our research reveals that using CLIP features with diffusion transformer and flow matching creates superior performance while reducing computational complexity. Most importantly we're making this model family available to the AI Research community: Complete model implementations Model weights 25M+ detailed caption pretrain dataset 60K high-quality"  
[X Link](https://x.com/anyuser/status/1923437520848638402)  2025-05-16T17:56Z 19.1K followers, [----] engagements


"๐ŸšจWere proud to announce our #ACL2025NLP-accepted papers. Preview and bookmark the research below and well look forward to seeing you in Vienna. Thanks @aclmeeting ๐Ÿ‘‰ Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents ๐Ÿ‘‰ Unanswerability Evaluation for Retrieval Augmented Generation ๐Ÿ‘‰ Why Vision Language Models Struggle with Visual Arithmetic Towards Enhanced Chart and Geometry Understanding ๐Ÿ‘‰ Does Context Matter ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings ๐Ÿ‘‰ What Makes a Good Natural Language"  
[X Link](https://x.com/anyuser/status/1923537535201878423)  2025-05-17T00:34Z 19.1K followers, [----] engagements


"๐ŸšจIntroducing "Elastic Reasoning"๐Ÿšจ Our novel framework solves LLM inference budget constraints without sacrificing performance. Open and available to the research community: ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿค— Models: Key insight: Separate "thinking" and "solution" phases with independent token budgets plus budget-constrained rollout training. Research results: ๐Ÿ‘‰ E1-Math-1.5B: 35% accuracy on AIME2024 with 32% fewer tokens ๐Ÿ‘‰ E1-Code-14B: Codeforces rating of [----] (96th percentile) ๐Ÿ‘‰ Models generalize to ANY budget without retraining The framework (shown) combines GRPO training under constraints +"  
[X Link](https://x.com/anyuser/status/1925721956030050457)  2025-05-23T01:14Z 19.1K followers, [----] engagements


"๐Ÿ† Introducing MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision ๐Ÿ† ๐Ÿ’ป Project Page: ๐Ÿ“„ Paper: ๐Ÿ”— Code: ๐Ÿ“š Explore 1000+ Discovered MAS designs: ๐Ÿงต Technical walk-through ๐Ÿ‘‡ (1/6) Multi-Agent Systems (MAS) can outperform single-agent approaches however designing MAS manually is difficult especially when LLM preferences differ from human intuition and manually designed MAS are hard to adapt to new tasks. โ“Can we automate MAS designeven better can we make it self-evolving without relying on a validation set Meet MAS-Zero: a meta-level inference-time self-evolving framework for"  
[X Link](https://x.com/anyuser/status/1927493966439625053)  2025-05-27T22:35Z 19.1K followers, 18.6K engagements


"๐Ÿšจ Introducing CRMArena-Pro: The first multi-turn enterprise-grade benchmark for LLM agents โœBlog: ๐Ÿ–‡Paper: ๐Ÿค—Dataset: ๐Ÿ–ฅCode: Most AI benchmarks test isolated single-turn tasks. Enterprise work is messy multi-step and demands both capability AND confidentiality ๐Ÿ”ฌBuilt with our exclusive synthetic dataset: Live Salesforce Org sandboxes with realistic expert-crafted CRM data enterprise complexity without customer exposure. What makes CRM-Pro different: ๐ŸŽฏ Multi-domain: Sales service CPQ workflows ๐Ÿ”„ Multi-turn conversations vs single exchanges ๐Ÿ”’ Confidentiality awareness testing ๐Ÿข Live CRM"  
[X Link](https://x.com/anyuser/status/1928252326772342804)  2025-05-30T00:49Z 19.1K followers, 13.7K engagements


"โšก NEW COMPUTER-USE AI RESEARCH โšก Introducing: [--] Our paper Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis 2OSWORLD-G benchmark covering fine-grained manipulation and layout understanding 3JEDI dataset our GUI grounding dataset series with 4M examples 3B and 7B model variants ๐Ÿ”—Paper: ๐Ÿง‘๐Ÿ’ปCode & Sample Usage: ๐Ÿ’ปWebsite: ๐Ÿค—Dataset: Key contributions: [---] expertly annotated samples across [--] capability dimensions Multi-perspective task decomposition (icons components layouts) SOTA performance: 91.7% on ScreenSpot-v2 54.1% on OSWORLD-G Direct impact: 5% 27% success"  
[X Link](https://x.com/SFResearch/status/1928925067234218422)  2025-05-31T21:22Z 19.1K followers, [----] engagements


"๐Ÿ† #ICML2025 Best Paper Award: AI Safety Should Prioritize the Future of Work ๐Ÿ“„ Paper: ๐ŸŽ‰ Congratulations to Sanchaita Hazra @hsanchaita Bodhisattwa Prasad Majumder @mbodhisattwa and Tuhin Chakrabarty @TuhinChakr for winning the Outstanding Award one of [--] top papers out of [----] accepted submissions Key insights: ๐Ÿ”ธ Comprehensive worker transition support needed ๐Ÿ”ธ AI exacerbates income inequality through labor disruption ๐Ÿ”ธ International copyright reforms & collective licensing required ๐Ÿ”ธ Pro-worker AI governance for shared prosperity @icmlconf #AIethics #FutureOfWork #AIgovernance"  
[X Link](https://x.com/anyuser/status/1947701131682996635)  2025-07-22T16:51Z 19.1K followers, [----] engagements


"๐Ÿ’ก Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models ๐Ÿ’ก ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿ˜ต๐Ÿ’ซ Have a task but experiencing prompt engineering existential dread Few-shot or zero-shot Chain-of-thought or ReAct Where do I get examples Should I label data How do I evaluate What metrics Manual feedback or auto-looping Why does one word change everything Promptomatix eliminates the entire decision tree. Describe task receive optimized prompt question nothing. Sanity restored โœจ #LLMs #LargeLanguageModels #FutureOfAI #EnterpriseAI https://bit.ly/4lLjQgd https://bit.ly/44IAvuO"  
[X Link](https://x.com/anyuser/status/1948069617756262882)  2025-07-23T17:16Z 19.1K followers, [----] engagements


"๐ŸŒŸ Excited to present our work at Empirical Methods in Natural Language Processing @emnlpmeeting - a leading conference in NLP and AI research ๐Ÿ“„ Our accepted papers: Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization ๐Ÿ‘ฅAuthors: Chuyuan Li @ChuyuanLi Austin Xu @austinsxu Shafiq Joty @JotyShafiq and Giuseppe Carenini @careninigiusepp ๐Ÿ“Paper: Demystifying Domain-adaptive Post-training for Financial LLMs ๐Ÿ‘ฅAuthors: Zixuan Ke @KeZixuan Yifei Ming @ming5_alvin Xuan-Phi Nguyen Caiming Xiong @CaimingXiong Shafiq Joty @JotyShafiq ๐Ÿ“Paper: CEMTM: Contextual"  
[X Link](https://x.com/anyuser/status/1959305373099180217)  2025-08-23T17:22Z 19.1K followers, [----] engagements


"โšก The era of AI agents that just chat is over. @Salesforce just introduced GTA1 - Computer Use Agents that actually CLICK SCROLL and WORK in your enterprise software like a human would. ๐Ÿ‘‰ ๐ŸŽฏ The results are game-changing: โžก 50.1% success on enterprise UIs โžก Outperforms models 10x larger โžก Beats OpenAI's CUA in half the steps โžก Built with enterprise trust & security No more "sorry I can't click that button" - these agents navigate CRMs update records and complete real workflows. The future of work isn't just AI that thinks. It's AI that ACTS. #EnterpriseAI #FutureOfAI"  
[X Link](https://x.com/anyuser/status/1959977726774788438)  2025-08-25T13:54Z 19.1K followers, [----] engagements


"Looking for the cutting-edge of AI research Follow Salesforce AI Research to see how we're transforming enterprise technology through advanced innovations. From world models to agentic systems discover the future of AI before it hits the market"  
[X Link](https://x.com/SFResearch/status/1966145065669271972)  2025-09-11T14:21Z 19.1K followers, 2.4M engagements


"๐Ÿ“ฃ Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels ๐Ÿ“ฃ RL for LLMs faces a critical data bottleneck: existing RL datasets are 10B tokens while pretraining uses 1T tokens. Our Webscale-RL pipeline solves this by automatically converting pretraining documents into 1.2M verifiable QA pairs across 9+ domains. ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿ“Š Dataset: Results: [---] more token-efficient than continual pretraining with significant performance gains on MMLU-pro BigBench and mathematical reasoning benchmarks ๐Ÿ“ˆ Work by Zhepeng Cen (@zhepengcen) Haolin Chen (@HaolinChen11) Shiyu Wang"  
[X Link](https://x.com/anyuser/status/1976762671740412126)  2025-10-10T21:32Z 19.1K followers, 16.6K engagements


"Introducing Enterprise Deep Research (EDR): A steerable multi-agent system that transforms complex enterprise research into comprehensive actionable reports ๐Ÿ“Š EDR combines [--] key components: ๐Ÿง  Master Planning Agent for adaptive query decomposition ๐Ÿ” [--] specialized search agents (General Academic GitHub LinkedIn) ๐Ÿ›  Extensible MCP-based tools (NL2SQL file analysis enterprise workflows) ๐Ÿ“ˆ Visualization Agent for data-driven insights ๐Ÿ”„ Reflection mechanism with optional human-in-the-loop guidance Results on open benchmarks: โœ… Outperforms SOTA on DeepResearch Bench (49.86 score) โœ… 71.57% win"  
[X Link](https://x.com/anyuser/status/1981831647277297799)  2025-10-24T21:14Z 19.1K followers, [----] engagements


"Were thrilled to announce that @MetaMindIO has been acquired by @Salesforce https://www.metamind.io/salesforce-acquisition https://www.metamind.io/salesforce-acquisition"  
[X Link](https://x.com/SFResearch/status/717088190796959745)  2016-04-04T20:35Z 19.1K followers, [---] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@SFResearch Avatar @SFResearch Salesforce AI Research

Salesforce AI Research posts on X about ai, llm, data, check the most. They currently have [------] followers and [---] posts still getting attention that total [-----] engagements in the last [--] hours.

Engagements: [-----] #

Engagements Line Chart

  • [--] Week [-----] -72%
  • [--] Month [-------] -76%
  • [--] Months [---------] -43%
  • [--] Year [---------] +558%

Mentions: [--] #

Mentions Line Chart

  • [--] Week [--] +150%
  • [--] Month [--] +86%
  • [--] Months [---] +8.20%
  • [--] Year [---] -11%

Followers: [------] #

Followers Line Chart

  • [--] Week [------] +0.01%
  • [--] Month [------] +1.50%
  • [--] Months [------] +9.90%
  • [--] Year [------] +19%

CreatorRank: [-------] #

CreatorRank Line Chart

Social Influence

Social category influence technology brands 57.69% stocks 56.15% finance 1.54% countries 0.77% travel destinations 0.77% social networks 0.77%

Social topic influence ai 62.31%, llm #864, data 11.54%, check 10%, agentic #1375, systems 5.38%, leaderboard 4.62%, $crm #40, software 3.85%, the first 3.85%

Top accounts mentioned or mentioned by @caimingxiong @silviocinguetta @jasonwu0731 @lijunnan0409 @jotyshafiq @huanwang @salesforce @semihyavuz @liuzuxin @yingbozhouai @iscreamnearby @philippelaban @stevenhoi @huggingface @streattaylor @violetnpeng @ucla @hllowrld @virprabh @yutongdai

Top assets mentioned Salesforce Inc (CRM) New Gold Inc. (NGD)

Top Social Posts

Top posts by engagements in the last [--] hours

"๐Ÿงต Jason Wu (@jasonwu0731) on our simulation and trustworthy AI work. (1/3) #FutureOfAI #TrustworthyAI"
X Link 2025-12-18T18:51Z 19.1K followers, [---] engagements

"Demographics aren't enough to simulate human behavior. ๐Ÿง  New research introduces SCOPE: a framework and persona dataset collection that moves beyond demographic templates to build richer AI personas grounded in sociopsychological structure. ๐Ÿ“„ Paper: Key findings across [--] models: Demographics alone explain only 1.5% of variance in human responses Adding traits values & identity narratives improves behavioral alignment while reducing bias SCOPE personas outperform existing approaches on SimBench an external social and behavioural benchmark. The work also shows SCOPE can augment existing"
X Link 2026-01-30T00:18Z 19.1K followers, [----] engagements

"At Salesforce AI Research we believe that the most transformative breakthroughs happen when we collaborate with the brightest minds in the academic community. Meet our [----] academic grant recipients: #FutureOfAI #EnterpriseAI https://sforce.co/4rmJ8DA https://sforce.co/4rmJ8DA"
X Link 2026-01-30T23:16Z 19.1K followers, 11.9K engagements

"(2/4) Nanyun (Violet) Peng @VioletNPeng @UCLA is developing MAP-SE a Multi-Agent Persuasion Simulation Engine that studies how influence emerges across adaptive agents with long-term memorymoving beyond simple one-on-one interactions. #AIResearch #AgenticAI"
X Link 2026-01-30T23:17Z 19.1K followers, [----] engagements

"@VioletNPeng @UCLA (3/4) Victor Zhong @hllo_wrld @UWaterloo is creating a framework for "distributional evaluation" using Language Personas grounded in real user data. This enables more realistic scalable AI testing that identifies bias and failure modes. #AIEvaluation #TrustedAI"
X Link 2026-01-30T23:17Z 19.1K followers, [----] engagements

"@VioletNPeng @UCLA @hllo_wrld @UWaterloo (4/4) Percy Liang @percyliang @Stanford is building fully open-source models for agentic tasks. Using the Marin framework his team is training 8B and 32B models with complete transparencyfrom data curation to reinforcement learning. #OpenSource #AgenticAI"
X Link 2026-01-30T23:17Z 19.1K followers, [----] engagements

"Last month @SFResearch team members created onesies for donation to @Hospital_Art supporting their mission to bring art to hospitals worldwide. Proud of our team's commitment to compassion and community impact. #AIforGood"
X Link 2026-02-01T16:50Z 19.1K followers, [----] engagements

"The next phase of AI isn't about larger models. We're engineering systems that give LLMs what they lack: long-term memory multistep reasoning capabilities and orchestration that enables real-world action. This is how AI moves from chatbots to business transformation. https://sforce.co/46sEUlP https://sforce.co/46sEUlP"
X Link 2026-02-03T22:38Z 19.1K followers, [---] engagements

"(2/22) GTA1: GUI Test-time Scaling Agent: GTA1 introduces test-time scaling for GUI agents using multiple candidate action proposals and RL-based grounding to achieve state-of-the-art performance on autonomous task completion across platforms. Authors: Yan Yang Dongxu Li Yutong Dai Yuhao Yang Ziyang Luo Zirui Zhao Zhiyuan Hu Junzhe Huang Amrita Saha Zeyuan Chen Ran Xu Liyuan Pan Silvio Savarese Caiming Xiong Junnan Li https://bit.ly/4o04fdX https://twitter.com/i/web/status/2019787013247897845 https://bit.ly/4o04fdX https://bit.ly/4o04fdX https://twitter.com/i/web/status/2019787013247897845"
X Link 2026-02-06T14:55Z 19.1K followers, [--] engagements

"(3/22) Variation in Verification: Understanding Verification Dynamics in Large Language Models: Generative verifiers can help weak LLM generators nearly match stronger ones in test-time scaling (closing gaps by 75.5%) but verification effectiveness varies with problem difficulty and verifier scaling alone has limits. Authors: Yefan Zhou Austin Xu Yilun Zhou Janvijay Singh Jiang Gui Shafiq Joty https://bit.ly/3McnwuS https://twitter.com/i/web/status/2019787015319859635 https://bit.ly/3McnwuS https://bit.ly/3McnwuS https://twitter.com/i/web/status/2019787015319859635 https://bit.ly/3McnwuS"
X Link 2026-02-06T14:55Z 19.1K followers, [--] engagements

"(18/22) Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels: Authors: Zhepeng Cen Haolin Chen Shiyu Wang Zuxin Liu Zhiwei Liu Ding Zhao Silvio Savarese Caiming Xiong Huan Wang Weiran Yao https://bit.ly/3IFuMhf https://bit.ly/3IFuMhf"
X Link 2026-02-06T14:55Z 19.1K followers, [--] engagements

"(19/22) Grounded Test-Time Adaptation for LLM Agents 10char link Authors: Arthur Chen Zuxin Liu Jianguo Zhang Akshara Prabhakar Zhiwei Liu Shelby Heinecke Silvio Savarese Victor Zhong Caiming Xiong"
X Link 2026-02-06T14:55Z 19.1K followers, [--] engagements

"(8/22) WALT: Web Agents that Learn Tools: WALT reverse-engineers website functionality into reusable tools like search filter and createshifting from fragile step-by-step interactions to reliable tool invocation with higher success and fewer steps on VisualWebArena and WebArena. Authors: Viraj Prabhu @virprabh Yutong Dai @yutong_dai Matthew Fernandez Jing Gu @jinggu4ai Krithika Ramakrishnan Yanqi Luo Silvio Savarese @silviocinguetta Caiming Xiong @CaimingXiong Junnan Li @LiJunnan0409 Zeyuan Chen @ZeyuanChen Ran Xu @stanleyran โœ… Accepted to #ICLR2026 https://bit.ly/4nhJf0K"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(9/22) SCUBA: Salesforce Computer Use Benchmark: SCUBA benchmarks computer-use agents on [---] real Salesforce CRM tasks across admin sales and service workflowsopen-source agents achieve 5% success vs. 39% for closed-source models in zero-shot settings improving to 50% with demonstrations while reducing time and costs by 13-16%. Authors: Yutong Dai @yutong_dai Krithika Ramakrishnan Jing Gu @jinggu4ai Matthew Fernandez Yanqi Luo Viraj Prabhu @virprabh Zhenyu Hu Silvio Savarese @silviocinguetta Caiming Xiong @CaimingXiong Zeyuan Chen @ZeyuanChen Ran Xu @stanleyran โœ… Accepted to #ICLR2026"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(12/22) Improving LLM Alignment with References: Reference-guided evaluation improves LLM-based evaluators and enables effective semi-self-improvementachieving 73.1% on AlpacaEval and 58.7% on Arena-Hard with Llama-3-8B-Instruct comparable to finetuned reward models. Authors: Kejian Shi Yixin Liu PeiFeng Wang @PeifengWang3 Alexander Fabbri Shafiq Rayhan Joty @JotyShafiq Arman Cohan โœ… Accepted to #ICLR2026 https://bit.ly/4am3czb https://bit.ly/4am3czb https://bit.ly/4am3czb https://bit.ly/4am3czb"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(13/22) Distill-SynthKG: Distilling Knowledge Graph Synthesis Workflow for Improved Coverage and Efficiency: SynthKG introduces ontology-free KG synthesis that distills into Distill-SynthKG for efficient single-step generationsurpassing models 8x larger in KG quality and outperforming baselines in retrieval and question-answering with a novel graph-based RAG framework. Authors: Prafulla Kumar Choubey Xin Su Man Luo Xiangyu Peng @beckypeng6 Caiming Xiong @CaimingXiong Tiep Le Shachar Rosenman @SRosenamn Vasudev Lal @vasudev_lal Phil Mui Ricky Ho Phillip Howard Chien-Sheng Wu @jasonwu0731 โœ…"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(17/22) Entropy-Based Block Pruning for Efficient Large Language Models: Entropy-based pruning outperforms cosine similarity methods by leveraging entropy patterns across Transformer blocksdecreasing early then increasingas a more effective measure of information richness for reducing model size while preserving accuracy. Authors: Liangwei Yang @Liangwei_Yang Yuhui Xu Juntao Tan Doyen Sahoo @doyensahoo Silvio Savarese @silviocinguetta Caiming Xiong @CaimingXiong Huan Wang @huan__wang Shelby Heinecke @shelbyh_ai โœ… Accepted to #ICLR2026 https://bit.ly/46cEErl https://bit.ly/46cEErl"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(18/22) Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels: Webscale-RL introduces a scalable pipeline converting pre-training documents into 1.2M verifiable QA pairs across 9+ domainsRL training on this dataset achieves continual pre-training performance with [---] fewer tokens offering a viable path to scaling RL to pre-training levels. Authors: Zhepeng Cen Haolin Chen Shiyu Wang Zuxin Liu @LiuZuxin Zhiwei Liu Ding Zhao Silvio Savarese @silviocinguetta Caiming Xiong @CaimingXiong Huan Wang @huan__wang Weiran Yao @iscreamnearby โœ… Accepted to #ICLR2026"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(19/22) Grounded Test-Time Adaptation for LLM Agents: Parametric online adaptation aligns LLM agents to environment-specific formats while non-parametric dynamics grounding learns causal state transitions through persona-driven explorationtogether addressing syntactic and semantic mismatches to boost WebArena multi-site success from 2% to 23% Authors: Arthur Chen @arthurchen189 Zuxin Liu @LiuZuxin Jianguo Zhang @JianguoZhang3 Akshara Prabhakar @aksh_555 Zhiwei Liu Shelby Heinecke @shelbyh_ai Silvio Savarese @silviocinguetta Victor Zhong @hllo_wrld Caiming Xiong @CaimingXiong โœ… Accepted to"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(20/22) Scalable Chain of Thoughts via Elastic Reasoning: Elastic Reasoning separates chain-of-thought into thinking and solution phases with independent budgets prioritizing solution completeness under constraintsachieving robust performance with lower training costs and more concise reasoning across math and coding benchmarks. Authors: Yuhui Xu @xyh6666 Hanze Dong @hendrydong Lei Wang @leiwang Doyen Sahoo @doyensahoo Junnan Li @LiJunnan0409 Caiming Xiong @CaimingXiong โœ… Accepted to #ICLR2026 https://bit.ly/4a0luqS https://bit.ly/4a0luqS https://bit.ly/4a0luqS https://bit.ly/4a0luqS"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(21/22) OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs: OFTSR achieves one-step image super-resolution with tunable fidelity-realism trade-off by aligning student predictions to teacher model sampling trajectoriesreaching state-of-the-art performance on FFHQ DIV2K and ImageNet without multi-step overhead. Authors: Yuanzhi Zhu Ruiqing Wang Shilin Lu Junnan Li @LiJunnan0409 Hanshu Yan @hanshu_yan Kai Zhang โœ… Accepted to #ICLR2026 https://bit.ly/3ZipY67 https://bit.ly/3ZipY67 https://bit.ly/3ZipY67 https://bit.ly/3ZipY67"
X Link 2026-02-06T17:17Z 19.1K followers, [---] engagements

"๐Ÿšจ New paper and dataset alert: When do multi-agent systems actually help ๐Ÿ“„ Paper: Despite growing interest most multi-agent systems (MAS) today rely on local sequential and hand-designed orchestrationmaking it difficult to reason about their benefits or scale them effectively. Introducing MAS-Orchestra which takes a new perspective: treat MAS orchestration as a holistic function-calling reinforcement learning problem with an explicit notion of the degree of multi-agentness (DoM). We also introduce MASBench a benchmark designed to systematically measure when MAS outperforms single-agent"
X Link 2026-02-02T18:17Z 19.1K followers, [----] engagements

"@Salesforce AI Research has [--] papers accepted to ICLR [----] advancing work across LLM reasoning evaluation systems knowledge graphs and agent architectures. Our research addresses critical challenges in making AI systems more reliable efficient and effective for enterprise applications. #ICLR2026 #FutureOfAI #EnterpriseAI https://twitter.com/i/web/status/2019822846122365374 https://twitter.com/i/web/status/2019822846122365374"
X Link 2026-02-06T17:17Z 19.1K followers, [----] engagements

"(14/22) CoAct-1: Computer-using Multi-agent System with Coding Actions: CoAct-1 introduces a multi-agent system combining GUI control with programmatic executionan Orchestrator delegates subtasks to GUI Operator or Programmer agents achieving 60.76% success on OSWorld (new SOTA) while reducing steps from [--] to [-----]. Authors: Linxin Song Yutong Dai @yutong_dai Viraj Prabhu @virprabh Jieyu Zhang Taiwei Shi Li Li Junnan Li @LiJunnan0409 Silvio Savarese @silviocinguetta Zeyuan Chen @ZeyuanChen Jieyu Zhao Ran Xu @stanleyran Caiming Xiong @CaimingXiong โœ… Accepted to #ICLR2026"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(15/22) LLMs Get Lost in Multi-Turn Conversation: LLMs show 39% performance drop in multi-turn vs. single-turn conversations across six tasksanalysis of 200000+ simulated conversations reveals they make premature assumptions and fail to recover when taking wrong turns. Authors: Philippe Laban @PhilippeLaban Hiroaki Hayashi @hiroakiLhayashi Yingbo Zhou Jennifer Neville @ProfJenNeville โœ… Accepted to #ICLR2026 https://bit.ly/3ZoC6T1 https://bit.ly/3ZoC6T1"
X Link 2026-02-06T17:17Z 19.1K followers, [--] engagements

"(2/7) Software issue localization maps bug reports to code functions needing fixes. As codebases grow across languages manual localization becomes infeasible. SWERANK+ addresses this with multilingual ranking and agentic search"
X Link 2026-02-10T02:39Z 19.1K followers, [---] engagements

"(3/7) SWERANKMULTI extends code ranking to [--] languages using SWELOCMULTI dataset (155K instances 4K+ repos): Bi-encoder retriever LLM reranker JavaScript Java TypeScript Ruby Rust Go PHP C C++ Python ๐Ÿ“„ https://arxiv.org/abs/2512.20482 https://arxiv.org/abs/2512.20482"
X Link 2026-02-10T02:39Z 19.1K followers, [--] engagements

"(6/7) The agentic approach mirrors developer workflows: broad exploration context building converging to root cause. Combines efficiency of specialized retrievers with depth of agentic reasoning. ๐Ÿ“„ https://arxiv.org/abs/2512.20482 https://arxiv.org/abs/2512.20482"
X Link 2026-02-10T02:39Z 19.1K followers, [---] engagements

"(7/7) By @gangi_official @YeLiu918 @WentingZhao9 @_jaedoo2 @TarunSures41845 @daniel_js_lee @caimingxiong @yingbozhou @semih__yavuz @JotyShafiq at @Salesforce AI Research UIUC KAIST AI #FutureOfAI #EnterpriseAI"
X Link 2026-02-10T02:39Z 19.1K followers, [---] engagements

"Deep research agents typically scale depthmore sequential steps. But what about scaling width ๐Ÿค” ๐Ÿ“„ Paper: We introduce Wide & Deep (W&D) research agents: a framework exploring parallel tool calling to boost performance while reducing costs and latency. Key results on BrowseComp HLE and GAIA: ๐Ÿ“Š Parallel tool calling improves accuracy across GPT-5 Gemini and Claude ๐Ÿ’ฐ 36% reduction in API costs 41% reduction in wall-clock time ๐ŸŽฏ W&D with GPT-5-Medium achieves 62.2% on BrowseCompbeating GPT-5-High's 54.9% Why it works: ๐Ÿ” Enhanced source credibility through diverse information gathering โœ…"
X Link 2026-02-11T15:17Z 19.1K followers, [----] engagements

"eVerse: CRMArena Org Data Generator The enterprise AI bottleneck isn't model performanceit's the last mile between pilot and production. ๐Ÿ”’ The Challenge: How do you rigorously test AI agents without touching real customer data Privacy regulations like GDPR prohibit using production data for development. But sanitized data removes the complexity and volume agents need to learn effectively. ๐Ÿ’ก The Solution: eVerse is our answer to enterprise AI's training problem: privacy-preserving simulation environments where AI agents can fail safely learn from realistic scenarios and improve before they"
X Link 2026-02-10T17:31Z 19.1K followers, [---] engagements

"Salesforce AI Research is hiring a Research Scientist (including Senior/Lead level) with deep expertise in Self-Evolving Agents and Reinforcement Learning. Apply: We're expanding our Frontier RL stack to support continuous self-evolutionbuilding the foundation for autonomous systems that improve through interaction and experience. This role sits at the intersection of cutting-edge research and real-world enterprise applications where your work will directly impact how millions of CRM customers leverage AI. ๐Ÿง  Key requirements: PhD in ML/RL or equivalent research experience Strong publication"
X Link 2026-02-12T15:55Z 19.1K followers, [----] engagements

"Introducing COVID-19 Search a new AI-powered search tool that equips scientists and researchers with the most relevant information about COVID-19. Learn more about this tool at https://sfdc.co/covid19search https://sfdc.co/covid19search"
X Link 2020-06-17T14:52Z 19.1K followers, [---] engagements

"Announcing the Third Annual AI Research Grant For more details and how to apply: Blog: Website: Good luck to our future applicants https://einstein.ai/outreach/grants https://blog.einstein.ai/announcing-the-annual-salesforce-ai-research-grant/ https://einstein.ai/outreach/grants https://blog.einstein.ai/announcing-the-annual-salesforce-ai-research-grant/"
X Link 2020-08-10T21:39Z 19.1K followers, [---] engagements

"Congrats to our ICLR [----] Accepted Paper Authors @CaimingXiong @jesse_vig @thisismadani @nazneenrajani @semih__yavuz @mrnt0810 @yingbozhou_ai @jasonwu0731 @sachin_logs @LiJunnan0409 @stevenhoi @panzhou9 @StrongDuality and all our amazing collaborators"
X Link 2021-01-19T23:37Z 19.1K followers, [--] engagements

"Thank you to everyone who submitted a proposal to our third annual Salesforce AI Research Grant. Were proud to announce our [----] round of winners. Congratulations @bluevincent @Diyi_Yang @mutembesa @danqi_chen Read More: https://blog.einstein.ai/celebrating-the-winners-of-the-third-annual-salesforce-ai-research-grant/ https://blog.einstein.ai/celebrating-the-winners-of-the-third-annual-salesforce-ai-research-grant/"
X Link 2021-01-26T00:15Z 19.1K followers, [---] engagements

"Were thrilled to announce that Silvio Savarese (@silviocinguetta) former associate professor of Computer Science at Stanford University has joined @salesforce as our new EVP and Chief Scientist of Salesforce Research"
X Link 2021-04-15T16:23Z 19.1K followers, [---] engagements

"Congrats to our ACL [----] Accepted Paper Authors @CaimingXiong @JotyShafiq @baxterkb @jasonwu0731 @owenhaoliu @Wenpeng_Yin @huan__wang and all of our amazing collaborators"
X Link 2021-05-06T23:05Z 19.1K followers, [--] engagements

"Can #AI language models learn from evolution to design proteins Learn how Salesforce is taking a step towards enabling solutions to cure disease and clean our planet. Blog: Paper: http://biorxiv.org/content/10.1101/2021.07.18.452833v1 http://blog.einstein.ai/learning-from-evolution/ http://biorxiv.org/content/10.1101/2021.07.18.452833v1 http://blog.einstein.ai/learning-from-evolution/"
X Link 2021-07-19T16:00Z 19.1K followers, [--] engagements

"Meet CodeT5 - the first code-aware encoder-decoder pre-trained model that achieves SoTA on [--] sub-tasks in CodeXGLUE Learn how its disrupting software development. Blog: Paper: GitHub: #codeintelligence https://github.com/salesforce/CodeT5 https://arxiv.org/abs/2109.00859 http://blog.einstein.ai/codet5/ https://github.com/salesforce/CodeT5 https://arxiv.org/abs/2109.00859 http://blog.einstein.ai/codet5/"
X Link 2021-09-03T16:13Z 19.1K followers, [--] engagements

"Congrats to our NeurIPS [----] Accepted Paper Authors @CaimingXiong @yubai01 @huan__wang @lrvarshney @a1vinchan @thisismadani @benwkrause @nikhil_ai @LiJunnan0409 @ramprs21 @AkhileshGotmare @JotyShafiq @stevenhoi and all of our amazing collaborators"
X Link 2021-09-29T23:35Z 19.1K followers, [--] engagements

"Do you want to launch your career in machine learning research Our new AI Residency Program can allow you to do just that. Set yourself up for success in applying to PhD programs w/ real-world experience at one of the industry's top AI research programs. https://sforce.co/AIResTwitter https://sforce.co/AIResTwitter"
X Link 2021-12-09T16:14Z 19.1K followers, [---] engagements

"Discover CTRLsum a generic summarization framework that enables users to control the content of the generated summaries along multiple dimensions. Blog: Code: #NLP #summarization https://github.com/salesforce/ctrl-sum https://arxiv.org/abs/2012.04281 https://blog.einstein.ai/ctrlsum/ https://github.com/salesforce/ctrl-sum https://arxiv.org/abs/2012.04281 https://blog.einstein.ai/ctrlsum/"
X Link 2021-12-15T23:07Z 19.1K followers, [--] engagements

"Did you know most #NLP models are not designed to handle code-mixing where each sentence contains multiple languages Learn how @samsontmr @SFResearch is changing that. Blog: Paper: Code: https://github.com/salesforce/adversarial-polyglots https://www.aclweb.org/anthology/2021.naacl-main.282 https://blog.salesforceairesearch.com/code-mixing https://github.com/salesforce/adversarial-polyglots https://www.aclweb.org/anthology/2021.naacl-main.282 https://blog.salesforceairesearch.com/code-mixing"
X Link 2022-01-24T22:28Z 19.1K followers, [---] engagements

"Meet BLIP: Bootstrapping Language-Image Pre-training for unified Vision-Language understanding/generation. New model architecture + Dataset bootstrapping = SoTA results on a wider range of V+L tasks than other models http://blog.salesforceairesearch.com/blip-bootstrapping-language-image-pretraining http://blog.salesforceairesearch.com/blip-bootstrapping-language-image-pretraining"
X Link 2022-02-24T21:47Z 19.1K followers, [--] engagements

"Discover CodeGen - an AI model that turns simple natural-language requests into executable code. Learn more about this breakthrough in conversational AI programming. Paper: Blog: Code: https://github.com/salesforce/CodeGen https://blog.salesforceairesearch.com/codegen/ https://arxiv.org/abs/2203.13474 https://github.com/salesforce/CodeGen https://blog.salesforceairesearch.com/codegen/ https://arxiv.org/abs/2203.13474"
X Link 2022-03-29T23:17Z 19.1K followers, [---] engagements

"Want to build bots better Try Converse: a new Task-Oriented Dialogue System that simplifies chatbot building while handling complex tasks and conversations. #NLP #AI Code: Paper: Blog: https://blog.salesforceairesearch.com/converse-task-oriented-dialogue-system/ https://arxiv.org/abs/2203.12187 https://github.com/salesforce/converse https://blog.salesforceairesearch.com/converse-task-oriented-dialogue-system/ https://arxiv.org/abs/2203.12187 https://github.com/salesforce/converse"
X Link 2022-04-08T17:22Z 19.1K followers, [--] engagements

"Check out our #NAACL2022 accepted papers Congrats to the authors We hope everyone enjoys the conference @EhsanHAsl @owenhaoliu @CaimingXiong @murakhovska @jasonwu0731 @alexfabbri4 @mrnt0810 @jesse_vig @iam_wkr @semih__yavuz @yingbozhou_ai @LHung1610 @stevenhoi @PhilippeLaban"
X Link 2022-04-15T18:34Z 19.1K followers, [--] engagements

"Read our blog on #ACL2022. Congrats to all our authors for their accepted papers http://blog.salesforceairesearch.com/salesforce-at-acl-2022 http://blog.salesforceairesearch.com/salesforce-at-acl-2022"
X Link 2022-05-19T17:35Z 19.1K followers, [--] engagements

"Our CodeGen models are now available at @huggingface (Model size variants: 350M 2B 6B and 16B.) Clone the latest transformers repository and try it out Paper: Models: https://huggingface.co/modelssearch=salesforce+codegen https://arxiv.org/abs/2203.13474 https://huggingface.co/modelssearch=salesforce+codegen https://arxiv.org/abs/2203.13474"
X Link 2022-06-28T16:19Z 19.1K followers, [---] engagements

"CodeRL advances program synthesis by integrating pretrained language models + deep reinforcement learning. Using unit test feedback in model training and inference + an improved CodeT5 model it achieves SOTA results on competition-level programming tasks. https://blog.salesforceairesearch.com/coderl https://blog.salesforceairesearch.com/coderl"
X Link 2022-07-19T23:28Z 19.1K followers, [---] engagements

"ETSformer is a time-series forecasting model that combines the classical intuition of seasonal-trend decomposition and exponential smoothing with the Transformer framework introducing novel exponential smoothing and frequency attention mechanisms. https://blog.salesforceairesearch.com/etsformer-time-series-forecasting https://blog.salesforceairesearch.com/etsformer-time-series-forecasting"
X Link 2022-08-23T22:00Z 19.1K followers, [--] engagements

"Using both natural and artificial abilities the human relationship with tools has drastically evolved. The best tools are powerful because theyre easy to use. This is where our skill of language and AI meet. Learn more on how conversation can power AI https://blog.salesforceairesearch.com/age-of-conversational-ai/ https://blog.salesforceairesearch.com/age-of-conversational-ai/"
X Link 2022-10-04T20:00Z 19.1K followers, [--] engagements

"Time-series forecasting methods perform poorly on long sequences when data changes over time. DeepTime overcomes this issue by using forecasting-as-meta-learning on deep time-index models. Result: state-of-the-art performance and a highly efficient model. https://blog.salesforceairesearch.com/deeptime-meta-learning-time-series-forecasting https://blog.salesforceairesearch.com/deeptime-meta-learning-time-series-forecasting"
X Link 2022-10-13T20:00Z 19.1K followers, [---] engagements

"For time series forecasting deep learning isnt scalable for streaming data and non-stationary data makes it hard. FSNet learns deep forecasting models on the fly and handles non-stationary data + concept drift. Learn more https://blog.salesforceairesearch.com/fsnet-deep-time-series-forecasting/ https://blog.salesforceairesearch.com/fsnet-deep-time-series-forecasting/"
X Link 2022-10-28T18:00Z 19.1K followers, [--] engagements

"Do you want to make your dog look like a golden retriever Or get a picture of a cat surfing Researchers at Salesforce recently developed a new editing algorithm called EDICT - here's a thread on the results and details ๐Ÿงต"
X Link 2023-01-10T19:00Z 19.1K followers, 29.2K engagements

"Check out our #ICLR2023 Accepted Papers Congrats to all the authors @silviocinguetta @CaimingXiong @huan__wang @doyensahoo @yingbozhou_ai @hiroakiLhayashi @erik_nijkamp @stevenhoi @YuBai01 @jasonwu0731"
X Link 2023-01-24T17:43Z 19.1K followers, [----] engagements

"We introduce the Salesforce CausalAI Library an open source library for causal analysis of time series and tabular data. GitHub: GitHub Documentation: Tech Report: Blog: https://blog.salesforceairesearch.com/causalai/ https://arxiv.org/abs/2301.10859 https://opensource.salesforce.com/causalai/latest/index.html https://github.com/salesforce/causalai https://blog.salesforceairesearch.com/causalai/ https://arxiv.org/abs/2301.10859 https://opensource.salesforce.com/causalai/latest/index.html https://github.com/salesforce/causalai"
X Link 2023-01-31T19:00Z 19.1K followers, 12.7K engagements

"Check out our #CVPR2023 Accepted Papers Congrats to all the authors @silviocinguetta @CaimingXiong @jcniebles @nikhil_ai @LiJunnan0409 @bram_wallace @realNingYu"
X Link 2023-03-01T00:54Z 19.1K followers, [----] engagements

"Editing an image using AI but want to keep the details Check out our work EDICT (๐ŸŽ†CVPR 2023๐ŸŽ†): Gradio Demo: Code: Arxiv: Authors: @bram_wallace @nikhil_ai https://arxiv.org/abs/2211.12446 https://huggingface.co/spaces/Salesforce/EDICT Do you want to make your dog look like a golden retriever Or get a picture of a cat surfing Researchers at Salesforce recently developed a new editing algorithm called EDICT - here's a thread on the results and details ๐Ÿงต https://t.co/vHDdVonBz0 https://arxiv.org/abs/2211.12446 https://huggingface.co/spaces/Salesforce/EDICT Do you want to make your dog look"
X Link 2023-03-15T18:41Z 19.1K followers, 18.6K engagements

"In Loving Memory of Dragomir Radev. You will be missed. โ™ฅ @dragomir_radev https://blog.salesforceairesearch.com/in-loving-memory-of-drago-radev/ https://blog.salesforceairesearch.com/in-loving-memory-of-drago-radev/"
X Link 2023-04-05T15:00Z 19.1K followers, [----] engagements

"Check out our #ACL2023 Accepted Papers Congrats to all the authors @silviocinguetta @CaimingXiong @alexfabbri4 @JiachengNLP @memray0 @JotyShafiq @jasonwu0731 @iam_wkr @PhilippeLaban @jesse_vig @yingbozhou_ai @semih__yavuz"
X Link 2023-05-03T17:03Z 19.1K followers, 14.9K engagements

"๐Ÿ”ฅIntroducing XGen-7B a new 7B LLM trained on 8K seq. length for 1.5T tokens. Better or comparable results with MPT Falcon LLaMA OpenLLaMA in text & code tasks. Blog: Code: Training cost $150K for 1T token https://github.com/salesforce/xgen http://blog.salesforceairesearch.com/xgen/ https://github.com/salesforce/xgen http://blog.salesforceairesearch.com/xgen/"
X Link 2023-06-28T19:02Z 19.1K followers, 22.5K engagements

"Releasing ๐Ÿš€ CodeGen2.5 ๐Ÿš€ a small but mighty LLM for code. - On par with models twice its size - Trained on 1.5T tokens - Features fast infill sampling Blog: Paper: Code: Model: https://huggingface.co/Salesforce/codegen25-7B-multi https://github.com/salesforce/CodeGen https://arxiv.org/abs/2305.02309 https://blog.salesforceairesearch.com/codegen25 https://huggingface.co/Salesforce/codegen25-7B-multi https://github.com/salesforce/CodeGen https://arxiv.org/abs/2305.02309 https://blog.salesforceairesearch.com/codegen25"
X Link 2023-07-06T20:46Z 19.1K followers, 230.2K engagements

"Introducing XGen-Image-1 our first foray into training large text-to-image models. Trained for $75K using TPUs on the LAION dataset XGen-Image-1 matches the performance of Stable Diffusion 1.5/2.1. https://blog.salesforceairesearch.com/prototyping-xgen-image-1/ https://blog.salesforceairesearch.com/prototyping-xgen-image-1/"
X Link 2023-08-10T18:59Z 19.1K followers, 18.7K engagements

"Our blog for Diffusion-DPO is now live๐Ÿš€ In this project we brought the benefits of Reinforcement Learning from Human Feedback (RLHF) to text-to-image diffusion models at scale for the first time. https://blog.salesforceairesearch.com/diffusion-dpo/ https://blog.salesforceairesearch.com/diffusion-dpo/"
X Link 2024-01-09T19:09Z 19.1K followers, [----] engagements

"Check out our #ICLR2024 Accepted Papers. Congratulations to all of our authors"
X Link 2024-01-18T22:37Z 19.1K followers, [----] engagements

"We have [--] accepted papers at NAACL this year Congratulations to all of our authors on their work"
X Link 2024-03-14T21:46Z 19.1K followers, [----] engagements

"๐ŸŒŸ Meet #Moirai: Revolutionizing time-series forecasting with universal models Say goodbye to dataset-specific models and hello ๐Ÿ‘‹ to accurate forecasts across domains Code: LOTSA data: Blog post: https://sforce.co/3TCMDqu https://sforce.co/4axHHtQ https://sforce.co/4aADhSM https://sforce.co/3TCMDqu https://sforce.co/4axHHtQ https://sforce.co/4aADhSM"
X Link 2024-04-01T17:33Z 19.1K followers, 13.1K engagements

"Check out Diffusion-DPO๐ŸŒŸ Bridging the gap between StableDiffusion & closed models like Midjourney v5. Our #TextToImage model uses human feedback for state-of-the-art alignment marking a new era in AI creativity Code: Blog: https://sforce.co/3VHYQg3 https://sforce.co/4ab7p7J https://sforce.co/3VHYQg3 https://sforce.co/4ab7p7J"
X Link 2024-04-05T21:20Z 19.1K followers, [----] engagements

"Meet our Tiny Giant. Our 1B parameter model xLAM-1B is now the best micro model for function calling outperforming models 7x its size including GPT-3.5 & Claude. On-device agentic AI is here. #AIResearch #SLM #TinyButMighty Paper: Github: https://apigen-pipeline.github.io/ https://arxiv.org/pdf/2406.18518 https://apigen-pipeline.github.io/ https://arxiv.org/pdf/2406.18518"
X Link 2024-07-01T16:21Z 19.1K followers, 59.1K engagements

"Just in Our Tiny Giant xLAM-1B-fc has officially arrived on @huggingface with a few friends๐ŸŽ‰ Check out for our suite of small agentic models including xLAM-1B-fc and xLAM-7B-fc with mobile-ready quantized versions nowโšก#LAM #AIModels #AI https://bit.ly/4faoYaQ https://bit.ly/4faoYaQ"
X Link 2024-07-18T19:02Z 19.1K followers, 25.4K engagements

"Exciting news ๐ŸŽŠ Our models xLAM-7B-fc and xLAM-1B-fc are ranked #3 and #25 on the Berkeley Function Calling leaderboard. Notably they are the smallest models on the leaderboard๐Ÿš€๐Ÿ“Š #AI #AIModels #AIresearch Check out our suite of small agentic models including mobile-ready quantized versions. ๐Ÿค— @huggingface: http://bit.ly/4faoYaQ http://bit.ly/4faoYaQ"
X Link 2024-07-19T17:32Z 19.1K followers, [----] engagements

"Trying out a snazzy new LLM tonight Take a break download and try our Tiny Giant xLAM-1B and xLAM-7B now on @huggingface. Your agentic AI workflows will thank you #tinybutmighty https://bit.ly/4faoYaQ https://bit.ly/4faoYaQ"
X Link 2024-07-24T05:10Z 19.1K followers, 10.8K engagements

"Breaking news โžกโžกโžก We just released the MINT-1T ๐Ÿƒdataset One trillion tokens. Multimodal. Interleaved. Open-source. Perfect for training multimodal models and advancing their pre-training. Try it today Blog: Dataset: https://bit.ly/3YikQiN https://bit.ly/3YikQPP https://bit.ly/3YikQiN https://bit.ly/3YikQPP"
X Link 2024-07-24T18:34Z 19.1K followers, 28.5K engagements

"(1/12) Can different LLMs give you unique and novel ideas Very likely NO ๐Ÿค– " : " reveals: LLMs often on purely imaginary and hallucinated contents Explore ๐Ÿงตor full paper: https://arxiv.org/abs/2407.16604 https://arxiv.org/abs/2407.16604"
X Link 2024-07-24T21:34Z 19.1K followers, 22.9K engagements

"๐Ÿ’ฅ xLAM-7b beats #GPT-4 in function calling according to the The Berkeley Function Calling Leaderboard second only to #Claude 3.5-Sonnet. Our "Tiny Giant" models are ranking [--] and [--]. Check it out: #tinybutmighty #SLM (and congrats team) https://bit.ly/3WIZdY3 https://bit.ly/3WIZdY3"
X Link 2024-07-29T18:34Z 19.1K followers, 11K engagements

"Open science wins again Introducing Salesforce Research DEI our AI software engineering agents org achieving a 34.3% resolve rate on SWE-Bench Lite crushing closed-source systems GitHub: Paper: #OpenScience #AIForAll https://www.arxiv.org/abs/2408.07060 https://salesforce-research-dei-agents.github.io/ https://www.arxiv.org/abs/2408.07060 https://salesforce-research-dei-agents.github.io/"
X Link 2024-08-14T16:34Z 19.1K followers, [----] engagements

"๐Ÿš€ Supercharge your RAG pipeline ๐Ÿš€ Introducing LlamaRank our SOTA reranker outperforming leading APIs in general document ranking and code search across diverse datasets Blog: Try it out on @togethercompute: Built on Llama3-8B-Instruct and with linear and calibrated scoring for easy interpretation LlamaRank isn't just powerful it's blazingly fast. https://bit.ly/3SZHybZ https://bit.ly/3MmHDTu https://bit.ly/3SZHybZ https://bit.ly/3MmHDTu"
X Link 2024-08-26T22:11Z 19.1K followers, 23.7K engagements

"Introducing the full xLAM family our groundbreaking suite of Large Action Models ๐Ÿš€ From the 'Tiny Giant' to industrial powerhouses xLAM is revolutionizing AI efficiency #AIResearch #AIEfficiency ๐Ÿค— Hugging Face Collection: ๐Ÿคฉ Research Blog ๐Ÿ—ž Press Release: Meet the family: xLAM-1B / TINY: Our 1B parameter marvel ideal for on-device AI. Outperforms larger models despite its compact size xLAM-7B / SMALL: Perfect for swift academic exploration with limited GPU resources. xLAM-8x7B / MEDIUM: Mixture-of-experts model balancing latency resources and performance for industrial applications."
X Link 2024-09-06T18:04Z 19.1K followers, 15.3K engagements

"๐Ÿ‘‡UPDATED DATASET๐Ÿ‘‡Fineweb training dataset just got leaner We've tackled the 70% duplication issue in this valuable 93.4TB dataset. Same great data now more efficient and cost-effective. #AIResearch #DataEfficiency https://bit.ly/3XI3wlB https://bit.ly/3XI3wlB"
X Link 2024-09-25T14:52Z 19.1K followers, 19.2K engagements

"Introducing SFR-Judge our new family of three judge models (8B 12B and 70B parameters) a game-changer for auto-evaluation and reward modeling. Blog: Paper: Github: (code coming soon): ๐Ÿ’ฅ Trained to perform pairwise comparison direct scoring and classification judgments ๐Ÿ’ฅ Outperformed many open-source judges on 10/13 benchmarks ๐Ÿ’ฅ Broken the 90% accuracy barrier on RewardBench - a first for generative models ๐Ÿ’ฅ Showed less bias across [--] key metrics than many other judge models ๐Ÿ’ฅ Matched/outperformed GPT-4o on most pairwise & direct scoring and classification tasks Accelerate your own model"
X Link 2024-09-27T17:27Z 19.1K followers, 35.3K engagements

"๐Ÿ† ๐Ÿ† ๐Ÿ† Our groundbreaking research on prompt leakage in multi-turn LLM interactions is amongst the top-50% industry-track papers accepted to #EMNLP2024 We propose a novel threat model uncover social engineering vulnerabilities measure fine-grained leakage and apply different mitigation techniques. Learn how to build more #SecureAI systems: #LLMSecurity #AISafety #TrustedAI https://arxiv.org/abs/2404.16251 https://arxiv.org/abs/2404.16251"
X Link 2024-10-02T19:50Z 19.1K followers, [----] engagements

"๐Ÿ“ข๐Ÿ“ข๐Ÿ“ขIntroducing xGen-MM-Vid (BLIP-3-Video) This highly efficient multimodal language model is laser-focused on video understanding. Compared to other models xGen-MM-Vid represents a video with a fraction of the visual tokens (e.g. [--] vs. [----] tokens). Paper: Website: Researchers ๐Ÿงต:๐Ÿ‘‡ https://bit.ly/3Yvyqiy https://arxiv.org/abs/2410.16267 https://bit.ly/3Yvyqiy https://arxiv.org/abs/2410.16267"
X Link 2024-10-22T18:28Z 19.1K followers, 12.7K engagements

"โ“Beyond "right or wrong": Introducing a novel RAG evaluation framework based on sub-question coverage. How do we measure if RAG systems are giving complete answers to complex questions Enter: Do RAG Systems Cover What Matters Evaluating and Optimizing Responses with Sub-Question Coverage #AccurateAI ๐Ÿ“ŽPaper: ๐Ÿงตstarts here ๐Ÿ‘‡ 1) We propose decomposing questions into sub-questions and classifying them into three typescore background and follow-upto reflect their roles and importance. ๐Ÿ’  Core sub-questions are central to addressing the main query. ๐Ÿ’  Background sub-questions provide necessary"
X Link 2024-10-24T22:46Z 19.1K followers, [----] engagements

"๐Ÿง  Breaking Research ๐Ÿง  Solving the LLM "Goldilocks Problem" Introducing Auto-CEI: A breakthrough training method that helps train LLMs find the sweet spot between overconfident (plausible but incorrect) hallucinations and overcautious (I dont know) refusals. ๐Ÿ”— Full paper: ๐Ÿงต Research review: ๐Ÿ‘‡ #LLMResearch #TrustedAI https://ar5iv.org/abs/2410.07627 https://ar5iv.org/abs/2410.07627"
X Link 2024-10-30T21:51Z 19.1K followers, [----] engagements

"๐Ÿš€Introducing Moirai-MoE:๐Ÿš€ the first mixture-of-experts time series foundation model a breakthrough in universal forecasting Moirai-MoE achieves token-level model specialization autonomously delivering an impressive 17% performance boost over its predecessor Moirai at the same model size. Plus it outperforms other foundation models with up to 65x fewer activated parameters ๐Ÿ’ชDive deeper: ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿค— Models: ๐Ÿ”ฌ Blog: ๐Ÿงต Technical details: ๐Ÿ‘‡ (1/6) Compared to our previous model Moirai using multi-heuristic-defined input/output projection layers to model time series with different"
X Link 2024-11-08T20:54Z 19.1K followers, 42.2K engagements

"๐Ÿš€ Introducing GIFT-Eval: ๐ŸŽThe new gold standard in time series forecasting evaluation 144K+ time series. [--] datasets. One benchmark to rule them all. Dive in: Paper Blog: ๐Ÿง : Github Dataset Leaderboard Our comprehensive GIFT-Eval tests models across ALL domains frequencies & prediction lengths - from zero-shot to full-shot scenarios. Help us advance innovation in AI time series research #TimeSeries #Forecasting #DataScience .and dive into our researcher's technical thread here: ๐Ÿงต๐Ÿ‘‡ https://bit.ly/3ACSshZ https://bit.ly/3ANNTBv https://bit.ly/3YICI59 https://sforce.co/3ALOcwE"
X Link 2024-11-12T18:35Z 19.1K followers, 15K engagements

"Another amazing #EMNLP2024 comes to a close but we at @SFReseach #NeverStopLearning. Missed us in Miami Bookmark save and explore the research below. Thanks @emnlpmeeting -- what an incredible week #Salesforce #AIResearch #NLP ----- Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems ๐Ÿ”– Paper: Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing ๐Ÿ”– Paper: DataNarrative: Automated Data-Driven Storytelling with Visualizations and Texts ๐Ÿ”– Paper: FOLIO: Natural Language Reasoning with First-Order Logic ๐Ÿ”– Paper: Evaluating Psychological"
X Link 2024-11-17T00:02Z 19.1K followers, [----] engagements

"๐Ÿ“Š Meet LaTent Reasoning Optimization (LaTRO):๐Ÿ“Š A principled variational approach to optimize LLM reasoning: ๐Ÿ’ฅ Paper: ๐Ÿ’ฅ Code: By treating reasoning as sampling from a latent distribution LaTRO improves zero-shot math accuracy by 12.5% over base modelsno external rewards needed. Implement self-rewarding reasoning in your models today #AIResearch #DeepLearning https://bit.ly/3YUoQVF https://bit.ly/3YUoP43 https://bit.ly/3YUoQVF https://bit.ly/3YUoP43"
X Link 2024-11-18T13:51Z 19.1K followers, 10.1K engagements

"๐ŸŒณ๐ŸŒณ๐ŸŒณIntroducing "CodeTree"๐ŸŒณ๐ŸŒณ๐ŸŒณ The first unified framework combining tree-based strategy exploration + execution feedback + LLM agent guidance for code generation. ๐Ÿ–‡ Paper: ๐Ÿ“ˆ Setting new standards with GPT-4: 95.1% HumanEval 98.7% MBPP 43.0% CodeContests Why CodeTree works: ๐ŸŒณ Tree structure unifies strategy planning implementation & refinement ๐ŸŒณ Novel Critic Agent guides search & pruning ๐ŸŒณ Combines execution feedback + LLM reasoning ๐ŸŒณ Breakthrough on complex tasks (27.6% on SWEBench) Our framework enables efficient exploration of coding strategies and multi-stage refinement"
X Link 2024-12-02T22:22Z 19.1K followers, 16.3K engagements

"๐ŸŒณ๐ŸŒณ๐ŸŒณ Take a closer look at CodeTree ๐ŸŒณ๐ŸŒณ๐ŸŒณ 1/6 Dive deep into our new framework for code generation with large language models (LLMs) combining multi-agent collaboration with an efficient tree search strategy. Code: Paper: Technical thread :๐Ÿ‘‡ https://bit.ly/3Vo0Au0 https://bit.ly/3Vo0AKw https://bit.ly/3Vo0Au0 https://bit.ly/3Vo0AKw"
X Link 2024-12-04T23:43Z 19.1K followers, [----] engagements

"๐Ÿ”ฌ๐Ÿ”ฌ๐Ÿ”ฌIntroducing ProVision: A new system for transforming images into verified instruction data for multimodal language models (MLMs) at massive scale Scene graphs + programmatic synthesis generate 10M+ diverse automated Q&A pairs. Fully verifiable. Training MLMs Dive in: ๐Ÿ“ฐBlog: ๐Ÿ—žPaper: ๐Ÿ’ปDataset: ๐Ÿ‘‡Researchers ๐Ÿงต๐Ÿ‘‡ (1/6) Why build ProVision Training multimodal LMs demands massive instruction datasets - pairing images with Q&As. Manual creation is costly while using existing models risks hallucinations. ProVision's novel solution Scene graphs + human-written programs. We represent images"
X Link 2025-01-08T21:45Z 19.1K followers, 20.8K engagements

"๐ŸŒฎ Introducing ๐ŸŒฎ TACO - our new family of multimodal action models that combine reasoning with real-world actions to solve complex visual tasks ๐Ÿ“ŠResults: 20% gains on MMVet 3.9% average improvement across [--] benchmarks 1M+ synthetic CoTA traces in training ๐Ÿ”“ ๐Ÿ”“๐Ÿ”“Fully open-sourced ๐Ÿ”“๐Ÿ”“๐Ÿ”“ Get started with: ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿ“ฑ Demo: ๐Ÿค– Models: ๐Ÿ“š Datasets: ๐Ÿงต .and our Technical deep-dive starts here (1/4) How does TACO work ๐Ÿค” โ›“TACO answers complex questions by generating Chains-of-Thought-and-Action (CoTA) executing intermediate actions with external tools such as OCR calculator and depth"
X Link 2025-01-09T22:47Z 19.1K followers, 70.7K engagements

"๐Ÿ”ฌWere so excited about TACO. our new open sourced multimodal model family that excels at complex visual reasoning tasks requiring multiple steps and external tools ๐Ÿ“Š The results speak for themselves: ๐ŸŒฎ30-50% accuracy boost vs. few-shot CoTA prompting ๐ŸŒฎUp to 20% improvement on MMVet benchmark ๐ŸŒฎConsistent outperformance across [--] benchmarks but check out our new blog that brings it to life and a great write-up by @Marktechpost Ready to get started ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿ“ฑ Demo: ๐Ÿค– Models: ๐Ÿ“š Datasets: ๐Ÿงต Research thread https://bit.ly/3Pxtzbv https://bit.ly/4j2ZG0h https://bit.ly/3PwrEE2"
X Link 2025-01-17T00:12Z 19.1K followers, [----] engagements

"๐Ÿšจ๐Ÿšจ๐ŸšจJust released๐Ÿšจ๐Ÿšจ๐Ÿšจ ๐Ÿš€Introducing the Salesforce Code Embedding Model Family (SFR-Embedding-Code) ranked #1 on CoIR Benchmark ๐Ÿš€ Available in [--] sizes: 2B 400M. Key Highlights: [--] 2B Model: Achieves #1 on CoIR. 2400M Model: Best-performing model under 0.5B parameters. [--] Multi-lingual multi-task unified training framework for code retrieval [--] Supports [--] programming languages including Python Java C++ JavaScript C# and more ๐Ÿง‘๐Ÿ’ปโœจEmpower your next AI Coding Agent with the best code embedding models ๐Ÿง‘๐Ÿ’ปโœจ Join us in advancing #AccurateAI: ๐Ÿ“ŽPaper: ๐Ÿค—400M Model: ๐Ÿค—2B Model: #CodeAI"
X Link 2025-01-17T22:34Z 19.1K followers, 22.8K engagements

"๐ŸŽ‰ โœ Our research on advancing AI-generated writing accepted to #CHI2025 โœ ๐ŸŽ‰ Our paper reveals how expert edits fix AI text issuesfrom clichs to purple prose creating better data for Reinforcement Learning from Human Feedback (RLHF) alignment. Thanks @acm_chi we'll see you in Yokohama Check it out #RLHFdata #AIforWriting https://arxiv.org/pdf/2409.14509 https://arxiv.org/pdf/2409.14509"
X Link 2025-01-23T22:33Z 19.1K followers, [----] engagements

"๐Ÿ“ฃ From efficient key caches and multimodal embeddings to self-improving reasoning and faithful context adherence. we're thrilled to present a broad range of powerful new research at #ICLR2025 ๐ŸŽ‰ Bookmark our accepted papers below and we'll see you in Singapore @iclr_conf ๐Ÿ”– REGENESIS: LLMs can grow into reasoning generalists via self improvement ๐Ÿง Becky Xiangyu Peng Congying Xia Xinyi Yang Caiming Xiong Jason Wu Chen Xing ๐Ÿ”–SiReRAG: Indexing Similar and Related Information for Multihop Reasoning ๐Ÿง  Nan Zhang Prafulla Choubey Alexander. Fabbri Gabriel Bernadett-Shapiro Jason Wu ๐Ÿ”–FaithEval:"
X Link 2025-01-27T04:49Z 19.1K followers, [----] engagements

"๐Ÿ”ฌAdvanced agent systems RAG evaluation instruction-following and more. Our team's accepted papers at #NAACL2025 span from professional CRM research to parallel in-context learning. ๐ŸŽ‰A huge congrats to our researchers and thanks to @naacl we're excited to share and discuss with the community this spring ๐Ÿ’ซ ๐Ÿ‘‡๐Ÿ“‘Bookmark and explore the research below ๐Ÿ“‘๐Ÿ‘‡ ๐Ÿ“ŽCRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments: ๐Ÿ‘Steeve Huang Akshara Prabhakar Sidharth Dhawan Yixin Mao Huan Wang Silvio Savarese Caiming Xiong Philippe Laban Chien-Sheng"
X Link 2025-01-27T21:16Z 19.1K followers, [----] engagements

"We built SFR-Embedding-Code to bridge a critical gap: While text retrieval has advanced rapidly code retrieval needed specialized attention. Our open-source models achieve SOTA results by learning from diverse code and text tasks and supporting [--] programming languages. See why SFR-Embedding is the Top-1 model on the CoIR Leaderboard ๐Ÿฅ‡ #CodeRetrieval #AIforDevelopers ๐Ÿ“– Read more in our latest blog: For the models and more: ๐Ÿค—400M Model: ๐Ÿค—2B Model: ๐Ÿ†CoIR Leaderboard: ๐Ÿ“„Technical Report: https://bit.ly/4gSZteu https://bit.ly/3CkgRKj https://bit.ly/3PCqxmp https://bit.ly/4jhDRdp"
X Link 2025-01-31T00:36Z 19.1K followers, [----] engagements

"๐Ÿ”„ PerfCodeGen: When LLMs learn from their own code execution. Our training-free framework outperforms human solutions in up to 67% of coding tasks by doing what great developers do - test analyze refine repeat. ๐Ÿ“Š Paper: ๐Ÿง‘๐Ÿ’ป Code: ๐Ÿ“ฐ MarkTechPost: ๐Ÿงต Researcher's walk-through๐Ÿ‘‡ #EfficientAI #CodeGeneration https://bit.ly/4jEVGDp https://bit.ly/4akP20J https://bit.ly/4jmH5wb https://bit.ly/4jEVGDp https://bit.ly/4akP20J https://bit.ly/4jmH5wb"
X Link 2025-02-03T21:47Z 19.1K followers, [----] engagements

"โšก Meet BOLT: A novel approach to develop long chain-of-thought reasoning in LLMs without relying on knowledge distillation or extensive human annotations. ๐Ÿ“„ Three key stages: [--] LongCoT data bootstrapping via in-context learning [--] Supervised fine tuning [--] Online refinement Achieves 40%+ gains on Arena-Hard & strong results across MT-Bench WildBench & MATH500 - all with just [--] examples. *Shout out to @_akhaliq for sharing it http://arXiv.org/abs/2502.03860v1 http://arXiv.org/abs/2502.03860v1"
X Link 2025-02-07T23:34Z 19.1K followers, [----] engagements

"๐Ÿ”‰ New advances in LLM reasoning capabilities accepted for oral presentation at #ICLR2025 ๐Ÿ“Ž Paper: ReGenesis introduces a novel approach where models self-improve their reasoning through abstraction-to-concrete progression - no human supervision needed. Key findings: Self-synthesized reasoning paths Superior generalization to new tasks 6.1% improvement in OOD performance Validated across multiple model architectures Our work opens new possibilities for developing more robust and generalizable AI systems. Stay tuned for the full presentation and see you in Singapore #AIResearch #AIReasoning"
X Link 2025-02-12T17:46Z 19.1K followers, [----] engagements

"๐Ÿš€Just dropped: Reward-Guided Speculative Decoding (RSD) - our breakthrough approach that makes LLM inference up to [---] faster while IMPROVING accuracy. ๐Ÿ“„Paper: ๐Ÿ’ปCode: ๐Ÿ‘‡ Key innovations in RSD: ๐Ÿ‘‡ [--] Biased Acceleration - Unlike traditional speculative decoding methods that enforce unbiasedness RSD incorporates a controlled bias to prioritize high-reward outputs. [--] Dynamic Quality Control - Process Reward Model (PRM) acts as real-time quality gate only engaging costly target model when needed [--] Proven Optimality - Mathematically derived threshold strategy ensures optimal"
X Link 2025-02-14T00:43Z 19.1K followers, [----] engagements

"๐ŸŽ‰Just Announced: "ViUniT: Visual Unit Tests for More Robust Visual Programming" has been accepted at #CVPR2025 Paper Link: Project Page: Researchers walk-through ๐Ÿ‘‡ In collaboration with @UPenn we introduce ViUniT a framework that enhances the reliability of visual programs by automatically generating unit tests by leveraging #LLMs and #DiffusionModels. Our approach: ๐Ÿ“Š Boosts model performance by 11.4% and outperforms gpt-4o-mini by 7.7%. ๐Ÿ”„ Reduces right-for-wrong-reasons errors by 40%. ๐Ÿ’ก Introduces innovative applications like best program selection answer refusal and unsupervised reward"
X Link 2025-03-04T23:00Z 19.1K followers, [----] engagements

"๐Ÿ“ฃ Introducing Text2Data open-sourced for the research community ๐Ÿ–‡ Paper: Code: ๐Ÿงช A major advancement in multimodal AI - a low-resource universal text-to-anything framework capable of bridging text with diverse modalities (molecules motion sequences time series) without costly human annotations ๐ŸŽฌ Text2Data in action: ๐ŸŽฌ Our framework first learns general data patterns from unlabeled data (blue) then fine-tunes with limited labeled examples (red) using constraint optimization to prevent forgetting. At bottom you see molecules generated with increasing polarizability levels from 'very low'"
X Link 2025-03-07T20:52Z 19.1K followers, [----] engagements

"Our paper "Can AI writing be salvaged Mitigating Idiosyncrasies and Improving Human-AI Alignment in the Writing Process through Edits" has been awarded a Best Paper Honorable Mention and is in the Top 5% of submissions for #CHI2025 ๐ŸŽ‰ Check it out here: #AI #Research #AIWriting @jasonwu0731 @TuhinChakr @PhilippeLaban https://arxiv.org/pdf/2409.14509 https://arxiv.org/pdf/2409.14509"
X Link 2025-03-27T21:06Z 19.1K followers, [----] engagements

"(1/4) Foundation models are revolutionizing time series analysisbut their success depends on large diverse high-quality datasets which poses a major challenge. Enter synthetic data reshaping Time Series Foundation Models (TSFMs) & Time Series LLMs (TSLLMs). Our survey explores how it tackles data scarcity improves model training & unlocks new research directions. ๐Ÿงต ๐Ÿ“ Paper: https://arxiv.org/abs/2503.11411 https://arxiv.org/abs/2503.11411"
X Link 2025-03-28T19:46Z 19.1K followers, [----] engagements

"๐Ÿšจ New Survey Alert ๐Ÿšจ ๐Ÿง A Survey of Frontiers in LLM Reasoning: Inference Scaling Learning to Reason and Agentic Systems ๐Ÿ“˜ Paper: ๐Ÿง  Project Page: ๐Ÿงต Researcher's thread: ๐Ÿ‘‡ (1/6) Reasoning is the key to unlocking true AI intelligence.๐Ÿ”‘ Two factors that affect the reasoning capabilities are: [--] Regime: how and at what stage is reasoning achieved [--] Architecture: what components are involved in the reasoning process โšกWe present a comprehensive survey along these two dimensions summarizing recent progress and covering: Regimes from inference scaling (e.g. OpenAI o1) to learning to reason (e.g."
X Link 2025-04-03T23:33Z 19.1K followers, [----] engagements

"๐Ÿ‘ Looking for VLMs that go beyond generators to transform multimodal embeddings Meet "VLM2VEC: Training Vision-Language Models for Massive Multimodal Embedding Tasks" ๐Ÿ“Ž Paper: ๐Ÿ’ป Website: Our #ICLR25-featured paper shows how vision language models transform into powerful embedders for classification VQA retrieval and visual grounding. We unlock strong emergent capabilities by deeply fusing vision and language rather than shallow combinations. Visit us in Singapore to see how we're redefining multimodal representation learning #MultimodalAI #VLMs https://tiger-ai-lab.github.io/VLM2Vec/"
X Link 2025-04-05T04:37Z 19.1K followers, [----] engagements

"Our xLAM (#LargeActionModels) family just got an upgrade [--] Multi-turn natural conversation support [--] Smarter multi-step reasoning [--] Models from 1B to 70B for ultimate flexibility ๐Ÿค— HuggingFace: ๐Ÿ‘‘ BFCL Leaderboard: Our research models xLAM-70B-r ranks #1 and xLAM-32B-r #2 on the BFCL function-calling leaderboardbeating GPT-4o Gemini Qwen & more. xLAM-8B-r lands at #4 ahead of GPT-4o. And our Tiny Giant xLAM-1B-r plus xLAM-3B-r outperform much larger models like Mistral-Large and DeepSeek-V3. This is just the beginningwe're building even stronger xLAM models internally to inspire future"
X Link 2025-04-18T17:34Z 19.1K followers, 10.8K engagements

"๐Ÿ“ฃ Meet: "From AI-Slop to AI-Polish" tackling the elephant in the room ๐Ÿ˜ AI writing quality is "mid" at best. Despite LLMs crushing coding their creative writing feels pedestrian. We introduce: [--] Writing Quality Benchmark (WQ): First comprehensive testbed for writing quality assessment [--] Writing Quality Reward Models (WQRM): Outperforming GPT-4o & Claude with 74% accuracy on WQ [--] Test-time compute strategies yielding text preferred by experts 66% of the time ๐Ÿ–‡ Paper: Time to raise the bar on AI-generated text beyond "coherent but clichd." #AIResearch #NLP #WritingQuality"
X Link 2025-04-25T03:21Z 19.1K followers, [----] engagements

"๐Ÿ”ฌ NEW BLOG DROP Our complete technical breakdown on small language models is now available on our research blog: Read here: ๐Ÿ” Discover our research on enterprise-ready AI that delivers powerful performance without the bloat ๐Ÿ‘€ See breakthrough results on long-context understanding at 128K tokens Math prowess revealed: 95% on GSM8K 92.5% on MATH 46.7% on AIME [----] ๐Ÿ’ป Code mastery: 41.1% on LiveCodeBench best-in-class performance ๐Ÿ’ช Our "small but long" approach proves deliberate engineering beats brute-force scalingoffering predictable costs enhanced privacy and reduced environmental impact."
X Link 2025-05-03T00:19Z 19.1K followers, [----] engagements

"Introducing APIGen-MT: Our agentic pipeline for multi-turn synthetic data generation that produces high-quality training data for tuning AI agents Try our open-sourced dataset today ๐Ÿ“Š Paper: ๐Ÿค— Dataset: We used APIGen-MT to train our xLAM-2 model family including xLAM-2-70b-fc-r still #1 on the BFCL leaderboard with 78.2% accuracy outperforming frontier models like GPT-4o and Claude [---] in function-calling tasks especially in challenging multi-turn scenarios. ๐Ÿค We're open-sourcing 5K high-quality trajectories and trained models to advance AI agent research. ๐Ÿง  xLAM Model Family: ๐Ÿ” BFCL:"
X Link 2025-05-08T23:07Z 19.1K followers, 10.7K engagements

"Excited to announce SWERank our code ranking framework for software issue localization. โžกPaper: โžกGitHub Project Page: โžกAI-Generated Podcast: โžกCode Data and Models: Coming soon (1/3) ๐Ÿงต Pinpointing the exact location of a software issue in code is a critical but often time-consuming part of software development. Current agentic approaches to localization can be slow and expensive relying on complex steps and often closed-source models. We introduce SWERank a retrieve-and-rerank framework that comprises SWERankEmbed a bi-encoder code retriever and SWERankLLM a listwise LLM code reranker."
X Link 2025-05-12T23:25Z 19.1K followers, [----] engagements

"๐ŸšจMODEL RELEASE We're thrilled to announce our powerful compact xGen-small model family now available for the research community. ๐Ÿค—Download xGen-Small model: Key highlights: xGen-9B: highly competitive on long-context understanding up to 128K tokens Exceptional math reasoning: 95.3% GSM8K 91.6% MATH 50.0% AIME [----] Superior code generation: 50.6% on LiveCodeBench Our "small but long" approach proves strategic engineering beats brute-force scaling. Full breakdown in our blog: Technical report available here: Advance your research today and tell us what you think #SLMs #EnterpriseAI"
X Link 2025-05-13T16:33Z 19.1K followers, 43.6K engagements

"We're thrilled to announce BLIP3-o a breakthrough in unified multimodal models that excels at both image understanding and generation in a single autoregressive architecture ๐Ÿ’ซ ๐Ÿ“Š Paper: ๐Ÿค— Models: ๐Ÿง  Code: ๐Ÿ“ฝ Learn on the go (AI Generated): Our research reveals that using CLIP features with diffusion transformer and flow matching creates superior performance while reducing computational complexity. Most importantly we're making this model family available to the AI Research community: Complete model implementations Model weights 25M+ detailed caption pretrain dataset 60K high-quality"
X Link 2025-05-16T17:56Z 19.1K followers, [----] engagements

"๐ŸšจWere proud to announce our #ACL2025NLP-accepted papers. Preview and bookmark the research below and well look forward to seeing you in Vienna. Thanks @aclmeeting ๐Ÿ‘‰ Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents ๐Ÿ‘‰ Unanswerability Evaluation for Retrieval Augmented Generation ๐Ÿ‘‰ Why Vision Language Models Struggle with Visual Arithmetic Towards Enhanced Chart and Geometry Understanding ๐Ÿ‘‰ Does Context Matter ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings ๐Ÿ‘‰ What Makes a Good Natural Language"
X Link 2025-05-17T00:34Z 19.1K followers, [----] engagements

"๐ŸšจIntroducing "Elastic Reasoning"๐Ÿšจ Our novel framework solves LLM inference budget constraints without sacrificing performance. Open and available to the research community: ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿค— Models: Key insight: Separate "thinking" and "solution" phases with independent token budgets plus budget-constrained rollout training. Research results: ๐Ÿ‘‰ E1-Math-1.5B: 35% accuracy on AIME2024 with 32% fewer tokens ๐Ÿ‘‰ E1-Code-14B: Codeforces rating of [----] (96th percentile) ๐Ÿ‘‰ Models generalize to ANY budget without retraining The framework (shown) combines GRPO training under constraints +"
X Link 2025-05-23T01:14Z 19.1K followers, [----] engagements

"๐Ÿ† Introducing MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision ๐Ÿ† ๐Ÿ’ป Project Page: ๐Ÿ“„ Paper: ๐Ÿ”— Code: ๐Ÿ“š Explore 1000+ Discovered MAS designs: ๐Ÿงต Technical walk-through ๐Ÿ‘‡ (1/6) Multi-Agent Systems (MAS) can outperform single-agent approaches however designing MAS manually is difficult especially when LLM preferences differ from human intuition and manually designed MAS are hard to adapt to new tasks. โ“Can we automate MAS designeven better can we make it self-evolving without relying on a validation set Meet MAS-Zero: a meta-level inference-time self-evolving framework for"
X Link 2025-05-27T22:35Z 19.1K followers, 18.6K engagements

"๐Ÿšจ Introducing CRMArena-Pro: The first multi-turn enterprise-grade benchmark for LLM agents โœBlog: ๐Ÿ–‡Paper: ๐Ÿค—Dataset: ๐Ÿ–ฅCode: Most AI benchmarks test isolated single-turn tasks. Enterprise work is messy multi-step and demands both capability AND confidentiality ๐Ÿ”ฌBuilt with our exclusive synthetic dataset: Live Salesforce Org sandboxes with realistic expert-crafted CRM data enterprise complexity without customer exposure. What makes CRM-Pro different: ๐ŸŽฏ Multi-domain: Sales service CPQ workflows ๐Ÿ”„ Multi-turn conversations vs single exchanges ๐Ÿ”’ Confidentiality awareness testing ๐Ÿข Live CRM"
X Link 2025-05-30T00:49Z 19.1K followers, 13.7K engagements

"โšก NEW COMPUTER-USE AI RESEARCH โšก Introducing: [--] Our paper Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis 2OSWORLD-G benchmark covering fine-grained manipulation and layout understanding 3JEDI dataset our GUI grounding dataset series with 4M examples 3B and 7B model variants ๐Ÿ”—Paper: ๐Ÿง‘๐Ÿ’ปCode & Sample Usage: ๐Ÿ’ปWebsite: ๐Ÿค—Dataset: Key contributions: [---] expertly annotated samples across [--] capability dimensions Multi-perspective task decomposition (icons components layouts) SOTA performance: 91.7% on ScreenSpot-v2 54.1% on OSWORLD-G Direct impact: 5% 27% success"
X Link 2025-05-31T21:22Z 19.1K followers, [----] engagements

"๐Ÿ† #ICML2025 Best Paper Award: AI Safety Should Prioritize the Future of Work ๐Ÿ“„ Paper: ๐ŸŽ‰ Congratulations to Sanchaita Hazra @hsanchaita Bodhisattwa Prasad Majumder @mbodhisattwa and Tuhin Chakrabarty @TuhinChakr for winning the Outstanding Award one of [--] top papers out of [----] accepted submissions Key insights: ๐Ÿ”ธ Comprehensive worker transition support needed ๐Ÿ”ธ AI exacerbates income inequality through labor disruption ๐Ÿ”ธ International copyright reforms & collective licensing required ๐Ÿ”ธ Pro-worker AI governance for shared prosperity @icmlconf #AIethics #FutureOfWork #AIgovernance"
X Link 2025-07-22T16:51Z 19.1K followers, [----] engagements

"๐Ÿ’ก Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models ๐Ÿ’ก ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿ˜ต๐Ÿ’ซ Have a task but experiencing prompt engineering existential dread Few-shot or zero-shot Chain-of-thought or ReAct Where do I get examples Should I label data How do I evaluate What metrics Manual feedback or auto-looping Why does one word change everything Promptomatix eliminates the entire decision tree. Describe task receive optimized prompt question nothing. Sanity restored โœจ #LLMs #LargeLanguageModels #FutureOfAI #EnterpriseAI https://bit.ly/4lLjQgd https://bit.ly/44IAvuO"
X Link 2025-07-23T17:16Z 19.1K followers, [----] engagements

"๐ŸŒŸ Excited to present our work at Empirical Methods in Natural Language Processing @emnlpmeeting - a leading conference in NLP and AI research ๐Ÿ“„ Our accepted papers: Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization ๐Ÿ‘ฅAuthors: Chuyuan Li @ChuyuanLi Austin Xu @austinsxu Shafiq Joty @JotyShafiq and Giuseppe Carenini @careninigiusepp ๐Ÿ“Paper: Demystifying Domain-adaptive Post-training for Financial LLMs ๐Ÿ‘ฅAuthors: Zixuan Ke @KeZixuan Yifei Ming @ming5_alvin Xuan-Phi Nguyen Caiming Xiong @CaimingXiong Shafiq Joty @JotyShafiq ๐Ÿ“Paper: CEMTM: Contextual"
X Link 2025-08-23T17:22Z 19.1K followers, [----] engagements

"โšก The era of AI agents that just chat is over. @Salesforce just introduced GTA1 - Computer Use Agents that actually CLICK SCROLL and WORK in your enterprise software like a human would. ๐Ÿ‘‰ ๐ŸŽฏ The results are game-changing: โžก 50.1% success on enterprise UIs โžก Outperforms models 10x larger โžก Beats OpenAI's CUA in half the steps โžก Built with enterprise trust & security No more "sorry I can't click that button" - these agents navigate CRMs update records and complete real workflows. The future of work isn't just AI that thinks. It's AI that ACTS. #EnterpriseAI #FutureOfAI"
X Link 2025-08-25T13:54Z 19.1K followers, [----] engagements

"Looking for the cutting-edge of AI research Follow Salesforce AI Research to see how we're transforming enterprise technology through advanced innovations. From world models to agentic systems discover the future of AI before it hits the market"
X Link 2025-09-11T14:21Z 19.1K followers, 2.4M engagements

"๐Ÿ“ฃ Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels ๐Ÿ“ฃ RL for LLMs faces a critical data bottleneck: existing RL datasets are 10B tokens while pretraining uses 1T tokens. Our Webscale-RL pipeline solves this by automatically converting pretraining documents into 1.2M verifiable QA pairs across 9+ domains. ๐Ÿ“„ Paper: ๐Ÿ’ป Code: ๐Ÿ“Š Dataset: Results: [---] more token-efficient than continual pretraining with significant performance gains on MMLU-pro BigBench and mathematical reasoning benchmarks ๐Ÿ“ˆ Work by Zhepeng Cen (@zhepengcen) Haolin Chen (@HaolinChen11) Shiyu Wang"
X Link 2025-10-10T21:32Z 19.1K followers, 16.6K engagements

"Introducing Enterprise Deep Research (EDR): A steerable multi-agent system that transforms complex enterprise research into comprehensive actionable reports ๐Ÿ“Š EDR combines [--] key components: ๐Ÿง  Master Planning Agent for adaptive query decomposition ๐Ÿ” [--] specialized search agents (General Academic GitHub LinkedIn) ๐Ÿ›  Extensible MCP-based tools (NL2SQL file analysis enterprise workflows) ๐Ÿ“ˆ Visualization Agent for data-driven insights ๐Ÿ”„ Reflection mechanism with optional human-in-the-loop guidance Results on open benchmarks: โœ… Outperforms SOTA on DeepResearch Bench (49.86 score) โœ… 71.57% win"
X Link 2025-10-24T21:14Z 19.1K followers, [----] engagements

"Were thrilled to announce that @MetaMindIO has been acquired by @Salesforce https://www.metamind.io/salesforce-acquisition https://www.metamind.io/salesforce-acquisition"
X Link 2016-04-04T20:35Z 19.1K followers, [---] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@SFResearch
/creator/twitter::SFResearch