Dark | Light
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

# ![@omarsar0 Avatar](https://lunarcrush.com/gi/w:26/cr:twitter::3448284313.png) @omarsar0 elvis

elvis posts on X about context engineering, llm, devs, realtime the most. They currently have XXXXXXX followers and 1148 posts still getting attention that total XXXXXX engagements in the last XX hours.

### Engagements: XXXXXX [#](/creator/twitter::3448284313/interactions)
![Engagements Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::3448284313/c:line/m:interactions.svg)

- X Week XXXXXXX -XX%
- X Month XXXXXXXXX +1.60%
- X Months XXXXXXXXXX +77%
- X Year XXXXXXXXXX +110%

### Mentions: XX [#](/creator/twitter::3448284313/posts_active)
![Mentions Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::3448284313/c:line/m:posts_active.svg)

- X Week XX -XX%
- X Month XXX +28%
- X Months XXX +16%
- X Year XXX +200%

### Followers: XXXXXXX [#](/creator/twitter::3448284313/followers)
![Followers Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::3448284313/c:line/m:followers.svg)

- X Week XXXXXXX +0.46%
- X Month XXXXXXX +1.90%
- X Months XXXXXXX +14%
- X Year XXXXXXX +30%

### CreatorRank: XXXXXXX [#](/creator/twitter::3448284313/influencer_rank)
![CreatorRank Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::3448284313/c:line/m:influencer_rank.svg)

### Social Influence [#](/creator/twitter::3448284313/influence)
---

**Social category influence**
[musicians](/list/musicians)  [technology brands](/list/technology-brands)  [stocks](/list/stocks)  [finance](/list/finance) 

**Social topic influence**
[context engineering](/topic/context-engineering) #14, [llm](/topic/llm) #169, [devs](/topic/devs) #98, [realtime](/topic/realtime) #268, [leaderboard](/topic/leaderboard), [categories](/topic/categories), [all the](/topic/all-the), [xai](/topic/xai), [context window](/topic/context-window), [pay attention](/topic/pay-attention)

**Top assets mentioned**
[Alphabet Inc Class A (GOOGL)](/topic/$googl)
### Top Social Posts [#](/creator/twitter::3448284313/posts)
---
Top posts by engagements in the last XX hours

"Small Language Models are the Future of Agentic AI Lots to gain from building agentic systems with small language models. Capabilities are increasing rapidly AI devs should be exploring SLMs. Here are my notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1940038438746718698) 2025-07-01 13:23:02 UTC 255.6K followers, 267.3K engagements


"Challenges & Future Work Challenges for future research include reducing LLM cost/latency (especially for real-time tasks) leveraging underused data like traces improving model adaptability to software evolution and integrating LLMs with existing AIOps toolchains instead of replacing them. Paper:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946997411471323336) 2025-07-20 18:15:31 UTC 255.6K followers, 5550 engagements


"Context engineering is going to evolve rapidly. But this is a great overview to better map and keep track of this rapidly evolving landscape. There is a lot more in the paper. Over 1000+ references included. This survey tries to capture the most common methods and biggest trends but there is more on the horizon as models continue to improve in capability and new agent architectures emerge"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946241716467716455) 2025-07-18 16:12:39 UTC 255.6K followers, 7265 engagements


"Context engineering components include context retrieval and generation context processing context management and how they are all integrated into systems implementation such as RAG memory architectures tool-integrated reasoning and multi-agent coordination mechanisms"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946241627582054854) 2025-07-18 16:12:18 UTC 255.6K followers, 5775 engagements


"Excited to announce my new short course: Building Agentic Applications with Replit Agent and n8n. With AI this capable I believe anyone can become a builder. The stack I use here will teach you how to rapidly build agentic apps with no-code tools"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1943410079199629470) 2025-07-10 20:40:44 UTC 255.6K followers, 41.8K engagements


"Evaluating LLM-based Agents This report has a comprehensive list of methods for evaluating AI Agents. Don't ignore evals. If done right they are a game-changer. Highly recommend it to AI devs. (bookmark it)"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1939691782477902313) 2025-06-30 14:25:33 UTC 255.5K followers, 95.8K engagements


"An ablation study shows the importance of explicitly including tool names and I/O descriptions in Routine steps. Removing tool names dropped Qwen3-14Bs accuracy from XXXX% to 71.9%. Adding I/O fields provided minor gains especially for less capable models"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1947750835334877352) 2025-07-22 20:09:21 UTC 255.6K followers, XXX engagements


"Tool-calling capabilities in an area of continuous development in the space. The paper provides an overview of tool-augmented language model architectures and how they compare across tool categories"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946241703893189017) 2025-07-18 16:12:36 UTC 255.6K followers, 4014 engagements


"Better discrimination among high performers REST exposes significant performance gaps between models that perform similarly on single-question tasks. For instance R1-7B and R1-32B differ by only XXX% on MATH500 in single settings but diverge by over XX% under stress"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1945150455161741735) 2025-07-15 15:56:22 UTC 255.6K followers, XXX engagements


"Anthropic is killing it with these technical posts. If you're an AI dev stop what you are doing and go read this. It shows in great detail how to implement an effective multi-agent research system. Pay attention to these key parts:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1933941545675206936) 2025-06-14 17:36:10 UTC 255.6K followers, 570.4K engagements


"Stress Testing Large Reasoning Models This looks like a more interesting way to evaluate large reasoning models. Presents multiple reasoning problems in a single prompt to better represent real-world scenarios. Which are the best models at this Here are my notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1945150414195974448) 2025-07-15 15:56:12 UTC 255.6K followers, 17.4K engagements


"The framework separates planning (with LLMs) from execution (with small instruction-tuned models) using Routine as the bridge. This enables small-scale models to reliably execute complex plans with minimal resource overhead especially when using variable memory and modular tools like MCP server"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1947750820403175918) 2025-07-22 20:09:17 UTC 255.6K followers, XXX engagements


"One Token to Fool LLM-as-a-Judge Watch out for this one devs Semantically empty tokens like Thought process: Solution or even just a colon : can consistently trick models into giving false positive rewards. Here are my notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1944778174493343771) 2025-07-14 15:17:03 UTC 255.6K followers, 86.5K engagements


"Context Rot The research evaluates how state-of-the-art LLMs perform as input context length increases challenging the common assumption that longer contexts are uniformly handled. Testing XX top models (including GPT-4.1 Claude X Gemini XXX Qwen3) the authors show that model reliability degrades non-uniformly even on simple tasks as input grows what they term "context rot.""  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946607741021323484) 2025-07-19 16:27:06 UTC 255.6K followers, 8286 engagements


"Whats Hud A lightweight Runtime Code Sensor that installs in X min. No config. No dashboards. No tracing. Just real-time visibility into: Performance Errors Flows Function paths Dependencies All directly in your AI tools"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1947288030706077822) 2025-07-21 13:30:20 UTC 255.6K followers, XXX engagements


"Context Engineering Guide I'm writing a detailed guide on context engineering for AI devs. v1 is out now (bookmark it) I use a concrete deep research multi-agent example to show what context engineering involves"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1941566132001153082) 2025-07-05 18:33:33 UTC 255.6K followers, 289.3K engagements


"AI Research Agents for ML Achieves state-of-the-art on MLE-bench lite Using AI to automate the training of ML models is one of the most exciting and promising areas of research today. Lots of cool ideas in this paper:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1942235421607682317) 2025-07-07 14:53:04 UTC 255.6K followers, 41.4K engagements


"Gemini CLI with MCP servers is a match made in heaven It's amazing for coding use cases. But it's also great at other creative tasks like transcribing and writing. Just watch to see what I am talking about:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1942418143609033115) 2025-07-08 02:59:08 UTC 255.5K followers, 52.9K engagements


"Top XX LLM Interview Questions. Looks like a great resource to learn LLM basics:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1930984834454712537) 2025-06-06 13:47:15 UTC 255.6K followers, 354.9K engagements


"Agent Leaderboard v2 is here GPT-4.1 leads Gemini-2.5-flash excels at tool selection Kimi K2 is the top open-source model Grok X falls short Reasoning models lag behind No single model dominates all domains More below:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1945956442785083895) 2025-07-17 21:19:04 UTC 255.6K followers, 272.3K engagements


"Context Rot Great title for a report but even better insights about how increasing input tokens impact the performance of top LLMs. Banger report from Chroma. Here are my takeaways (relevant for AI devs):"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946607725796045287) 2025-07-19 16:27:02 UTC 255.6K followers, 168.3K engagements


"Future of Work with AI Agents Stanford's new report analyzes what 1500 workers think about working with AI Agents. What types of AI Agents should we build A few surprises Let's take a closer look:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1936134951520682123) 2025-06-20 18:51:58 UTC 255.6K followers, 300.8K engagements


"What comes after Cursor This new agentic IDE Kiro offers a glimpse at that future. Kiro comes with all the fun features in an agentic IDE deliberate planning and leverages ambient agents that autonmously collaborate with devs as they build production-grade systems"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1945134066551980278) 2025-07-15 14:51:14 UTC 255.6K followers, 46.9K engagements


"LLM method taxonomy Five categories of LLM-based approaches are identified: foundation models fine-tuning embeddings prompts and knowledge-based methods (like RAG and tool use). Prompt-based and retrieval-augmented methods dominate practical deployments while foundation models increasingly support generalizable anomaly detection and log understanding"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946997383352713419) 2025-07-20 18:15:24 UTC 255.6K followers, 1395 engagements


"Grok X on Vending Bench Grok X gets the #1 spot. Double the net worth of Claude Opus 4"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1943170235789398331) 2025-07-10 04:47:41 UTC 255.6K followers, 41.9K engagements


"BREAKING: xAI announces Grok X "It can reason at a superhuman level" Here is everything you need to know:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1943162144930828397) 2025-07-10 04:15:32 UTC 255.6K followers, 1.3M engagements


"MemAgent MemAgent-14B is trained on 32K-length documents with an 8K context window. Achieves XX% accuracy even at 3.5M tokens That consistency is crazy Here are my notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1942667308368871457) 2025-07-08 19:29:13 UTC 255.6K followers, 100.4K engagements


"The paper provides a taxonomy of context engineering in LLMs categorized into foundational components system implementations evaluation methodologies and future directions"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946241581415272904) 2025-07-18 16:12:07 UTC 255.6K followers, 7477 engagements


"Long2Short training improves resilience Models trained with concise reasoning objectives (e.g. L1-Qwen and Efficient-R1 variants) are significantly more robust under REST preserving performance by avoiding the overthinking trap. For example L1-Qwen-1.5B-Max maintains XXXX% accuracy on MATH500 under stress outperforming similarly sized baselines"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1945150471959929048) 2025-07-15 15:56:26 UTC 255.6K followers, XXX engagements


"The work distinguishes prompt engineering from context engineering on dimensions like state scalability error analysis complexity etc"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946241611903762897) 2025-07-18 16:12:14 UTC 255.6K followers, 5965 engagements


"The context engineering evolution timeline from 2020 to 2025 involves foundational RAG systems to complex multi-agent architectures"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946241597148123515) 2025-07-18 16:12:10 UTC 255.6K followers, 6960 engagements


"Assisted Remediation Categorization of LLM-enabled assisted remediation tasks in AIOps by increasing levels of automation. Tasks range from assisted questioning to mitigation solution generation command recommendation script generation and finally automatic execution. This highlights how LLMs progressively reduce human intervention in operational workflows"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946997367439499348) 2025-07-20 18:15:20 UTC 255.6K followers, 1478 engagements


"A Survey of Latent Reasoning Nice overview on the emerging field of latent reasoning. Great read for AI devs. (bookmark it)"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1942976772724695513) 2025-07-09 15:58:56 UTC 255.6K followers, 72.7K engagements


"YC on the key prompting techniques used by the best AI startups:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1928562249297211600) 2025-05-30 21:20:45 UTC 255.6K followers, 665.6K engagements


"Great work from the team. I care about efficiency in my work. GPT-4.1-mini being a lot cheaper and performant at the same time is exciting. The fact that GPT-4.1 tops the leaderboard for now makes a lot of sense as it is one of the top models for instruction following which is a key capability in building agents with superior tool calling. Perhaps the reasoning models weren't great but I think they can only get better at tool calling so it will be interesting to track how the leaderboard progresses as new reasoning models roll out. Kimi K2 is such an exciting open-source release. It packs a"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1945964232752713932) 2025-07-17 21:50:02 UTC 255.5K followers, 11.2K engagements


"Traditional agent planning approaches often fail in enterprise scenarios due to unstructured plans weak instruction following and tool selection errors. Routine provides a clear and modular format for LLM agents to follow multi-step plans reducing ambiguity and improving tool selection. Each step contains a step number name detailed description and (optionally) inputs outputs and tool to be called"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1947750788752937174) 2025-07-22 20:09:10 UTC 255.6K followers, XXX engagements


"Emerging evaluation strategies Beyond traditional metrics (precision F1 RMSE) the field has adopted generation metrics (BLEU BERTScore) execution metrics (e.g. success rates of generated scripts) and manual evaluation (qualitative grading human preference) to assess tasks like script generation or report explanation"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946997399127482526) 2025-07-20 18:15:28 UTC 255.6K followers, 5844 engagements


"A Survey of Context Engineering 160+ pages covering the most important research around context engineering for LLMs. This is a must-read Here are my notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946241565728600503) 2025-07-18 16:12:03 UTC 255.6K followers, 195.4K engagements


"Context Engineering Guide is now part of the Prompt Engineering Guide. 🔥 Nicer format. We've also been writing guides on other fire topics such as deep research reasoning LLMs and image generation. I will be expanding the guide further in the coming days. Stay tuned"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1942581621171105820) 2025-07-08 13:48:44 UTC 255.6K followers, 36K engagements


"AI for Scientific Search AI for Science is where I spend most of my time exploring with AI agents. This 120+ pages report does a good job of highlighting why all the big names like OpenAI and Google DeepMind are pursuing AI4Science. Bookmark it My notes below:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1940787135596187970) 2025-07-03 14:58:05 UTC 255.6K followers, 61.6K engagements


"💯 This is why I have been teaching my students about a more modular approach to building with AI agents. You need to be able to scope a project well and have a good framework for testing and evaluating workflows. Not every task needs an LLM but this is hard to determine if you are trying to solve the whole problem in one go and haven't thought about how the components play with each other. Always build with the intention to iterate. The easier and faster you can iterate the more you reap the rewards"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946998463813591526) 2025-07-20 18:19:41 UTC 255.6K followers, 1707 engagements


"The Illusion of Thinking in LLMs Apple researchers discuss the strengths and limitations of reasoning models. Apparently reasoning models "collapse" beyond certain task complexities. Lots of important insights on this one. (bookmark it) Here are my notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1931333830985883888) 2025-06-07 12:54:02 UTC 255.6K followers, 953.9K engagements


"Agentic-R1 This 7B model is surprisingly good at interleaved tool use and reasoning capabilities. It's fun to see small language models improving this fast. Knowledge distillation in full display. Here are my notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1945863581918257591) 2025-07-17 15:10:04 UTC 255.6K followers, 61.6K engagements


"Every AI dev should know how to apply everything on this list. From prompting tips to context engineering to metaprompting. Learn it once apply it everywhere. Check out my new 4+ hrs course (with code examples) to learn more:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1938602425386152296) 2025-06-27 14:16:50 UTC 255.5K followers, 49.2K engagements


"You can read the full paper below: Want to take it a step further Learn about context engineering and how to build effective agentic systems in my courses: We also have a workshop on context engineering coming soon"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946241728316653990) 2025-07-18 16:12:42 UTC 255.6K followers, 7432 engagements


"@rungalileo introduces Agent Leaderboard v2 a domain-specific evaluation benchmark for AI agents designed to simulate real enterprise tasks across banking healthcare insurance telecom and investment"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1945956445792411754) 2025-07-17 21:19:05 UTC 255.6K followers, 9632 engagements


"Agentic RAG for Personalized Recommendation This is a really good example of integrating agentic reasoning into RAG. Leads to better personalization and improved recommendations. Here are my notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1941957079331377475) 2025-07-06 20:27:02 UTC 255.6K followers, 92.5K engagements


"Grok X models are available via the xAI API. 256K context window. Real-time data search"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1943170665005134308) 2025-07-10 04:49:23 UTC 255.5K followers, 27.6K engagements


"A Structural Planning Framework for LLM Agent System in Enterprise Agentic systems for enterprise are a work in progress. Reliability is a real problem. No secret that planning works but structural planning can further help improve the reliability of AI agents. My notes:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1947750756494586275) 2025-07-22 20:09:02 UTC 255.6K followers, 8523 engagements


"In a real HR agent scenario with X multi-step workflows adding Routine increased GPT-4os accuracy from XXXX% to XXXX% and Qwen3-14Bs from XXXX% to 83.3%. Fine-tuning Qwen3-14B on a Routine-following dataset further increased accuracy to 88.2%; training on a Routine-distilled dataset reached 95.5%"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1947750804376719447) 2025-07-22 20:09:13 UTC 255.6K followers, XXX engagements


"Production-aware AI coding agents Hud powers tools like Copilot Cursor & Windsurf with actual production behavior: Endpoint degradation Failure patterns High-traffic functions Behavioral drift Your agents now generate code grounded in real usage not blind stabs"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1947288092530167994) 2025-07-21 13:30:34 UTC 255.6K followers, XXX engagements


"So tired of seeing these agentic systems used for booking travel. The real deal is AI agents for scientific discovery. Were getting close and not just LLM providers. If you can build reliable agentic systems you can be part of the race. Saying no more but watch this space"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1946738268881514899) 2025-07-20 01:05:46 UTC 255.6K followers, 14K engagements


"This handbook is so good It covers *everything* you need to know about LLM inference. FREE to access:"  
![@omarsar0 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::3448284313.png) [@omarsar0](/creator/x/omarsar0) on [X](/post/tweet/1943727674637033601) 2025-07-11 17:42:45 UTC 255.6K followers, 85.7K engagements

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@omarsar0 Avatar @omarsar0 elvis

elvis posts on X about context engineering, llm, devs, realtime the most. They currently have XXXXXXX followers and 1148 posts still getting attention that total XXXXXX engagements in the last XX hours.

Engagements: XXXXXX #

Engagements Line Chart

  • X Week XXXXXXX -XX%
  • X Month XXXXXXXXX +1.60%
  • X Months XXXXXXXXXX +77%
  • X Year XXXXXXXXXX +110%

Mentions: XX #

Mentions Line Chart

  • X Week XX -XX%
  • X Month XXX +28%
  • X Months XXX +16%
  • X Year XXX +200%

Followers: XXXXXXX #

Followers Line Chart

  • X Week XXXXXXX +0.46%
  • X Month XXXXXXX +1.90%
  • X Months XXXXXXX +14%
  • X Year XXXXXXX +30%

CreatorRank: XXXXXXX #

CreatorRank Line Chart

Social Influence #


Social category influence musicians technology brands stocks finance

Social topic influence context engineering #14, llm #169, devs #98, realtime #268, leaderboard, categories, all the, xai, context window, pay attention

Top assets mentioned Alphabet Inc Class A (GOOGL)

Top Social Posts #


Top posts by engagements in the last XX hours

"Small Language Models are the Future of Agentic AI Lots to gain from building agentic systems with small language models. Capabilities are increasing rapidly AI devs should be exploring SLMs. Here are my notes:"
@omarsar0 Avatar @omarsar0 on X 2025-07-01 13:23:02 UTC 255.6K followers, 267.3K engagements

"Challenges & Future Work Challenges for future research include reducing LLM cost/latency (especially for real-time tasks) leveraging underused data like traces improving model adaptability to software evolution and integrating LLMs with existing AIOps toolchains instead of replacing them. Paper:"
@omarsar0 Avatar @omarsar0 on X 2025-07-20 18:15:31 UTC 255.6K followers, 5550 engagements

"Context engineering is going to evolve rapidly. But this is a great overview to better map and keep track of this rapidly evolving landscape. There is a lot more in the paper. Over 1000+ references included. This survey tries to capture the most common methods and biggest trends but there is more on the horizon as models continue to improve in capability and new agent architectures emerge"
@omarsar0 Avatar @omarsar0 on X 2025-07-18 16:12:39 UTC 255.6K followers, 7265 engagements

"Context engineering components include context retrieval and generation context processing context management and how they are all integrated into systems implementation such as RAG memory architectures tool-integrated reasoning and multi-agent coordination mechanisms"
@omarsar0 Avatar @omarsar0 on X 2025-07-18 16:12:18 UTC 255.6K followers, 5775 engagements

"Excited to announce my new short course: Building Agentic Applications with Replit Agent and n8n. With AI this capable I believe anyone can become a builder. The stack I use here will teach you how to rapidly build agentic apps with no-code tools"
@omarsar0 Avatar @omarsar0 on X 2025-07-10 20:40:44 UTC 255.6K followers, 41.8K engagements

"Evaluating LLM-based Agents This report has a comprehensive list of methods for evaluating AI Agents. Don't ignore evals. If done right they are a game-changer. Highly recommend it to AI devs. (bookmark it)"
@omarsar0 Avatar @omarsar0 on X 2025-06-30 14:25:33 UTC 255.5K followers, 95.8K engagements

"An ablation study shows the importance of explicitly including tool names and I/O descriptions in Routine steps. Removing tool names dropped Qwen3-14Bs accuracy from XXXX% to 71.9%. Adding I/O fields provided minor gains especially for less capable models"
@omarsar0 Avatar @omarsar0 on X 2025-07-22 20:09:21 UTC 255.6K followers, XXX engagements

"Tool-calling capabilities in an area of continuous development in the space. The paper provides an overview of tool-augmented language model architectures and how they compare across tool categories"
@omarsar0 Avatar @omarsar0 on X 2025-07-18 16:12:36 UTC 255.6K followers, 4014 engagements

"Better discrimination among high performers REST exposes significant performance gaps between models that perform similarly on single-question tasks. For instance R1-7B and R1-32B differ by only XXX% on MATH500 in single settings but diverge by over XX% under stress"
@omarsar0 Avatar @omarsar0 on X 2025-07-15 15:56:22 UTC 255.6K followers, XXX engagements

"Anthropic is killing it with these technical posts. If you're an AI dev stop what you are doing and go read this. It shows in great detail how to implement an effective multi-agent research system. Pay attention to these key parts:"
@omarsar0 Avatar @omarsar0 on X 2025-06-14 17:36:10 UTC 255.6K followers, 570.4K engagements

"Stress Testing Large Reasoning Models This looks like a more interesting way to evaluate large reasoning models. Presents multiple reasoning problems in a single prompt to better represent real-world scenarios. Which are the best models at this Here are my notes:"
@omarsar0 Avatar @omarsar0 on X 2025-07-15 15:56:12 UTC 255.6K followers, 17.4K engagements

"The framework separates planning (with LLMs) from execution (with small instruction-tuned models) using Routine as the bridge. This enables small-scale models to reliably execute complex plans with minimal resource overhead especially when using variable memory and modular tools like MCP server"
@omarsar0 Avatar @omarsar0 on X 2025-07-22 20:09:17 UTC 255.6K followers, XXX engagements

"One Token to Fool LLM-as-a-Judge Watch out for this one devs Semantically empty tokens like Thought process: Solution or even just a colon : can consistently trick models into giving false positive rewards. Here are my notes:"
@omarsar0 Avatar @omarsar0 on X 2025-07-14 15:17:03 UTC 255.6K followers, 86.5K engagements

"Context Rot The research evaluates how state-of-the-art LLMs perform as input context length increases challenging the common assumption that longer contexts are uniformly handled. Testing XX top models (including GPT-4.1 Claude X Gemini XXX Qwen3) the authors show that model reliability degrades non-uniformly even on simple tasks as input grows what they term "context rot.""
@omarsar0 Avatar @omarsar0 on X 2025-07-19 16:27:06 UTC 255.6K followers, 8286 engagements

"Whats Hud A lightweight Runtime Code Sensor that installs in X min. No config. No dashboards. No tracing. Just real-time visibility into: Performance Errors Flows Function paths Dependencies All directly in your AI tools"
@omarsar0 Avatar @omarsar0 on X 2025-07-21 13:30:20 UTC 255.6K followers, XXX engagements

"Context Engineering Guide I'm writing a detailed guide on context engineering for AI devs. v1 is out now (bookmark it) I use a concrete deep research multi-agent example to show what context engineering involves"
@omarsar0 Avatar @omarsar0 on X 2025-07-05 18:33:33 UTC 255.6K followers, 289.3K engagements

"AI Research Agents for ML Achieves state-of-the-art on MLE-bench lite Using AI to automate the training of ML models is one of the most exciting and promising areas of research today. Lots of cool ideas in this paper:"
@omarsar0 Avatar @omarsar0 on X 2025-07-07 14:53:04 UTC 255.6K followers, 41.4K engagements

"Gemini CLI with MCP servers is a match made in heaven It's amazing for coding use cases. But it's also great at other creative tasks like transcribing and writing. Just watch to see what I am talking about:"
@omarsar0 Avatar @omarsar0 on X 2025-07-08 02:59:08 UTC 255.5K followers, 52.9K engagements

"Top XX LLM Interview Questions. Looks like a great resource to learn LLM basics:"
@omarsar0 Avatar @omarsar0 on X 2025-06-06 13:47:15 UTC 255.6K followers, 354.9K engagements

"Agent Leaderboard v2 is here GPT-4.1 leads Gemini-2.5-flash excels at tool selection Kimi K2 is the top open-source model Grok X falls short Reasoning models lag behind No single model dominates all domains More below:"
@omarsar0 Avatar @omarsar0 on X 2025-07-17 21:19:04 UTC 255.6K followers, 272.3K engagements

"Context Rot Great title for a report but even better insights about how increasing input tokens impact the performance of top LLMs. Banger report from Chroma. Here are my takeaways (relevant for AI devs):"
@omarsar0 Avatar @omarsar0 on X 2025-07-19 16:27:02 UTC 255.6K followers, 168.3K engagements

"Future of Work with AI Agents Stanford's new report analyzes what 1500 workers think about working with AI Agents. What types of AI Agents should we build A few surprises Let's take a closer look:"
@omarsar0 Avatar @omarsar0 on X 2025-06-20 18:51:58 UTC 255.6K followers, 300.8K engagements

"What comes after Cursor This new agentic IDE Kiro offers a glimpse at that future. Kiro comes with all the fun features in an agentic IDE deliberate planning and leverages ambient agents that autonmously collaborate with devs as they build production-grade systems"
@omarsar0 Avatar @omarsar0 on X 2025-07-15 14:51:14 UTC 255.6K followers, 46.9K engagements

"LLM method taxonomy Five categories of LLM-based approaches are identified: foundation models fine-tuning embeddings prompts and knowledge-based methods (like RAG and tool use). Prompt-based and retrieval-augmented methods dominate practical deployments while foundation models increasingly support generalizable anomaly detection and log understanding"
@omarsar0 Avatar @omarsar0 on X 2025-07-20 18:15:24 UTC 255.6K followers, 1395 engagements

"Grok X on Vending Bench Grok X gets the #1 spot. Double the net worth of Claude Opus 4"
@omarsar0 Avatar @omarsar0 on X 2025-07-10 04:47:41 UTC 255.6K followers, 41.9K engagements

"BREAKING: xAI announces Grok X "It can reason at a superhuman level" Here is everything you need to know:"
@omarsar0 Avatar @omarsar0 on X 2025-07-10 04:15:32 UTC 255.6K followers, 1.3M engagements

"MemAgent MemAgent-14B is trained on 32K-length documents with an 8K context window. Achieves XX% accuracy even at 3.5M tokens That consistency is crazy Here are my notes:"
@omarsar0 Avatar @omarsar0 on X 2025-07-08 19:29:13 UTC 255.6K followers, 100.4K engagements

"The paper provides a taxonomy of context engineering in LLMs categorized into foundational components system implementations evaluation methodologies and future directions"
@omarsar0 Avatar @omarsar0 on X 2025-07-18 16:12:07 UTC 255.6K followers, 7477 engagements

"Long2Short training improves resilience Models trained with concise reasoning objectives (e.g. L1-Qwen and Efficient-R1 variants) are significantly more robust under REST preserving performance by avoiding the overthinking trap. For example L1-Qwen-1.5B-Max maintains XXXX% accuracy on MATH500 under stress outperforming similarly sized baselines"
@omarsar0 Avatar @omarsar0 on X 2025-07-15 15:56:26 UTC 255.6K followers, XXX engagements

"The work distinguishes prompt engineering from context engineering on dimensions like state scalability error analysis complexity etc"
@omarsar0 Avatar @omarsar0 on X 2025-07-18 16:12:14 UTC 255.6K followers, 5965 engagements

"The context engineering evolution timeline from 2020 to 2025 involves foundational RAG systems to complex multi-agent architectures"
@omarsar0 Avatar @omarsar0 on X 2025-07-18 16:12:10 UTC 255.6K followers, 6960 engagements

"Assisted Remediation Categorization of LLM-enabled assisted remediation tasks in AIOps by increasing levels of automation. Tasks range from assisted questioning to mitigation solution generation command recommendation script generation and finally automatic execution. This highlights how LLMs progressively reduce human intervention in operational workflows"
@omarsar0 Avatar @omarsar0 on X 2025-07-20 18:15:20 UTC 255.6K followers, 1478 engagements

"A Survey of Latent Reasoning Nice overview on the emerging field of latent reasoning. Great read for AI devs. (bookmark it)"
@omarsar0 Avatar @omarsar0 on X 2025-07-09 15:58:56 UTC 255.6K followers, 72.7K engagements

"YC on the key prompting techniques used by the best AI startups:"
@omarsar0 Avatar @omarsar0 on X 2025-05-30 21:20:45 UTC 255.6K followers, 665.6K engagements

"Great work from the team. I care about efficiency in my work. GPT-4.1-mini being a lot cheaper and performant at the same time is exciting. The fact that GPT-4.1 tops the leaderboard for now makes a lot of sense as it is one of the top models for instruction following which is a key capability in building agents with superior tool calling. Perhaps the reasoning models weren't great but I think they can only get better at tool calling so it will be interesting to track how the leaderboard progresses as new reasoning models roll out. Kimi K2 is such an exciting open-source release. It packs a"
@omarsar0 Avatar @omarsar0 on X 2025-07-17 21:50:02 UTC 255.5K followers, 11.2K engagements

"Traditional agent planning approaches often fail in enterprise scenarios due to unstructured plans weak instruction following and tool selection errors. Routine provides a clear and modular format for LLM agents to follow multi-step plans reducing ambiguity and improving tool selection. Each step contains a step number name detailed description and (optionally) inputs outputs and tool to be called"
@omarsar0 Avatar @omarsar0 on X 2025-07-22 20:09:10 UTC 255.6K followers, XXX engagements

"Emerging evaluation strategies Beyond traditional metrics (precision F1 RMSE) the field has adopted generation metrics (BLEU BERTScore) execution metrics (e.g. success rates of generated scripts) and manual evaluation (qualitative grading human preference) to assess tasks like script generation or report explanation"
@omarsar0 Avatar @omarsar0 on X 2025-07-20 18:15:28 UTC 255.6K followers, 5844 engagements

"A Survey of Context Engineering 160+ pages covering the most important research around context engineering for LLMs. This is a must-read Here are my notes:"
@omarsar0 Avatar @omarsar0 on X 2025-07-18 16:12:03 UTC 255.6K followers, 195.4K engagements

"Context Engineering Guide is now part of the Prompt Engineering Guide. 🔥 Nicer format. We've also been writing guides on other fire topics such as deep research reasoning LLMs and image generation. I will be expanding the guide further in the coming days. Stay tuned"
@omarsar0 Avatar @omarsar0 on X 2025-07-08 13:48:44 UTC 255.6K followers, 36K engagements

"AI for Scientific Search AI for Science is where I spend most of my time exploring with AI agents. This 120+ pages report does a good job of highlighting why all the big names like OpenAI and Google DeepMind are pursuing AI4Science. Bookmark it My notes below:"
@omarsar0 Avatar @omarsar0 on X 2025-07-03 14:58:05 UTC 255.6K followers, 61.6K engagements

"💯 This is why I have been teaching my students about a more modular approach to building with AI agents. You need to be able to scope a project well and have a good framework for testing and evaluating workflows. Not every task needs an LLM but this is hard to determine if you are trying to solve the whole problem in one go and haven't thought about how the components play with each other. Always build with the intention to iterate. The easier and faster you can iterate the more you reap the rewards"
@omarsar0 Avatar @omarsar0 on X 2025-07-20 18:19:41 UTC 255.6K followers, 1707 engagements

"The Illusion of Thinking in LLMs Apple researchers discuss the strengths and limitations of reasoning models. Apparently reasoning models "collapse" beyond certain task complexities. Lots of important insights on this one. (bookmark it) Here are my notes:"
@omarsar0 Avatar @omarsar0 on X 2025-06-07 12:54:02 UTC 255.6K followers, 953.9K engagements

"Agentic-R1 This 7B model is surprisingly good at interleaved tool use and reasoning capabilities. It's fun to see small language models improving this fast. Knowledge distillation in full display. Here are my notes:"
@omarsar0 Avatar @omarsar0 on X 2025-07-17 15:10:04 UTC 255.6K followers, 61.6K engagements

"Every AI dev should know how to apply everything on this list. From prompting tips to context engineering to metaprompting. Learn it once apply it everywhere. Check out my new 4+ hrs course (with code examples) to learn more:"
@omarsar0 Avatar @omarsar0 on X 2025-06-27 14:16:50 UTC 255.5K followers, 49.2K engagements

"You can read the full paper below: Want to take it a step further Learn about context engineering and how to build effective agentic systems in my courses: We also have a workshop on context engineering coming soon"
@omarsar0 Avatar @omarsar0 on X 2025-07-18 16:12:42 UTC 255.6K followers, 7432 engagements

"@rungalileo introduces Agent Leaderboard v2 a domain-specific evaluation benchmark for AI agents designed to simulate real enterprise tasks across banking healthcare insurance telecom and investment"
@omarsar0 Avatar @omarsar0 on X 2025-07-17 21:19:05 UTC 255.6K followers, 9632 engagements

"Agentic RAG for Personalized Recommendation This is a really good example of integrating agentic reasoning into RAG. Leads to better personalization and improved recommendations. Here are my notes:"
@omarsar0 Avatar @omarsar0 on X 2025-07-06 20:27:02 UTC 255.6K followers, 92.5K engagements

"Grok X models are available via the xAI API. 256K context window. Real-time data search"
@omarsar0 Avatar @omarsar0 on X 2025-07-10 04:49:23 UTC 255.5K followers, 27.6K engagements

"A Structural Planning Framework for LLM Agent System in Enterprise Agentic systems for enterprise are a work in progress. Reliability is a real problem. No secret that planning works but structural planning can further help improve the reliability of AI agents. My notes:"
@omarsar0 Avatar @omarsar0 on X 2025-07-22 20:09:02 UTC 255.6K followers, 8523 engagements

"In a real HR agent scenario with X multi-step workflows adding Routine increased GPT-4os accuracy from XXXX% to XXXX% and Qwen3-14Bs from XXXX% to 83.3%. Fine-tuning Qwen3-14B on a Routine-following dataset further increased accuracy to 88.2%; training on a Routine-distilled dataset reached 95.5%"
@omarsar0 Avatar @omarsar0 on X 2025-07-22 20:09:13 UTC 255.6K followers, XXX engagements

"Production-aware AI coding agents Hud powers tools like Copilot Cursor & Windsurf with actual production behavior: Endpoint degradation Failure patterns High-traffic functions Behavioral drift Your agents now generate code grounded in real usage not blind stabs"
@omarsar0 Avatar @omarsar0 on X 2025-07-21 13:30:34 UTC 255.6K followers, XXX engagements

"So tired of seeing these agentic systems used for booking travel. The real deal is AI agents for scientific discovery. Were getting close and not just LLM providers. If you can build reliable agentic systems you can be part of the race. Saying no more but watch this space"
@omarsar0 Avatar @omarsar0 on X 2025-07-20 01:05:46 UTC 255.6K followers, 14K engagements

"This handbook is so good It covers everything you need to know about LLM inference. FREE to access:"
@omarsar0 Avatar @omarsar0 on X 2025-07-11 17:42:45 UTC 255.6K followers, 85.7K engagements

@omarsar0
/creator/twitter::omarsar0