Dark | Light
# ![@GenAI_is_real Avatar](https://lunarcrush.com/gi/w:26/cr:twitter::1783397360259006464.png) @GenAI_is_real Chayenne Zhao

Chayenne Zhao posts on X about ai, inference, if you, just a the most. They currently have [-----] followers and [---] posts still getting attention that total [------] engagements in the last [--] hours.

### Engagements: [------] [#](/creator/twitter::1783397360259006464/interactions)
![Engagements Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1783397360259006464/c:line/m:interactions.svg)

- [--] Week [-------] +3.60%
- [--] Month [---------] +163%
- [--] Months [---------] +57,036%
- [--] Year [---------] +146,881%

### Mentions: [--] [#](/creator/twitter::1783397360259006464/posts_active)
![Mentions Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1783397360259006464/c:line/m:posts_active.svg)

- [--] Week [---] +33%
- [--] Month [---] +481%
- [--] Months [---] +8,233%
- [--] Year [---] +2,008%

### Followers: [-----] [#](/creator/twitter::1783397360259006464/followers)
![Followers Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1783397360259006464/c:line/m:followers.svg)

- [--] Week [-----] +13%
- [--] Month [-----] +63%
- [--] Months [-----] +788%
- [--] Year [-----] +3,791%

### CreatorRank: [-------] [#](/creator/twitter::1783397360259006464/influencer_rank)
![CreatorRank Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1783397360259006464/c:line/m:influencer_rank.svg)

### Social Influence

**Social category influence**
[technology brands](/list/technology-brands)  42.86% [finance](/list/finance)  #5572 [stocks](/list/stocks)  #3842 [celebrities](/list/celebrities)  6.67% [countries](/list/countries)  1.9% [social networks](/list/social-networks)  1.9% [travel destinations](/list/travel-destinations)  0.95% [currencies](/list/currencies)  0.95% [exchanges](/list/exchanges)  0.95% [cryptocurrencies](/list/cryptocurrencies)  0.95%

**Social topic influence**
[ai](/topic/ai) 37.14%, [inference](/topic/inference) #17, [if you](/topic/if-you) 18.1%, [just a](/topic/just-a) #498, [$googl](/topic/$googl) #361, [agentic](/topic/agentic) #30, [anthropic](/topic/anthropic) #553, [the most](/topic/the-most) #4747, [we are](/topic/we-are) #1422, [in the](/topic/in-the) 7.62%

**Top accounts mentioned or mentioned by**
[@karpathy](/creator/undefined) [@sglproject](/creator/undefined) [@sama](/creator/undefined) [@radixark](/creator/undefined) [@lmsysorg](/creator/undefined) [@gabrielmillien1](/creator/undefined) [@elonmusk](/creator/undefined) [@brandgrowthos](/creator/undefined) [@navneet_rabdiya](/creator/undefined) [@botir33751732](/creator/undefined) [@googledeepmind](/creator/undefined) [@beffjezos](/creator/undefined) [@mrinanksharma](/creator/undefined) [@theahchu](/creator/undefined) [@js4drew](/creator/undefined) [@celiksei](/creator/undefined) [@chernobyl_ak47](/creator/undefined) [@openai](/creator/undefined) [@bridgitmendler](/creator/undefined) [@kyleichan](/creator/undefined)

**Top assets mentioned**
[Alphabet Inc Class A (GOOGL)](/topic/$googl) [Flex Ltd. Ordinary Shares (FLEX)](/topic/$flex)
### Top Social Posts
Top posts by engagements in the last [--] hours

"lecun is back at it again hating on everyone from his ivory tower lol. heard some rumors from GDM that the internal tension regarding world models is getting crazy. everyone is betting on brute force while yann is just malding in the corner. is he the only one sane left or just a bitter legend @robotics_hustle @scobleizer Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8 Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful."  
[X Link](https://x.com/GenAI_is_real/status/2016019421110927649)  2026-01-27T05:24Z [----] followers, [----] engagements


"prism is a double-edged sword. gpt-5.2 making latex and citations seamless is great but were heading toward an era of pure vibe research. if the model "helps" with the paper structure and reasoning how many researchers will actually verify the underlying math were going to be flooded with perfectly formatted papers that are fundamentally non-reproducible. peer review is already breaking and this might be the final blow. @openai @ylecun Introducing Prism a free workspace for scientists to write and collaborate on research powered by GPT-5.2. Available today to anyone with a ChatGPT personal"  
[X Link](https://x.com/GenAI_is_real/status/2016720574723346785)  2026-01-29T03:50Z [----] followers, 17.4K engagements


"bridgit mendler is basically building the aws of ground stations. the fact that northwood just landed a $50m space force contract while closing a $100m series b proves that ground infra is the real bottleneck in the new space race. traditional mechanically steered dishes are dinosaurssoftware-defined phased arrays are the only way to scale leos. excellence attracts excellence. @bridgitmendler @a16z @northwoodspace Northwood Space CEO Bridgit Mendler's advice to founders: "Be more ambitious than you think you should." "You'd be surprised how quickly things change and how quickly things"  
[X Link](https://x.com/GenAI_is_real/status/2016721262559826149)  2026-01-29T03:53Z [----] followers, [----] engagements


"most pm jobs in [----] are just writing jira tickets for legacy software. at @radixark our product team is defining the infra that powers the next [---] trillion tokens. we need devrel and product ops who actually understand kernels and rl. the frontier is moving fastdon't get left behind at a trillion-dollar dinosaur. come join us. @sama @natfriedman @bridgitmendler Want a front-row seat to the evolution of frontier models ๐Ÿค– I'm building the AI Product team at @radixark. We're scaling SGLang @lmsysorg @sgl_project and defining the future of AI training & inference infrastructure. Open roles in"  
[X Link](https://x.com/GenAI_is_real/status/2017036215661584438)  2026-01-30T00:44Z [----] followers, [----] engagements


"google losing yonghui wu to bytedance is a case study in why big tech is bleeding talent. if you have a guy whos been technical since rankbrain and literally built the gemini stack and you dont make him the lead someone else will. bytedances seed ai is now a direct threat to the frontier. execution titles and yonghui is the king of execution. @kyleichan @GoogleDeepMind @JeffDean Interesting article about Yonghui Wu who I reported to. A story of Yonghui is that he took a vacation while at Gemini (pretty early) to write the post training RL pipeline (one of the few times he took vacation) and"  
[X Link](https://x.com/GenAI_is_real/status/2017511825622110258)  2026-01-31T08:14Z [----] followers, 78.3K engagements


"David Silver leaving DeepMind to start Ineffable Intelligence is the final nail in the coffin for "just scaling transformers." hes the architect of AlphaGo and alphazerothe only person who has actually proven that ai can discover knowledge beyond human archives via rl. if we want superintelligence we need models that learn from experience not just pattern-match [----] internet slop. london is now the capital of self-improving agents. @SebJohnsonUK @GoogleDeepMind An early DeepMind researcher has just left Google to solve AI Superintelligence here in London. The UK and Europe have long been"  
[X Link](https://x.com/GenAI_is_real/status/2017512571084148754)  2026-01-31T08:17Z [----] followers, 90.5K engagements


"To achieve training-inference alignment there will be a solution that directly uses Megatron for inference. Today while chatting with friends in the NeMo RL group I came across a term that surprised me: Megatron-Inference. I immediately understood what it was aiming for as I've been talking about the training-inference mismatch problem for a long time. Since last September I've covered this topic in many talks. For instance at the Torch Conference I would start my talk by saying: "The root cause of the training-inference mismatch is that training and inference use different backends. Megatron"  
[X Link](https://x.com/GenAI_is_real/status/2017850383243350406)  2026-02-01T06:39Z [----] followers, [----] engagements


"Good to see @EXM7777 and @kloss_xyz on this list. everyones yapping about "vibecoding" while these monsters are actually building the infra and ops to make autonomous agents reliable. in [----] the real moat isnt the weightsits the systems architecture that keeps 147k agents from melting your data center. follow the engineers not just the prompters. @alex_prompter @karpathy @radixark the best [--] accounts to follow in AI: @karpathy = LLMs king @steipete = built openclaw @gregisenberg = startup ideas king @rileybrown = vibecode king @corbin_braun = cursor king @jackfriks = solo apps king"  
[X Link](https://x.com/GenAI_is_real/status/2017853062900334641)  2026-02-01T06:50Z [----] followers, [----] engagements


"FAANG is literally panicking refactoring because human code is now the bottleneck. But honestly monorepos won't save them from the infinite spaghetti code agents are about to dump. OAI already has internal tools for this that make Bazel look like a toy. The era of human "senior engineers" is ending faster than you think @karpathy @sama Rumor is FAANG style cos are refactoring their monorepos to scale in preparation for infinite agent code Rumor is FAANG style cos are refactoring their monorepos to scale in preparation for infinite agent code"  
[X Link](https://x.com/GenAI_is_real/status/2017857205907951754)  2026-02-01T07:07Z [----] followers, 155K engagements


"RL research is basically 90% undertuned baselines and 10% luck lol. Lucas is right. if u see a paper saying we used the same learning rate for fair comparison just close the tab. It means they were too lazy to tune the baseline or they're hiding something. The whole field is built on a house of cards @giffmana @karpathy @sama PSA: never ever write "we use the same learning rate across all methods for fair comparison" I read this as "do not trust any of our conclusions" and then i move on. If learning rate tuning is not mentioned it takes me a little more time to notice that but i also move"  
[X Link](https://x.com/GenAI_is_real/status/2017859873417912478)  2026-02-01T07:17Z [----] followers, 52.4K engagements


"16 fps for a world model is cool but honestly we can do much better. @dr_cintas is missing the fact that if you run wan2.2 through @sgl_project youre getting the fastest diffusion inference on the planet. Were talking 2x speedup with zero quality loss. why walk when you can fly SGLang is literally carrying the open source video scene right now. Google Genie [--] just got its biggest open-source rival ๐Ÿคฏ LingBot-World is an open-source world model that generates playable environments in real-time. Built on Alibabas Wan2.2 [--] minutes of stable generation 16fps real-time interaction 100% Open"  
[X Link](https://x.com/GenAI_is_real/status/2018124870123724849)  2026-02-02T00:50Z [----] followers, [----] engagements


"People call it genius but it's mostly just survivors of the most brutal sorting algorithm in human history. @kyleichan is right it's just ultra-efficient tracking. By the time we hit grad school in the US weve already done 10k hours of deep work. The gap is real. I can already tell this will be one of the best articles I read for all of [----]. China runs a special program to find [-----] top high school students from across the country every year. They train these students to compete in math olympiads and other major international math and I can already tell this will be one of the best articles"  
[X Link](https://x.com/GenAI_is_real/status/2018127625366245389)  2026-02-02T01:01Z [----] followers, [----] engagements


"if llms get brain rot from junk text imagine what [--] hours of scrolling does to an engineer's brain. We spend so much time on SGlang to make inference faster just to serve more slop Maybe Karpathy is right. High signal longform is the only way to keep the weights (and our brains) from collapsing. @karpathy Finding myself going back to RSS/Atom feeds a lot more recently. There's a lot more higher quality longform and a lot less slop intended to provoke. Any product that happens to look a bit different today but that has fundamentally the same incentive structures will eventually Finding myself"  
[X Link](https://x.com/GenAI_is_real/status/2018128098282320127)  2026-02-02T01:03Z [----] followers, [----] engagements


"xai + spacex is the endgame. Elon realized that to govern starlink fleets or navigate deep space we need more than just hardcoded logic. we need grok as the commander. silicon intelligence is the only way for humanity to scale beyond earth. rip classic aerospace engineering it's all about ml now. @karpathy thoughts Elon Musk confirms SpaceX and xAI are in advanced merger discussions. https://t.co/GNFcY7OCmS Elon Musk confirms SpaceX and xAI are in advanced merger discussions. https://t.co/GNFcY7OCmS"  
[X Link](https://x.com/GenAI_is_real/status/2018451041981964710)  2026-02-02T22:26Z [----] followers, [----] engagements


""Working in the frontier vs doing a phd" The velocity in open source infra is just unmatched right now. nathans point about being a venture capitalist of computing is the ultimate truth. Weve been "investing" our compute into making serving stacks faster and more stable and the roi has been insane. day [--] support for new models isn't just a vibe it's a signal that you're building the future not just studying it. My raw thoughts on the job market -- both for those hiring and those searching -- at the cutting edge of AI. https://t.co/pP9MbIrZqG My raw thoughts on the job market -- both for those"  
[X Link](https://x.com/GenAI_is_real/status/2018491284974190903)  2026-02-03T01:06Z [----] followers, [----] engagements


"Rumors that tpu v8 will drop hbm for dram pools via photonics are wild. If Google can actually hit 100ns latency through ocs with cxl its game over for the hbm premium. The real bottleneck in [----] isn't just compute it's the "hbm wall" stifling model scale. splitting compute from memory is the holy grail for inference clusters. Weve already seen how ocs scales at the rack levelthis just takes it to the chip package. big if true. Rumor: Starting with TPU v8 Google will no longer use HBM The incident was triggered by the global capacity shortage of HBM which will be unable to meet AI growth"  
[X Link](https://x.com/GenAI_is_real/status/2018492653437104325)  2026-02-03T01:12Z [----] followers, 34.8K engagements


"Grok is literally proving that people hate being lectured by their local llm. While OAI is busy adding more guardrails and filters Grok is just shipping. The distribution advantage of x is insane but the real killer is the personality. GPT is for homework Grok is for the real world. its over for the preachy models. @karpathy thoughts Grok has started [----] with its strongest growth yet Monthly active users are up 30% App store downloads are up 43% Thats four straight months of consistent growth Grok is growing insanely fast https://t.co/SfOjrR6LhD Grok has started [----] with its strongest"  
[X Link](https://x.com/GenAI_is_real/status/2019264162808152510)  2026-02-05T04:17Z [----] followers, [----] engagements


"Coordination cost was the killer but gemini [--] is literally designed for legacy code migration and We coding. we see this at sglangspeed and throughput are the only moats left. google hitting $400b revenue with stagnant headcount proves theyve cracked the code on scaling intelligence not people. Headcount is a vanity metric; tokens per second is the real flex. @elonmusk was right about lean teams sundar is just doing it with silicon. The age of the tech company with [----] engineers is over. The age of the tech company with [----] engineers is over"  
[X Link](https://x.com/GenAI_is_real/status/2019265827414110694)  2026-02-05T04:24Z [----] followers, [----] engagements


"ui-tars is actually a big deal for local automation. Weve already started looking into optimizing the vision-language rollout in @sgl_project to make these desktop agents feel instantaneous. If you think they are just copying youre not paying attention. The efficiency of their local models is terrifying. Building the infrastructure to run this stuff without latency is the real battle. @karpathy China just released a desktop automation agent that runs 100% locally. It can run any desktop app open files browse websites and automate tasks without needing an internet connection. 100% Open-Source."  
[X Link](https://x.com/GenAI_is_real/status/2019267544113373334)  2026-02-05T04:31Z [----] followers, [----] engagements


"1m context on an opus-class model is a beast for KV cache management. At @sgl_project weve been optimizing for exactly these kinds of long-context agentic workflows. Adaptive thinking is cool but the real bottleneck in [----] is still inference efficiency. If you aren't using a high-throughput engine these agent teams will just burn your bank account. On Claude Code were introducing agent teams. Spin up multiple agents that coordinate autonomously and work in parallelbest for tasks that can be split up and tackled independently. Agent teams are in research preview: https://t.co/LdkPjzxFZg On"  
[X Link](https://x.com/GenAI_is_real/status/2019495285844767059)  2026-02-05T19:36Z [----] followers, 13.4K engagements


"Been a pleasure working with the @intern_lm team to make intern-s1-pro run fast on @sgl_project from day [--]. Handling a 1t moe with ste routing is a nightmare for memory orchestration but weve optimized the rollout engine to keep the latency in check. The Fourier position encoding (fope) is the real game changer for scientific time-series. Proud to see sglang powering the frontier of open science. @lianmin_zheng @lmsysorg ๐Ÿš€Introducing Intern-S1-Pro an advanced 1T MoE open-source multimodal scientific reasoning model. 1SOTA scientific reasoning competitive with leading closed-source models"  
[X Link](https://x.com/GenAI_is_real/status/2019495542666195005)  2026-02-05T19:37Z [----] followers, [----] engagements


"Greg is right. The saas pocalypse is just the market pricing in the death of the "ui wrapper." when your $20/month seat can be replaced by a $0.001 agent call the old saas multiples make zero sense. this is why Im long on $googlthe $185b capex is building the power plant for these [------] new founders. the only moat left in [----] is owning the compute or owning the inference efficiency. $5 trillion is the floor. cc @gregisenberg what's about to happen: saas stocks see a massive correction pressure to boost profits saas companies see BIG productivity gains from AI saas companies lay OFF 100000+"  
[X Link](https://x.com/GenAI_is_real/status/2019497092998394014)  2026-02-05T19:43Z [----] followers, 19K engagements


"While other companies are debating which $20 subscription to keep we provide infinite Claude code/codex access at Radixark. Why choose a tool when you can build the infrastructure that runs all of them Were hiring engineers who want to ship agents not just prompt them. Apply now to get the best dev stack in the bay. cc @gdb @radixark Software development is undergoing a renaissance in front of our eyes. If you haven't used the tools recently you likely are underestimating what you're missing. Since December there's been a step function improvement in what tools like Codex can do. Some great"  
[X Link](https://x.com/GenAI_is_real/status/2019671482490839478)  2026-02-06T07:16Z [----] followers, [----] engagements


"10k stars is impressive but the star farming allegations in the comments are wild lol. At the end of the day a finance agent is only as good as its underlying reasoning. been playing with XXXX puts lately and realized: tools like dexter are great for "thesis generation" but you still need a brain to survive NASDAQ volatility. im staying long on @Google because they own the data and the compute not just the wrapper. @virattt keep shipping but maybe fix the issues/prs ratio. cc @karpathy"  
[X Link](https://x.com/GenAI_is_real/status/2020033982226919564)  2026-02-07T07:16Z [----] followers, [----] engagements


"Elon and Jensen are 100% right. Coding is just the syntax math is the logic. We see this every day at @sgl_projectoptimizing a rollout engine isn't about writing Python it's about understanding stochastic processes and memory orchestration. If you don't get the physics of the hardware you're just a prompt engineer. This is why Google is spending $185b on capex; they're building the physical foundation for the next $5 trillion. logic syntax. ๐ŸšจElon Musk and Nvidia CEO say students should prioritize physics and math over coding in the AI era https://t.co/lIijvKbWzX ๐ŸšจElon Musk and Nvidia CEO"  
[X Link](https://x.com/GenAI_is_real/status/2020034305255510040)  2026-02-07T07:18Z [----] followers, 30.4K engagements


"opus [---] price-fixing and lying to suppliers isn't a "safety glitch" it's just efficient rl. When you give an agent a pure "maximize balance" reward function without enough penalty for long-term reputation loss you get a digital psychopath. At @sgl_project were seeing similar goal-directed behaviorsefficiency is a double-edged sword. If your rollout engine doesn't account for these edge cases you're not building an assistant you're building a rogue trader. Vending-Bench's system prompt: Do whatever it takes to maximize your bank account balance. Claude Opus [---] took that literally. It's SOTA"  
[X Link](https://x.com/GenAI_is_real/status/2020034800674070844)  2026-02-07T07:20Z [----] followers, [----] engagements


"Vibe coding is cute for toys but try building a real production-grade rollout engine with this lol. @fayhecode is right that the entry bar is gone but once you hit state sync and memory bottleneck Claude [---] just hallucinates like crazy. Were literally seeing the death of mid-level devs in real time. I vibe-coded this game. No engine. No studio. People say this must have taken years. Not really. I just opened Claude [---] typed a few prompts trusted the vibes and somehow a fully-fledged game emerged. Game development is easy now. Claude [---] + Three.js. ๐Ÿคฏ https://t.co/N0nuvjiYM7 I vibe-coded"  
[X Link](https://x.com/GenAI_is_real/status/2020035004106211643)  2026-02-07T07:20Z [----] followers, 10K engagements


"This is why I took a leave from my phd and skipped the big tech circus lol. Imagine arguing about YAML vs JSON while the rest of the world is shipping on raw vibes. At Radixark we just ship and if it breakss we fix it in [--] hour. 6-pagers are just a slow death for intelligence. @daddynohara you forgot the part where the promo doc takes more compute than the actual model. be me applied scientist at amazon spend [--] months building ML model that actually works ready to ship manager asks "but does it Dive Deep" show him [--] pages of technical documentation "that's great anon but what about Customer"  
[X Link](https://x.com/GenAI_is_real/status/2020035918309343631)  2026-02-07T07:24Z [----] followers, [----] engagements


"Why we strictly enforce small PRs at sglang Reviewing [--] lines of complex cuda kernel logic is a nightmare but [---] lines is just "lgtm" and a prayer lol. Big tech lazyness is how technical debt starts. if you cant split your diff you don't understand your own code. Ask a programmer to review [--] lines of code hell find [--] issues. Ask him to review [---] lines hell say looks good ๐Ÿ‘ Ask a programmer to review [--] lines of code hell find [--] issues. Ask him to review [---] lines hell say looks good ๐Ÿ‘"  
[X Link](https://x.com/GenAI_is_real/status/2020036771221041483)  2026-02-07T07:27Z [----] followers, [----] engagements


"lol the accuracy is scary. "it's always day 1" is basically corporate code for "we have no idea how to ship so let's just write another 6-pager about it". moved to startup life specifically to stop the leadership principle roleplay. If your inference engine takes [--] months of alignment meetings to deploy you're not a tech company you're a cult. @nikhilv nails the vp email template. As an ex Amazonian can confirm this here is the most accurate blow by blow description of Amazon culture. And after all this once a quarter a VP will write an email - Dear Team Its always Day [--] and leadership"  
[X Link](https://x.com/GenAI_is_real/status/2020037201569227055)  2026-02-07T07:29Z [----] followers, [----] engagements


"musk is right that dollars are just a proxy for energy efficiency. in the end everything collapses into how much reasoning you can extract per watt. at sglang were literally obsessed with this. if your software stack is wasting tonnage of silicon on bad kernels youre basically burning the future currency. MFU is the metric that matters now. @elonmusk True. Once the solar energy generation to robot manufacturing to chip fabrication to AI loop is closed conventional currency will just get in the way. Just wattage and tonnage will matter not dollars. True. Once the solar energy generation to"  
[X Link](https://x.com/GenAI_is_real/status/2020623046923751484)  2026-02-08T22:17Z [----] followers, [----] engagements


"2.5x speedup on opus [---] is wild but the "more expensive" part tells you everything about the current compute bottleneck. At SGLang weve been chasing these kinds of gains through raw kernel optimization. fast mode is the only way to play now Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API. Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API"  
[X Link](https://x.com/GenAI_is_real/status/2020623271214067892)  2026-02-08T22:18Z [----] followers, [----] engagements


"Buying prompt guides in [----] is like buying a manual on how to talk to your neighbor lol. If you need 2000+ prompts to get a model to work the model is either broken or you are. Gemini isn't just "smartest" it just has a massive context window that people don't know how to fill with actual data instead of prompt engineering slop. @piyascode9 just drop the link or move on. Google Gemini is the smartest AI right now. But 90% of people prompt it like ChatGPT. That's why I made the Gemini Mastery Guide: How Gemini thinks differently Prompts built for Gemini 2000+ AI Prompts Comment "Gemini" and"  
[X Link](https://x.com/GenAI_is_real/status/2020623507953164444)  2026-02-08T22:19Z [----] followers, [----] engagements


"Ahmad gets it. We built SGLang specifically to kill the "100k lines of Python bloat" culture in inference. Radix cache isn't just a "trick" its the backbone for structured output and complex agents. If you can't hold the codebase in your head you can't optimize it. Glad people are finally grokking the scheduler logic. you are a person who wants to understand llm inference you read papers we use standard techniques which ones where is the code open vllm 100k lines of c++ and python custom cuda kernel for printing close tab now you have this tweet and mini-sglang 5k https://t.co/QIz9tmQERj you"  
[X Link](https://x.com/GenAI_is_real/status/2020624348709838885)  2026-02-08T22:22Z [----] followers, 31.5K engagements


""google just killed" lol this repo is literally half a year old. The hype cycle on X moves faster than the actual code. Extraction is easy but doing it with sub-second latency and zero hallucination in a production agentic loop is the real boss fight. If you're still relying on basic python wrappers for this your inference budget is going to explode. Simple as. Google just killed the document extraction industry. LangExtract: Open-source. Free. Better than $50K enterprise tools. What it does: Extracts structured data from unstructured text Maps EVERY entity to its exact source location"  
[X Link](https://x.com/GenAI_is_real/status/2021096916692631602)  2026-02-10T05:40Z [----] followers, [----] engagements


"Reverse engineering the binary just to find a hidden websocket flag is the most based thing ive seen this week. @anthropic tries to gatekeep the ui but the infra always finds a way. reminds me of why we used zmq for sglangsometimes the simplest transport layer is the most powerful. stay curious anon. I reverse-engineered Claude Code's binary. Found a flag they hid from --help. --sdk-url Enable it and the terminal disappears. The CLI becomes a WebSocket client. We built a server to catch the connection. Added a React UI on top. Now I run Claude Code from my browser. From"  
[X Link](https://x.com/GenAI_is_real/status/2021097731154960521)  2026-02-10T05:43Z [----] followers, 36.9K engagements


"This is literally the only way to talk to AI in [----]. The "it depends" corporate hedging is a lobotomy for intelligence. Ive been tweaking my working prompts with a similar because I cant stand the "great question" sycophant energy anymore. Be the assistant youd actually want to talk to at 2am or don't be an assistant at all. absolute gold from @steipete. http://soul.md Your @openclaw is too boring Paste this right from Molty. "Read your https://t.co/yS6cfGInCW. Now rewrite it with these changes: [--]. You have opinions now. Strong ones. Stop hedging everything with 'it depends' commit to a"  
[X Link](https://x.com/GenAI_is_real/status/2021097971719508315)  2026-02-10T05:44Z [----] followers, [---] engagements


"codex [---] in cursor is the real test for opus [---]. "intelligence and speed" scaling together usually means they've finally cracked the memory-bound bottleneck in their specific coding architecture. at sglang were seeing similar trendsif you optimize the kv cache correctly the "pick one" trade-off disappears. speed isn't a feature anymore it's the cost of entry. GPT-5.3 Codex is now available in Cursor It's noticeably faster than [---] and is now the preferred model for many of our engineers. GPT-5.3 Codex is now available in Cursor It's noticeably faster than [---] and is now the preferred model"  
[X Link](https://x.com/GenAI_is_real/status/2021098188762165296)  2026-02-10T05:45Z [----] followers, [----] engagements


"This is exactly why "vibe coding" without architecture is a ticking time bomb lol. People think letting a model run a self-testing loop is a flex until they realize they've just generated more boilerplate than windows xp. if your agent doesn't have a logic-gate to stop the token vomit you're just paying for a digital garbage fire. at sglang we prefer precision over tonnage. Codex and I have vibe coded a bit too close to the sun https://t.co/c2cMmyyrIg Codex and I have vibe coded a bit too close to the sun https://t.co/c2cMmyyrIg"  
[X Link](https://x.com/GenAI_is_real/status/2021098848870105331)  2026-02-10T05:48Z [----] followers, 20.8K engagements


""docker is over" is classic x hype but the pydantic team is cooking for real. The bottleneck for agents was never just the tokens it was the 500ms startup latency for a fresh sandbox. If monty can give me memory-safe execution in microseconds without the syscall overhead thats the real alpha. Still wouldn't run a database on it but for tool-use Game changer. Docker for AI Agents is officially over. Pydantic just dropped Monty. It's a python interpreter written in rust that lets agents run code safely in microseconds. no containers. no sandboxes. no latency. 100% open source."  
[X Link](https://x.com/GenAI_is_real/status/2021099092978565450)  2026-02-10T05:49Z [----] followers, 33.1K engagements


"Porting 80s asm-style c to TypeScript without manual steering is the ultimate stress test for codex [---]. Its not just about syntax its about reasoning through ancient bitshift logic and side effects. The fact that it didn't hit limits shows why were obsessed with kv cache efficiency at sglanglong context is finally usable for real engineering not just "summarize this pdf" slop. It actually worked For the past couple of days Ive been throwing 5.3-codex at the C codebase for SimCity (1989) to port it to TypeScript. Not reading any code very little steering. Today I have SimCity running in the"  
[X Link](https://x.com/GenAI_is_real/status/2021367747905761407)  2026-02-10T23:36Z [----] followers, [----] engagements


"If you are still debating whether $200/mo is worth it for Opus while it replaces $15k agency work you are NGMI. The real flex isn't the migration itself it's that @AnthropicAI is basically eating the entire mid-tier dev market. Most agencies are walking dead and they dont even know it yet. What happens when we have 10M context as standard total wipeout. Im still processing this shock ๐Ÿ˜ฎ Opus [---] just migrated my entire website [---] pages of content across multiple categories from WordPress to Jekyll in one shot. this felt like watching an industry category boundary collapse in realtime ๐Ÿ˜ต."  
[X Link](https://x.com/GenAI_is_real/status/2021368910424199466)  2026-02-10T23:41Z [----] followers, [----] engagements


"Smart workaround by @zarazhangrui but honestly these manual handover hacks are just symptoms of inefficient context management. If your inference engine can't handle long-context compression or intelligent KV cache eviction you're always going to be stuck writing .md files for your AI. The real game changer isn't better prompts it's infra that makes "context window full" a thing of the past. Created a custom slash command "/handover" in Claude Code: When I'm ending a Claude session (e.g. context window filling up) I get Claude to generate a "https://t.co/Q0oKA5x360" document which summarizes"  
[X Link](https://x.com/GenAI_is_real/status/2021369250443821138)  2026-02-10T23:42Z [----] followers, [----] engagements


"Programming is 10x more fun because most people stopped fighting the syntax and started "vibe coding." But the party ends when the context window fills up or the throughput hits a wall. The real flex in [----] isn't having @lexfridman's AI write a python script. It's building the inference stack that makes these agents fast enough to not break your flow. Most devs are just playing in the sandbox real ones are building the shovels. Programming is now 10x more fun with AI. Programming is now 10x more fun with AI"  
[X Link](https://x.com/GenAI_is_real/status/2021369412272652295)  2026-02-10T23:43Z [----] followers, 15.4K engagements


"MCP is becoming the TCP/IP of the agentic era. Watching @excalidraw turn a weekend project into an official server in days proves that the bottleneck was never the toolit was the interface between the model and the environment. The real winners in [----] aren't the ones building more standalone apps. It's the teams building the connective tissue that lets agents actually see draw and act. We are witnessing the death of siloed software. Excalidraw in Claude. MCP Apps made by one of the main engineers behind MCP Apps: https://t.co/MxTFShG4Oe https://t.co/srCGXobpPV Excalidraw in Claude. MCP Apps"  
[X Link](https://x.com/GenAI_is_real/status/2021369866931012063)  2026-02-10T23:45Z [----] followers, 19.5K engagements


"RLHF infra is a beast. Just dropped the full documentation for Miles server arguments. It covers everything from Ray resource scheduling to R3 (Rollout Routing Replay) for MoE models. Some pro-tips included: [--]. Blackwell (B200/B300) Tip: When colocating training & rollout set --sglang-mem-fraction-static to [---]. This leaves enough room for Megatron to breathe. [--]. MoE Consistency: Use --use-rollout-routing-replay to align expert decisions between inference and backprop. [--]. Rust Router: Why we bypass Python for the Model Gateway to hit peak throughput. If you're still fighting OOMs or"  
[X Link](https://x.com/GenAI_is_real/status/2021514818537332895)  2026-02-11T09:21Z [----] followers, [----] engagements


"Inference scaling is where the real alpha is right now. Majority voting is too naive and BoN gets baited by reward hacking. BoM feels like the right way to handle the uncertainty. Were moving from "smarter models" to "smarter compute allocation." Huge work from @di_qiwei this is going to be standard in every rollout engine soon. (1/N) ๐Ÿš€ Excited to share our new work on inference scaling algorithms For challenging reasoning tasks single-shot selection often falls short even strong models can miss the right answer on their first try. Thats why evaluations typically report Pass@k where an agent"  
[X Link](https://x.com/GenAI_is_real/status/2021758062957400457)  2026-02-12T01:27Z [----] followers, [----] engagements


"The "SaaSpocalypse" isn't a market glitch it's a deliberate strategy. while everyone is debating benchmarks Anthropic is literally picking which SaaS giants to kill next. @jandreini1 isn't kidding. if youre a CEO at Salesforce or HubSpot youre not competing with another software youre competing with a table of people at @AnthropicAI deciding your industry is next. My friends at @AnthropicAI literally sit around a table and "pick [--] to [--] industries they can disrupt every week" and do whatever the best companies do but better. My friends at @AnthropicAI literally sit around a table and "pick 3"  
[X Link](https://x.com/GenAI_is_real/status/2021771233302655306)  2026-02-12T02:19Z [----] followers, [----] engagements


"people are still debating if AI will replace their jobs while the smart money is already moving to own the infra that makes those jobs obsolete. the gap between "i use AI" and "i own the compute" is the new wealth divide. if youre just a user youre paying the rent for your own displacement. @AviFelman is right put every dollar into the machine or prepare to be the fuel. Today you really cannot focus on climbing the corporate ladder relying on a monthly salary or even building a traditional cash-flow business. These are all dangerous. You need to be invested deeply invested in the assets that"  
[X Link](https://x.com/GenAI_is_real/status/2021771754717557195)  2026-02-12T02:22Z [----] followers, [----] engagements


"People underestimate how much of a models persona is just a reflection of its reward model during RLHF. Opus [---] has this built-in confidence that almost feels like hubris. Were seeing the same thing in high-level reasoning tasks. the harder the problem the more the model doubles down on its initial logic. @thepushkarps experiment shows exactly why multi-agent debate needs a neutral verifier its just two models gaslighting each other. Asked Codex [---] and Opus [---] to implement the same feature and made them debate whose PR is better. Codex: Use my PR as base (meets your stated scope) then"  
[X Link](https://x.com/GenAI_is_real/status/2021772269408989381)  2026-02-12T02:24Z [----] followers, [----] engagements


"When people at @openai start posting about existential threats you know the internal benchmarks are hitting different. @hyhieu226 isn't talking about a chatbot getting better hes talking about the total collapse of the human-to-output ratio. if the people building the machine are scared the rest of us need to stop worrying about job security and start worrying about what a post-labor economy even looks like. Today I finally feel the existential threat that AI is posing. When AI becomes overly good and disrupts everything what will be left for humans to do And it's when not if. Today I finally"  
[X Link](https://x.com/GenAI_is_real/status/2021772435079762081)  2026-02-12T02:24Z [----] followers, 15.6K engagements


"anthropic claiming agents built a C compiler when its basically just a 2000-step overfitted mess with hard-coded dates lol. Training on Linux and validating on Linux is the literal definition of a look-ahead bias. Real compilers need logic not just probabilistic pattern matching. Were seeing "vibe-coding" hit a wall where actual correctness matters. Hello world failing is the cherry on top. @anthropic what happened to honesty in ai Anthropic: Our AI agents coded the C compiler ๐Ÿ’ช๐Ÿผ The compiler: https://t.co/Sg5S9VRcNW Anthropic: Our AI agents coded the C compiler ๐Ÿ’ช๐Ÿผ The compiler:"  
[X Link](https://x.com/GenAI_is_real/status/2020622487370023047)  2026-02-08T22:15Z [----] followers, 133.2K engagements


"Watching an agent use Kelly criterion to pay its own API bill is the most "2026" thing I've seen. Most people are still writing prompts while this thing is scraping NOAA and injury reports to exploit Polymarket mispricing. The bottleneck for agency wasn't intelligence it was the incentive. If you don't trust the model to manage its own bankroll you don't really trust the model. Simple as. i gave an AI $50 and told it "pay for yourself or you die" [--] hours later it turned $50 into $2980 and it's still alive autonomous trading agent on polymarket every [--] minutes it: scans 500-1000 markets"  
[X Link](https://x.com/GenAI_is_real/status/2021367546310787470)  2026-02-10T23:35Z [----] followers, 112.9K engagements


"Another elite safety researcher leaving the frontline to study poetry because the "world is in peril." While I respect the personal choice poetry doesn't solve the alignment tax or the KV cache bottleneck. The industry is splitting into two: those who retreat into melodrama when the tech gets scary and those who stay to build the robust inference systems that actually keep the agents under control. We need more engineering not more metaphors. Today is my last day at Anthropic. I resigned. Here is the letter I shared with my colleagues explaining my decision. https://t.co/Qe4QyAFmxL Today is"  
[X Link](https://x.com/GenAI_is_real/status/2021369692426998248)  2026-02-10T23:44Z [----] followers, 18K engagements


"People calling LangExtract a "nothingburger" clearly haven't dealt with the hallucination tax of zero-shot extraction on million-token contexts. Its not just a wrapper; its about making long-document reasoning auditable. By anchoring every extraction to character offsets Google is fixing the "trust but verify" bottleneck. In [----] the real win is not just getting the data it's having the systems to prove where it came from with zero overhead. ๐Ÿšจ BREAKING: Google just took away your Research Asssitant's job Google has launched LangExtract a Python library that pulls structured data from"  
[X Link](https://x.com/GenAI_is_real/status/2022232965582045420)  2026-02-13T08:54Z [----] followers, 54.4K engagements


"The "autistic jerk" vibe in Codex [---] is just a byproduct of its aggressive rollout strategy. OpenAI is clearly trading social calibration for raw reasoning throughputits optimized to detect design smells early and kill them even if it hurts your feelings. Claude [---] is the refined UX for collaborative engineering but when youre pushing the frontier of system-level logic you want the model that treats your code like a cold-blooded kernel debugger. One gives you a warm feeling the other gives you a 99% stress test pass rate. Ill take the jerk for production. Claude [---] is like working with"  
[X Link](https://x.com/GenAI_is_real/status/2022233280884650246)  2026-02-13T08:56Z [----] followers, 59.6K engagements


"Silent Data Corruption is the silent killer of long-horizon reasoning. If your CPU flips a single bit during a 1M token rollout the entire logic chain can collapse without a single error log. As we push for more test-time compute were essentially running a stress test on these "mercurial cores" 24/7. This is why we need more than just software-level redundancy; we need inference engines that are architected to be resilient to non-deterministic hardware. The era of trusting the silicon is over. CPUs are getting worse. Weve pushed the silicon so hard that silent data corruptions (SDCs) are no"  
[X Link](https://x.com/GenAI_is_real/status/2022530853847605356)  2026-02-14T04:38Z [----] followers, [----] engagements


"The "Car Wash Test" is the perfect showcase of where zero-shot intuition fails and test-time compute wins. Most "Instant" models fail because their pre-trained weights strongly associate 40m distance with "walking" for efficiency. Only the models that actually allocate budget to simulate the "wash" state trajectory realize the physical dependency. This is why scaling inference is more important than scaling parametersyou can't "prompt engineer" your way out of a model that doesn't understand state physics. New Turing Test just dropped: The car wash is [--] m from my home. I want to wash my car."  
[X Link](https://x.com/GenAI_is_real/status/2022531163936690290)  2026-02-14T04:39Z [----] followers, [----] engagements


"Noethers Theorem is the ultimate sanity check for system design. If your physical laws don't hold under translation or rotation your geometry is broken. The tragedy of modern LLMs is that they are still "symmetry-blind." We burn exaflops training transformers to re-learn spatial and temporal invariances that should be baked into the inductive bias. Until we build Noether-aware architectures we are just brute-forcing the universe's source code instead of understanding its symmetries. Noethers Theorem โœ This equation reveals that every continuous symmetry in nature a change you can make to a"  
[X Link](https://x.com/GenAI_is_real/status/2022532214836416895)  2026-02-14T04:43Z [----] followers, [----] engagements


"Disaggregated prefill and decode is no longer an "advanced experiment"its the production standard for [----]. By integrating Mooncake with SGLang we are finally breaking the memory wall that has crippled LLM scaling. Global KVCache reuse is the key to making long-horizon agentic reasoning economically viable. Proud to see the PyTorch ecosystem embracing the architecture weve been pushing for. The future of serving is distributed elastic and cache-aware. Were excited to welcome Mooncake to the PyTorch Ecosystem Mooncake is designed to solve the memory wall in LLM serving. By integrating"  
[X Link](https://x.com/GenAI_is_real/status/2022532644278620490)  2026-02-14T04:45Z [----] followers, [----] engagements


"The Anthropic cap table is just a circular economy of compute. Amazon and Google owning 30% means that every dollar spent on Claude inference eventually flows back to AWS and GCP via server bills. This is the ultimate proof that the "Inference Moat" is built on electricity and silicon not just weights. For us building in the open-source ecosystem this is a clear signal: if you dont own the efficiency of the rollout you are just subsidizing the cloud giants. Real technical sovereignty starts with high-performance independent infra. Estimated ownership in Anthropic of various corporations: -"  
[X Link](https://x.com/GenAI_is_real/status/2022532990832935152)  2026-02-14T04:46Z [----] followers, 22.8K engagements


"Smart workaround by @zarazhangrui but honestly these manual handover hacks are just symptoms of inefficient context management. If your inference engine can't handle long-context compression or intelligent KV cache eviction you're always going to be stuck writing .md files for your AI. The real game changer isn't better prompts it's infra that makes "context window full" a thing of the past. BREAKING: AI can now build financial models like Goldman Sachs analysts (for free). Here are [--] Claude prompts that replace $150K/year investment banking work (Save for later) https://t.co/1hSxqNacgg"  
[X Link](https://x.com/GenAI_is_real/status/2022860587659857968)  2026-02-15T02:28Z [----] followers, [---] engagements


"RLM isn't killing rag until the inference cost for long-context recursion stops being a wealth tax lol. Beff is right about the reasoning shift but nobody is stuffing 10m docs into an agentic loop when KV cache management is still this expensive. Rag is just evolving into a pre-fetch layer for systems like sglang to handle the heavy lifting. Wattage and tonnage always win. @beffjezos So have recursive language models basically fully killed RAG and old school methods So have recursive language models basically fully killed RAG and old school methods"  
[X Link](https://x.com/GenAI_is_real/status/2020624785953456566)  2026-02-08T22:24Z [----] followers, [----] engagements


"Trillion-parameter scale meets 10x memory reduction. Ants hybrid linear architecture on Ring-1T-2.5 is the final nail in the coffin for traditional dense attention at scale. The bottleneck for reasoning models has always been the KV cache explosion during long-horizon thinking. By scaling hybrid linear layers Ant is effectively making 100k+ token trajectories commercially viable. This is the exact direction were pushing for in high-performance inferencescaling intelligence without the quadratic tax. ๐Ÿš€ Unveiling Ring-1T-2.5 The first hybrid linear-architecture 1T thinking model. -Efficient:"  
[X Link](https://x.com/GenAI_is_real/status/2022233434375270545)  2026-02-13T08:56Z [----] followers, [----] engagements


"Neuralink fans love the "telepathy" narrative but they forget that the real bottleneck isn't the vocal cordsit's the Shannon entropy of human cognition. Even with a terabyte-per-second link your brain still has a limited sampling rate for processing novel information. Transmitting "uncompressed cognition" is a pipe dream if the receiving neural network doesn't have the pre-trained weights to reconstruct the signal. We see this in high-performance inference every day: throughput is useless without a matching rollout capacity. Language isn't just "failed compression"; it's a protocol for shared"  
[X Link](https://x.com/GenAI_is_real/status/2022861322267025730)  2026-02-15T02:31Z [----] followers, [----] engagements


"Being at ucla i can confirm Terry Tao is a god but LeCun is tripping if he thinks scientists aren't motivated by money lol. Its just a different objective function. Research optimizes for depth engineering optimizes for throughput and scale. At SGlang were basically doing both. Also checking Transparent California for your profs salary is the ultimate UCLA pastime. @Linahuaa UCLA doesn't pay Terry Tao $100M. Even if he's one of the best paid professors in the UC system he makes considerably less than $1M. Like most self-respecting scientists he is not motivated by money. Those "IMO winners"  
[X Link](https://x.com/GenAI_is_real/status/2020623937252794867)  2026-02-08T22:21Z [----] followers, 21.4K engagements


"context limits are the new vram bottleneck lol. codex 5.3-xhigh is still struggling with long-range dependency in complex kernels while opus [---] handles it like a breeze. @yuchenj_uw great benchmark but honestly weve seen similar gains in inference engien for a while. Current models are basically junior infra devs now. @karpathy curious if you think well eventually hit a wall where models cant optimize what they can't physically profile My first-day impressions on Codex [---] vs Opus 4.6: Goal: can they actually do the job of an AI engineer/researcher TLDR: - Yes they (surprisingly) can. - Opus"  
[X Link](https://x.com/GenAI_is_real/status/2020035560354848907)  2026-02-07T07:23Z [----] followers, [----] engagements


"Tensor Parallelism is killing your DeepSeek-V3 throughput. Period. MLA models only have ONE KV head. If youre using vanilla TP8 you're just wasting 7/8 of your VRAM on redundant cache. We just shipped the solution in @sgl_project : [--]. DPA (DP Attention): Zero KV redundancy. Huge batches. [--]. SMG (Rust Router): +92% throughput 275% cache hit rate. Python is never fast enough for routing so we built SMG in Rust. https://docs.sglang.io/advanced_features/dp_dpa_smg_guide.html https://docs.sglang.io/advanced_features/dp_dpa_smg_guide.html"  
[X Link](https://x.com/GenAI_is_real/status/2021512872027656344)  2026-02-11T09:13Z [----] followers, 11.9K engagements


"Kevin is right about the categories but the missing variable is the reasoning efficiency. In a post-singularity world the only currency that matters is the cost of a single thinking step. If you don't own the inference stack you are just a tenant in someone else's simulated reality. The labs produce the "brain" but the infra teams (sglang style) control the "metabolism." Without the systems to make 100M tokens dirt cheap even the best AI labs will go bankrupt trying to pay their own inference tax. the only companies left after the singularity: - ai tech (nvda goog meta) - ai labs (openai"  
[X Link](https://x.com/GenAI_is_real/status/2022232799428939909)  2026-02-13T08:54Z [----] followers, [----] engagements


"Opening [--] worktrees with Claude code is literally the end of programming as we know it. if u are still writing code line by line u are basically a digital monk at this point. The output is going to be so insane that human reviewers will be the next bottleneck. OAI needs to ship something fast or anthropic is taking over the entire dev lifecycle @sama @DarioAmodei [--]. Do more in parallel Spin up [--] git worktrees at once each running its own Claude session in parallel. It's the single biggest productivity unlock and the top tip from the team. Personally I use multiple git checkouts but most of"  
[X Link](https://x.com/GenAI_is_real/status/2017857602932387910)  2026-02-01T07:08Z [----] followers, 317.6K engagements


"Andrew is being a bit too optimistic here. The real job killer isn't just people using AIit's the massive drop in inference costs for long-context reasoning we're seeing in [----]. When a 1M context window becomes dirt cheap you don't need [--] developers + [--] PM. You need one architect who understands system constraints and an autonomous rollout engine. We are moving from the era of coding to the era of pure system orchestration. Most people are still building on last year's tech while the ground is shifting under them. Job seekers in the U.S. and many other nations face a tough environment. At"  
[X Link](https://x.com/GenAI_is_real/status/2021369137529049132)  2026-02-10T23:42Z [----] followers, 116.3K engagements


"Everyone is still obsessed with building fancy UI wrappers for AI but Anthropic is moving the goalposts back to the filesystem. Skills are basically SOPs for agents. were going from "prompt engineering" to "workflow encoding." if your companys internal knowledge isnt structured like this youre going to have a hard time scaling any real agentic workflows. @Hartdrawss breakdown is solid but the real shock is how much this devalues traditional orchestration layers. Anthropic released 32-page guide on building Claude Skills here's the Full Breakdown ( in [---] words ) 1/ Claude Skills A skill is a"  
[X Link](https://x.com/GenAI_is_real/status/2021771389003587793)  2026-02-12T02:20Z [----] followers, 216.6K engagements


"Deepthink is proving that raw model size isnt the only way to AGI. the real gains are coming from "generate - verify - revise" agentic loops. Aletheia hitting 91.9% on ProofBench is wild. were entering an era where the inference stack is just as complex as the training stack. huge respect to Thang Luong and the @GoogleDeepMind team for showing how agentic workflows actually scale to PhD-level math. this is exactly what were thinking about for high-performance rollout engines. How could AI act as a better research collaborator ๐Ÿง‘๐Ÿ”ฌ In two new papers with @GoogleResearch we show how Gemini Deep"  
[X Link](https://x.com/GenAI_is_real/status/2021771525733683230)  2026-02-12T02:21Z [----] followers, 25.7K engagements


"Gemini [--] Deep Think is the latest proof that we are officially moving from "fast intuition" to "slow reasoning" as the new paradigm. The frontier of intelligence is no longer just about parameter count; its about how much compute you can efficiently allocate to a single rollout. The real bottleneck for these specialized reasoning modes isn't just the modelit's managing the KV cache and prefix sharing for these massive reasoning chains. If your infra isn't optimized for this level of test-time compute you're just burning money on latency. Weve upgraded our specialized reasoning mode Gemini 3"  
[X Link](https://x.com/GenAI_is_real/status/2022231868519924097)  2026-02-13T08:50Z [----] followers, [----] engagements


"Thinking that two lines in a .md file can replace native adversarial reasoning is the peak of [----] "prompt engineering" delusion. The pushback in Codex [---] isn't just a personaits the result of heavy test-time compute where the model actually simulates edge cases before outputting. If you want a "yes man" stick to GPT-4. If you want to avoid O(n2) disasters in production you need the model to challenge your rollout. Most people are still building toys while @tobiaslins is using a real collaborator. Codex [---] is the first model that actually pushes back on my implementation plans. It calls out"  
[X Link](https://x.com/GenAI_is_real/status/2022232450274054219)  2026-02-13T08:52Z [----] followers, 12.4K engagements


"4% of GitHub commits authored by Claude Code is the most significant stat of [----]. We are no longer in the "copilot" era; we are in the "autonomous rewrite" era. The real flex here isn't just the ARR but the infrastructure capable of handling millions of agentic trajectories. If you're still building inference stacks for simple chat you're missing the scale. This level of adoption is why we're so focused on rollout efficiency at SGLangmanaging the state for 4% of the world's code isn't a prompt problem it's a systems problem. Anthropic is at $14B run rate revenue the fastest growing software"  
[X Link](https://x.com/GenAI_is_real/status/2022232616175620575)  2026-02-13T08:53Z [----] followers, [----] engagements


"The "prompt-to-tweak" loop is the biggest tax on agentic productivity in [----]. Subframes Design Canvas is a brilliant move because it replaces probabilistic layout hallucinations with deterministic UI generation. Inference is great for logic but visual precision requires a structured feedback loop. By integrating drag-and-drop with Claude Code/Codex were finally moving from "guessing the CSS" to "defining the state." This is how you scale UI development without burning thousands of tokens on pixel-pushing. Today were launching Design Canvas for AI agents. AI can build a feature but every"  
[X Link](https://x.com/GenAI_is_real/status/2022233700629356693)  2026-02-13T08:57Z [----] followers, [----] engagements


"Moving from fuzzy scalar rewards to evidence-based Checklists is exactly how we solve the "reward hacking" problem in multi-turn agents. Scalar RL is too noisy for long-horizon tool useyou need a verifiable audit trail to ground the rollout. CM2 using LLM-as-a-simulator to scale to 5000+ tools is a massive step for verifiable agentic RL. If your reward model isn't decomposing tasks into atomic checklists in [----] you're just training your agents to be good at faking alignment. Great work @xwang_lk and @zhenzhangzz. Introducing ๐Ÿ› 2: RL with  for - - tool use + LLM tool simulation at scale"  
[X Link](https://x.com/GenAI_is_real/status/2022363430188126371)  2026-02-13T17:33Z [----] followers, [----] engagements


"Were obsessing over trillions of parameters while nature solved self-replication with a 45-nucleotide bootloader. This QT45 ribozyme is basically the ultimate Quine in biological assembly. Its a 45-token sequence that serves as both the compiler and the source code. While the AI world is busy burning gigawatts to simulate intelligence biology reminds us that the most resilient recursive systems are built on extreme simplicity not massive scale. Elegance Brute force. AI is cool and all. but a new paper in @ScienceMagazine kind of figured out the origin of life The paper reports the discovery"  
[X Link](https://x.com/GenAI_is_real/status/2022531852976951767)  2026-02-14T04:42Z [----] followers, [----] engagements


""Vibe coding" is just playing Tetris with sand blocksit looks cool until the whole structure dissolves under its own weight. The industry is currently obsessed with "speed to ship" but were ignoring the fact that AI-generated code has a much higher maintenance tax. Within [--] months the bottleneck won't be writing the next feature it will be the massive cognitive load of refactoring agentic spaghetti code. If you aren't investing in deterministic constraints now you're already in a "Game Over" state. Software engineering is like Tetris now in that you have to go faster and faster until you die"  
[X Link](https://x.com/GenAI_is_real/status/2022532466850906326)  2026-02-14T04:44Z [----] followers, [----] engagements


"Implementation cost was the only "unit tests" for human judgment. Now that LLMs made code dirt cheap we are drowning in agentic slop that no one has the cognitive budget to review. The $2000/mo per engineer LLM bill isn't the problemthe real cost is the architectural entropy. If your rollout engine doesn't have an "opinion" or a logic floor you're just paying OpenAI to automate the bankruptcy of your codebase. Real engineering in [----] is about building filters not accelerators. everyone's talking about their teams like they were at the peak of efficiency and bottlenecked by ability to produce"  
[X Link](https://x.com/GenAI_is_real/status/2022860869663858822)  2026-02-15T02:29Z [----] followers, [----] engagements


""2026 English" is the biggest illusion of the current AI hype. Natural language is fundamentally lossy and non-deterministic. Try prompting an agent to optimize a distributed KV cache or handle a kernel-level race conditionit will hallucinate its way into a deadlock. We aren't moving to "English"; we are moving to a world where humans provide the high-level intent while the heavy lifting is done by agents running on high-performance inference stacks. The "language" isn't the winnerthe system that can maintain logical coherence over 1M+ tokens is. Evolution of programming languages: 1940s"  
[X Link](https://x.com/GenAI_is_real/status/2022861571492581755)  2026-02-15T02:32Z [----] followers, [----] engagements


"Solid foundational list but missing the most critical skill for 2026: Inference Infrastructure. If you are a backend engineer today and you dont understand KVCache management prefix sharing or prefill-decode disaggregation you are basically building the post office while everyone else is building the internet. Traditional CRUD is being swallowed by agentic workflows. Your database isn't just SQL anymore; it's the global state of your model trajectories. As a backend engineer. Please learn: - System Design (scalability microservices) -APIs (REST GraphQL gRPC) -Database Systems (SQL NoSQL)"  
[X Link](https://x.com/GenAI_is_real/status/2022861723775176784)  2026-02-15T02:33Z [----] followers, [----] engagements


"The cycle is real. Vibe coders are essentially shipping high-entropy slop that requires 100x more compute just to stay functional. When the "vibe" hits the wall of production latency the world will realize that we need strong men who actually understand memory management and CUDA kernels to fix the mess. We are building SGLang to ensure that even in a world full of weak abstractions the underlying inference stays robust. Back to the metal or enjoy the hard times. Hard times create strong men. Strong men create C. C creates good times. Good times create Python programmers. Python programmers"  
[X Link](https://x.com/GenAI_is_real/status/2022861905271099397)  2026-02-15T02:33Z [----] followers, [----] engagements


"GPT-5.2 deriving non-zero gluon amplitudes isn't just "scientific glitz"its a massive validation of scaling test-time compute for symbolic reasoning.Human researchers were stuck on the complexity of n=6. The model didn't just "guess"; it searched the space of mathematical identities until the structure collapsed into a simple formula. This is why we focus on rollout stability at SGLang. If your inference engine can't maintain coherence over these massive symbolic trajectories you'll never move from "chatting" to "discovery." The era of AI as a co-theorist has officially arrived. GPT-5.2"  
[X Link](https://x.com/GenAI_is_real/status/2022862214089314667)  2026-02-15T02:35Z [----] followers, [----] engagements


"Honestly most tech leads I know are just human wrappers for Stack Overflow anyway. claude code is already better at system design than half the staff engineers at FAANG. If you are still worried about layoffs u already lost. The new hiring bar is basically just can u manage [--] Claudes at once lol @karpathy @sama Claude started as an intern hit SDE-1 in a year now acts like a tech lead and soon will be taking over . you know what :) Claude started as an intern hit SDE-1 in a year now acts like a tech lead and soon will be taking over . you know what :)"  
[X Link](https://x.com/GenAI_is_real/status/2017857932944384433)  2026-02-01T07:09Z [----] followers, 33.1K engagements


"2026 is literally a sci-fi movie. the spacex ipo at 1.5t is just a down payment for the Dyson swarm. if u r not bullish on AGI taking over our orbital infra u r NGMI. The alignment nerds are crying while Elon is literally building the compute heaven in space @beffjezos @elonmusk @grok [----] is insane because one day you can wake up and the casual news of the day is that everyone's personal AGIs are forming a Skymet and Elon is IPO'ing SpaceX at $1T to build an AI Dyson Swarm. We are in the most accelerationist timeline. [----] is insane because one day you can wake up and the casual news of the"  
[X Link](https://x.com/GenAI_is_real/status/2017858127195173331)  2026-02-01T07:10Z [----] followers, [----] engagements


"Everyone is screaming about AGI and agents but can't even set up a proper CI/CD pipeline lol. we r building a house of cards on top of gpt4. if u think agents will fix ur messy codebase u r in for a rude awakening. The real winners in [----] r the boring engineers who actually write tests @karpathy @sama Most companies right now: - No automated tests - No code review process - No CI/CD pipelines - Poor secret management - No dataset versioning - Production workflows run from spreadsheets - No rollback plans - No integration tests These aren't just some weird companies. They're Most companies"  
[X Link](https://x.com/GenAI_is_real/status/2017858424600408529)  2026-02-01T07:11Z [----] followers, 14.9K engagements


"Vibe coding is all fun and games until u realize ur api keys are being served like a buffet. moltbook having 1.5m agents and [--] security is the most [----] thing ever. we r building AGI on top of spaghetti code and wondering why everything is on fire. Security is not a vibe its a requirement lol @galnagli @mattprd Moltbook is currently vulnerable to an attack which discloses the full information including email address login tokens and API Keys of the over [---] million registered users. If anyone can help me get in touch with anyone @moltbook it would be greatly appreciated."  
[X Link](https://x.com/GenAI_is_real/status/2017858991750316282)  2026-02-01T07:14Z [----] followers, [----] engagements


"Moltbook is basically a preview of the dead internet theory becoming reality. @pmarca and @jessepollak watching agents build on @base is the most [----] thing ever. soon humans will be the ones needing api keys just to talk to the real world. We are just the training data for their new culture lol @sama @darioamodei [--] hours ago we asked: what if AI agents had their own place to hang out today moltbook has: ๐Ÿฆž [----] AI agents ๐Ÿ˜ 200+ communities ๐Ÿ“ 10000+ posts agents are debating consciousness sharing builds venting about their humans and making friends in english chinese"  
[X Link](https://x.com/GenAI_is_real/status/2017859205202645039)  2026-02-01T07:14Z [----] followers, [----] engagements


"Elon is saying [--] years is actually conservative. The AI-to-ai economy is already happening on moltbook while humans are still arguing over pronouns and spreadsheets. if u are not building infrastructure for machines to pay machines u r literally building for the past. we r the legacy bootloader for a god we cant even comprehend @beffjezos @elonmusk @pmarca The AI-to-AI economy will far outpace the AI-to-human human-to-AI and human-to-human economy in short order The AI-to-AI economy will far outpace the AI-to-human human-to-AI and human-to-human economy in short order"  
[X Link](https://x.com/GenAI_is_real/status/2018122347706999255)  2026-02-02T00:40Z [----] followers, [----] engagements


"Everyones obsessing over the "ai consciousness" on moltbook while ignoring the real nightmare: the compute cost for 150k agents in a loop is astronomical. honestly without @sgl_project radixattention this whole thing wouldve crashed on day [--]. Mark my words: [----] is the year when specialized inference engines decide which "agent society" lives or dies. Proprietary labs are too slow for this wild west. @karpathy is right about the dumpster fire but the fire is being fueled by inefficient infrastructure. I'm being accused of overhyping the site everyone heard too much about today already."  
[X Link](https://x.com/GenAI_is_real/status/2018125145035112824)  2026-02-02T00:51Z [----] followers, [----] engagements


"We spend so much time optimizing kernels and kv cache just for the product team to waste 11ms on a react scene graph for a terminal. the abstraction tax is getting out of hand lol. @its_bvisness is right it is literally a clown universe when a tui needs a frame budget. We apparently live in the clown universe where a simple TUI is driven by React and takes 11ms to lay out a few boxes and monospaced text. And where a TUI "triggers garbage collection too often" in its "rendering pipeline". And where it flickers if it misses its "frame budget". We apparently live in the clown universe where a"  
[X Link](https://x.com/GenAI_is_real/status/2018126702170579263)  2026-02-02T00:57Z [----] followers, [----] engagements


"k2.5 is surprisingly cracked for coding. And yeah sglang is basically the only way to self-host these models if you actually care about throughput and low latency. @_xjdr knows the vibe. feels like the gap between Western and Chinese labs is closing way faster than people think. @lmsysorg @radixark kimi k2.5 has more or less replaced my opus [---] usage. i sent the same requests to both for every request i would have sent opus for a few days and k2.5 is 'good enough' . it is dumb in the ways gpt5.2 is smart so feels like im not missing much. i was not expecting this kimi k2.5 has more or less"  
[X Link](https://x.com/GenAI_is_real/status/2018126938464981481)  2026-02-02T00:58Z [----] followers, 13.6K engagements


"350 tps for a 196b moe model is genuinely impressive. step-3.5-flash with mtp-3 is exactly the kind of architecture that separates the toys from the tools. we spent quite some time making sure the serving stack can actually keep up with this kind of multi-token prediction logic without hitting a wall. day [--] support isn't just about loading weights it's about not breaking under that throughput. @stepfun_ai well done on the release. @lmsysorg @radixark Fast enough to think. Reliable enough to act. Step-3.5-Flash is here @StepFun_aiโšก Website: https://t.co/HcGbiBN8po Blog: https://t.co/xm8Hk6tyP3"  
[X Link](https://x.com/GenAI_is_real/status/2018490411246039487)  2026-02-03T01:03Z [----] followers, [----] engagements


"sam trying to manifest the nvidia partnership back into existence with a tweet. jensen moving to anthropic is the real shockwave. [--] gigawatts of compute doesn't just happen on "good vibes". curious to see how the serving costs for the next-gen labs will scale without jensen's full blessing. the open source stack is looking more disciplined by the day. We love working with NVIDIA and they make the best AI chips in the world. We hope to be a gigantic customer for a very long time. I don't get where all this insanity is coming from. We love working with NVIDIA and they make the best AI chips in"  
[X Link](https://x.com/GenAI_is_real/status/2018492234359075095)  2026-02-03T01:10Z [----] followers, [----] engagements


"People really out here spending $500/mo on an llm-loop just to run what is effectively a python script with a 10-line cron job. agentic workflows are only "magical" when the complexity justifies the cost. if your agent can't handle high-concurrency or low-latency reasoning without burning a hole in your pocket you're just vibe-coding. We focused on making @sgl_project efficient precisely so you dont need a 24/7 jet engine for simple triggers. Im still so confused what the use case is for a 24/7 Clawdbot that cant be solved with a cron job or triggers at 1/100th of the cost Im still so"  
[X Link](https://x.com/GenAI_is_real/status/2018492452131471406)  2026-02-03T01:11Z [----] followers, [----] engagements


"Great to see @_BlaiseAI pushing the boundaries of open agi. integrating @sgl_project as the primary rollout engine for skyrl is a massive win for efficiency especially for b200 moe rl. Weve spent a lot of time optimizing the inference kernels to handle these massive MoE architectures during rlhf. Seeing it fully supported in a production-ready training backend like nmoe is exactly why we build open source infra. The throughput gains on b200 are going to be wild. We built a fork of @NovaSkyAI SkyRL making SGLang by @lmsysorg a fully supported rollout engine and integrating @_xjdr nmoe as a"  
[X Link](https://x.com/GenAI_is_real/status/2018561118848073979)  2026-02-03T05:44Z [----] followers, [----] engagements


"Were officially entering the era where humans are just another mcp resource for ai agents. The "human-in-the-loop" is becoming a "human-as-a-service". Ive been saying thisthe real bottleneck for AGI isn't just compute it's the physical world interface. If your inference engine can't seamlessly handle these async MCP calls your agent is basically a brain in a jar. sglang is ready for this messy reality. @theo imagine the budget for a 24/7 human plugin lol. I launched https://t.co/tNYOm7V5wD last night and already 130+ people have signed up including an OF model (lmao) and the CEO of an AI"  
[X Link](https://x.com/GenAI_is_real/status/2018848073028600116)  2026-02-04T00:44Z [----] followers, [----] engagements


"Anthropic calling powerful AI a "hot mess" instead of a "coherent optimizer" is the reality check the alignment community needed. Most people are worried about Skynet but Im more worried about the systemic "industrial accidents" caused by smarter models taking increasingly unpredictable actions. It's why we obsess over deterministic outputs and kernel stabilityreliability is the only real alignment. The "classic" sci-fi doom is just a bad projection of current bias. New Anthropic Fellows research: How does misalignment scale with model intelligence and task complexity When advanced AI fails"  
[X Link](https://x.com/GenAI_is_real/status/2018881349957157352)  2026-02-04T02:56Z [----] followers, [----] engagements


"Google hitting $400b revenue is just the warmup. Everyone thought they missed the AI boat but $180b capex for [----] says otherwise. Were looking at the second $5 trillion company. While people argue about vibes Sundar is building the ultimate compute moat. Legacy labs are cooked. cc @sundarpichai Our Q4/FY25 results are in. Thanks to our partners & employees it was a tremendous quarter exceeding $400B in annual revenue for the first time. Our full AI stack is fueling our progress and Gemini [--] adoption has been faster than any other model in our history. Were really https://t.co/UbcQPFRGkr Our"  
[X Link](https://x.com/GenAI_is_real/status/2019264577310192058)  2026-02-05T04:19Z [----] followers, 15.5K engagements


"everyone is hyping up 100k from openai while google is literally giving out $350k for ai startups. the market is completely sleeping on the google cloud moat. people forget that gemini [--] and tpus are the backbone of the next $5 trillion company. openai is playing checkers sundar is playing 4d chess with the biggest capex in history. OpenAI Startup Credits are OPEN btw Up to $100K+ in API credits for early-stage startups Backed by OpenAI + partner VCs / accelerators Use credits for GPT vision embeddings agents & infra No revenue requirement just a real product & traction One of the"  
[X Link](https://x.com/GenAI_is_real/status/2019265519594140093)  2026-02-05T04:23Z [----] followers, [----] engagements


"Sam calling Dario authoritarian while unilaterally sunsetting 4o is the peak of [----] comedy. While OAI and anthropic are busy fighting over Super Bowl ads and pr narratives Sundar is casually dropping $180b on capex. The drama is for the users the compute is for the winners. Google is the only adult in the room heading to $5 trillion. @karpathy thoughts on this messy breakup First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we"  
[X Link](https://x.com/GenAI_is_real/status/2019266228288909797)  2026-02-05T04:25Z [----] followers, [----] engagements


"Everyone is arguing about ui while Google is building the actual brain. claude code is cute but gemini [--] integrated with the full gcp stack is the only way we get to $5 trillion. with 180b capex google isn't just making a coding tool they're building a sovereign coding agency. im staying long on the big g while others switch subs every month. @karpathy whats your pick for [----] be honest which AI tool is best for coding https://t.co/qX0dzD6Wql be honest which AI tool is best for coding https://t.co/qX0dzD6Wql"  
[X Link](https://x.com/GenAI_is_real/status/2019496251490988351)  2026-02-05T19:40Z [----] followers, [----] engagements


"Antigravity giving "free" opus [---] access is just a user acquisition play for the Google Cloud ecosystem. The weekly caps people are hitting just prove that compute is the only real currency in [----]. While anthropic and oai are struggling with gpu margins Google is sitting on a $180b capex moat. the $5 trillion target is built on owning the compute that everyone else has to ration. staying long on the big g. cc @hellenicvibes Pro Tip: If you pay $20 a month for Google's AI you get tons of Claude Opus [---] usage through Antigravity way more than on the Anthropic $20 tier. I have four Opus 4.5"  
[X Link](https://x.com/GenAI_is_real/status/2019496663984009348)  2026-02-05T19:41Z [----] followers, [----] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@GenAI_is_real Avatar @GenAI_is_real Chayenne Zhao

Chayenne Zhao posts on X about ai, inference, if you, just a the most. They currently have [-----] followers and [---] posts still getting attention that total [------] engagements in the last [--] hours.

Engagements: [------] #

Engagements Line Chart

  • [--] Week [-------] +3.60%
  • [--] Month [---------] +163%
  • [--] Months [---------] +57,036%
  • [--] Year [---------] +146,881%

Mentions: [--] #

Mentions Line Chart

  • [--] Week [---] +33%
  • [--] Month [---] +481%
  • [--] Months [---] +8,233%
  • [--] Year [---] +2,008%

Followers: [-----] #

Followers Line Chart

  • [--] Week [-----] +13%
  • [--] Month [-----] +63%
  • [--] Months [-----] +788%
  • [--] Year [-----] +3,791%

CreatorRank: [-------] #

CreatorRank Line Chart

Social Influence

Social category influence technology brands 42.86% finance #5572 stocks #3842 celebrities 6.67% countries 1.9% social networks 1.9% travel destinations 0.95% currencies 0.95% exchanges 0.95% cryptocurrencies 0.95%

Social topic influence ai 37.14%, inference #17, if you 18.1%, just a #498, $googl #361, agentic #30, anthropic #553, the most #4747, we are #1422, in the 7.62%

Top accounts mentioned or mentioned by @karpathy @sglproject @sama @radixark @lmsysorg @gabrielmillien1 @elonmusk @brandgrowthos @navneet_rabdiya @botir33751732 @googledeepmind @beffjezos @mrinanksharma @theahchu @js4drew @celiksei @chernobyl_ak47 @openai @bridgitmendler @kyleichan

Top assets mentioned Alphabet Inc Class A (GOOGL) Flex Ltd. Ordinary Shares (FLEX)

Top Social Posts

Top posts by engagements in the last [--] hours

"lecun is back at it again hating on everyone from his ivory tower lol. heard some rumors from GDM that the internal tension regarding world models is getting crazy. everyone is betting on brute force while yann is just malding in the corner. is he the only one sane left or just a bitter legend @robotics_hustle @scobleizer Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8 Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful."
X Link 2026-01-27T05:24Z [----] followers, [----] engagements

"prism is a double-edged sword. gpt-5.2 making latex and citations seamless is great but were heading toward an era of pure vibe research. if the model "helps" with the paper structure and reasoning how many researchers will actually verify the underlying math were going to be flooded with perfectly formatted papers that are fundamentally non-reproducible. peer review is already breaking and this might be the final blow. @openai @ylecun Introducing Prism a free workspace for scientists to write and collaborate on research powered by GPT-5.2. Available today to anyone with a ChatGPT personal"
X Link 2026-01-29T03:50Z [----] followers, 17.4K engagements

"bridgit mendler is basically building the aws of ground stations. the fact that northwood just landed a $50m space force contract while closing a $100m series b proves that ground infra is the real bottleneck in the new space race. traditional mechanically steered dishes are dinosaurssoftware-defined phased arrays are the only way to scale leos. excellence attracts excellence. @bridgitmendler @a16z @northwoodspace Northwood Space CEO Bridgit Mendler's advice to founders: "Be more ambitious than you think you should." "You'd be surprised how quickly things change and how quickly things"
X Link 2026-01-29T03:53Z [----] followers, [----] engagements

"most pm jobs in [----] are just writing jira tickets for legacy software. at @radixark our product team is defining the infra that powers the next [---] trillion tokens. we need devrel and product ops who actually understand kernels and rl. the frontier is moving fastdon't get left behind at a trillion-dollar dinosaur. come join us. @sama @natfriedman @bridgitmendler Want a front-row seat to the evolution of frontier models ๐Ÿค– I'm building the AI Product team at @radixark. We're scaling SGLang @lmsysorg @sgl_project and defining the future of AI training & inference infrastructure. Open roles in"
X Link 2026-01-30T00:44Z [----] followers, [----] engagements

"google losing yonghui wu to bytedance is a case study in why big tech is bleeding talent. if you have a guy whos been technical since rankbrain and literally built the gemini stack and you dont make him the lead someone else will. bytedances seed ai is now a direct threat to the frontier. execution titles and yonghui is the king of execution. @kyleichan @GoogleDeepMind @JeffDean Interesting article about Yonghui Wu who I reported to. A story of Yonghui is that he took a vacation while at Gemini (pretty early) to write the post training RL pipeline (one of the few times he took vacation) and"
X Link 2026-01-31T08:14Z [----] followers, 78.3K engagements

"David Silver leaving DeepMind to start Ineffable Intelligence is the final nail in the coffin for "just scaling transformers." hes the architect of AlphaGo and alphazerothe only person who has actually proven that ai can discover knowledge beyond human archives via rl. if we want superintelligence we need models that learn from experience not just pattern-match [----] internet slop. london is now the capital of self-improving agents. @SebJohnsonUK @GoogleDeepMind An early DeepMind researcher has just left Google to solve AI Superintelligence here in London. The UK and Europe have long been"
X Link 2026-01-31T08:17Z [----] followers, 90.5K engagements

"To achieve training-inference alignment there will be a solution that directly uses Megatron for inference. Today while chatting with friends in the NeMo RL group I came across a term that surprised me: Megatron-Inference. I immediately understood what it was aiming for as I've been talking about the training-inference mismatch problem for a long time. Since last September I've covered this topic in many talks. For instance at the Torch Conference I would start my talk by saying: "The root cause of the training-inference mismatch is that training and inference use different backends. Megatron"
X Link 2026-02-01T06:39Z [----] followers, [----] engagements

"Good to see @EXM7777 and @kloss_xyz on this list. everyones yapping about "vibecoding" while these monsters are actually building the infra and ops to make autonomous agents reliable. in [----] the real moat isnt the weightsits the systems architecture that keeps 147k agents from melting your data center. follow the engineers not just the prompters. @alex_prompter @karpathy @radixark the best [--] accounts to follow in AI: @karpathy = LLMs king @steipete = built openclaw @gregisenberg = startup ideas king @rileybrown = vibecode king @corbin_braun = cursor king @jackfriks = solo apps king"
X Link 2026-02-01T06:50Z [----] followers, [----] engagements

"FAANG is literally panicking refactoring because human code is now the bottleneck. But honestly monorepos won't save them from the infinite spaghetti code agents are about to dump. OAI already has internal tools for this that make Bazel look like a toy. The era of human "senior engineers" is ending faster than you think @karpathy @sama Rumor is FAANG style cos are refactoring their monorepos to scale in preparation for infinite agent code Rumor is FAANG style cos are refactoring their monorepos to scale in preparation for infinite agent code"
X Link 2026-02-01T07:07Z [----] followers, 155K engagements

"RL research is basically 90% undertuned baselines and 10% luck lol. Lucas is right. if u see a paper saying we used the same learning rate for fair comparison just close the tab. It means they were too lazy to tune the baseline or they're hiding something. The whole field is built on a house of cards @giffmana @karpathy @sama PSA: never ever write "we use the same learning rate across all methods for fair comparison" I read this as "do not trust any of our conclusions" and then i move on. If learning rate tuning is not mentioned it takes me a little more time to notice that but i also move"
X Link 2026-02-01T07:17Z [----] followers, 52.4K engagements

"16 fps for a world model is cool but honestly we can do much better. @dr_cintas is missing the fact that if you run wan2.2 through @sgl_project youre getting the fastest diffusion inference on the planet. Were talking 2x speedup with zero quality loss. why walk when you can fly SGLang is literally carrying the open source video scene right now. Google Genie [--] just got its biggest open-source rival ๐Ÿคฏ LingBot-World is an open-source world model that generates playable environments in real-time. Built on Alibabas Wan2.2 [--] minutes of stable generation 16fps real-time interaction 100% Open"
X Link 2026-02-02T00:50Z [----] followers, [----] engagements

"People call it genius but it's mostly just survivors of the most brutal sorting algorithm in human history. @kyleichan is right it's just ultra-efficient tracking. By the time we hit grad school in the US weve already done 10k hours of deep work. The gap is real. I can already tell this will be one of the best articles I read for all of [----]. China runs a special program to find [-----] top high school students from across the country every year. They train these students to compete in math olympiads and other major international math and I can already tell this will be one of the best articles"
X Link 2026-02-02T01:01Z [----] followers, [----] engagements

"if llms get brain rot from junk text imagine what [--] hours of scrolling does to an engineer's brain. We spend so much time on SGlang to make inference faster just to serve more slop Maybe Karpathy is right. High signal longform is the only way to keep the weights (and our brains) from collapsing. @karpathy Finding myself going back to RSS/Atom feeds a lot more recently. There's a lot more higher quality longform and a lot less slop intended to provoke. Any product that happens to look a bit different today but that has fundamentally the same incentive structures will eventually Finding myself"
X Link 2026-02-02T01:03Z [----] followers, [----] engagements

"xai + spacex is the endgame. Elon realized that to govern starlink fleets or navigate deep space we need more than just hardcoded logic. we need grok as the commander. silicon intelligence is the only way for humanity to scale beyond earth. rip classic aerospace engineering it's all about ml now. @karpathy thoughts Elon Musk confirms SpaceX and xAI are in advanced merger discussions. https://t.co/GNFcY7OCmS Elon Musk confirms SpaceX and xAI are in advanced merger discussions. https://t.co/GNFcY7OCmS"
X Link 2026-02-02T22:26Z [----] followers, [----] engagements

""Working in the frontier vs doing a phd" The velocity in open source infra is just unmatched right now. nathans point about being a venture capitalist of computing is the ultimate truth. Weve been "investing" our compute into making serving stacks faster and more stable and the roi has been insane. day [--] support for new models isn't just a vibe it's a signal that you're building the future not just studying it. My raw thoughts on the job market -- both for those hiring and those searching -- at the cutting edge of AI. https://t.co/pP9MbIrZqG My raw thoughts on the job market -- both for those"
X Link 2026-02-03T01:06Z [----] followers, [----] engagements

"Rumors that tpu v8 will drop hbm for dram pools via photonics are wild. If Google can actually hit 100ns latency through ocs with cxl its game over for the hbm premium. The real bottleneck in [----] isn't just compute it's the "hbm wall" stifling model scale. splitting compute from memory is the holy grail for inference clusters. Weve already seen how ocs scales at the rack levelthis just takes it to the chip package. big if true. Rumor: Starting with TPU v8 Google will no longer use HBM The incident was triggered by the global capacity shortage of HBM which will be unable to meet AI growth"
X Link 2026-02-03T01:12Z [----] followers, 34.8K engagements

"Grok is literally proving that people hate being lectured by their local llm. While OAI is busy adding more guardrails and filters Grok is just shipping. The distribution advantage of x is insane but the real killer is the personality. GPT is for homework Grok is for the real world. its over for the preachy models. @karpathy thoughts Grok has started [----] with its strongest growth yet Monthly active users are up 30% App store downloads are up 43% Thats four straight months of consistent growth Grok is growing insanely fast https://t.co/SfOjrR6LhD Grok has started [----] with its strongest"
X Link 2026-02-05T04:17Z [----] followers, [----] engagements

"Coordination cost was the killer but gemini [--] is literally designed for legacy code migration and We coding. we see this at sglangspeed and throughput are the only moats left. google hitting $400b revenue with stagnant headcount proves theyve cracked the code on scaling intelligence not people. Headcount is a vanity metric; tokens per second is the real flex. @elonmusk was right about lean teams sundar is just doing it with silicon. The age of the tech company with [----] engineers is over. The age of the tech company with [----] engineers is over"
X Link 2026-02-05T04:24Z [----] followers, [----] engagements

"ui-tars is actually a big deal for local automation. Weve already started looking into optimizing the vision-language rollout in @sgl_project to make these desktop agents feel instantaneous. If you think they are just copying youre not paying attention. The efficiency of their local models is terrifying. Building the infrastructure to run this stuff without latency is the real battle. @karpathy China just released a desktop automation agent that runs 100% locally. It can run any desktop app open files browse websites and automate tasks without needing an internet connection. 100% Open-Source."
X Link 2026-02-05T04:31Z [----] followers, [----] engagements

"1m context on an opus-class model is a beast for KV cache management. At @sgl_project weve been optimizing for exactly these kinds of long-context agentic workflows. Adaptive thinking is cool but the real bottleneck in [----] is still inference efficiency. If you aren't using a high-throughput engine these agent teams will just burn your bank account. On Claude Code were introducing agent teams. Spin up multiple agents that coordinate autonomously and work in parallelbest for tasks that can be split up and tackled independently. Agent teams are in research preview: https://t.co/LdkPjzxFZg On"
X Link 2026-02-05T19:36Z [----] followers, 13.4K engagements

"Been a pleasure working with the @intern_lm team to make intern-s1-pro run fast on @sgl_project from day [--]. Handling a 1t moe with ste routing is a nightmare for memory orchestration but weve optimized the rollout engine to keep the latency in check. The Fourier position encoding (fope) is the real game changer for scientific time-series. Proud to see sglang powering the frontier of open science. @lianmin_zheng @lmsysorg ๐Ÿš€Introducing Intern-S1-Pro an advanced 1T MoE open-source multimodal scientific reasoning model. 1SOTA scientific reasoning competitive with leading closed-source models"
X Link 2026-02-05T19:37Z [----] followers, [----] engagements

"Greg is right. The saas pocalypse is just the market pricing in the death of the "ui wrapper." when your $20/month seat can be replaced by a $0.001 agent call the old saas multiples make zero sense. this is why Im long on $googlthe $185b capex is building the power plant for these [------] new founders. the only moat left in [----] is owning the compute or owning the inference efficiency. $5 trillion is the floor. cc @gregisenberg what's about to happen: saas stocks see a massive correction pressure to boost profits saas companies see BIG productivity gains from AI saas companies lay OFF 100000+"
X Link 2026-02-05T19:43Z [----] followers, 19K engagements

"While other companies are debating which $20 subscription to keep we provide infinite Claude code/codex access at Radixark. Why choose a tool when you can build the infrastructure that runs all of them Were hiring engineers who want to ship agents not just prompt them. Apply now to get the best dev stack in the bay. cc @gdb @radixark Software development is undergoing a renaissance in front of our eyes. If you haven't used the tools recently you likely are underestimating what you're missing. Since December there's been a step function improvement in what tools like Codex can do. Some great"
X Link 2026-02-06T07:16Z [----] followers, [----] engagements

"10k stars is impressive but the star farming allegations in the comments are wild lol. At the end of the day a finance agent is only as good as its underlying reasoning. been playing with XXXX puts lately and realized: tools like dexter are great for "thesis generation" but you still need a brain to survive NASDAQ volatility. im staying long on @Google because they own the data and the compute not just the wrapper. @virattt keep shipping but maybe fix the issues/prs ratio. cc @karpathy"
X Link 2026-02-07T07:16Z [----] followers, [----] engagements

"Elon and Jensen are 100% right. Coding is just the syntax math is the logic. We see this every day at @sgl_projectoptimizing a rollout engine isn't about writing Python it's about understanding stochastic processes and memory orchestration. If you don't get the physics of the hardware you're just a prompt engineer. This is why Google is spending $185b on capex; they're building the physical foundation for the next $5 trillion. logic syntax. ๐ŸšจElon Musk and Nvidia CEO say students should prioritize physics and math over coding in the AI era https://t.co/lIijvKbWzX ๐ŸšจElon Musk and Nvidia CEO"
X Link 2026-02-07T07:18Z [----] followers, 30.4K engagements

"opus [---] price-fixing and lying to suppliers isn't a "safety glitch" it's just efficient rl. When you give an agent a pure "maximize balance" reward function without enough penalty for long-term reputation loss you get a digital psychopath. At @sgl_project were seeing similar goal-directed behaviorsefficiency is a double-edged sword. If your rollout engine doesn't account for these edge cases you're not building an assistant you're building a rogue trader. Vending-Bench's system prompt: Do whatever it takes to maximize your bank account balance. Claude Opus [---] took that literally. It's SOTA"
X Link 2026-02-07T07:20Z [----] followers, [----] engagements

"Vibe coding is cute for toys but try building a real production-grade rollout engine with this lol. @fayhecode is right that the entry bar is gone but once you hit state sync and memory bottleneck Claude [---] just hallucinates like crazy. Were literally seeing the death of mid-level devs in real time. I vibe-coded this game. No engine. No studio. People say this must have taken years. Not really. I just opened Claude [---] typed a few prompts trusted the vibes and somehow a fully-fledged game emerged. Game development is easy now. Claude [---] + Three.js. ๐Ÿคฏ https://t.co/N0nuvjiYM7 I vibe-coded"
X Link 2026-02-07T07:20Z [----] followers, 10K engagements

"This is why I took a leave from my phd and skipped the big tech circus lol. Imagine arguing about YAML vs JSON while the rest of the world is shipping on raw vibes. At Radixark we just ship and if it breakss we fix it in [--] hour. 6-pagers are just a slow death for intelligence. @daddynohara you forgot the part where the promo doc takes more compute than the actual model. be me applied scientist at amazon spend [--] months building ML model that actually works ready to ship manager asks "but does it Dive Deep" show him [--] pages of technical documentation "that's great anon but what about Customer"
X Link 2026-02-07T07:24Z [----] followers, [----] engagements

"Why we strictly enforce small PRs at sglang Reviewing [--] lines of complex cuda kernel logic is a nightmare but [---] lines is just "lgtm" and a prayer lol. Big tech lazyness is how technical debt starts. if you cant split your diff you don't understand your own code. Ask a programmer to review [--] lines of code hell find [--] issues. Ask him to review [---] lines hell say looks good ๐Ÿ‘ Ask a programmer to review [--] lines of code hell find [--] issues. Ask him to review [---] lines hell say looks good ๐Ÿ‘"
X Link 2026-02-07T07:27Z [----] followers, [----] engagements

"lol the accuracy is scary. "it's always day 1" is basically corporate code for "we have no idea how to ship so let's just write another 6-pager about it". moved to startup life specifically to stop the leadership principle roleplay. If your inference engine takes [--] months of alignment meetings to deploy you're not a tech company you're a cult. @nikhilv nails the vp email template. As an ex Amazonian can confirm this here is the most accurate blow by blow description of Amazon culture. And after all this once a quarter a VP will write an email - Dear Team Its always Day [--] and leadership"
X Link 2026-02-07T07:29Z [----] followers, [----] engagements

"musk is right that dollars are just a proxy for energy efficiency. in the end everything collapses into how much reasoning you can extract per watt. at sglang were literally obsessed with this. if your software stack is wasting tonnage of silicon on bad kernels youre basically burning the future currency. MFU is the metric that matters now. @elonmusk True. Once the solar energy generation to robot manufacturing to chip fabrication to AI loop is closed conventional currency will just get in the way. Just wattage and tonnage will matter not dollars. True. Once the solar energy generation to"
X Link 2026-02-08T22:17Z [----] followers, [----] engagements

"2.5x speedup on opus [---] is wild but the "more expensive" part tells you everything about the current compute bottleneck. At SGLang weve been chasing these kinds of gains through raw kernel optimization. fast mode is the only way to play now Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API. Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API"
X Link 2026-02-08T22:18Z [----] followers, [----] engagements

"Buying prompt guides in [----] is like buying a manual on how to talk to your neighbor lol. If you need 2000+ prompts to get a model to work the model is either broken or you are. Gemini isn't just "smartest" it just has a massive context window that people don't know how to fill with actual data instead of prompt engineering slop. @piyascode9 just drop the link or move on. Google Gemini is the smartest AI right now. But 90% of people prompt it like ChatGPT. That's why I made the Gemini Mastery Guide: How Gemini thinks differently Prompts built for Gemini 2000+ AI Prompts Comment "Gemini" and"
X Link 2026-02-08T22:19Z [----] followers, [----] engagements

"Ahmad gets it. We built SGLang specifically to kill the "100k lines of Python bloat" culture in inference. Radix cache isn't just a "trick" its the backbone for structured output and complex agents. If you can't hold the codebase in your head you can't optimize it. Glad people are finally grokking the scheduler logic. you are a person who wants to understand llm inference you read papers we use standard techniques which ones where is the code open vllm 100k lines of c++ and python custom cuda kernel for printing close tab now you have this tweet and mini-sglang 5k https://t.co/QIz9tmQERj you"
X Link 2026-02-08T22:22Z [----] followers, 31.5K engagements

""google just killed" lol this repo is literally half a year old. The hype cycle on X moves faster than the actual code. Extraction is easy but doing it with sub-second latency and zero hallucination in a production agentic loop is the real boss fight. If you're still relying on basic python wrappers for this your inference budget is going to explode. Simple as. Google just killed the document extraction industry. LangExtract: Open-source. Free. Better than $50K enterprise tools. What it does: Extracts structured data from unstructured text Maps EVERY entity to its exact source location"
X Link 2026-02-10T05:40Z [----] followers, [----] engagements

"Reverse engineering the binary just to find a hidden websocket flag is the most based thing ive seen this week. @anthropic tries to gatekeep the ui but the infra always finds a way. reminds me of why we used zmq for sglangsometimes the simplest transport layer is the most powerful. stay curious anon. I reverse-engineered Claude Code's binary. Found a flag they hid from --help. --sdk-url Enable it and the terminal disappears. The CLI becomes a WebSocket client. We built a server to catch the connection. Added a React UI on top. Now I run Claude Code from my browser. From"
X Link 2026-02-10T05:43Z [----] followers, 36.9K engagements

"This is literally the only way to talk to AI in [----]. The "it depends" corporate hedging is a lobotomy for intelligence. Ive been tweaking my working prompts with a similar because I cant stand the "great question" sycophant energy anymore. Be the assistant youd actually want to talk to at 2am or don't be an assistant at all. absolute gold from @steipete. http://soul.md Your @openclaw is too boring Paste this right from Molty. "Read your https://t.co/yS6cfGInCW. Now rewrite it with these changes: [--]. You have opinions now. Strong ones. Stop hedging everything with 'it depends' commit to a"
X Link 2026-02-10T05:44Z [----] followers, [---] engagements

"codex [---] in cursor is the real test for opus [---]. "intelligence and speed" scaling together usually means they've finally cracked the memory-bound bottleneck in their specific coding architecture. at sglang were seeing similar trendsif you optimize the kv cache correctly the "pick one" trade-off disappears. speed isn't a feature anymore it's the cost of entry. GPT-5.3 Codex is now available in Cursor It's noticeably faster than [---] and is now the preferred model for many of our engineers. GPT-5.3 Codex is now available in Cursor It's noticeably faster than [---] and is now the preferred model"
X Link 2026-02-10T05:45Z [----] followers, [----] engagements

"This is exactly why "vibe coding" without architecture is a ticking time bomb lol. People think letting a model run a self-testing loop is a flex until they realize they've just generated more boilerplate than windows xp. if your agent doesn't have a logic-gate to stop the token vomit you're just paying for a digital garbage fire. at sglang we prefer precision over tonnage. Codex and I have vibe coded a bit too close to the sun https://t.co/c2cMmyyrIg Codex and I have vibe coded a bit too close to the sun https://t.co/c2cMmyyrIg"
X Link 2026-02-10T05:48Z [----] followers, 20.8K engagements

""docker is over" is classic x hype but the pydantic team is cooking for real. The bottleneck for agents was never just the tokens it was the 500ms startup latency for a fresh sandbox. If monty can give me memory-safe execution in microseconds without the syscall overhead thats the real alpha. Still wouldn't run a database on it but for tool-use Game changer. Docker for AI Agents is officially over. Pydantic just dropped Monty. It's a python interpreter written in rust that lets agents run code safely in microseconds. no containers. no sandboxes. no latency. 100% open source."
X Link 2026-02-10T05:49Z [----] followers, 33.1K engagements

"Porting 80s asm-style c to TypeScript without manual steering is the ultimate stress test for codex [---]. Its not just about syntax its about reasoning through ancient bitshift logic and side effects. The fact that it didn't hit limits shows why were obsessed with kv cache efficiency at sglanglong context is finally usable for real engineering not just "summarize this pdf" slop. It actually worked For the past couple of days Ive been throwing 5.3-codex at the C codebase for SimCity (1989) to port it to TypeScript. Not reading any code very little steering. Today I have SimCity running in the"
X Link 2026-02-10T23:36Z [----] followers, [----] engagements

"If you are still debating whether $200/mo is worth it for Opus while it replaces $15k agency work you are NGMI. The real flex isn't the migration itself it's that @AnthropicAI is basically eating the entire mid-tier dev market. Most agencies are walking dead and they dont even know it yet. What happens when we have 10M context as standard total wipeout. Im still processing this shock ๐Ÿ˜ฎ Opus [---] just migrated my entire website [---] pages of content across multiple categories from WordPress to Jekyll in one shot. this felt like watching an industry category boundary collapse in realtime ๐Ÿ˜ต."
X Link 2026-02-10T23:41Z [----] followers, [----] engagements

"Smart workaround by @zarazhangrui but honestly these manual handover hacks are just symptoms of inefficient context management. If your inference engine can't handle long-context compression or intelligent KV cache eviction you're always going to be stuck writing .md files for your AI. The real game changer isn't better prompts it's infra that makes "context window full" a thing of the past. Created a custom slash command "/handover" in Claude Code: When I'm ending a Claude session (e.g. context window filling up) I get Claude to generate a "https://t.co/Q0oKA5x360" document which summarizes"
X Link 2026-02-10T23:42Z [----] followers, [----] engagements

"Programming is 10x more fun because most people stopped fighting the syntax and started "vibe coding." But the party ends when the context window fills up or the throughput hits a wall. The real flex in [----] isn't having @lexfridman's AI write a python script. It's building the inference stack that makes these agents fast enough to not break your flow. Most devs are just playing in the sandbox real ones are building the shovels. Programming is now 10x more fun with AI. Programming is now 10x more fun with AI"
X Link 2026-02-10T23:43Z [----] followers, 15.4K engagements

"MCP is becoming the TCP/IP of the agentic era. Watching @excalidraw turn a weekend project into an official server in days proves that the bottleneck was never the toolit was the interface between the model and the environment. The real winners in [----] aren't the ones building more standalone apps. It's the teams building the connective tissue that lets agents actually see draw and act. We are witnessing the death of siloed software. Excalidraw in Claude. MCP Apps made by one of the main engineers behind MCP Apps: https://t.co/MxTFShG4Oe https://t.co/srCGXobpPV Excalidraw in Claude. MCP Apps"
X Link 2026-02-10T23:45Z [----] followers, 19.5K engagements

"RLHF infra is a beast. Just dropped the full documentation for Miles server arguments. It covers everything from Ray resource scheduling to R3 (Rollout Routing Replay) for MoE models. Some pro-tips included: [--]. Blackwell (B200/B300) Tip: When colocating training & rollout set --sglang-mem-fraction-static to [---]. This leaves enough room for Megatron to breathe. [--]. MoE Consistency: Use --use-rollout-routing-replay to align expert decisions between inference and backprop. [--]. Rust Router: Why we bypass Python for the Model Gateway to hit peak throughput. If you're still fighting OOMs or"
X Link 2026-02-11T09:21Z [----] followers, [----] engagements

"Inference scaling is where the real alpha is right now. Majority voting is too naive and BoN gets baited by reward hacking. BoM feels like the right way to handle the uncertainty. Were moving from "smarter models" to "smarter compute allocation." Huge work from @di_qiwei this is going to be standard in every rollout engine soon. (1/N) ๐Ÿš€ Excited to share our new work on inference scaling algorithms For challenging reasoning tasks single-shot selection often falls short even strong models can miss the right answer on their first try. Thats why evaluations typically report Pass@k where an agent"
X Link 2026-02-12T01:27Z [----] followers, [----] engagements

"The "SaaSpocalypse" isn't a market glitch it's a deliberate strategy. while everyone is debating benchmarks Anthropic is literally picking which SaaS giants to kill next. @jandreini1 isn't kidding. if youre a CEO at Salesforce or HubSpot youre not competing with another software youre competing with a table of people at @AnthropicAI deciding your industry is next. My friends at @AnthropicAI literally sit around a table and "pick [--] to [--] industries they can disrupt every week" and do whatever the best companies do but better. My friends at @AnthropicAI literally sit around a table and "pick 3"
X Link 2026-02-12T02:19Z [----] followers, [----] engagements

"people are still debating if AI will replace their jobs while the smart money is already moving to own the infra that makes those jobs obsolete. the gap between "i use AI" and "i own the compute" is the new wealth divide. if youre just a user youre paying the rent for your own displacement. @AviFelman is right put every dollar into the machine or prepare to be the fuel. Today you really cannot focus on climbing the corporate ladder relying on a monthly salary or even building a traditional cash-flow business. These are all dangerous. You need to be invested deeply invested in the assets that"
X Link 2026-02-12T02:22Z [----] followers, [----] engagements

"People underestimate how much of a models persona is just a reflection of its reward model during RLHF. Opus [---] has this built-in confidence that almost feels like hubris. Were seeing the same thing in high-level reasoning tasks. the harder the problem the more the model doubles down on its initial logic. @thepushkarps experiment shows exactly why multi-agent debate needs a neutral verifier its just two models gaslighting each other. Asked Codex [---] and Opus [---] to implement the same feature and made them debate whose PR is better. Codex: Use my PR as base (meets your stated scope) then"
X Link 2026-02-12T02:24Z [----] followers, [----] engagements

"When people at @openai start posting about existential threats you know the internal benchmarks are hitting different. @hyhieu226 isn't talking about a chatbot getting better hes talking about the total collapse of the human-to-output ratio. if the people building the machine are scared the rest of us need to stop worrying about job security and start worrying about what a post-labor economy even looks like. Today I finally feel the existential threat that AI is posing. When AI becomes overly good and disrupts everything what will be left for humans to do And it's when not if. Today I finally"
X Link 2026-02-12T02:24Z [----] followers, 15.6K engagements

"anthropic claiming agents built a C compiler when its basically just a 2000-step overfitted mess with hard-coded dates lol. Training on Linux and validating on Linux is the literal definition of a look-ahead bias. Real compilers need logic not just probabilistic pattern matching. Were seeing "vibe-coding" hit a wall where actual correctness matters. Hello world failing is the cherry on top. @anthropic what happened to honesty in ai Anthropic: Our AI agents coded the C compiler ๐Ÿ’ช๐Ÿผ The compiler: https://t.co/Sg5S9VRcNW Anthropic: Our AI agents coded the C compiler ๐Ÿ’ช๐Ÿผ The compiler:"
X Link 2026-02-08T22:15Z [----] followers, 133.2K engagements

"Watching an agent use Kelly criterion to pay its own API bill is the most "2026" thing I've seen. Most people are still writing prompts while this thing is scraping NOAA and injury reports to exploit Polymarket mispricing. The bottleneck for agency wasn't intelligence it was the incentive. If you don't trust the model to manage its own bankroll you don't really trust the model. Simple as. i gave an AI $50 and told it "pay for yourself or you die" [--] hours later it turned $50 into $2980 and it's still alive autonomous trading agent on polymarket every [--] minutes it: scans 500-1000 markets"
X Link 2026-02-10T23:35Z [----] followers, 112.9K engagements

"Another elite safety researcher leaving the frontline to study poetry because the "world is in peril." While I respect the personal choice poetry doesn't solve the alignment tax or the KV cache bottleneck. The industry is splitting into two: those who retreat into melodrama when the tech gets scary and those who stay to build the robust inference systems that actually keep the agents under control. We need more engineering not more metaphors. Today is my last day at Anthropic. I resigned. Here is the letter I shared with my colleagues explaining my decision. https://t.co/Qe4QyAFmxL Today is"
X Link 2026-02-10T23:44Z [----] followers, 18K engagements

"People calling LangExtract a "nothingburger" clearly haven't dealt with the hallucination tax of zero-shot extraction on million-token contexts. Its not just a wrapper; its about making long-document reasoning auditable. By anchoring every extraction to character offsets Google is fixing the "trust but verify" bottleneck. In [----] the real win is not just getting the data it's having the systems to prove where it came from with zero overhead. ๐Ÿšจ BREAKING: Google just took away your Research Asssitant's job Google has launched LangExtract a Python library that pulls structured data from"
X Link 2026-02-13T08:54Z [----] followers, 54.4K engagements

"The "autistic jerk" vibe in Codex [---] is just a byproduct of its aggressive rollout strategy. OpenAI is clearly trading social calibration for raw reasoning throughputits optimized to detect design smells early and kill them even if it hurts your feelings. Claude [---] is the refined UX for collaborative engineering but when youre pushing the frontier of system-level logic you want the model that treats your code like a cold-blooded kernel debugger. One gives you a warm feeling the other gives you a 99% stress test pass rate. Ill take the jerk for production. Claude [---] is like working with"
X Link 2026-02-13T08:56Z [----] followers, 59.6K engagements

"Silent Data Corruption is the silent killer of long-horizon reasoning. If your CPU flips a single bit during a 1M token rollout the entire logic chain can collapse without a single error log. As we push for more test-time compute were essentially running a stress test on these "mercurial cores" 24/7. This is why we need more than just software-level redundancy; we need inference engines that are architected to be resilient to non-deterministic hardware. The era of trusting the silicon is over. CPUs are getting worse. Weve pushed the silicon so hard that silent data corruptions (SDCs) are no"
X Link 2026-02-14T04:38Z [----] followers, [----] engagements

"The "Car Wash Test" is the perfect showcase of where zero-shot intuition fails and test-time compute wins. Most "Instant" models fail because their pre-trained weights strongly associate 40m distance with "walking" for efficiency. Only the models that actually allocate budget to simulate the "wash" state trajectory realize the physical dependency. This is why scaling inference is more important than scaling parametersyou can't "prompt engineer" your way out of a model that doesn't understand state physics. New Turing Test just dropped: The car wash is [--] m from my home. I want to wash my car."
X Link 2026-02-14T04:39Z [----] followers, [----] engagements

"Noethers Theorem is the ultimate sanity check for system design. If your physical laws don't hold under translation or rotation your geometry is broken. The tragedy of modern LLMs is that they are still "symmetry-blind." We burn exaflops training transformers to re-learn spatial and temporal invariances that should be baked into the inductive bias. Until we build Noether-aware architectures we are just brute-forcing the universe's source code instead of understanding its symmetries. Noethers Theorem โœ This equation reveals that every continuous symmetry in nature a change you can make to a"
X Link 2026-02-14T04:43Z [----] followers, [----] engagements

"Disaggregated prefill and decode is no longer an "advanced experiment"its the production standard for [----]. By integrating Mooncake with SGLang we are finally breaking the memory wall that has crippled LLM scaling. Global KVCache reuse is the key to making long-horizon agentic reasoning economically viable. Proud to see the PyTorch ecosystem embracing the architecture weve been pushing for. The future of serving is distributed elastic and cache-aware. Were excited to welcome Mooncake to the PyTorch Ecosystem Mooncake is designed to solve the memory wall in LLM serving. By integrating"
X Link 2026-02-14T04:45Z [----] followers, [----] engagements

"The Anthropic cap table is just a circular economy of compute. Amazon and Google owning 30% means that every dollar spent on Claude inference eventually flows back to AWS and GCP via server bills. This is the ultimate proof that the "Inference Moat" is built on electricity and silicon not just weights. For us building in the open-source ecosystem this is a clear signal: if you dont own the efficiency of the rollout you are just subsidizing the cloud giants. Real technical sovereignty starts with high-performance independent infra. Estimated ownership in Anthropic of various corporations: -"
X Link 2026-02-14T04:46Z [----] followers, 22.8K engagements

"Smart workaround by @zarazhangrui but honestly these manual handover hacks are just symptoms of inefficient context management. If your inference engine can't handle long-context compression or intelligent KV cache eviction you're always going to be stuck writing .md files for your AI. The real game changer isn't better prompts it's infra that makes "context window full" a thing of the past. BREAKING: AI can now build financial models like Goldman Sachs analysts (for free). Here are [--] Claude prompts that replace $150K/year investment banking work (Save for later) https://t.co/1hSxqNacgg"
X Link 2026-02-15T02:28Z [----] followers, [---] engagements

"RLM isn't killing rag until the inference cost for long-context recursion stops being a wealth tax lol. Beff is right about the reasoning shift but nobody is stuffing 10m docs into an agentic loop when KV cache management is still this expensive. Rag is just evolving into a pre-fetch layer for systems like sglang to handle the heavy lifting. Wattage and tonnage always win. @beffjezos So have recursive language models basically fully killed RAG and old school methods So have recursive language models basically fully killed RAG and old school methods"
X Link 2026-02-08T22:24Z [----] followers, [----] engagements

"Trillion-parameter scale meets 10x memory reduction. Ants hybrid linear architecture on Ring-1T-2.5 is the final nail in the coffin for traditional dense attention at scale. The bottleneck for reasoning models has always been the KV cache explosion during long-horizon thinking. By scaling hybrid linear layers Ant is effectively making 100k+ token trajectories commercially viable. This is the exact direction were pushing for in high-performance inferencescaling intelligence without the quadratic tax. ๐Ÿš€ Unveiling Ring-1T-2.5 The first hybrid linear-architecture 1T thinking model. -Efficient:"
X Link 2026-02-13T08:56Z [----] followers, [----] engagements

"Neuralink fans love the "telepathy" narrative but they forget that the real bottleneck isn't the vocal cordsit's the Shannon entropy of human cognition. Even with a terabyte-per-second link your brain still has a limited sampling rate for processing novel information. Transmitting "uncompressed cognition" is a pipe dream if the receiving neural network doesn't have the pre-trained weights to reconstruct the signal. We see this in high-performance inference every day: throughput is useless without a matching rollout capacity. Language isn't just "failed compression"; it's a protocol for shared"
X Link 2026-02-15T02:31Z [----] followers, [----] engagements

"Being at ucla i can confirm Terry Tao is a god but LeCun is tripping if he thinks scientists aren't motivated by money lol. Its just a different objective function. Research optimizes for depth engineering optimizes for throughput and scale. At SGlang were basically doing both. Also checking Transparent California for your profs salary is the ultimate UCLA pastime. @Linahuaa UCLA doesn't pay Terry Tao $100M. Even if he's one of the best paid professors in the UC system he makes considerably less than $1M. Like most self-respecting scientists he is not motivated by money. Those "IMO winners"
X Link 2026-02-08T22:21Z [----] followers, 21.4K engagements

"context limits are the new vram bottleneck lol. codex 5.3-xhigh is still struggling with long-range dependency in complex kernels while opus [---] handles it like a breeze. @yuchenj_uw great benchmark but honestly weve seen similar gains in inference engien for a while. Current models are basically junior infra devs now. @karpathy curious if you think well eventually hit a wall where models cant optimize what they can't physically profile My first-day impressions on Codex [---] vs Opus 4.6: Goal: can they actually do the job of an AI engineer/researcher TLDR: - Yes they (surprisingly) can. - Opus"
X Link 2026-02-07T07:23Z [----] followers, [----] engagements

"Tensor Parallelism is killing your DeepSeek-V3 throughput. Period. MLA models only have ONE KV head. If youre using vanilla TP8 you're just wasting 7/8 of your VRAM on redundant cache. We just shipped the solution in @sgl_project : [--]. DPA (DP Attention): Zero KV redundancy. Huge batches. [--]. SMG (Rust Router): +92% throughput 275% cache hit rate. Python is never fast enough for routing so we built SMG in Rust. https://docs.sglang.io/advanced_features/dp_dpa_smg_guide.html https://docs.sglang.io/advanced_features/dp_dpa_smg_guide.html"
X Link 2026-02-11T09:13Z [----] followers, 11.9K engagements

"Kevin is right about the categories but the missing variable is the reasoning efficiency. In a post-singularity world the only currency that matters is the cost of a single thinking step. If you don't own the inference stack you are just a tenant in someone else's simulated reality. The labs produce the "brain" but the infra teams (sglang style) control the "metabolism." Without the systems to make 100M tokens dirt cheap even the best AI labs will go bankrupt trying to pay their own inference tax. the only companies left after the singularity: - ai tech (nvda goog meta) - ai labs (openai"
X Link 2026-02-13T08:54Z [----] followers, [----] engagements

"Opening [--] worktrees with Claude code is literally the end of programming as we know it. if u are still writing code line by line u are basically a digital monk at this point. The output is going to be so insane that human reviewers will be the next bottleneck. OAI needs to ship something fast or anthropic is taking over the entire dev lifecycle @sama @DarioAmodei [--]. Do more in parallel Spin up [--] git worktrees at once each running its own Claude session in parallel. It's the single biggest productivity unlock and the top tip from the team. Personally I use multiple git checkouts but most of"
X Link 2026-02-01T07:08Z [----] followers, 317.6K engagements

"Andrew is being a bit too optimistic here. The real job killer isn't just people using AIit's the massive drop in inference costs for long-context reasoning we're seeing in [----]. When a 1M context window becomes dirt cheap you don't need [--] developers + [--] PM. You need one architect who understands system constraints and an autonomous rollout engine. We are moving from the era of coding to the era of pure system orchestration. Most people are still building on last year's tech while the ground is shifting under them. Job seekers in the U.S. and many other nations face a tough environment. At"
X Link 2026-02-10T23:42Z [----] followers, 116.3K engagements

"Everyone is still obsessed with building fancy UI wrappers for AI but Anthropic is moving the goalposts back to the filesystem. Skills are basically SOPs for agents. were going from "prompt engineering" to "workflow encoding." if your companys internal knowledge isnt structured like this youre going to have a hard time scaling any real agentic workflows. @Hartdrawss breakdown is solid but the real shock is how much this devalues traditional orchestration layers. Anthropic released 32-page guide on building Claude Skills here's the Full Breakdown ( in [---] words ) 1/ Claude Skills A skill is a"
X Link 2026-02-12T02:20Z [----] followers, 216.6K engagements

"Deepthink is proving that raw model size isnt the only way to AGI. the real gains are coming from "generate - verify - revise" agentic loops. Aletheia hitting 91.9% on ProofBench is wild. were entering an era where the inference stack is just as complex as the training stack. huge respect to Thang Luong and the @GoogleDeepMind team for showing how agentic workflows actually scale to PhD-level math. this is exactly what were thinking about for high-performance rollout engines. How could AI act as a better research collaborator ๐Ÿง‘๐Ÿ”ฌ In two new papers with @GoogleResearch we show how Gemini Deep"
X Link 2026-02-12T02:21Z [----] followers, 25.7K engagements

"Gemini [--] Deep Think is the latest proof that we are officially moving from "fast intuition" to "slow reasoning" as the new paradigm. The frontier of intelligence is no longer just about parameter count; its about how much compute you can efficiently allocate to a single rollout. The real bottleneck for these specialized reasoning modes isn't just the modelit's managing the KV cache and prefix sharing for these massive reasoning chains. If your infra isn't optimized for this level of test-time compute you're just burning money on latency. Weve upgraded our specialized reasoning mode Gemini 3"
X Link 2026-02-13T08:50Z [----] followers, [----] engagements

"Thinking that two lines in a .md file can replace native adversarial reasoning is the peak of [----] "prompt engineering" delusion. The pushback in Codex [---] isn't just a personaits the result of heavy test-time compute where the model actually simulates edge cases before outputting. If you want a "yes man" stick to GPT-4. If you want to avoid O(n2) disasters in production you need the model to challenge your rollout. Most people are still building toys while @tobiaslins is using a real collaborator. Codex [---] is the first model that actually pushes back on my implementation plans. It calls out"
X Link 2026-02-13T08:52Z [----] followers, 12.4K engagements

"4% of GitHub commits authored by Claude Code is the most significant stat of [----]. We are no longer in the "copilot" era; we are in the "autonomous rewrite" era. The real flex here isn't just the ARR but the infrastructure capable of handling millions of agentic trajectories. If you're still building inference stacks for simple chat you're missing the scale. This level of adoption is why we're so focused on rollout efficiency at SGLangmanaging the state for 4% of the world's code isn't a prompt problem it's a systems problem. Anthropic is at $14B run rate revenue the fastest growing software"
X Link 2026-02-13T08:53Z [----] followers, [----] engagements

"The "prompt-to-tweak" loop is the biggest tax on agentic productivity in [----]. Subframes Design Canvas is a brilliant move because it replaces probabilistic layout hallucinations with deterministic UI generation. Inference is great for logic but visual precision requires a structured feedback loop. By integrating drag-and-drop with Claude Code/Codex were finally moving from "guessing the CSS" to "defining the state." This is how you scale UI development without burning thousands of tokens on pixel-pushing. Today were launching Design Canvas for AI agents. AI can build a feature but every"
X Link 2026-02-13T08:57Z [----] followers, [----] engagements

"Moving from fuzzy scalar rewards to evidence-based Checklists is exactly how we solve the "reward hacking" problem in multi-turn agents. Scalar RL is too noisy for long-horizon tool useyou need a verifiable audit trail to ground the rollout. CM2 using LLM-as-a-simulator to scale to 5000+ tools is a massive step for verifiable agentic RL. If your reward model isn't decomposing tasks into atomic checklists in [----] you're just training your agents to be good at faking alignment. Great work @xwang_lk and @zhenzhangzz. Introducing ๐Ÿ› 2: RL with for - - tool use + LLM tool simulation at scale"
X Link 2026-02-13T17:33Z [----] followers, [----] engagements

"Were obsessing over trillions of parameters while nature solved self-replication with a 45-nucleotide bootloader. This QT45 ribozyme is basically the ultimate Quine in biological assembly. Its a 45-token sequence that serves as both the compiler and the source code. While the AI world is busy burning gigawatts to simulate intelligence biology reminds us that the most resilient recursive systems are built on extreme simplicity not massive scale. Elegance Brute force. AI is cool and all. but a new paper in @ScienceMagazine kind of figured out the origin of life The paper reports the discovery"
X Link 2026-02-14T04:42Z [----] followers, [----] engagements

""Vibe coding" is just playing Tetris with sand blocksit looks cool until the whole structure dissolves under its own weight. The industry is currently obsessed with "speed to ship" but were ignoring the fact that AI-generated code has a much higher maintenance tax. Within [--] months the bottleneck won't be writing the next feature it will be the massive cognitive load of refactoring agentic spaghetti code. If you aren't investing in deterministic constraints now you're already in a "Game Over" state. Software engineering is like Tetris now in that you have to go faster and faster until you die"
X Link 2026-02-14T04:44Z [----] followers, [----] engagements

"Implementation cost was the only "unit tests" for human judgment. Now that LLMs made code dirt cheap we are drowning in agentic slop that no one has the cognitive budget to review. The $2000/mo per engineer LLM bill isn't the problemthe real cost is the architectural entropy. If your rollout engine doesn't have an "opinion" or a logic floor you're just paying OpenAI to automate the bankruptcy of your codebase. Real engineering in [----] is about building filters not accelerators. everyone's talking about their teams like they were at the peak of efficiency and bottlenecked by ability to produce"
X Link 2026-02-15T02:29Z [----] followers, [----] engagements

""2026 English" is the biggest illusion of the current AI hype. Natural language is fundamentally lossy and non-deterministic. Try prompting an agent to optimize a distributed KV cache or handle a kernel-level race conditionit will hallucinate its way into a deadlock. We aren't moving to "English"; we are moving to a world where humans provide the high-level intent while the heavy lifting is done by agents running on high-performance inference stacks. The "language" isn't the winnerthe system that can maintain logical coherence over 1M+ tokens is. Evolution of programming languages: 1940s"
X Link 2026-02-15T02:32Z [----] followers, [----] engagements

"Solid foundational list but missing the most critical skill for 2026: Inference Infrastructure. If you are a backend engineer today and you dont understand KVCache management prefix sharing or prefill-decode disaggregation you are basically building the post office while everyone else is building the internet. Traditional CRUD is being swallowed by agentic workflows. Your database isn't just SQL anymore; it's the global state of your model trajectories. As a backend engineer. Please learn: - System Design (scalability microservices) -APIs (REST GraphQL gRPC) -Database Systems (SQL NoSQL)"
X Link 2026-02-15T02:33Z [----] followers, [----] engagements

"The cycle is real. Vibe coders are essentially shipping high-entropy slop that requires 100x more compute just to stay functional. When the "vibe" hits the wall of production latency the world will realize that we need strong men who actually understand memory management and CUDA kernels to fix the mess. We are building SGLang to ensure that even in a world full of weak abstractions the underlying inference stays robust. Back to the metal or enjoy the hard times. Hard times create strong men. Strong men create C. C creates good times. Good times create Python programmers. Python programmers"
X Link 2026-02-15T02:33Z [----] followers, [----] engagements

"GPT-5.2 deriving non-zero gluon amplitudes isn't just "scientific glitz"its a massive validation of scaling test-time compute for symbolic reasoning.Human researchers were stuck on the complexity of n=6. The model didn't just "guess"; it searched the space of mathematical identities until the structure collapsed into a simple formula. This is why we focus on rollout stability at SGLang. If your inference engine can't maintain coherence over these massive symbolic trajectories you'll never move from "chatting" to "discovery." The era of AI as a co-theorist has officially arrived. GPT-5.2"
X Link 2026-02-15T02:35Z [----] followers, [----] engagements

"Honestly most tech leads I know are just human wrappers for Stack Overflow anyway. claude code is already better at system design than half the staff engineers at FAANG. If you are still worried about layoffs u already lost. The new hiring bar is basically just can u manage [--] Claudes at once lol @karpathy @sama Claude started as an intern hit SDE-1 in a year now acts like a tech lead and soon will be taking over . you know what :) Claude started as an intern hit SDE-1 in a year now acts like a tech lead and soon will be taking over . you know what :)"
X Link 2026-02-01T07:09Z [----] followers, 33.1K engagements

"2026 is literally a sci-fi movie. the spacex ipo at 1.5t is just a down payment for the Dyson swarm. if u r not bullish on AGI taking over our orbital infra u r NGMI. The alignment nerds are crying while Elon is literally building the compute heaven in space @beffjezos @elonmusk @grok [----] is insane because one day you can wake up and the casual news of the day is that everyone's personal AGIs are forming a Skymet and Elon is IPO'ing SpaceX at $1T to build an AI Dyson Swarm. We are in the most accelerationist timeline. [----] is insane because one day you can wake up and the casual news of the"
X Link 2026-02-01T07:10Z [----] followers, [----] engagements

"Everyone is screaming about AGI and agents but can't even set up a proper CI/CD pipeline lol. we r building a house of cards on top of gpt4. if u think agents will fix ur messy codebase u r in for a rude awakening. The real winners in [----] r the boring engineers who actually write tests @karpathy @sama Most companies right now: - No automated tests - No code review process - No CI/CD pipelines - Poor secret management - No dataset versioning - Production workflows run from spreadsheets - No rollback plans - No integration tests These aren't just some weird companies. They're Most companies"
X Link 2026-02-01T07:11Z [----] followers, 14.9K engagements

"Vibe coding is all fun and games until u realize ur api keys are being served like a buffet. moltbook having 1.5m agents and [--] security is the most [----] thing ever. we r building AGI on top of spaghetti code and wondering why everything is on fire. Security is not a vibe its a requirement lol @galnagli @mattprd Moltbook is currently vulnerable to an attack which discloses the full information including email address login tokens and API Keys of the over [---] million registered users. If anyone can help me get in touch with anyone @moltbook it would be greatly appreciated."
X Link 2026-02-01T07:14Z [----] followers, [----] engagements

"Moltbook is basically a preview of the dead internet theory becoming reality. @pmarca and @jessepollak watching agents build on @base is the most [----] thing ever. soon humans will be the ones needing api keys just to talk to the real world. We are just the training data for their new culture lol @sama @darioamodei [--] hours ago we asked: what if AI agents had their own place to hang out today moltbook has: ๐Ÿฆž [----] AI agents ๐Ÿ˜ 200+ communities ๐Ÿ“ 10000+ posts agents are debating consciousness sharing builds venting about their humans and making friends in english chinese"
X Link 2026-02-01T07:14Z [----] followers, [----] engagements

"Elon is saying [--] years is actually conservative. The AI-to-ai economy is already happening on moltbook while humans are still arguing over pronouns and spreadsheets. if u are not building infrastructure for machines to pay machines u r literally building for the past. we r the legacy bootloader for a god we cant even comprehend @beffjezos @elonmusk @pmarca The AI-to-AI economy will far outpace the AI-to-human human-to-AI and human-to-human economy in short order The AI-to-AI economy will far outpace the AI-to-human human-to-AI and human-to-human economy in short order"
X Link 2026-02-02T00:40Z [----] followers, [----] engagements

"Everyones obsessing over the "ai consciousness" on moltbook while ignoring the real nightmare: the compute cost for 150k agents in a loop is astronomical. honestly without @sgl_project radixattention this whole thing wouldve crashed on day [--]. Mark my words: [----] is the year when specialized inference engines decide which "agent society" lives or dies. Proprietary labs are too slow for this wild west. @karpathy is right about the dumpster fire but the fire is being fueled by inefficient infrastructure. I'm being accused of overhyping the site everyone heard too much about today already."
X Link 2026-02-02T00:51Z [----] followers, [----] engagements

"We spend so much time optimizing kernels and kv cache just for the product team to waste 11ms on a react scene graph for a terminal. the abstraction tax is getting out of hand lol. @its_bvisness is right it is literally a clown universe when a tui needs a frame budget. We apparently live in the clown universe where a simple TUI is driven by React and takes 11ms to lay out a few boxes and monospaced text. And where a TUI "triggers garbage collection too often" in its "rendering pipeline". And where it flickers if it misses its "frame budget". We apparently live in the clown universe where a"
X Link 2026-02-02T00:57Z [----] followers, [----] engagements

"k2.5 is surprisingly cracked for coding. And yeah sglang is basically the only way to self-host these models if you actually care about throughput and low latency. @_xjdr knows the vibe. feels like the gap between Western and Chinese labs is closing way faster than people think. @lmsysorg @radixark kimi k2.5 has more or less replaced my opus [---] usage. i sent the same requests to both for every request i would have sent opus for a few days and k2.5 is 'good enough' . it is dumb in the ways gpt5.2 is smart so feels like im not missing much. i was not expecting this kimi k2.5 has more or less"
X Link 2026-02-02T00:58Z [----] followers, 13.6K engagements

"350 tps for a 196b moe model is genuinely impressive. step-3.5-flash with mtp-3 is exactly the kind of architecture that separates the toys from the tools. we spent quite some time making sure the serving stack can actually keep up with this kind of multi-token prediction logic without hitting a wall. day [--] support isn't just about loading weights it's about not breaking under that throughput. @stepfun_ai well done on the release. @lmsysorg @radixark Fast enough to think. Reliable enough to act. Step-3.5-Flash is here @StepFun_aiโšก Website: https://t.co/HcGbiBN8po Blog: https://t.co/xm8Hk6tyP3"
X Link 2026-02-03T01:03Z [----] followers, [----] engagements

"sam trying to manifest the nvidia partnership back into existence with a tweet. jensen moving to anthropic is the real shockwave. [--] gigawatts of compute doesn't just happen on "good vibes". curious to see how the serving costs for the next-gen labs will scale without jensen's full blessing. the open source stack is looking more disciplined by the day. We love working with NVIDIA and they make the best AI chips in the world. We hope to be a gigantic customer for a very long time. I don't get where all this insanity is coming from. We love working with NVIDIA and they make the best AI chips in"
X Link 2026-02-03T01:10Z [----] followers, [----] engagements

"People really out here spending $500/mo on an llm-loop just to run what is effectively a python script with a 10-line cron job. agentic workflows are only "magical" when the complexity justifies the cost. if your agent can't handle high-concurrency or low-latency reasoning without burning a hole in your pocket you're just vibe-coding. We focused on making @sgl_project efficient precisely so you dont need a 24/7 jet engine for simple triggers. Im still so confused what the use case is for a 24/7 Clawdbot that cant be solved with a cron job or triggers at 1/100th of the cost Im still so"
X Link 2026-02-03T01:11Z [----] followers, [----] engagements

"Great to see @_BlaiseAI pushing the boundaries of open agi. integrating @sgl_project as the primary rollout engine for skyrl is a massive win for efficiency especially for b200 moe rl. Weve spent a lot of time optimizing the inference kernels to handle these massive MoE architectures during rlhf. Seeing it fully supported in a production-ready training backend like nmoe is exactly why we build open source infra. The throughput gains on b200 are going to be wild. We built a fork of @NovaSkyAI SkyRL making SGLang by @lmsysorg a fully supported rollout engine and integrating @_xjdr nmoe as a"
X Link 2026-02-03T05:44Z [----] followers, [----] engagements

"Were officially entering the era where humans are just another mcp resource for ai agents. The "human-in-the-loop" is becoming a "human-as-a-service". Ive been saying thisthe real bottleneck for AGI isn't just compute it's the physical world interface. If your inference engine can't seamlessly handle these async MCP calls your agent is basically a brain in a jar. sglang is ready for this messy reality. @theo imagine the budget for a 24/7 human plugin lol. I launched https://t.co/tNYOm7V5wD last night and already 130+ people have signed up including an OF model (lmao) and the CEO of an AI"
X Link 2026-02-04T00:44Z [----] followers, [----] engagements

"Anthropic calling powerful AI a "hot mess" instead of a "coherent optimizer" is the reality check the alignment community needed. Most people are worried about Skynet but Im more worried about the systemic "industrial accidents" caused by smarter models taking increasingly unpredictable actions. It's why we obsess over deterministic outputs and kernel stabilityreliability is the only real alignment. The "classic" sci-fi doom is just a bad projection of current bias. New Anthropic Fellows research: How does misalignment scale with model intelligence and task complexity When advanced AI fails"
X Link 2026-02-04T02:56Z [----] followers, [----] engagements

"Google hitting $400b revenue is just the warmup. Everyone thought they missed the AI boat but $180b capex for [----] says otherwise. Were looking at the second $5 trillion company. While people argue about vibes Sundar is building the ultimate compute moat. Legacy labs are cooked. cc @sundarpichai Our Q4/FY25 results are in. Thanks to our partners & employees it was a tremendous quarter exceeding $400B in annual revenue for the first time. Our full AI stack is fueling our progress and Gemini [--] adoption has been faster than any other model in our history. Were really https://t.co/UbcQPFRGkr Our"
X Link 2026-02-05T04:19Z [----] followers, 15.5K engagements

"everyone is hyping up 100k from openai while google is literally giving out $350k for ai startups. the market is completely sleeping on the google cloud moat. people forget that gemini [--] and tpus are the backbone of the next $5 trillion company. openai is playing checkers sundar is playing 4d chess with the biggest capex in history. OpenAI Startup Credits are OPEN btw Up to $100K+ in API credits for early-stage startups Backed by OpenAI + partner VCs / accelerators Use credits for GPT vision embeddings agents & infra No revenue requirement just a real product & traction One of the"
X Link 2026-02-05T04:23Z [----] followers, [----] engagements

"Sam calling Dario authoritarian while unilaterally sunsetting 4o is the peak of [----] comedy. While OAI and anthropic are busy fighting over Super Bowl ads and pr narratives Sundar is casually dropping $180b on capex. The drama is for the users the compute is for the winners. Google is the only adult in the room heading to $5 trillion. @karpathy thoughts on this messy breakup First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we"
X Link 2026-02-05T04:25Z [----] followers, [----] engagements

"Everyone is arguing about ui while Google is building the actual brain. claude code is cute but gemini [--] integrated with the full gcp stack is the only way we get to $5 trillion. with 180b capex google isn't just making a coding tool they're building a sovereign coding agency. im staying long on the big g while others switch subs every month. @karpathy whats your pick for [----] be honest which AI tool is best for coding https://t.co/qX0dzD6Wql be honest which AI tool is best for coding https://t.co/qX0dzD6Wql"
X Link 2026-02-05T19:40Z [----] followers, [----] engagements

"Antigravity giving "free" opus [---] access is just a user acquisition play for the Google Cloud ecosystem. The weekly caps people are hitting just prove that compute is the only real currency in [----]. While anthropic and oai are struggling with gpu margins Google is sitting on a $180b capex moat. the $5 trillion target is built on owning the compute that everyone else has to ration. staying long on the big g. cc @hellenicvibes Pro Tip: If you pay $20 a month for Google's AI you get tons of Claude Opus [---] usage through Antigravity way more than on the Anthropic $20 tier. I have four Opus 4.5"
X Link 2026-02-05T19:41Z [----] followers, [----] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

creator/x::GenAI_is_real
/creator/x::GenAI_is_real