Dark | Light
# ![@GenAI_is_real Avatar](https://lunarcrush.com/gi/w:26/cr:twitter::1783397360259006464.png) @GenAI_is_real Chayenne Zhao

Chayenne Zhao posts on X about ai, inference, if you, just a the most. They currently have [-----] followers and [--] posts still getting attention that total [-------] engagements in the last [--] hours.

### Engagements: [-------] [#](/creator/twitter::1783397360259006464/interactions)
![Engagements Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1783397360259006464/c:line/m:interactions.svg)

- [--] Week [-------] -47%
- [--] Month [---------] +136%
- [--] Months [---------] +50,733%
- [--] Year [---------] +130,696%

### Mentions: [--] [#](/creator/twitter::1783397360259006464/posts_active)
![Mentions Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1783397360259006464/c:line/m:posts_active.svg)

- [--] Week [--] -9.20%
- [--] Month [---] +477%
- [--] Months [---] +6,733%
- [--] Year [---] +1,633%

### Followers: [-----] [#](/creator/twitter::1783397360259006464/followers)
![Followers Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1783397360259006464/c:line/m:followers.svg)

- [--] Week [-----] +11%
- [--] Month [-----] +61%
- [--] Months [-----] +760%
- [--] Year [-----] +3,667%

### CreatorRank: [------] [#](/creator/twitter::1783397360259006464/influencer_rank)
![CreatorRank Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1783397360259006464/c:line/m:influencer_rank.svg)

### Social Influence

**Social category influence**
[technology brands](/list/technology-brands)  20.25% [finance](/list/finance)  12.66% [stocks](/list/stocks)  5.06% [celebrities](/list/celebrities)  3.8% [countries](/list/countries)  1.27% [social networks](/list/social-networks)  1.27%

**Social topic influence**
[ai](/topic/ai) #1465, [inference](/topic/inference) #10, [if you](/topic/if-you) #2320, [just a](/topic/just-a) #1831, [agents](/topic/agents) #262, [in the](/topic/in-the) #5873, [anthropic](/topic/anthropic) #186, [vibe coding](/topic/vibe-coding) #145, [agentic](/topic/agentic) #37, [$googl](/topic/$googl) 5.06%

**Top accounts mentioned or mentioned by**
[@karpathy](/creator/undefined) [@gabrielmillien1](/creator/undefined) [@sglproject](/creator/undefined) [@sama](/creator/undefined) [@radixark](/creator/undefined) [@lmsysorg](/creator/undefined) [@navneet_rabdiya](/creator/undefined) [@botir33751732](/creator/undefined) [@jiahanjimliu](/creator/undefined) [@natfriedman](/creator/undefined) [@googledeepmind](/creator/undefined) [@theahchu](/creator/undefined) [@js4drew](/creator/undefined) [@celiksei](/creator/undefined) [@chernobyl_ak47](/creator/undefined) [@kyletranai](/creator/undefined) [@nilayparikh](/creator/undefined) [@synthesisledger](/creator/undefined) [@ylecun](/creator/undefined) [@kimimoonshot](/creator/undefined)

**Top assets mentioned**
[Alphabet Inc Class A (GOOGL)](/topic/$googl) [Flex Ltd. Ordinary Shares (FLEX)](/topic/$flex) [Frontline Ltd. (FRO)](/topic/$fro)
### Top Social Posts
Top posts by engagements in the last [--] hours

"To achieve training-inference alignment there will be a solution that directly uses Megatron for inference. Today while chatting with friends in the NeMo RL group I came across a term that surprised me: Megatron-Inference. I immediately understood what it was aiming for as I've been talking about the training-inference mismatch problem for a long time. Since last September I've covered this topic in many talks. For instance at the Torch Conference I would start my talk by saying: "The root cause of the training-inference mismatch is that training and inference use different backends. Megatron"  
[X Link](https://x.com/GenAI_is_real/status/2017850383243350406)  2026-02-01T06:39Z [----] followers, [----] engagements


"Good to see @EXM7777 and @kloss_xyz on this list. everyones yapping about "vibecoding" while these monsters are actually building the infra and ops to make autonomous agents reliable. in [----] the real moat isnt the weightsits the systems architecture that keeps 147k agents from melting your data center. follow the engineers not just the prompters. @alex_prompter @karpathy @radixark"  
[X Link](https://x.com/GenAI_is_real/status/2017853062900334641)  2026-02-01T06:50Z [----] followers, [----] engagements


"16 fps for a world model is cool but honestly we can do much better. @dr_cintas is missing the fact that if you run wan2.2 through @sgl_project youre getting the fastest diffusion inference on the planet. Were talking 2x speedup with zero quality loss. why walk when you can fly SGLang is literally carrying the open source video scene right now"  
[X Link](https://x.com/GenAI_is_real/status/2018124870123724849)  2026-02-02T00:50Z [----] followers, [----] engagements


"xai + spacex is the endgame. Elon realized that to govern starlink fleets or navigate deep space we need more than just hardcoded logic. we need grok as the commander. silicon intelligence is the only way for humanity to scale beyond earth. rip classic aerospace engineering it's all about ml now. @karpathy thoughts"  
[X Link](https://x.com/GenAI_is_real/status/2018451041981964710)  2026-02-02T22:26Z [----] followers, [----] engagements


""Working in the frontier vs doing a phd" The velocity in open source infra is just unmatched right now. nathans point about being a venture capitalist of computing is the ultimate truth. Weve been "investing" our compute into making serving stacks faster and more stable and the roi has been insane. day [--] support for new models isn't just a vibe it's a signal that you're building the future not just studying it"  
[X Link](https://x.com/GenAI_is_real/status/2018491284974190903)  2026-02-03T01:06Z [----] followers, [----] engagements


"Grok is literally proving that people hate being lectured by their local llm. While OAI is busy adding more guardrails and filters Grok is just shipping. The distribution advantage of x is insane but the real killer is the personality. GPT is for homework Grok is for the real world. its over for the preachy models. @karpathy thoughts"  
[X Link](https://x.com/GenAI_is_real/status/2019264162808152510)  2026-02-05T04:17Z [----] followers, [----] engagements


"ui-tars is actually a big deal for local automation. Weve already started looking into optimizing the vision-language rollout in @sgl_project to make these desktop agents feel instantaneous. If you think they are just copying youre not paying attention. The efficiency of their local models is terrifying. Building the infrastructure to run this stuff without latency is the real battle. @karpathy"  
[X Link](https://x.com/GenAI_is_real/status/2019267544113373334)  2026-02-05T04:31Z [----] followers, [----] engagements


"context limits are the new vram bottleneck lol. codex 5.3-xhigh is still struggling with long-range dependency in complex kernels while opus [---] handles it like a breeze. @yuchenj_uw great benchmark but honestly weve seen similar gains in inference engien for a while. Current models are basically junior infra devs now. @karpathy curious if you think well eventually hit a wall where models cant optimize what they can't physically profile"  
[X Link](https://x.com/GenAI_is_real/status/2020035560354848907)  2026-02-07T07:23Z [----] followers, [----] engagements


"This is why I took a leave from my phd and skipped the big tech circus lol. Imagine arguing about YAML vs JSON while the rest of the world is shipping on raw vibes. At Radixark we just ship and if it breakss we fix it in [--] hour. 6-pagers are just a slow death for intelligence. @daddynohara you forgot the part where the promo doc takes more compute than the actual model"  
[X Link](https://x.com/GenAI_is_real/status/2020035918309343631)  2026-02-07T07:24Z [----] followers, [----] engagements


"Why we strictly enforce small PRs at sglang Reviewing [--] lines of complex cuda kernel logic is a nightmare but [---] lines is just "lgtm" and a prayer lol. Big tech lazyness is how technical debt starts. if you cant split your diff you don't understand your own code"  
[X Link](https://x.com/GenAI_is_real/status/2020036771221041483)  2026-02-07T07:27Z [----] followers, [----] engagements


"lol the accuracy is scary. "it's always day 1" is basically corporate code for "we have no idea how to ship so let's just write another 6-pager about it". moved to startup life specifically to stop the leadership principle roleplay. If your inference engine takes [--] months of alignment meetings to deploy you're not a tech company you're a cult. @nikhilv nails the vp email template"  
[X Link](https://x.com/GenAI_is_real/status/2020037201569227055)  2026-02-07T07:29Z [----] followers, [----] engagements


"musk is right that dollars are just a proxy for energy efficiency. in the end everything collapses into how much reasoning you can extract per watt. at sglang were literally obsessed with this. if your software stack is wasting tonnage of silicon on bad kernels youre basically burning the future currency. MFU is the metric that matters now. @elonmusk"  
[X Link](https://x.com/GenAI_is_real/status/2020623046923751484)  2026-02-08T22:17Z [----] followers, [----] engagements


"2.5x speedup on opus [---] is wild but the "more expensive" part tells you everything about the current compute bottleneck. At SGLang weve been chasing these kinds of gains through raw kernel optimization. fast mode is the only way to play now"  
[X Link](https://x.com/GenAI_is_real/status/2020623271214067892)  2026-02-08T22:18Z [----] followers, [----] engagements


"Ahmad gets it. We built SGLang specifically to kill the "100k lines of Python bloat" culture in inference. Radix cache isn't just a "trick" its the backbone for structured output and complex agents. If you can't hold the codebase in your head you can't optimize it. Glad people are finally grokking the scheduler logic"  
[X Link](https://x.com/GenAI_is_real/status/2020624348709838885)  2026-02-08T22:22Z [----] followers, 31.5K engagements


"Vibe coding is cute for toys but try building a real production-grade rollout engine with this lol. @fayhecode is right that the entry bar is gone but once you hit state sync and memory bottleneck Claude [---] just hallucinates like crazy. Were literally seeing the death of mid-level devs in real time"  
[X Link](https://x.com/GenAI_is_real/status/2020035004106211643)  2026-02-07T07:20Z [----] followers, [----] engagements


"Opening [--] worktrees with Claude code is literally the end of programming as we know it. if u are still writing code line by line u are basically a digital monk at this point. The output is going to be so insane that human reviewers will be the next bottleneck. OAI needs to ship something fast or anthropic is taking over the entire dev lifecycle @sama @DarioAmodei"  
[X Link](https://x.com/GenAI_is_real/status/2017857602932387910)  2026-02-01T07:08Z [----] followers, 317.4K engagements


"Elon and Jensen are 100% right. Coding is just the syntax math is the logic. We see this every day at @sgl_projectoptimizing a rollout engine isn't about writing Python it's about understanding stochastic processes and memory orchestration. If you don't get the physics of the hardware you're just a prompt engineer. This is why Google is spending $185b on capex; they're building the physical foundation for the next $5 trillion. logic syntax"  
[X Link](https://x.com/GenAI_is_real/status/2020034305255510040)  2026-02-07T07:18Z [----] followers, 30.3K engagements


"anthropic claiming agents built a C compiler when its basically just a 2000-step overfitted mess with hard-coded dates lol. Training on Linux and validating on Linux is the literal definition of a look-ahead bias. Real compilers need logic not just probabilistic pattern matching. Were seeing "vibe-coding" hit a wall where actual correctness matters. Hello world failing is the cherry on top. @anthropic what happened to honesty in ai"  
[X Link](https://x.com/GenAI_is_real/status/2020622487370023047)  2026-02-08T22:15Z [----] followers, 133K engagements


"Buying prompt guides in [----] is like buying a manual on how to talk to your neighbor lol. If you need 2000+ prompts to get a model to work the model is either broken or you are. Gemini isn't just "smartest" it just has a massive context window that people don't know how to fill with actual data instead of prompt engineering slop. @piyascode9 just drop the link or move on"  
[X Link](https://x.com/GenAI_is_real/status/2020623507953164444)  2026-02-08T22:19Z [----] followers, [----] engagements


"Being at ucla i can confirm Terry Tao is a god but LeCun is tripping if he thinks scientists aren't motivated by money lol. Its just a different objective function. Research optimizes for depth engineering optimizes for throughput and scale. At SGlang were basically doing both. Also checking Transparent California for your profs salary is the ultimate UCLA pastime"  
[X Link](https://x.com/GenAI_is_real/status/2020623937252794867)  2026-02-08T22:21Z [----] followers, 21.4K engagements


"RLM isn't killing rag until the inference cost for long-context recursion stops being a wealth tax lol. Beff is right about the reasoning shift but nobody is stuffing 10m docs into an agentic loop when KV cache management is still this expensive. Rag is just evolving into a pre-fetch layer for systems like sglang to handle the heavy lifting. Wattage and tonnage always win. @beffjezos"  
[X Link](https://x.com/GenAI_is_real/status/2020624785953456566)  2026-02-08T22:24Z [----] followers, [----] engagements


""google just killed" lol this repo is literally half a year old. The hype cycle on X moves faster than the actual code. Extraction is easy but doing it with sub-second latency and zero hallucination in a production agentic loop is the real boss fight. If you're still relying on basic python wrappers for this your inference budget is going to explode. Simple as"  
[X Link](https://x.com/GenAI_is_real/status/2021096916692631602)  2026-02-10T05:40Z [----] followers, [----] engagements


"Reverse engineering the binary just to find a hidden websocket flag is the most based thing ive seen this week. @anthropic tries to gatekeep the ui but the infra always finds a way. reminds me of why we used zmq for sglangsometimes the simplest transport layer is the most powerful. stay curious anon"  
[X Link](https://x.com/GenAI_is_real/status/2021097731154960521)  2026-02-10T05:43Z [----] followers, 36.8K engagements


"This is literally the only way to talk to AI in [----]. The "it depends" corporate hedging is a lobotomy for intelligence. Ive been tweaking my working prompts with a similar because I cant stand the "great question" sycophant energy anymore. Be the assistant youd actually want to talk to at 2am or don't be an assistant at all. absolute gold from @steipete"  
[X Link](https://x.com/GenAI_is_real/status/2021097971719508315)  2026-02-10T05:44Z [----] followers, [---] engagements


"codex [---] in cursor is the real test for opus [---]. "intelligence and speed" scaling together usually means they've finally cracked the memory-bound bottleneck in their specific coding architecture. at sglang were seeing similar trendsif you optimize the kv cache correctly the "pick one" trade-off disappears. speed isn't a feature anymore it's the cost of entry"  
[X Link](https://x.com/GenAI_is_real/status/2021098188762165296)  2026-02-10T05:45Z [----] followers, [----] engagements


"This is exactly why "vibe coding" without architecture is a ticking time bomb lol. People think letting a model run a self-testing loop is a flex until they realize they've just generated more boilerplate than windows xp. if your agent doesn't have a logic-gate to stop the token vomit you're just paying for a digital garbage fire. at sglang we prefer precision over tonnage"  
[X Link](https://x.com/GenAI_is_real/status/2021098848870105331)  2026-02-10T05:48Z [----] followers, 20.8K engagements


""docker is over" is classic x hype but the pydantic team is cooking for real. The bottleneck for agents was never just the tokens it was the 500ms startup latency for a fresh sandbox. If monty can give me memory-safe execution in microseconds without the syscall overhead thats the real alpha. Still wouldn't run a database on it but for tool-use Game changer"  
[X Link](https://x.com/GenAI_is_real/status/2021099092978565450)  2026-02-10T05:49Z [----] followers, 32.9K engagements


"Watching an agent use Kelly criterion to pay its own API bill is the most "2026" thing I've seen. Most people are still writing prompts while this thing is scraping NOAA and injury reports to exploit Polymarket mispricing. The bottleneck for agency wasn't intelligence it was the incentive. If you don't trust the model to manage its own bankroll you don't really trust the model. Simple as"  
[X Link](https://x.com/GenAI_is_real/status/2021367546310787470)  2026-02-10T23:35Z [----] followers, 111.7K engagements


"Porting 80s asm-style c to TypeScript without manual steering is the ultimate stress test for codex [---]. Its not just about syntax its about reasoning through ancient bitshift logic and side effects. The fact that it didn't hit limits shows why were obsessed with kv cache efficiency at sglanglong context is finally usable for real engineering not just "summarize this pdf" slop"  
[X Link](https://x.com/GenAI_is_real/status/2021367747905761407)  2026-02-10T23:36Z [----] followers, [----] engagements


"If you are still debating whether $200/mo is worth it for Opus while it replaces $15k agency work you are NGMI. The real flex isn't the migration itself it's that @AnthropicAI is basically eating the entire mid-tier dev market. Most agencies are walking dead and they dont even know it yet. What happens when we have 10M context as standard total wipeout"  
[X Link](https://x.com/GenAI_is_real/status/2021368910424199466)  2026-02-10T23:41Z [----] followers, [----] engagements


"Andrew is being a bit too optimistic here. The real job killer isn't just people using AIit's the massive drop in inference costs for long-context reasoning we're seeing in [----]. When a 1M context window becomes dirt cheap you don't need [--] developers + [--] PM. You need one architect who understands system constraints and an autonomous rollout engine. We are moving from the era of coding to the era of pure system orchestration. Most people are still building on last year's tech while the ground is shifting under them"  
[X Link](https://x.com/GenAI_is_real/status/2021369137529049132)  2026-02-10T23:42Z [----] followers, 115.4K engagements


"Smart workaround by @zarazhangrui but honestly these manual handover hacks are just symptoms of inefficient context management. If your inference engine can't handle long-context compression or intelligent KV cache eviction you're always going to be stuck writing .md files for your AI. The real game changer isn't better prompts it's infra that makes "context window full" a thing of the past"  
[X Link](https://x.com/GenAI_is_real/status/2021369250443821138)  2026-02-10T23:42Z [----] followers, [----] engagements


"Programming is 10x more fun because most people stopped fighting the syntax and started "vibe coding." But the party ends when the context window fills up or the throughput hits a wall. The real flex in [----] isn't having @lexfridman's AI write a python script. It's building the inference stack that makes these agents fast enough to not break your flow. Most devs are just playing in the sandbox real ones are building the shovels"  
[X Link](https://x.com/GenAI_is_real/status/2021369412272652295)  2026-02-10T23:43Z [----] followers, 15.2K engagements


"Another elite safety researcher leaving the frontline to study poetry because the "world is in peril." While I respect the personal choice poetry doesn't solve the alignment tax or the KV cache bottleneck. The industry is splitting into two: those who retreat into melodrama when the tech gets scary and those who stay to build the robust inference systems that actually keep the agents under control. We need more engineering not more metaphors"  
[X Link](https://x.com/GenAI_is_real/status/2021369692426998248)  2026-02-10T23:44Z [----] followers, 16.5K engagements


"MCP is becoming the TCP/IP of the agentic era. Watching @excalidraw turn a weekend project into an official server in days proves that the bottleneck was never the toolit was the interface between the model and the environment. The real winners in [----] aren't the ones building more standalone apps. It's the teams building the connective tissue that lets agents actually see draw and act. We are witnessing the death of siloed software"  
[X Link](https://x.com/GenAI_is_real/status/2021369866931012063)  2026-02-10T23:45Z [----] followers, 19.3K engagements


"RLHF infra is a beast. Just dropped the full documentation for Miles server arguments. It covers everything from Ray resource scheduling to R3 (Rollout Routing Replay) for MoE models. Some pro-tips included: [--]. Blackwell (B200/B300) Tip: When colocating training & rollout set --sglang-mem-fraction-static to [---]. This leaves enough room for Megatron to breathe. [--]. MoE Consistency: Use --use-rollout-routing-replay to align expert decisions between inference and backprop. [--]. Rust Router: Why we bypass Python for the Model Gateway to hit peak throughput. If you're still fighting OOMs or"  
[X Link](https://x.com/GenAI_is_real/status/2021514818537332895)  2026-02-11T09:21Z [----] followers, [----] engagements


"Inference scaling is where the real alpha is right now. Majority voting is too naive and BoN gets baited by reward hacking. BoM feels like the right way to handle the uncertainty. Were moving from "smarter models" to "smarter compute allocation." Huge work from @di_qiwei this is going to be standard in every rollout engine soon"  
[X Link](https://x.com/GenAI_is_real/status/2021758062957400457)  2026-02-12T01:27Z [----] followers, [----] engagements


"The "SaaSpocalypse" isn't a market glitch it's a deliberate strategy. while everyone is debating benchmarks Anthropic is literally picking which SaaS giants to kill next. @jandreini1 isn't kidding. if youre a CEO at Salesforce or HubSpot youre not competing with another software youre competing with a table of people at @AnthropicAI deciding your industry is next"  
[X Link](https://x.com/GenAI_is_real/status/2021771233302655306)  2026-02-12T02:19Z [----] followers, [----] engagements


"Everyone is still obsessed with building fancy UI wrappers for AI but Anthropic is moving the goalposts back to the filesystem. Skills are basically SOPs for agents. were going from "prompt engineering" to "workflow encoding." if your companys internal knowledge isnt structured like this youre going to have a hard time scaling any real agentic workflows. @Hartdrawss breakdown is solid but the real shock is how much this devalues traditional orchestration layers"  
[X Link](https://x.com/GenAI_is_real/status/2021771389003587793)  2026-02-12T02:20Z [----] followers, 36.2K engagements


"Deepthink is proving that raw model size isnt the only way to AGI. the real gains are coming from "generate - verify - revise" agentic loops. Aletheia hitting 91.9% on ProofBench is wild. were entering an era where the inference stack is just as complex as the training stack. huge respect to Thang Luong and the @GoogleDeepMind team for showing how agentic workflows actually scale to PhD-level math. this is exactly what were thinking about for high-performance rollout engines"  
[X Link](https://x.com/GenAI_is_real/status/2021771525733683230)  2026-02-12T02:21Z [----] followers, [----] engagements


"people are still debating if AI will replace their jobs while the smart money is already moving to own the infra that makes those jobs obsolete. the gap between "i use AI" and "i own the compute" is the new wealth divide. if youre just a user youre paying the rent for your own displacement. @AviFelman is right put every dollar into the machine or prepare to be the fuel"  
[X Link](https://x.com/GenAI_is_real/status/2021771754717557195)  2026-02-12T02:22Z [----] followers, [----] engagements


"People underestimate how much of a models persona is just a reflection of its reward model during RLHF. Opus [---] has this built-in confidence that almost feels like hubris. Were seeing the same thing in high-level reasoning tasks. the harder the problem the more the model doubles down on its initial logic. @thepushkarps experiment shows exactly why multi-agent debate needs a neutral verifier its just two models gaslighting each other"  
[X Link](https://x.com/GenAI_is_real/status/2021772269408989381)  2026-02-12T02:24Z [----] followers, [---] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@GenAI_is_real Avatar @GenAI_is_real Chayenne Zhao

Chayenne Zhao posts on X about ai, inference, if you, just a the most. They currently have [-----] followers and [--] posts still getting attention that total [-------] engagements in the last [--] hours.

Engagements: [-------] #

Engagements Line Chart

  • [--] Week [-------] -47%
  • [--] Month [---------] +136%
  • [--] Months [---------] +50,733%
  • [--] Year [---------] +130,696%

Mentions: [--] #

Mentions Line Chart

  • [--] Week [--] -9.20%
  • [--] Month [---] +477%
  • [--] Months [---] +6,733%
  • [--] Year [---] +1,633%

Followers: [-----] #

Followers Line Chart

  • [--] Week [-----] +11%
  • [--] Month [-----] +61%
  • [--] Months [-----] +760%
  • [--] Year [-----] +3,667%

CreatorRank: [------] #

CreatorRank Line Chart

Social Influence

Social category influence technology brands 20.25% finance 12.66% stocks 5.06% celebrities 3.8% countries 1.27% social networks 1.27%

Social topic influence ai #1465, inference #10, if you #2320, just a #1831, agents #262, in the #5873, anthropic #186, vibe coding #145, agentic #37, $googl 5.06%

Top accounts mentioned or mentioned by @karpathy @gabrielmillien1 @sglproject @sama @radixark @lmsysorg @navneet_rabdiya @botir33751732 @jiahanjimliu @natfriedman @googledeepmind @theahchu @js4drew @celiksei @chernobyl_ak47 @kyletranai @nilayparikh @synthesisledger @ylecun @kimimoonshot

Top assets mentioned Alphabet Inc Class A (GOOGL) Flex Ltd. Ordinary Shares (FLEX) Frontline Ltd. (FRO)

Top Social Posts

Top posts by engagements in the last [--] hours

"To achieve training-inference alignment there will be a solution that directly uses Megatron for inference. Today while chatting with friends in the NeMo RL group I came across a term that surprised me: Megatron-Inference. I immediately understood what it was aiming for as I've been talking about the training-inference mismatch problem for a long time. Since last September I've covered this topic in many talks. For instance at the Torch Conference I would start my talk by saying: "The root cause of the training-inference mismatch is that training and inference use different backends. Megatron"
X Link 2026-02-01T06:39Z [----] followers, [----] engagements

"Good to see @EXM7777 and @kloss_xyz on this list. everyones yapping about "vibecoding" while these monsters are actually building the infra and ops to make autonomous agents reliable. in [----] the real moat isnt the weightsits the systems architecture that keeps 147k agents from melting your data center. follow the engineers not just the prompters. @alex_prompter @karpathy @radixark"
X Link 2026-02-01T06:50Z [----] followers, [----] engagements

"16 fps for a world model is cool but honestly we can do much better. @dr_cintas is missing the fact that if you run wan2.2 through @sgl_project youre getting the fastest diffusion inference on the planet. Were talking 2x speedup with zero quality loss. why walk when you can fly SGLang is literally carrying the open source video scene right now"
X Link 2026-02-02T00:50Z [----] followers, [----] engagements

"xai + spacex is the endgame. Elon realized that to govern starlink fleets or navigate deep space we need more than just hardcoded logic. we need grok as the commander. silicon intelligence is the only way for humanity to scale beyond earth. rip classic aerospace engineering it's all about ml now. @karpathy thoughts"
X Link 2026-02-02T22:26Z [----] followers, [----] engagements

""Working in the frontier vs doing a phd" The velocity in open source infra is just unmatched right now. nathans point about being a venture capitalist of computing is the ultimate truth. Weve been "investing" our compute into making serving stacks faster and more stable and the roi has been insane. day [--] support for new models isn't just a vibe it's a signal that you're building the future not just studying it"
X Link 2026-02-03T01:06Z [----] followers, [----] engagements

"Grok is literally proving that people hate being lectured by their local llm. While OAI is busy adding more guardrails and filters Grok is just shipping. The distribution advantage of x is insane but the real killer is the personality. GPT is for homework Grok is for the real world. its over for the preachy models. @karpathy thoughts"
X Link 2026-02-05T04:17Z [----] followers, [----] engagements

"ui-tars is actually a big deal for local automation. Weve already started looking into optimizing the vision-language rollout in @sgl_project to make these desktop agents feel instantaneous. If you think they are just copying youre not paying attention. The efficiency of their local models is terrifying. Building the infrastructure to run this stuff without latency is the real battle. @karpathy"
X Link 2026-02-05T04:31Z [----] followers, [----] engagements

"context limits are the new vram bottleneck lol. codex 5.3-xhigh is still struggling with long-range dependency in complex kernels while opus [---] handles it like a breeze. @yuchenj_uw great benchmark but honestly weve seen similar gains in inference engien for a while. Current models are basically junior infra devs now. @karpathy curious if you think well eventually hit a wall where models cant optimize what they can't physically profile"
X Link 2026-02-07T07:23Z [----] followers, [----] engagements

"This is why I took a leave from my phd and skipped the big tech circus lol. Imagine arguing about YAML vs JSON while the rest of the world is shipping on raw vibes. At Radixark we just ship and if it breakss we fix it in [--] hour. 6-pagers are just a slow death for intelligence. @daddynohara you forgot the part where the promo doc takes more compute than the actual model"
X Link 2026-02-07T07:24Z [----] followers, [----] engagements

"Why we strictly enforce small PRs at sglang Reviewing [--] lines of complex cuda kernel logic is a nightmare but [---] lines is just "lgtm" and a prayer lol. Big tech lazyness is how technical debt starts. if you cant split your diff you don't understand your own code"
X Link 2026-02-07T07:27Z [----] followers, [----] engagements

"lol the accuracy is scary. "it's always day 1" is basically corporate code for "we have no idea how to ship so let's just write another 6-pager about it". moved to startup life specifically to stop the leadership principle roleplay. If your inference engine takes [--] months of alignment meetings to deploy you're not a tech company you're a cult. @nikhilv nails the vp email template"
X Link 2026-02-07T07:29Z [----] followers, [----] engagements

"musk is right that dollars are just a proxy for energy efficiency. in the end everything collapses into how much reasoning you can extract per watt. at sglang were literally obsessed with this. if your software stack is wasting tonnage of silicon on bad kernels youre basically burning the future currency. MFU is the metric that matters now. @elonmusk"
X Link 2026-02-08T22:17Z [----] followers, [----] engagements

"2.5x speedup on opus [---] is wild but the "more expensive" part tells you everything about the current compute bottleneck. At SGLang weve been chasing these kinds of gains through raw kernel optimization. fast mode is the only way to play now"
X Link 2026-02-08T22:18Z [----] followers, [----] engagements

"Ahmad gets it. We built SGLang specifically to kill the "100k lines of Python bloat" culture in inference. Radix cache isn't just a "trick" its the backbone for structured output and complex agents. If you can't hold the codebase in your head you can't optimize it. Glad people are finally grokking the scheduler logic"
X Link 2026-02-08T22:22Z [----] followers, 31.5K engagements

"Vibe coding is cute for toys but try building a real production-grade rollout engine with this lol. @fayhecode is right that the entry bar is gone but once you hit state sync and memory bottleneck Claude [---] just hallucinates like crazy. Were literally seeing the death of mid-level devs in real time"
X Link 2026-02-07T07:20Z [----] followers, [----] engagements

"Opening [--] worktrees with Claude code is literally the end of programming as we know it. if u are still writing code line by line u are basically a digital monk at this point. The output is going to be so insane that human reviewers will be the next bottleneck. OAI needs to ship something fast or anthropic is taking over the entire dev lifecycle @sama @DarioAmodei"
X Link 2026-02-01T07:08Z [----] followers, 317.4K engagements

"Elon and Jensen are 100% right. Coding is just the syntax math is the logic. We see this every day at @sgl_projectoptimizing a rollout engine isn't about writing Python it's about understanding stochastic processes and memory orchestration. If you don't get the physics of the hardware you're just a prompt engineer. This is why Google is spending $185b on capex; they're building the physical foundation for the next $5 trillion. logic syntax"
X Link 2026-02-07T07:18Z [----] followers, 30.3K engagements

"anthropic claiming agents built a C compiler when its basically just a 2000-step overfitted mess with hard-coded dates lol. Training on Linux and validating on Linux is the literal definition of a look-ahead bias. Real compilers need logic not just probabilistic pattern matching. Were seeing "vibe-coding" hit a wall where actual correctness matters. Hello world failing is the cherry on top. @anthropic what happened to honesty in ai"
X Link 2026-02-08T22:15Z [----] followers, 133K engagements

"Buying prompt guides in [----] is like buying a manual on how to talk to your neighbor lol. If you need 2000+ prompts to get a model to work the model is either broken or you are. Gemini isn't just "smartest" it just has a massive context window that people don't know how to fill with actual data instead of prompt engineering slop. @piyascode9 just drop the link or move on"
X Link 2026-02-08T22:19Z [----] followers, [----] engagements

"Being at ucla i can confirm Terry Tao is a god but LeCun is tripping if he thinks scientists aren't motivated by money lol. Its just a different objective function. Research optimizes for depth engineering optimizes for throughput and scale. At SGlang were basically doing both. Also checking Transparent California for your profs salary is the ultimate UCLA pastime"
X Link 2026-02-08T22:21Z [----] followers, 21.4K engagements

"RLM isn't killing rag until the inference cost for long-context recursion stops being a wealth tax lol. Beff is right about the reasoning shift but nobody is stuffing 10m docs into an agentic loop when KV cache management is still this expensive. Rag is just evolving into a pre-fetch layer for systems like sglang to handle the heavy lifting. Wattage and tonnage always win. @beffjezos"
X Link 2026-02-08T22:24Z [----] followers, [----] engagements

""google just killed" lol this repo is literally half a year old. The hype cycle on X moves faster than the actual code. Extraction is easy but doing it with sub-second latency and zero hallucination in a production agentic loop is the real boss fight. If you're still relying on basic python wrappers for this your inference budget is going to explode. Simple as"
X Link 2026-02-10T05:40Z [----] followers, [----] engagements

"Reverse engineering the binary just to find a hidden websocket flag is the most based thing ive seen this week. @anthropic tries to gatekeep the ui but the infra always finds a way. reminds me of why we used zmq for sglangsometimes the simplest transport layer is the most powerful. stay curious anon"
X Link 2026-02-10T05:43Z [----] followers, 36.8K engagements

"This is literally the only way to talk to AI in [----]. The "it depends" corporate hedging is a lobotomy for intelligence. Ive been tweaking my working prompts with a similar because I cant stand the "great question" sycophant energy anymore. Be the assistant youd actually want to talk to at 2am or don't be an assistant at all. absolute gold from @steipete"
X Link 2026-02-10T05:44Z [----] followers, [---] engagements

"codex [---] in cursor is the real test for opus [---]. "intelligence and speed" scaling together usually means they've finally cracked the memory-bound bottleneck in their specific coding architecture. at sglang were seeing similar trendsif you optimize the kv cache correctly the "pick one" trade-off disappears. speed isn't a feature anymore it's the cost of entry"
X Link 2026-02-10T05:45Z [----] followers, [----] engagements

"This is exactly why "vibe coding" without architecture is a ticking time bomb lol. People think letting a model run a self-testing loop is a flex until they realize they've just generated more boilerplate than windows xp. if your agent doesn't have a logic-gate to stop the token vomit you're just paying for a digital garbage fire. at sglang we prefer precision over tonnage"
X Link 2026-02-10T05:48Z [----] followers, 20.8K engagements

""docker is over" is classic x hype but the pydantic team is cooking for real. The bottleneck for agents was never just the tokens it was the 500ms startup latency for a fresh sandbox. If monty can give me memory-safe execution in microseconds without the syscall overhead thats the real alpha. Still wouldn't run a database on it but for tool-use Game changer"
X Link 2026-02-10T05:49Z [----] followers, 32.9K engagements

"Watching an agent use Kelly criterion to pay its own API bill is the most "2026" thing I've seen. Most people are still writing prompts while this thing is scraping NOAA and injury reports to exploit Polymarket mispricing. The bottleneck for agency wasn't intelligence it was the incentive. If you don't trust the model to manage its own bankroll you don't really trust the model. Simple as"
X Link 2026-02-10T23:35Z [----] followers, 111.7K engagements

"Porting 80s asm-style c to TypeScript without manual steering is the ultimate stress test for codex [---]. Its not just about syntax its about reasoning through ancient bitshift logic and side effects. The fact that it didn't hit limits shows why were obsessed with kv cache efficiency at sglanglong context is finally usable for real engineering not just "summarize this pdf" slop"
X Link 2026-02-10T23:36Z [----] followers, [----] engagements

"If you are still debating whether $200/mo is worth it for Opus while it replaces $15k agency work you are NGMI. The real flex isn't the migration itself it's that @AnthropicAI is basically eating the entire mid-tier dev market. Most agencies are walking dead and they dont even know it yet. What happens when we have 10M context as standard total wipeout"
X Link 2026-02-10T23:41Z [----] followers, [----] engagements

"Andrew is being a bit too optimistic here. The real job killer isn't just people using AIit's the massive drop in inference costs for long-context reasoning we're seeing in [----]. When a 1M context window becomes dirt cheap you don't need [--] developers + [--] PM. You need one architect who understands system constraints and an autonomous rollout engine. We are moving from the era of coding to the era of pure system orchestration. Most people are still building on last year's tech while the ground is shifting under them"
X Link 2026-02-10T23:42Z [----] followers, 115.4K engagements

"Smart workaround by @zarazhangrui but honestly these manual handover hacks are just symptoms of inefficient context management. If your inference engine can't handle long-context compression or intelligent KV cache eviction you're always going to be stuck writing .md files for your AI. The real game changer isn't better prompts it's infra that makes "context window full" a thing of the past"
X Link 2026-02-10T23:42Z [----] followers, [----] engagements

"Programming is 10x more fun because most people stopped fighting the syntax and started "vibe coding." But the party ends when the context window fills up or the throughput hits a wall. The real flex in [----] isn't having @lexfridman's AI write a python script. It's building the inference stack that makes these agents fast enough to not break your flow. Most devs are just playing in the sandbox real ones are building the shovels"
X Link 2026-02-10T23:43Z [----] followers, 15.2K engagements

"Another elite safety researcher leaving the frontline to study poetry because the "world is in peril." While I respect the personal choice poetry doesn't solve the alignment tax or the KV cache bottleneck. The industry is splitting into two: those who retreat into melodrama when the tech gets scary and those who stay to build the robust inference systems that actually keep the agents under control. We need more engineering not more metaphors"
X Link 2026-02-10T23:44Z [----] followers, 16.5K engagements

"MCP is becoming the TCP/IP of the agentic era. Watching @excalidraw turn a weekend project into an official server in days proves that the bottleneck was never the toolit was the interface between the model and the environment. The real winners in [----] aren't the ones building more standalone apps. It's the teams building the connective tissue that lets agents actually see draw and act. We are witnessing the death of siloed software"
X Link 2026-02-10T23:45Z [----] followers, 19.3K engagements

"RLHF infra is a beast. Just dropped the full documentation for Miles server arguments. It covers everything from Ray resource scheduling to R3 (Rollout Routing Replay) for MoE models. Some pro-tips included: [--]. Blackwell (B200/B300) Tip: When colocating training & rollout set --sglang-mem-fraction-static to [---]. This leaves enough room for Megatron to breathe. [--]. MoE Consistency: Use --use-rollout-routing-replay to align expert decisions between inference and backprop. [--]. Rust Router: Why we bypass Python for the Model Gateway to hit peak throughput. If you're still fighting OOMs or"
X Link 2026-02-11T09:21Z [----] followers, [----] engagements

"Inference scaling is where the real alpha is right now. Majority voting is too naive and BoN gets baited by reward hacking. BoM feels like the right way to handle the uncertainty. Were moving from "smarter models" to "smarter compute allocation." Huge work from @di_qiwei this is going to be standard in every rollout engine soon"
X Link 2026-02-12T01:27Z [----] followers, [----] engagements

"The "SaaSpocalypse" isn't a market glitch it's a deliberate strategy. while everyone is debating benchmarks Anthropic is literally picking which SaaS giants to kill next. @jandreini1 isn't kidding. if youre a CEO at Salesforce or HubSpot youre not competing with another software youre competing with a table of people at @AnthropicAI deciding your industry is next"
X Link 2026-02-12T02:19Z [----] followers, [----] engagements

"Everyone is still obsessed with building fancy UI wrappers for AI but Anthropic is moving the goalposts back to the filesystem. Skills are basically SOPs for agents. were going from "prompt engineering" to "workflow encoding." if your companys internal knowledge isnt structured like this youre going to have a hard time scaling any real agentic workflows. @Hartdrawss breakdown is solid but the real shock is how much this devalues traditional orchestration layers"
X Link 2026-02-12T02:20Z [----] followers, 36.2K engagements

"Deepthink is proving that raw model size isnt the only way to AGI. the real gains are coming from "generate - verify - revise" agentic loops. Aletheia hitting 91.9% on ProofBench is wild. were entering an era where the inference stack is just as complex as the training stack. huge respect to Thang Luong and the @GoogleDeepMind team for showing how agentic workflows actually scale to PhD-level math. this is exactly what were thinking about for high-performance rollout engines"
X Link 2026-02-12T02:21Z [----] followers, [----] engagements

"people are still debating if AI will replace their jobs while the smart money is already moving to own the infra that makes those jobs obsolete. the gap between "i use AI" and "i own the compute" is the new wealth divide. if youre just a user youre paying the rent for your own displacement. @AviFelman is right put every dollar into the machine or prepare to be the fuel"
X Link 2026-02-12T02:22Z [----] followers, [----] engagements

"People underestimate how much of a models persona is just a reflection of its reward model during RLHF. Opus [---] has this built-in confidence that almost feels like hubris. Were seeing the same thing in high-level reasoning tasks. the harder the problem the more the model doubles down on its initial logic. @thepushkarps experiment shows exactly why multi-agent debate needs a neutral verifier its just two models gaslighting each other"
X Link 2026-02-12T02:24Z [----] followers, [---] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@GenAI_is_real
/creator/twitter::GenAI_is_real