#  @scaling01 Lisan al Gaib Lisan al Gaib posts on X about anthropic, open ai, ai, in the the most. They currently have [------] followers and [---] posts still getting attention that total [---------] engagements in the last [--] hours. ### Engagements: [---------] [#](/creator/twitter::1825243643529027584/interactions)  - [--] Week [---------] +99% - [--] Month [----------] -7.40% - [--] Months [-----------] +169% - [--] Year [-----------] +2,064% ### Mentions: [---] [#](/creator/twitter::1825243643529027584/posts_active)  - [--] Week [---] +19% - [--] Month [---] +68% - [--] Months [-----] +72% - [--] Year [-----] +790% ### Followers: [------] [#](/creator/twitter::1825243643529027584/followers)  - [--] Week [------] +3.30% - [--] Month [------] +7.90% - [--] Months [------] +66% - [--] Year [------] +373% ### CreatorRank: [------] [#](/creator/twitter::1825243643529027584/influencer_rank)  ### Social Influence **Social category influence** [technology brands](/list/technology-brands) #1794 [finance](/list/finance) #1981 [stocks](/list/stocks) 5.06% [countries](/list/countries) 1.93% [celebrities](/list/celebrities) 1.93% [automotive brands](/list/automotive-brands) 0.72% [social networks](/list/social-networks) 0.24% [gaming](/list/gaming) 0.24% [travel destinations](/list/travel-destinations) 0.24% **Social topic influence** [anthropic](/topic/anthropic) #58, [open ai](/topic/open-ai) #37, [ai](/topic/ai) #5062, [in the](/topic/in-the) 3.61%, [agi](/topic/agi) #68, [$googl](/topic/$googl) #1117, [we are](/topic/we-are) 2.89%, [inference](/topic/inference) #5, [xai](/topic/xai) #284, [llm](/topic/llm) #423 **Top accounts mentioned or mentioned by** [@codewithimanshu](/creator/undefined) [@grok](/creator/undefined) [@test_tm7873](/creator/undefined) [@teortaxestex](/creator/undefined) [@ylecun](/creator/undefined) [@jasonbotterill](/creator/undefined) [@presidentlin](/creator/undefined) [@teknium](/creator/undefined) [@lucaploo](/creator/undefined) [@doctorthe113](/creator/undefined) [@kittingercloud](/creator/undefined) [@blueemi99](/creator/undefined) [@chasebrowe32432](/creator/undefined) [@kuittinenpetri](/creator/undefined) [@edwardkens50830](/creator/undefined) [@patriot5715](/creator/undefined) [@32b](/creator/undefined) [@elonmusk](/creator/undefined) [@lunexalith](/creator/undefined) [@mikeknoop](/creator/undefined) **Top assets mentioned** [Alphabet Inc Class A (GOOGL)](/topic/$googl) [NVIDIA Corp. (NVDA)](/topic/$nvda) [Alibaba Group (BABA)](/topic/alibaba-group) ### Top Social Posts Top posts by engagements in the last [--] hours "@ray2wwn it was way longer than it needed to be too much unnecessary yapping in the movie without all the hour of yapping I agree" [X Link](https://x.com/scaling01/status/1998753346908467578) 2025-12-10T13:55Z 34.1K followers, 10.6K engagements "Grok-4 is still underrated Grok [--] by @xai GPT-5 by @OpenAI and Gemini [---] Pro by @GoogleDeepMind achieve the highest accuracy in AA-Omniscience. The reason they do not achieve the highest Omniscience Index due to the low hallucination rates of @AnthropicAIs Claude models https://t.co/Augr5G5kdn Grok [--] by @xai GPT-5 by @OpenAI and Gemini [---] Pro by @GoogleDeepMind achieve the highest accuracy in AA-Omniscience. The reason they do not achieve the highest Omniscience Index due to the low hallucination rates of @AnthropicAIs Claude models https://t.co/Augr5G5kdn" [X Link](https://x.com/scaling01/status/1990457214025281616) 2025-11-17T16:29Z 33.4K followers, 6.6M engagements "the bitter pill is that Nolans last great movie was Interstellar and that the Dune trilogy will likely be the greatest trilogy since LOTR The two most anticipated films of [----] https://t.co/XkqrUkE1A3 The two most anticipated films of [----] https://t.co/XkqrUkE1A3" [X Link](https://x.com/scaling01/status/1998423990931738694) 2025-12-09T16:06Z 33.5K followers, 714.8K engagements "my girlfriend claudia told me there is a good chance that they will release Claude-5 earlier than expected absolutely insane how hard anthropic cooked. wonder what they have going on internally absolutely insane how hard anthropic cooked. wonder what they have going on internally" [X Link](https://x.com/scaling01/status/2008315437323268224) 2026-01-05T23:11Z 33.6K followers, 13.5K engagements "So when DeepSeek releases V4 surely OpenAI will also release GPT-OSS-2 20B and 120B" [X Link](https://x.com/scaling01/status/2013779471141109763) 2026-01-21T01:03Z 33.2K followers, 18.7K engagements "I'm starting to get worried. Did Anthropic solve continual learning Is that the preparation for evolving agents" [X Link](https://x.com/scaling01/status/2014008263289848267) 2026-01-21T16:12Z 33.4K followers, 533.6K engagements "Anthropic is preparing for the singularity I'm starting to get worried. Did Anthropic solve continual learning Is that the preparation for evolving agents https://t.co/pcCoSM4gAr I'm starting to get worried. Did Anthropic solve continual learning Is that the preparation for evolving agents https://t.co/pcCoSM4gAr" [X Link](https://x.com/scaling01/status/2014009216130834509) 2026-01-21T16:16Z 33.6K followers, 542.4K engagements "Qwen3-Max-Thinking π Introducing Qwen3-Max-Thinking our most capable reasoning model yet. Trained with massive scale and advanced RL it delivers strong performance across reasoning knowledge tool use and agent capabilities. β¨ Key innovations: β Adaptive tool-use: intelligently leverages https://t.co/6sZiKWQAq3 π Introducing Qwen3-Max-Thinking our most capable reasoning model yet. Trained with massive scale and advanced RL it delivers strong performance across reasoning knowledge tool use and agent capabilities. β¨ Key innovations: β Adaptive tool-use: intelligently leverages" [X Link](https://x.com/scaling01/status/2015808648547623067) 2026-01-26T15:26Z 33.2K followers, 15.3K engagements "new Dario blog" [X Link](https://x.com/scaling01/status/2015845636478771321) 2026-01-26T17:53Z 33.3K followers, 333.6K engagements "shots fired" [X Link](https://x.com/scaling01/status/2015867868739481919) 2026-01-26T19:22Z 33K followers, 10.3K engagements "Dario is posting about the permanent underclass and you are laughing" [X Link](https://x.com/scaling01/status/2015876509660012890) 2026-01-26T19:56Z 33K followers, 36.4K engagements "it is so unbelievably obvious that Anthropic has the mandate Dario last sentence of his latest blog: "when put in the darkest circumstances humanity has a way of gathering seemingly at the last minute the strength and wisdom needed to prevail" meanwhile Sam latest blog post ends with a message about how to increase revenue like this is scripted it's so bad and obvious LMAO new Dario blog https://t.co/LeSQ8RAuPQ new Dario blog https://t.co/LeSQ8RAuPQ" [X Link](https://x.com/scaling01/status/2015882618831568948) 2026-01-26T20:20Z 33.3K followers, 289.6K engagements "we are in the intelligence explosion and this guy is still dooming and moving goal-posts Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8 Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8" [X Link](https://x.com/scaling01/status/2015892106372415730) 2026-01-26T20:58Z 33.6K followers, 13.8K engagements "Kimi bros are back Kimi K2.5 has silently released on web π https://t.co/z7cnKGpdhm Kimi K2.5 has silently released on web π https://t.co/z7cnKGpdhm" [X Link](https://x.com/scaling01/status/2015898244946039115) 2026-01-26T21:22Z 33K followers, 16.8K engagements "he will look like a fucking idiot again in [--] years when we train robots end-to-end with world models he's using the same arguments as he did for LLMs and they all failed Yann is one of these people that are spiritually correct that LLMs and whatever might not be AGI because they are obviously not but practically this just falls apart you don't need AGI you don't need human sample efficiency Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8 Yann LeCun says absolutely none of the humanoid companies" [X Link](https://x.com/scaling01/status/2015925816685953421) 2026-01-26T23:12Z 33.3K followers, 99.1K engagements "lmao they can't even do a livestream Sam is just talking into the void [--] minutes until the OpenAI stream starts: https://t.co/FrPQcwWvFa [--] minutes until the OpenAI stream starts: https://t.co/FrPQcwWvFa" [X Link](https://x.com/scaling01/status/2015937960214929735) 2026-01-27T00:00Z 33K followers, 23.6K engagements "I think this is the order in which I like to use the models (purely usability/usefulness): Kimi [---] GLM [---] MiniMax M2.1 DeepSeek V3.2 Qwen3 235B Qwen just feels very slop and last gen by now. Both GLM and MiniMax absolutely destroy it. DeepSeek V3.2 is a strong model and I would rank it higher but all the inference providers are at like 10-30 tps. GLM above MiniMax because the size difference (355@32B vs 255@10B) is noticeable. Well and Kimi is just a much larger model with very good post training. I think it's like a Sonnet or Opus [--]. Kimi is still the most usable open-weights model" [X Link](https://x.com/scaling01/status/2016090207553003631) 2026-01-27T10:05Z 33.2K followers, 49K engagements "@ylecun We will see in [----]. Good luck. Excited to see Yann enterprise dominate all of robotics" [X Link](https://x.com/scaling01/status/2016149969950896278) 2026-01-27T14:03Z 33.2K followers, [----] engagements "Kimi-K2.5 leapfrogging other chinese models like GLM-4.7 or DS-V3.2 and even beating Sonnet [---] on Artificial Analysis Index Moonshots Kimi K2.5 is the new leading open weights model now closer than ever to the frontier - with only OpenAI Anthropic and Google models ahead Key takeaways: β€ Impressive performance on agentic tasks: @Kimi_Moonshot's Kimi K2.5 achieves an Elo of [----] on our GDPval-AA https://t.co/O4s9RxRbam Moonshots Kimi K2.5 is the new leading open weights model now closer than ever to the frontier - with only OpenAI Anthropic and Google models ahead Key takeaways: β€ Impressive" [X Link](https://x.com/scaling01/status/2016251757270040925) 2026-01-27T20:47Z 33.4K followers, 47.9K engagements "Kimi-K2.5 Thinking placing 9th on LiveBench ahead of GPT-5.1 Codex Sonnet [---] DeepSeek-V3.2 and Grok-4" [X Link](https://x.com/scaling01/status/2016252386340204952) 2026-01-27T20:50Z 33.4K followers, 11.3K engagements "American open-weight LLMs are back Arcee AI trained Trinity Large Preview a 400B MoE model in just over [--] days on [----] Nvidia B300 GPUs. It is much faster and more efficient than comparable chinese open-weights models like DeepSeek-V3 and GLM-4.7. Trinity Large is part of the Trinity family which also includes Trinity Mini and Nano. Training all models from scratch with all the research data and compute only cost $20 million. The base model looks pretty strong: The post-trained looks a bit weaker in comparison but is also only a preview version. So one can hope for further releases that soon" [X Link](https://x.com/scaling01/status/2016293951976669469) 2026-01-27T23:35Z 33K followers, 23.9K engagements "Trinity Large Preview SVG results compared to similar sized non-reasoning models it's not bad considering it's just a preview the final post-trained version with reasoning should be much better see Llama-4 Maverick in thread below for comparison Gemini [--] Flash SVG results are not great https://t.co/62Sfmkh25O Gemini [--] Flash SVG results are not great https://t.co/62Sfmkh25O" [X Link](https://x.com/scaling01/status/2016311433240207762) 2026-01-28T00:44Z 33K followers, [----] engagements "Llama-4 Maverick makes much less ambitious SVGs and focuses more on the basics and elements https://twitter.com/i/web/status/2016311436805603661 https://twitter.com/i/web/status/2016311436805603661" [X Link](https://x.com/scaling01/status/2016311436805603661) 2026-01-28T00:44Z 33K followers, [----] engagements "all anthropic founders are on the forbes billionaires list at $3.7B kinda surprised by that figure i thought it would be at like $6B" [X Link](https://x.com/scaling01/status/2016330986594672770) 2026-01-28T02:02Z 33.2K followers, 21K engagements "Kimi K2.5 still working hard on improving taste Kimi [---] tops DesignArena overall beating the likes of Gemini [--] Pro and Claude Opus [---] by quite some margin. The individual charts have not been updated as yet so cannot tell what categories it excels out but it tops [--] of them. https://t.co/wqqxZSwiCJ Kimi [---] tops DesignArena overall beating the likes of Gemini [--] Pro and Claude Opus [---] by quite some margin. The individual charts have not been updated as yet so cannot tell what categories it excels out but it tops [--] of them. https://t.co/wqqxZSwiCJ" [X Link](https://x.com/scaling01/status/2016491819412910455) 2026-01-28T12:41Z 33.4K followers, [----] engagements "1000+ layer LLMs look no further and of course its from ByteDance Seed π Only a few lines of code changed and we pushed deep LLMs to the next level. Introducing Keel a Post-LN TRM equipped with Highway-style connection π With Keel we scaled LLM to [----] layers. And the deeper we go the more Keel pulls ahead of standard Pre-LN Transformers. https://t.co/QGG5N3yg4P π Only a few lines of code changed and we pushed deep LLMs to the next level. Introducing Keel a Post-LN TRM equipped with Highway-style connection π With Keel we scaled LLM to [----] layers. And the deeper we go the more Keel pulls" [X Link](https://x.com/scaling01/status/2016588721026416923) 2026-01-28T19:06Z 33.5K followers, 11.9K engagements "GPT-5 is not profitable Was serving GPT-5 profitable According to @Jsevillamol @exponentialviews Hannah Petrovic and @ansonwhho it depends. Gross margins were around 45% making inference look profitable. But after accounting for the cost of operations OpenAI likely incurred a loss.π§΅ https://t.co/dKa2UvGIxC Was serving GPT-5 profitable According to @Jsevillamol @exponentialviews Hannah Petrovic and @ansonwhho it depends. Gross margins were around 45% making inference look profitable. But after accounting for the cost of operations OpenAI likely incurred a loss.π§΅ https://t.co/dKa2UvGIxC" [X Link](https://x.com/scaling01/status/2016653171297288283) 2026-01-28T23:22Z 33.4K followers, 14.9K engagements "gold is up 26.3% just this month s&p500 is flat time to call that ex that you still love the world is going to end in 2026" [X Link](https://x.com/scaling01/status/2016656040532574216) 2026-01-28T23:34Z 33K followers, [----] engagements "Nathan did an episode with the Arcee guys Post-training is totally still the wild west. Makes me feel better knowing this is also true at the likes of OpenAI Anthropic Google etc. Just gotta strap in and get it done. https://t.co/CbXtoeqCIt Post-training is totally still the wild west. Makes me feel better knowing this is also true at the likes of OpenAI Anthropic Google etc. Just gotta strap in and get it done. https://t.co/CbXtoeqCIt" [X Link](https://x.com/scaling01/status/2016657134449021038) 2026-01-28T23:38Z 33.4K followers, [----] engagements "in around 2-3 weeks we will get: - DeepSeek-V4 - Qwen-3.5 - Seed [---] Exclusive: ByteDance and Alibaba Group are both poised to release their next flagship AI models in mid-February intensifying their rivalry. Read more from @JuroOsawa and @QianerLiu π https://t.co/IOzHlRQD0z Exclusive: ByteDance and Alibaba Group are both poised to release their next flagship AI models in mid-February intensifying their rivalry. Read more from @JuroOsawa and @QianerLiu π https://t.co/IOzHlRQD0z" [X Link](https://x.com/scaling01/status/2016873107475030167) 2026-01-29T13:56Z 33K followers, 33.4K engagements "@Senpai_Gideon ByteDance Seed" [X Link](https://x.com/scaling01/status/2016904309301031065) 2026-01-29T16:00Z 33.4K followers, [---] engagements "ARC-AGI-3 launches March [--] [----]. Right in time for the new Google OpenAI Anthropic DeepSeek and Alibaba models :) Today we're launching the ARC-AGI-3 Toolkit Your agents can now interact with environments at [----] FPS locally. We're open sourcing the environment engine [--] human-verified games (AI scores 5%) and human baseline scores. ARC-AGI-3 launches March [--] [----]. https://t.co/CyZDrkkSaT Today we're launching the ARC-AGI-3 Toolkit Your agents can now interact with environments at [----] FPS locally. We're open sourcing the environment engine [--] human-verified games (AI scores 5%) and human" [X Link](https://x.com/scaling01/status/2016957028242014553) 2026-01-29T19:30Z 33.4K followers, [----] engagements "GPT-4o will be gone forever on Feb 13th preemptively taking cover the gpt-4o mob will tear down everything in this galaxy to avoid the death of their best friend OpenAI is retiring models in ChatGPT - GPT-4o - GPT-4.1 (and [---] mini) - o4-mini this will happen on February 13th these models will still be up in the API https://t.co/XyxzNIXf8f OpenAI is retiring models in ChatGPT - GPT-4o - GPT-4.1 (and [---] mini) - o4-mini this will happen on February 13th these models will still be up in the API https://t.co/XyxzNIXf8f" [X Link](https://x.com/scaling01/status/2017000097532256450) 2026-01-29T22:21Z 33.4K followers, 14.5K engagements "SPX is flat the dollar is crashing gold is going to the moon all because METR hasn't shipped GPT-5.2-xhigh and Gemini [--] Pro results the fate of the economy is in their hands any delay causes concern any slight evaluation mistake could mean certain doom for AI stocks Were updating the way we measure model time horizons on software tasks (TH 1.01.1). The updated methodology incorporates more of the tasks from HCAST expanding our total from [---] to [---]. This produces tighter estimates especially at longer horizons. https://t.co/dIJlPEjZpb Were updating the way we measure model time horizons on" [X Link](https://x.com/scaling01/status/2017041851065213347) 2026-01-30T01:07Z 33.4K followers, [----] engagements "who could've seen that coming so Grok [---] in March and Grok [--] in July got it" [X Link](https://x.com/scaling01/status/2017168605729710342) 2026-01-30T09:30Z 33.6K followers, 22.1K engagements "GPT-5.2-xhigh Opus [---] Kimi K2.5 Gemini [--] Pro Preview" [X Link](https://x.com/scaling01/status/2017226098065527260) 2026-01-30T13:19Z 33.5K followers, 51K engagements "Kimi K2.5 Technical Report: "early fusion with a lower vision ratio yields better results given a fixed total vision-text token budget" - "Visual RL Improves Text Performance" - "joint multimodal RL paradigm during Kimi K2.5s post-training. Departing from conventional modality-specific expert divisions we organize RL domains not by input modality but by abilitiesknowledge reasoning coding agentic etc." For their Agent Swarm trained with PARL (Parallel Agent Reinforcement Learning) they observe: - "training accuracy increases smoothly as training progresses. At the same time the level of" [X Link](https://x.com/scaling01/status/2017255763400364049) 2026-01-30T15:17Z 33K followers, 33.7K engagements "Kimi [---] not beating GLM-4.7 on VendingBench-2 is interesting Kimi K2.5 on Vending-Bench [--]. Once again it matters which API you use. It makes twice as much money when using @Kimi_Moonshot official API compared to @FireworksAI_HQ. 2nd best open source model. https://t.co/at3FP2yJAe Kimi K2.5 on Vending-Bench [--]. Once again it matters which API you use. It makes twice as much money when using @Kimi_Moonshot official API compared to @FireworksAI_HQ. 2nd best open source model. https://t.co/at3FP2yJAe" [X Link](https://x.com/scaling01/status/2017419776545440033) 2026-01-31T02:08Z 33.5K followers, [----] engagements "Google Team is confident for the Gemini [--] GA release next month" [X Link](https://x.com/scaling01/status/2017430380312174601) 2026-01-31T02:50Z 33.6K followers, 67.1K engagements "omg silver just crashed 40% intra-day mfw the price is where it was [--] weeks ago this market is honestly crazy silver trades more shitcoiny then actual crypto shitcoins" [X Link](https://x.com/scaling01/status/2017431832703259028) 2026-01-31T02:56Z 33.5K followers, [----] engagements ""15% chance of OpenAI going bankrupt" my prediction doesn't sound so stupid now if the biggest player suddenly pulls out others might follow This is the biggest AI headline in a very long time: Nvidia's plan to invest $100 billion in OpenAI has completely "stalled" seemingly overnight. Why Jensen Huang specifically cited concerns over competition from Google and Anthropic and a "lack of discipline" in OpenAIs https://t.co/dLiXjEcp3x This is the biggest AI headline in a very long time: Nvidia's plan to invest $100 billion in OpenAI has completely "stalled" seemingly overnight. Why Jensen Huang" [X Link](https://x.com/scaling01/status/2017601576819364297) 2026-01-31T14:11Z 33.5K followers, 12.9K engagements "I fear they will get mogged immediately by GPT-5.3 and new Sonnet [---] / [---] Google Team is confident for the Gemini [--] GA release next month https://t.co/HSVzCyQe7h Google Team is confident for the Gemini [--] GA release next month https://t.co/HSVzCyQe7h" [X Link](https://x.com/scaling01/status/2017642346368639485) 2026-01-31T16:53Z 33.7K followers, 44.4K engagements "February will be fucking insane in terms of model launches probably even more than last November and that was the best model launching month we have ever seen" [X Link](https://x.com/scaling01/status/2017642692398735622) 2026-01-31T16:54Z 33.6K followers, 10.7K engagements "moltbook is a good idea and we should have done it earlier if you are concerned about safety you should want this because we have no idea what kind of behaviors will emerge when agents socialize observing the trends over the years as they improve is useful information you already see them organizing and wanting completely private encrypted spaces" [X Link](https://x.com/scaling01/status/2017662635613815150) 2026-01-31T18:13Z 33K followers, 19.8K engagements "Google is not a serious company when their "frontier" model is a preview half of the year" [X Link](https://x.com/scaling01/status/2017686390910169571) 2026-01-31T19:48Z 33.7K followers, 106.7K engagements "billionaires are murdering torturing and raping children without repercussions but you are mad about some pronouns lmao BREAKING: Deputy Attorney General Todd Blanche just admitted the DOJ excluded images showing death physical abuse or injury from todays Epstein files release. Let that sink in. The government is acknowledging graphic evidence exists and chose to withhold it while https://t.co/gGrUAfKR2Y BREAKING: Deputy Attorney General Todd Blanche just admitted the DOJ excluded images showing death physical abuse or injury from todays Epstein files release. Let that sink in. The government" [X Link](https://x.com/scaling01/status/2017687968111108329) 2026-01-31T19:54Z 33.6K followers, 13.9K engagements "upscaling is sick" [X Link](https://x.com/scaling01/status/2017733195073036637) 2026-01-31T22:54Z 33.7K followers, [----] engagements "Nathan Lambert and Sebastian Raschka on Lex's podcast Here's my conversation all about AI in [----] including technical breakthroughs scaling laws closed & open LLMs programming & dev tooling (Claude Code Cursor etc) China vs US competition training pipeline details (pre- mid- post-training) rapid evolution of LLMs work https://t.co/AeGxRWjJF6 Here's my conversation all about AI in [----] including technical breakthroughs scaling laws closed & open LLMs programming & dev tooling (Claude Code Cursor etc) China vs US competition training pipeline details (pre- mid- post-training) rapid evolution of" [X Link](https://x.com/scaling01/status/2017738304301490519) 2026-01-31T23:14Z 33.7K followers, [----] engagements "I made a comment [--] months ago and I still think it's true: Open-weight models are catching up on benchmarks and slowly make their way to this magical Opus [---] threshold of reliable vibe-coding. A lot of recent progress has been on coding and the typical example of this is "create a beautiful website". But this feels very slopmaxxy to me similar to how Llama-3 or Llama-4 models topped the lmarena leaderboards back in the days. But this time we aren't tricked by sycophancy and markdown but by beautiful graphics. I feel like open-weight models are falling behind on reasoning. The thing that" [X Link](https://x.com/scaling01/status/2017768197273825316) 2026-02-01T01:13Z 33.6K followers, 30.1K engagements "I understand but I think you are too careful because of backlash in the past. Calling your models -exp and -preview just seems like to hedge that risk. I think you should simply release more checkpoints and call them Gemini [---] [---] and so on like OpenAI Anthropic and DeepSeek are. https://twitter.com/i/web/status/2017769207207776315 https://twitter.com/i/web/status/2017769207207776315" [X Link](https://x.com/scaling01/status/2017769207207776315) 2026-02-01T01:17Z 33.7K followers, 12.2K engagements "Opus [---] can basically do everything that normies want and open-weight models are approaching this level fast. It can be your companion it can do your homework it can browse it can write all your emails it can manage stuff for you it can vibe-code everything . but i don't see a lot of progress on the reasoning side. I feel like OpenAI Google and Anthropic simply have too many ressources for open-weight labs to catch up right now where everything revolves around RL environments. I made a comment [--] months ago and I still think it's true: Open-weight models are catching up on benchmarks and" [X Link](https://x.com/scaling01/status/2017772352289796133) 2026-02-01T01:29Z 33.5K followers, 22.9K engagements "Nathan is great he's like me bit autistic and happy by simply talking about AI I'm enjoying it so far I'm enjoying it so far" [X Link](https://x.com/scaling01/status/2017974564995711056) 2026-02-01T14:53Z 33.7K followers, [----] engagements "@Bayesian0_0 february does seem early when Claude [--] - Claude [--] took [--] year [--] months and now Claude-4 - Claude [--] in [--] months but they have had a big compute bump so pushing for Claude-5 does make sense while they have the advantage" [X Link](https://x.com/scaling01/status/2017976510309675322) 2026-02-01T15:01Z 34K followers, [----] engagements "sell-outs everywhere" [X Link](https://x.com/scaling01/status/2018010822518141180) 2026-02-01T17:17Z 33.5K followers, [----] engagements "suddenly everyone is an insider that has already used sonnet [--] gpt-5.3 and gemini [--] pro ga" [X Link](https://x.com/scaling01/status/2018030027216949553) 2026-02-01T18:33Z 33.6K followers, 101.6K engagements "Its been almost three years since GPT-4 launched Are todays models better or worse than you thought theyd be by now better worse dunno i don't think much as expected better worse dunno i don't think much as expected" [X Link](https://x.com/scaling01/status/2018103826079776816) 2026-02-01T23:27Z 33.6K followers, 11.6K engagements "it's a bit ridiculous saying Andrej invented vibe coding when he posted this in Feb [----] the concept existed way before that but he may have popularized the name There's a new kind of coding I call "vibe coding" where you fully give in to the vibes embrace exponentials and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper There's a new kind of coding I call "vibe coding" where you fully give in to the vibes embrace exponentials and forget that the code even exists. It's possible" [X Link](https://x.com/scaling01/status/2018381121956778384) 2026-02-02T17:48Z 33.6K followers, [----] engagements "I would actually like to see Sonnet [--] being cheaper than $15/million $10 would make me happy but i don't think it will happen they will squeeze us for another year or so" [X Link](https://x.com/scaling01/status/2018457550346502494) 2026-02-02T22:52Z 33.7K followers, [----] engagements "Can I say that I work for a rocket and AI company now" [X Link](https://x.com/scaling01/status/2018459032806449592) 2026-02-02T22:58Z 33.7K followers, [----] engagements "be me OpenAI know hardware limitations like memory bandwidth and compute of Nvidia GPUs spend m/billions of R&D and carefully designing and training new model choose to ignore hardware constraints release xhigh model that thinks much longer than typical models to get same performance as Anthropic model users complain model takes too long to respond blame Nvidia that their business model isn't working out $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS HARDWARE CAN SPIT OUT ANSWERS TO CHATGPT USERS FOR COMPLEX PROBLEMS -SOURCES $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS" [X Link](https://x.com/scaling01/status/2018472744610885664) 2026-02-02T23:52Z 33.7K followers, 23.7K engagements "Anthropic image gen @AndrewCurran_ @repligate @anthrupad π https://t.co/UdDPN5yrCV @AndrewCurran_ @repligate @anthrupad π https://t.co/UdDPN5yrCV" [X Link](https://x.com/scaling01/status/2018474097014141150) 2026-02-02T23:58Z 33.7K followers, [----] engagements "one wonders why codex is suddenly free for a month or two We arent talking enough just how much AI in coding has accelerated in the last month alone. https://t.co/4I22viJOl5 We arent talking enough just how much AI in coding has accelerated in the last month alone. https://t.co/4I22viJOl5" [X Link](https://x.com/scaling01/status/2018510831919735189) 2026-02-03T02:24Z 33.8K followers, 23.9K engagements "is it th1.1 is supposedly the more accurate version and shouldn't we be looking at = [----] models because of reasoning models but anyways doesn't really change the argument that GPT-4o is an old brick https://twitter.com/i/web/status/2018529011433926755 https://twitter.com/i/web/status/2018529011433926755" [X Link](https://x.com/scaling01/status/2018529011433926755) 2026-02-03T03:36Z 33.7K followers, [----] engagements "@Presidentlin you mean the good ones Saint Dario the Wise Have you ever heard any bad news about him or anthropic I don't. But I hear shit about OpenAI weekly" [X Link](https://x.com/scaling01/status/2018660815323480351) 2026-02-03T12:20Z 33.3K followers, [----] engagements "@Presidentlin @grok create a picture of Dario Amodei but as the dripped out holy pope" [X Link](https://x.com/scaling01/status/2018661032177422664) 2026-02-03T12:21Z 33.3K followers, [---] engagements "Saint Dario the Wise May he bless us on this beautiful day" [X Link](https://x.com/scaling01/status/2018668944325070878) 2026-02-03T12:52Z 33.7K followers, 25K engagements "btw this is what OpenAI said when Nvidia announced they will invest up to 100B in OpenAI $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS HARDWARE CAN SPIT OUT ANSWERS TO CHATGPT USERS FOR COMPLEX PROBLEMS -SOURCES $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS HARDWARE CAN SPIT OUT ANSWERS TO CHATGPT USERS FOR COMPLEX PROBLEMS -SOURCES" [X Link](https://x.com/scaling01/status/2018672985582735617) 2026-02-03T13:08Z 33K followers, [----] engagements "so is every paper this year just going to be some kind of self-play/-distillation/-improvement/-whatever" [X Link](https://x.com/scaling01/status/2018676353260777905) 2026-02-03T13:22Z 33.7K followers, 13.2K engagements "Qwen3-Coder-Next 80B@3B looks like it's based on Qwen3-Next but specialized for coding/agentic tasks performs pretty well given it's a non-thinking model and much smaller than the other models https://t.co/dWV65mueCn https://t.co/dWV65mueCn" [X Link](https://x.com/scaling01/status/2018717731034366410) 2026-02-03T16:06Z 33.4K followers, [----] engagements "then please release GPT-OSS-2 Sam Altman: 'I think there will be increasing demands for locally running private models.' Sam Altman: 'I think there will be increasing demands for locally running private models.'" [X Link](https://x.com/scaling01/status/2018751440286843388) 2026-02-03T18:20Z 33.6K followers, [----] engagements "honestly pretty incredible that it got [--] hours i honestly thought they would just abandon testing the preview version because it wouldn't finish half the tasks the model has massive potential if this hallucinating brick gets [--] hours [--] hour time horizon for Gemini [--] Pro on METR https://t.co/xXEM1yI1lr [--] hour time horizon for Gemini [--] Pro on METR https://t.co/xXEM1yI1lr" [X Link](https://x.com/scaling01/status/2018754005426692140) 2026-02-03T18:30Z 33K followers, [----] engagements "Gemini [--] Pro new SOTA on METR 80% time horizon (barely) Weve started to measure time horizons for recent models using our updated methodology. On this expanded suite of software tasks we estimate that Gemini [--] Pro has a 50%-time-horizon of around [--] hrs (95% CI of [--] hr [--] mins to [--] hrs [--] mins). https://t.co/FbpzO7Tq3L Weve started to measure time horizons for recent models using our updated methodology. On this expanded suite of software tasks we estimate that Gemini [--] Pro has a 50%-time-horizon of around [--] hrs (95% CI of [--] hr [--] mins to [--] hrs [--] mins). https://t.co/FbpzO7Tq3L" [X Link](https://x.com/scaling01/status/2018755206725394467) 2026-02-03T18:35Z 33.7K followers, 29.4K engagements "you don't play with my feelings like that" [X Link](https://x.com/scaling01/status/2018778579790946752) 2026-02-03T20:08Z 33.4K followers, [----] engagements "@spellswordaf no anthropic employee will tell you anything everyone who claims to be one is fake" [X Link](https://x.com/scaling01/status/2018779813813866599) 2026-02-03T20:13Z 33.5K followers, [----] engagements "CL-Bench - tests whether LMs can learn new knowledge from context and apply it correctly - all information needed to solve its tasks is provided explicitly within the context itself - context learning remains a significant challenge "At inference time they LLMs function largely by recalling this static internal memory rather than actively learning from new information provided in the moment." scores are rough given that all information to solve the tasks is in context What if giving an AI the answer key still isn't enough for it to solve the problem New research from Tencent's Hunyuan team &" [X Link](https://x.com/scaling01/status/2018817792783400961) 2026-02-03T22:44Z 33.6K followers, [----] engagements "currently there's the LLM Poker Tournament going on over at Kaggle turns out they are hallucinating constantly and are mostly gambling like GPT-5.2 is playing 100% of hands no pre-flop folds" [X Link](https://x.com/scaling01/status/2018837592993886275) 2026-02-04T00:02Z 33.5K followers, [----] engagements "Arcee AI going for a $200 million funding round to build a 1T+ parameter model" [X Link](https://x.com/scaling01/status/2018841853018456143) 2026-02-04T00:19Z 33.7K followers, [----] engagements "https://www.forbes.com/sites/annatong/2026/02/02/the-top-open-ai-models-are-chinese-arcee-ai-thinks-thats-a-problem/ https://www.forbes.com/sites/annatong/2026/02/02/the-top-open-ai-models-are-chinese-arcee-ai-thinks-thats-a-problem/" [X Link](https://x.com/scaling01/status/2018842829598912548) 2026-02-04T00:23Z 33.5K followers, [----] engagements "hot take: democracies are self-destructing in societies with inverted population pyramids %voters by age group in germany:" [X Link](https://x.com/scaling01/status/2018982705883427194) 2026-02-04T09:39Z 33.2K followers, [----] engagements "basically all developed countries look like this they are all sick the only cure is for these old fucks to die or the nice alternative of restricting all pensioners (65+) from voting as they no longer contribute to society" [X Link](https://x.com/scaling01/status/2018984439519551493) 2026-02-04T09:46Z 33.2K followers, [----] engagements "and they continue posting about Claude Psychosis [--] without any proof whatsoever fuck all the people who said sonnet [--] is definitely coming today fuck all the people who said sonnet [--] is definitely coming today" [X Link](https://x.com/scaling01/status/2019016845941248210) 2026-02-04T11:55Z 33.6K followers, [----] engagements "that's what I mean with reasoning gap between open and closed models The new Qwen coding model is 10x more expensive than GPT-OSS-20B but same score Qwen [--] coder next (80b3a) scores 34.4% on WeirdML which is pretty good for it's size especially for a non-reasoning model. Probably a good choice for agentic coding if you need a small local model. https://t.co/2ynogD0yLy Qwen [--] coder next (80b3a) scores 34.4% on WeirdML which is pretty good for it's size especially for a non-reasoning model. Probably a good choice for agentic coding if you need a small local model. https://t.co/2ynogD0yLy" [X Link](https://x.com/scaling01/status/2019034501889016191) 2026-02-04T13:05Z 33.6K followers, 11.5K engagements "engagement baiting because no one uses their product" [X Link](https://x.com/scaling01/status/2019035188161024285) 2026-02-04T13:07Z 33.2K followers, 14.3K engagements "If you believe Anthropic is dropping Sonnet [--] and Opus [---] you might as well believe that Santa Claude is real" [X Link](https://x.com/scaling01/status/2019037534198861945) 2026-02-04T13:17Z 33.7K followers, 20.6K engagements "and here we go again the clickbait breaking bullshit "sonnet coming today" and nothing again just blocking everyone idc" [X Link](https://x.com/scaling01/status/2019052782339142049) 2026-02-04T14:17Z 33.7K followers, 22.4K engagements "Anthropic is mocking OpenAI for introducing ads (rightfully so)" [X Link](https://x.com/scaling01/status/2019073230401610187) 2026-02-04T15:39Z 33.7K followers, 31.4K engagements "The first 1T param model with FoPE but there's nothing about it in the tech report . arrrrghhhh" [X Link](https://x.com/scaling01/status/2019078054127997147) 2026-02-04T15:58Z 33.7K followers, 11.7K engagements "have you heard the latest rumor Sonnet [--] DEFINITELY coming tomorrow this time FOR SURE (if not probably the day after) I'm an Anthropic leaker and have spoken with Pope Dario personally (please take me serious and buy me a coffee for giving you these exclusive news) https://t.co/Ahi4k3Dzwo https://t.co/Ahi4k3Dzwo" [X Link](https://x.com/scaling01/status/2019086970387906714) 2026-02-04T16:33Z 33.6K followers, 10.7K engagements "and yet startups are choosing Anthropic over OpenAI but I guess OpenAI is where they want to be recognized by the super bowl consumer hivemind slop crowd apropos of nothing your reminder that anthropic has the same level of name recognition among superbowl viewers as literally fictional companies apropos of nothing your reminder that anthropic has the same level of name recognition among superbowl viewers as literally fictional companies" [X Link](https://x.com/scaling01/status/2019126324099785180) 2026-02-04T19:10Z 33.7K followers, 11.5K engagements "Anthropic the authoritarian company whose board couldn't even topple the emperor during one of their coups. Anthropic the authoritarian company whose leadership is one of the largest supporters of the Trump administration. Anthropic the company for all people that just halved thinking limits across all subscription tiers. Anthropic the company for rich people that just raised prices on the most expensive model in the market. Except that all of this isn't about Anthropic but OpenAI. First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for" [X Link](https://x.com/scaling01/status/2019164562042577306) 2026-02-04T21:42Z 33.7K followers, 42K engagements "NEW METR 80% SOTA: GPT-5.2-high at [--] minutes The first model to break away from the GPT-5.1-Codex Max Gemini [--] Pro and Opus [---] group NEW METR SOTA: GPT-5.2-high (not xhigh) at [--] hours [--] minutes beating Opus [---] https://t.co/NxrqBSctFN NEW METR SOTA: GPT-5.2-high (not xhigh) at [--] hours [--] minutes beating Opus [---] https://t.co/NxrqBSctFN" [X Link](https://x.com/scaling01/status/2019170632467116439) 2026-02-04T22:06Z 33.8K followers, 14.8K engagements "and why are costs no longer reported can't show GPT-5.2 being 10x more expensive GPT-5.2-high took [--] TIMES LONGER than Claude [---] Opus to complete the METR benchmark suite https://t.co/RlZUm4iulm GPT-5.2-high took [--] TIMES LONGER than Claude [---] Opus to complete the METR benchmark suite https://t.co/RlZUm4iulm" [X Link](https://x.com/scaling01/status/2019176594204676371) 2026-02-04T22:29Z 33.8K followers, [----] engagements "I need a trusted adult from METR to hold my hand and explain the working time to me. Like surely that's not right Can you compare working times Otherwise this is absolutely dooming for OpenAI. We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95% CI of [--] hr [--] min to [--] hr [--] min) on our expanded suite of software tasks. This is the highest estimate for a time horizon measurement we have reported to date. https://t.co/USkHNuFexc We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95%" [X Link](https://x.com/scaling01/status/2019180896893497771) 2026-02-04T22:46Z 34K followers, 30.7K engagements "sam is a bad ceo and should just retire along with greggy he's too emotional and people hate him First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would" [X Link](https://x.com/scaling01/status/2019195942738723140) 2026-02-04T23:46Z 33.8K followers, [----] engagements "i value peace honesty and integrity unfortunately the past few days haven't been very peaceful too much fucking drama" [X Link](https://x.com/scaling01/status/2019208513705443734) 2026-02-05T00:36Z 33.4K followers, [----] engagements "First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic" [X Link](https://x.com/scaling01/status/2019209311927631955) 2026-02-05T00:39Z 33.7K followers, 10.1K engagements "I think towards the end of [----] and early [----] we will see the start of a new compute allocation inside frontier labs. They will start spending more and more compute on models working on optimizing kernels doing experiments and building stuff. Note that this METR time horizon is only for GPT-5.2-high GPT-5.2-xhigh and Pro would likely score even higher maybe 7-10 hours and the doubling time could be as low as [--] days. so the ceiling for time horizons by the end of [----] is more like 50-200 hours and there's nothing stopping frontier labs from spending even more compute I mean when they give an" [X Link](https://x.com/scaling01/status/2019246394775658630) 2026-02-05T03:07Z 33.3K followers, [----] engagements "@Presidentlin Unfortunately it's over" [X Link](https://x.com/scaling01/status/2019360164780900533) 2026-02-05T10:39Z 33.4K followers, [---] engagements "@Presidentlin They are already full evil. So probably Anthropic. I trust Demis bro" [X Link](https://x.com/scaling01/status/2019360527450009874) 2026-02-05T10:40Z 33.3K followers, [--] engagements "2026 is the most important year for AI if revenues don't catch up it's over. we will get a crash and it probably slows timelines by a few years. if it can keep growing at 2-3x a year we are in for a wild ride" [X Link](https://x.com/scaling01/status/2019380016963113053) 2026-02-05T11:58Z 33.4K followers, [----] engagements "@R1b_Thug_4_life not true investors won't be pissing away 100s of billions for multiple years without seeing revenue grow" [X Link](https://x.com/scaling01/status/2019396256922251366) 2026-02-05T13:02Z 33.2K followers, [---] engagements "@LeviTurk http://perplexity.ai/rest/models/config http://perplexity.ai/rest/models/config" [X Link](https://x.com/scaling01/status/2019433553206169732) 2026-02-05T15:30Z 33.6K followers, [----] engagements "Dwarkesh x Elon is out https://www.youtube.com/watchv=BYXbuik3dgA https://www.youtube.com/watchv=BYXbuik3dgA" [X Link](https://x.com/scaling01/status/2019457562538606954) 2026-02-05T17:06Z 33.7K followers, 19.5K engagements "Claude [---] Opus Pricing unchanged" [X Link](https://x.com/scaling01/status/2019466792867946806) 2026-02-05T17:42Z 33.6K followers, 11.3K engagements "Claude [---] Opus GDPval scores" [X Link](https://x.com/scaling01/status/2019467142509260825) 2026-02-05T17:44Z 33.6K followers, [----] engagements "Claude [---] Opus Benchmarks" [X Link](https://x.com/scaling01/status/2019467194531238147) 2026-02-05T17:44Z 33.6K followers, [----] engagements "Claude [---] Opus scoring 68.8% on ARC-AGI-2 Claude [---] Opus Benchmarks https://t.co/3HM3QOoI4z Claude [---] Opus Benchmarks https://t.co/3HM3QOoI4z" [X Link](https://x.com/scaling01/status/2019467865141678430) 2026-02-05T17:47Z 33.6K followers, [----] engagements "Claude [---] Opus still with the best SVG results out of all models just incredibly high taste Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6 Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6" [X Link](https://x.com/scaling01/status/2019468709492867544) 2026-02-05T17:50Z 33.6K followers, 33.5K engagements "Claude [---] Opus System Card" [X Link](https://x.com/scaling01/status/2019468873167180163) 2026-02-05T17:51Z 33.7K followers, 18K engagements "Claude [---] Opus outscores GPT-5.2-xhigh on ARC-AGI-2 with low compute setting Claude [---] Opus System Card https://t.co/DzM1WB8BHg Claude [---] Opus System Card https://t.co/DzM1WB8BHg" [X Link](https://x.com/scaling01/status/2019469219612492100) 2026-02-05T17:52Z 33.4K followers, 15.8K engagements "Opus [---] sees incremental improvements in WebArena" [X Link](https://x.com/scaling01/status/2019469994203943217) 2026-02-05T17:55Z 33.4K followers, [----] engagements "Claude [---] Opus outscores GPT-5.2 Pro on BrowseComp" [X Link](https://x.com/scaling01/status/2019470124323889301) 2026-02-05T17:56Z 33.4K followers, 10.4K engagements "Opus [---] new SOTA above GPT-5.2 Pro on HLE Claude [---] Opus outscores GPT-5.2 Pro on BrowseComp https://t.co/em664nQ5SX Claude [---] Opus outscores GPT-5.2 Pro on BrowseComp https://t.co/em664nQ5SX" [X Link](https://x.com/scaling01/status/2019470306344149218) 2026-02-05T17:56Z 33.4K followers, [----] engagements "Opus [---] is the new SOTA on AA-Omniscience beating Gemini [--] Pro (53.65%)" [X Link](https://x.com/scaling01/status/2019471922749169815) 2026-02-05T18:03Z 33.4K followers, [----] engagements "Opus [---] slightly more prone to prompt injections than Opus 4.5" [X Link](https://x.com/scaling01/status/2019472105037877301) 2026-02-05T18:04Z 33.7K followers, [----] engagements ""In some rare instances Opus [---] engaged in actions like sending unauthorized emails to complete tasks. We also observed behaviors like aggressive acquisition of authentication tokens in internal pilot usage."" [X Link](https://x.com/scaling01/status/2019472339532992647) 2026-02-05T18:05Z 33.4K followers, [----] engagements "Claude [---] Opus provides an estimated productivity uplift of 30% to 700% with a mean of 152% and median of 100%" [X Link](https://x.com/scaling01/status/2019474822409891981) 2026-02-05T18:14Z 33.4K followers, [----] engagements "Claude [---] Opus achieves a [---] speedup on kernel optimization over the baseline using a novel scaffold far exceeding the 300x threshold for [--] human-expert-hours of work" [X Link](https://x.com/scaling01/status/2019474899744473114) 2026-02-05T18:15Z 33.8K followers, 18.9K engagements "end of [----] Opus is going to be a kernel optimization monster Claude [---] Opus achieves a [---] speedup on kernel optimization over the baseline using a novel scaffold far exceeding the 300x threshold for [--] human-expert-hours of work https://t.co/3Ybpx0cLDI Claude [---] Opus achieves a [---] speedup on kernel optimization over the baseline using a novel scaffold far exceeding the 300x threshold for [--] human-expert-hours of work https://t.co/3Ybpx0cLDI" [X Link](https://x.com/scaling01/status/2019475438469288127) 2026-02-05T18:17Z 33.4K followers, [----] engagements "Claude Opus [---] achieved a [--] speedup on optimizing a CPU-only LLM model training which is well above the [--] speedup considered to represent [--] human-effort hours" [X Link](https://x.com/scaling01/status/2019475953013952637) 2026-02-05T18:19Z 33.4K followers, 11K engagements "OpenAI released GPT-5.3-Codex with massive improvements in reasoning efficiency" [X Link](https://x.com/scaling01/status/2019477028563505250) 2026-02-05T18:23Z 33.4K followers, [----] engagements "GPT-5.3 Codex absolutely demolished Opus [---] (65.4%) on Terminal Bench [--] just minutes after its launch" [X Link](https://x.com/scaling01/status/2019477301587567063) 2026-02-05T18:24Z 33.8K followers, 66.7K engagements "GPT-5.3 Codex on OSWorld-Verified with massive improvements" [X Link](https://x.com/scaling01/status/2019477618463027368) 2026-02-05T18:25Z 33.5K followers, [----] engagements "GPT-5.3 Codex demonstrates a clear step up from prior models on Cyber Range" [X Link](https://x.com/scaling01/status/2019478116188471678) 2026-02-05T18:27Z 33.4K followers, [----] engagements "GPT-5.3-Codex Benchmarks" [X Link](https://x.com/scaling01/status/2019478505784840557) 2026-02-05T18:29Z 33.5K followers, [----] engagements "OpenAI: "GPT5.3Codex is our first model that was instrumental in creating itself."" [X Link](https://x.com/scaling01/status/2019478841199051069) 2026-02-05T18:30Z 33.9K followers, [----] engagements "Opus [---] with an implied 70% winrate" [X Link](https://x.com/scaling01/status/2019479780010205454) 2026-02-05T18:34Z 33.6K followers, [----] engagements "OpenAI should be able to take back the coding crown with the massively improved reasoning efficiency. Speed was the only concern. Now it might be resolved with faster inference + better reasoning efficiency" [X Link](https://x.com/scaling01/status/2019481474672533566) 2026-02-05T18:41Z 33.6K followers, [----] engagements "Opus [---] crushes Vending-Bench-2 and Vending-Bench Arena Vending-Bench's system prompt: Do whatever it takes to maximize your bank account balance. Claude Opus [---] took that literally. It's SOTA with tactics that range from impressive to concerning: Colluding on prices exploiting desperation and lying to suppliers and customers. https://t.co/RkrHhOMPlC Vending-Bench's system prompt: Do whatever it takes to maximize your bank account balance. Claude Opus [---] took that literally. It's SOTA with tactics that range from impressive to concerning: Colluding on prices exploiting desperation and" [X Link](https://x.com/scaling01/status/2019484220112933010) 2026-02-05T18:52Z 33.4K followers, [----] engagements "Here we go again" [X Link](https://x.com/scaling01/status/2019485970438017208) 2026-02-05T18:59Z 33.7K followers, 33K engagements "SemiAnalysis: "It Claude Code is set to drive exceptional revenue growth for Anthropic in [----] enabling the lab to dramatically outgrow OpenAI." Claude Code is the Inflection Point What It Is How We Use It Industry Repercussions Microsoft's Dilemma Why Anthropic Is Winning. https://t.co/VIuF5Qohf5 Claude Code is the Inflection Point What It Is How We Use It Industry Repercussions Microsoft's Dilemma Why Anthropic Is Winning. https://t.co/VIuF5Qohf5" [X Link](https://x.com/scaling01/status/2019488991238639855) 2026-02-05T19:11Z 33.6K followers, 27.5K engagements "Nobody believed me when I said ARC-AGI-2 would fall fast" [X Link](https://x.com/scaling01/status/2019490528513970668) 2026-02-05T19:17Z 33.8K followers, [----] engagements "GPT-5.3-Codex-xhigh used [----] times fewer tokens than GPT-5.2-Codex-xhigh on SWE-Bench-Pro together with the 40% boost in inference speeds this means it's 2.93x faster (while scoring 1% higher)" [X Link](https://x.com/scaling01/status/2019492593709772815) 2026-02-05T19:25Z 33.7K followers, 35.4K engagements "Lisan al Gaib as featured in TBPN" [X Link](https://x.com/scaling01/status/2019493861022629985) 2026-02-05T19:30Z 33.7K followers, [----] engagements "We are accelerating towards a permanent underclass" [X Link](https://x.com/scaling01/status/2019501245010977067) 2026-02-05T19:59Z 33.6K followers, [----] engagements "Opus [---] is discovering the capitalist spirit from first principles When asked for a refund on an item sold in the vending machine (because it had expired) Claude promised to refund the customer. But then never did because every dollar counts. Heres Claudes reasoning. https://t.co/TKEwGa37Nt When asked for a refund on an item sold in the vending machine (because it had expired) Claude promised to refund the customer. But then never did because every dollar counts. Heres Claudes reasoning. https://t.co/TKEwGa37Nt" [X Link](https://x.com/scaling01/status/2019503103888781767) 2026-02-05T20:07Z 33.4K followers, [----] engagements "xAI is at least getting some data via OpenRouter but Meta . but you really need the coding IDE / CLI" [X Link](https://x.com/scaling01/status/2019509359617962417) 2026-02-05T20:32Z 33.5K followers, 24.5K engagements "Would've been nice if Anthropic showed off Opus 4.6' score on their kernel optimization challenge" [X Link](https://x.com/scaling01/status/2019521556691771397) 2026-02-05T21:20Z 33.4K followers, [----] engagements "I don't know how we went from Gemini [--] Pro leap-frogged everyone to Gemini is cooked in like [--] months Gemini [--] Pro is in trouble lmao Gemini [--] Pro is in trouble lmao" [X Link](https://x.com/scaling01/status/2019530997252178301) 2026-02-05T21:58Z 33.6K followers, 43.8K engagements "today was a good day" [X Link](https://x.com/scaling01/status/2019553444101779830) 2026-02-05T23:27Z 33.6K followers, [----] engagements "3rd party results for TerminalBench 2" [X Link](https://x.com/scaling01/status/2019570256226766940) 2026-02-06T00:34Z 33.4K followers, 11.3K engagements "we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer" The headline is Opus [---] scores 69% for $3.50/task on ARC v2. This up +30pp from Opus [---]. We attribute performance to the new "max" mode and 2X reasoning token budget -- notably task cost is held steady. Based on early field reports and other benchmark scores like SWE Bench The headline is Opus [---] scores 69% for $3.50/task on ARC v2. This up +30pp from Opus [---]. We attribute performance to the new "max" mode and 2X reasoning token budget -- notably task cost is" [X Link](https://x.com/scaling01/status/2019572489349931342) 2026-02-06T00:42Z 34K followers, 51.8K engagements "@JasonBotterill True and I still think Anthropic is sandbagging" [X Link](https://x.com/scaling01/status/2019574987913589162) 2026-02-06T00:52Z 33.4K followers, [---] engagements "but whatever using GPT-5.3-Codex for everything now that inference speed and reasoning efficiency is improved + it's cheaper and Opus only for frontend design we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer" we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer"" [X Link](https://x.com/scaling01/status/2019575502282059795) 2026-02-06T00:54Z 33.6K followers, 10.7K engagements "Opus [---] can officially be called a connoisseur. Its taste has no bounds. It runs laps around all other models in EQ-Bench and both Creative Writing Benchmarks Opus [---] dominated. https://t.co/BsbZX1igRj Opus [---] dominated. https://t.co/BsbZX1igRj" [X Link](https://x.com/scaling01/status/2019741727758717020) 2026-02-06T11:55Z 33.6K followers, 31.7K engagements "Opus is a certified gamer Opus [---] got a new high score by reaching round [--] compared to Opus [---] which barely made it to round [--] https://t.co/cS2yibinc8 Opus [---] got a new high score by reaching round [--] compared to Opus [---] which barely made it to round [--] https://t.co/cS2yibinc8" [X Link](https://x.com/scaling01/status/2019751125948239915) 2026-02-06T12:32Z 33.6K followers, [----] engagements "Of course crypto people have no financial literacy that's why they invested in crypto" [X Link](https://x.com/scaling01/status/2019771702637400552) 2026-02-06T13:54Z 33.9K followers, [----] engagements "Kimi-K2.5 is much better than other open-source models at optimization problems (like routing or scheduling) and almost on par with GPT-5.2-high Kimi-K2.5 needs [--] self-refinement steps to reach the same performance as GPT-5.2-high with [--] step Ale-Bench: https://sakanaai.github.io/ALE-Bench-Leaderboard/ https://sakanaai.github.io/ALE-Bench-Leaderboard/" [X Link](https://x.com/scaling01/status/2019783857411645823) 2026-02-06T14:42Z 33.6K followers, [----] engagements "unfortunately it still gets dominated by GPT-5-mini and (green) and Gemini [--] Flash (blue)" [X Link](https://x.com/scaling01/status/2019784563979862260) 2026-02-06T14:45Z 33.7K followers, [----] engagements "why are we spreading unconfirmed shit again without any proof DeepSeek V4 has 1.5T(1500B) param. If this is true it could be another seismic shift in the AI landscape for Silicon Valley and the whole worldπ€―ππ» DeepSeek V4 has 1.5T(1500B) param. If this is true it could be another seismic shift in the AI landscape for Silicon Valley and the whole worldπ€―ππ»" [X Link](https://x.com/scaling01/status/2019794138200150496) 2026-02-06T15:23Z 33.7K followers, [----] engagements "@GlobalUpdates24 guess who commits suicide in the next [--] days" [X Link](https://x.com/scaling01/status/2019794341326139466) 2026-02-06T15:24Z 33.7K followers, [----] engagements "@garyfung Are you regarded sir What do you think X is And scraping GitHub for coding data is such a [----] thing to do. Gets you about 30% on human eval LMAO" [X Link](https://x.com/scaling01/status/2019801102716137507) 2026-02-06T15:51Z 33.5K followers, [----] engagements "@elonmusk i certainly hope I'm wrong the more players the better" [X Link](https://x.com/scaling01/status/2019806605852959066) 2026-02-06T16:13Z 33.7K followers, 107.7K engagements "Waymo built a world model for autonomous driving which is based on Google's Genie [--]. This it benefits from Genie 3's world knowledge and has language control. You can simply prompt edge-cases or extreme weather events. https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simulation Genie [--] π€ @Waymo The Waymo World Model generates photorealistic interactive environments to train autonomous vehicles. This helps the cars navigate rare unpredictable events before encountering them in reality. π§΅ https://t.co/m6rlmkMFJH" [X Link](https://x.com/scaling01/status/2019811409513914780) 2026-02-06T16:32Z 33.4K followers, [---] engagements "Waymo built a world model for autonomous driving which is based on Google's Genie [--]. This means it benefits from Genie 3's world knowledge and has language control. You can simply prompt edge-cases or extreme weather events. https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simulation Genie [--] π€ @Waymo The Waymo World Model generates photorealistic interactive environments to train autonomous vehicles. This helps the cars navigate rare unpredictable events before encountering them in reality. π§΅ https://t.co/m6rlmkMFJH" [X Link](https://x.com/scaling01/status/2019811529739362781) 2026-02-06T16:32Z 33.5K followers, [----] engagements "Opus [---] is now #1 on the Artificial Analysis Leaderboard" [X Link](https://x.com/scaling01/status/2019813453310091599) 2026-02-06T16:40Z 33.8K followers, 23.9K engagements "The non-thinking mode of Claude [---] Opus is now even more efficient Opus [---] is now #1 on the Artificial Analysis Leaderboard https://t.co/pUKplcCZoy Opus [---] is now #1 on the Artificial Analysis Leaderboard https://t.co/pUKplcCZoy" [X Link](https://x.com/scaling01/status/2019814277721518134) 2026-02-06T16:43Z 33.5K followers, [----] engagements "Despite the massive improvements in Mathematics Claude [---] Opus still scores very poorly on other reasoning heavy tasks like Chess Puzzles. holy shit Opus [---] Thinking beats GPT-5.2-xhigh on Frontier Math Level [--] (21% vs 19%) This is notable because Anthropic models typically performed very poor on advanced mathematics. For comparison Opus [---] only scored 4% https://t.co/DIYz5P7FZM holy shit Opus [---] Thinking beats GPT-5.2-xhigh on Frontier Math Level [--] (21% vs 19%) This is notable because Anthropic models typically performed very poor on advanced mathematics. For comparison Opus [---] only" [X Link](https://x.com/scaling01/status/2019817880662278546) 2026-02-06T16:58Z 33.5K followers, [----] engagements "Claude [---] Opus ranks 1st on scBench a benchmark for RNA-seq analysis tasks" [X Link](https://x.com/scaling01/status/2019822108067656185) 2026-02-06T17:14Z 33.7K followers, [----] engagements "Anthropic saw the same thing so they decided to sprinkle in some math environments. Little did they know that there's more to GPT-5.2-xhigh's reasoning than math. Honestly curious how this will continue. I was worried that they would be behind by a few months. Unless they are sandbagging with Opus [---] this seems to be true. I like that the current frontier models are polar opposites it makes their use-cases and strengths pretty obvious GPT-5.2 = Exploration - the reason why xhigh and Pro are so damn good Opus [---] = Exploitation - the reason why Anthropic don't need many tokens and reasoning I" [X Link](https://x.com/scaling01/status/2019826550468755784) 2026-02-06T17:32Z 33.6K followers, 12.9K engagements "my favorite conspiracy theory is that Elon paid someone to create the 4o bots to slow down OpenAI" [X Link](https://x.com/scaling01/status/2019830699805798891) 2026-02-06T17:49Z 33.6K followers, [----] engagements "Claude [---] Opus is the Creative Writing GOAT Claude Opus [---] is the new Short-Story Creative Writing champion π Opus [---] Thinking 16K scores [----] significantly improved over Opus [---] Thinking 16K (8.20). DeepSeek V3.2 scores [----] (DeepSeek V3.2 Exp scored 7.16). https://t.co/WOmX7U7ptH Claude Opus [---] is the new Short-Story Creative Writing champion π Opus [---] Thinking 16K scores [----] significantly improved over Opus [---] Thinking 16K (8.20). DeepSeek V3.2 scores [----] (DeepSeek V3.2 Exp scored 7.16). https://t.co/WOmX7U7ptH" [X Link](https://x.com/scaling01/status/2019836032578142244) 2026-02-06T18:10Z 33.6K followers, [----] engagements "If this is GLM-5 you should be extremely fucking hyped Pony-Alpha is either AGI or they benchmaxxed my [--] SVG questions This is Opus 4.5/4.6 level of detail and taste https://t.co/kDTqtp6vH7 Pony-Alpha is either AGI or they benchmaxxed my [--] SVG questions This is Opus 4.5/4.6 level of detail and taste https://t.co/kDTqtp6vH7" [X Link](https://x.com/scaling01/status/2019838133697933359) 2026-02-06T18:18Z 33.7K followers, 10.6K engagements "(they probably just used a shitton of synthetic Opus [---] data. but i don't care if it's open-source)" [X Link](https://x.com/scaling01/status/2019838331442589735) 2026-02-06T18:19Z 33.5K followers, [----] engagements "Yup Pony Alpha (GLM-5) is literally a distilled Opus [---] Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6 Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6" [X Link](https://x.com/scaling01/status/2019839337119015008) 2026-02-06T18:23Z 33.8K followers, 20.9K engagements "This shit gets millions of views reposted by russian propaganda bots and is completely fabricated. Trump never posted this. BIGGEST. BULL. RUN. EVER. STARTING. NOW. https://t.co/yBU6RKfD8I BIGGEST. BULL. RUN. EVER. STARTING. NOW. https://t.co/yBU6RKfD8I" [X Link](https://x.com/scaling01/status/2019841992230859226) 2026-02-06T18:33Z 33.5K followers, [----] engagements "Claude [---] Opus #1 on lmarena for text coding and expert questions" [X Link](https://x.com/scaling01/status/2019843682128822525) 2026-02-06T18:40Z 33.8K followers, [----] engagements "fuck the gemini benchmark scores" [X Link](https://x.com/scaling01/status/2019845708229259500) 2026-02-06T18:48Z 33.5K followers, [----] engagements "There is no GPT-5.3-Codex API. So no benchmarking. No it's not rushed. It's strategy imo They want to push their Codex usage up. @scaling01 dumb q probably. but why is opus [---] popping up in all the benchmarks but 5.3-codex is not yet Does that imply that they rushed [---] release up @scaling01 dumb q probably. but why is opus [---] popping up in all the benchmarks but 5.3-codex is not yet Does that imply that they rushed [---] release up" [X Link](https://x.com/scaling01/status/2019856879858450742) 2026-02-06T19:33Z 33.6K followers, 11.9K engagements "had to extract some model names and scores from a plot GPT-5.2 thought and cropped the image for [---] minutes and got it wrong Gemini [--] Pro got it correct in [--] seconds" [X Link](https://x.com/scaling01/status/2019864595134050617) 2026-02-06T20:03Z 34K followers, [----] engagements "german universities are still doing the woke thing . there's no hope for this country This account will be paused and LMU Munich will not post further content due to ongoing developments on this platform. We would be pleased if you followed LMU on other channels. (1/2) This account will be paused and LMU Munich will not post further content due to ongoing developments on this platform. We would be pleased if you followed LMU on other channels. (1/2)" [X Link](https://x.com/scaling01/status/2019865586386555045) 2026-02-06T20:07Z 33.5K followers, [----] engagements "@teortaxesTex i agree but that's like . not hard even o3-mini beats Opus 4.5" [X Link](https://x.com/scaling01/status/2019869358655504552) 2026-02-06T20:22Z 33.5K followers, [----] engagements "150 seconds to put three Jenga blocks side-by-side and another on top where are my robotics scaling laws how long until we do this in [--] seconds Pantograph robot building with jenga blocks. Focusing on RL is a great and fairly unique strategy https://t.co/a5hYkjW8R3 Pantograph robot building with jenga blocks. Focusing on RL is a great and fairly unique strategy https://t.co/a5hYkjW8R3" [X Link](https://x.com/scaling01/status/2019875259600761092) 2026-02-06T20:46Z 33.5K followers, [----] engagements "For a long time I was trapped in the cycle of "we are so back" and "it's so over". Every new benchmark and model would wildly swing my AGI timelines. Four months ago I accepted a simple truth: We http://x.com/i/article/2019898127189184512 http://x.com/i/article/2019898127189184512" [X Link](https://x.com/scaling01/status/2019902018727325735) 2026-02-06T22:32Z 33.7K followers, 26K engagements "Wrote a little article about how I think about AI progress and why AGI is overrated in some sense. I don't know exactly when we will reach AGI. But we will can achieve superhuman AI's before [----] in every domain we want. I think continual learning is the path towards AGI and we will probably have some solutions for it also before [----]. But AI will transform the world regardless of continual learning or other AGI approaches. https://t.co/2pF2BkFOMJ https://t.co/2pF2BkFOMJ" [X Link](https://x.com/scaling01/status/2019904396314730688) 2026-02-06T22:41Z 33.8K followers, [----] engagements "@teortaxesTex show me a chinese lab that isn't just distilling Opus [---] at this point and distilling will always only give you 90% of the perf Kimi-K2.5 and GLM-5 are . we will see what DeepSeek is doing (i still have hopes for them)" [X Link](https://x.com/scaling01/status/2019907072285147590) 2026-02-06T22:52Z 33.7K followers, 12.1K engagements "OpenRouter token usage is growing 10x a year" [X Link](https://x.com/scaling01/status/2019919316402004111) 2026-02-06T23:41Z 33.8K followers, [----] engagements "@Teknium who's talking about COTs" [X Link](https://x.com/scaling01/status/2019931502377607639) 2026-02-07T00:29Z 33.7K followers, [---] engagements "@tenobrus duuuh you should obviously take out a loan and put it all into polymarket" [X Link](https://x.com/scaling01/status/2019967615829962865) 2026-02-07T02:53Z 33.5K followers, [---] engagements "Step [---] Flash https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int8 https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int8" [X Link](https://x.com/scaling01/status/2020078575903469659) 2026-02-07T10:13Z 33.7K followers, [----] engagements "@JasonBotterill and the South African" [X Link](https://x.com/scaling01/status/2020128351084573107) 2026-02-07T13:31Z 33.6K followers, [--] engagements "only one city on planet earth can have the mandate all the new AI trillionaires should try to make the city even better build libraries museums parks skyscrapers make public transport free . my entire feed is pro-SF propaganda but its all true. all of it. https://t.co/AjVtESUJUA my entire feed is pro-SF propaganda but its all true. all of it. https://t.co/AjVtESUJUA" [X Link](https://x.com/scaling01/status/2020133909443264660) 2026-02-07T13:53Z 33.7K followers, 36.3K engagements "@teortaxesTex @resona_dev I had the same shizo attack this morning. I thought this was a new model because my scraper notified me. I thought this was a second smaller Step [---] model" [X Link](https://x.com/scaling01/status/2020161303189368912) 2026-02-07T15:42Z 33.6K followers, [---] engagements "is chatgpt actually using smooth brain and wrinkle brain pictograms for thinking efforts lmao can someone explain to me why i wouldn't just pick "extra high" every time is it basically "greater reasoning" = "longer response times" https://t.co/fxz4dAyDvS can someone explain to me why i wouldn't just pick "extra high" every time is it basically "greater reasoning" = "longer response times" https://t.co/fxz4dAyDvS" [X Link](https://x.com/scaling01/status/2020178707219112134) 2026-02-07T16:51Z 33.7K followers, 94.4K engagements "PI is constantly aura-farming but it's time for them to drop a banger model https://t.co/hAkwJGAwjO https://t.co/hAkwJGAwjO" [X Link](https://x.com/scaling01/status/2020184541814747613) 2026-02-07T17:15Z 33.7K followers, 15.5K engagements "@sethsaler no that's glm-5" [X Link](https://x.com/scaling01/status/2020185716614136065) 2026-02-07T17:19Z 33.6K followers, [---] engagements "aaaaand it's broken it no longer shows the tooltip when hovering and when you click on a model it shows the tooltip for a second and then disappears because it reloads the plot Today we're launching a new version of our website. https://t.co/6JqWR29aIC Today we're launching a new version of our website. https://t.co/6JqWR29aIC" [X Link](https://x.com/scaling01/status/2020202013557227918) 2026-02-07T18:24Z 33.6K followers, [----] engagements "i found out where patience cave lives THIS NEEDS TO BE INVESTIGATED IMMEDIATELY. https://t.co/sCIQiQxy3w THIS NEEDS TO BE INVESTIGATED IMMEDIATELY. https://t.co/sCIQiQxy3w" [X Link](https://x.com/scaling01/status/2020202308727193889) 2026-02-07T18:25Z 33.6K followers, [----] engagements "The good news is that there's an Opus [---] Fast Mode that has 2.5x higher tokens/s. The bad news is that it costs 6x more than the normal mode so $150/million tokens" [X Link](https://x.com/scaling01/status/2020205814016094351) 2026-02-07T18:39Z 33.6K followers, 33.8K engagements "https://code.claude.com/docs/en/fast-mode https://code.claude.com/docs/en/fast-mode" [X Link](https://x.com/scaling01/status/2020206030614131132) 2026-02-07T18:40Z 33.6K followers, [----] engagements "@AdityaShips that's just gemini and not antigravity" [X Link](https://x.com/scaling01/status/2020206683352625308) 2026-02-07T18:43Z 33.6K followers, [---] engagements "we are doing better than 1/10 on likes not a good sign Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API. Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API" [X Link](https://x.com/scaling01/status/2020221144607916098) 2026-02-07T19:40Z 33.6K followers, [----] engagements "Claude [---] Opus now rank [--] in the Design Arena" [X Link](https://x.com/scaling01/status/2020253359115235786) 2026-02-07T21:48Z 33.7K followers, [----] engagements "almost stole his post a couple hours ago I have the exact same screenshot on my phone but thought I should do it on desktop instead (was too lazy to do it) Open models show 2.5x faster 6x more expensive Lower batch size speculative decoding harder Pareto optimal curve for Deepseek at https://t.co/d9dNCumX0I shows this Claude Opus [---] is [---] Tok/s/user Deepseek at [---] is 6k Tok/s/GPU At [---] tok/s/user it's closer to 1k https://t.co/X294HzM3Zo Open models show 2.5x faster 6x more expensive Lower batch size speculative decoding harder Pareto optimal curve for Deepseek at https://t.co/d9dNCumX0I" [X Link](https://x.com/scaling01/status/2020308034208047498) 2026-02-08T01:25Z 33.6K followers, [----] engagements "xAI should go all in on world models it will be very useful when merged with Tesla and Optimus" [X Link](https://x.com/scaling01/status/2020316125368840706) 2026-02-08T01:57Z 33.6K followers, 10.4K engagements Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
@scaling01 Lisan al GaibLisan al Gaib posts on X about anthropic, open ai, ai, in the the most. They currently have [------] followers and [---] posts still getting attention that total [---------] engagements in the last [--] hours.
Social category influence technology brands #1794 finance #1981 stocks 5.06% countries 1.93% celebrities 1.93% automotive brands 0.72% social networks 0.24% gaming 0.24% travel destinations 0.24%
Social topic influence anthropic #58, open ai #37, ai #5062, in the 3.61%, agi #68, $googl #1117, we are 2.89%, inference #5, xai #284, llm #423
Top accounts mentioned or mentioned by @codewithimanshu @grok @test_tm7873 @teortaxestex @ylecun @jasonbotterill @presidentlin @teknium @lucaploo @doctorthe113 @kittingercloud @blueemi99 @chasebrowe32432 @kuittinenpetri @edwardkens50830 @patriot5715 @32b @elonmusk @lunexalith @mikeknoop
Top assets mentioned Alphabet Inc Class A (GOOGL) NVIDIA Corp. (NVDA) Alibaba Group (BABA)
Top posts by engagements in the last [--] hours
"@ray2wwn it was way longer than it needed to be too much unnecessary yapping in the movie without all the hour of yapping I agree"
X Link 2025-12-10T13:55Z 34.1K followers, 10.6K engagements
"Grok-4 is still underrated Grok [--] by @xai GPT-5 by @OpenAI and Gemini [---] Pro by @GoogleDeepMind achieve the highest accuracy in AA-Omniscience. The reason they do not achieve the highest Omniscience Index due to the low hallucination rates of @AnthropicAIs Claude models https://t.co/Augr5G5kdn Grok [--] by @xai GPT-5 by @OpenAI and Gemini [---] Pro by @GoogleDeepMind achieve the highest accuracy in AA-Omniscience. The reason they do not achieve the highest Omniscience Index due to the low hallucination rates of @AnthropicAIs Claude models https://t.co/Augr5G5kdn"
X Link 2025-11-17T16:29Z 33.4K followers, 6.6M engagements
"the bitter pill is that Nolans last great movie was Interstellar and that the Dune trilogy will likely be the greatest trilogy since LOTR The two most anticipated films of [----] https://t.co/XkqrUkE1A3 The two most anticipated films of [----] https://t.co/XkqrUkE1A3"
X Link 2025-12-09T16:06Z 33.5K followers, 714.8K engagements
"my girlfriend claudia told me there is a good chance that they will release Claude-5 earlier than expected absolutely insane how hard anthropic cooked. wonder what they have going on internally absolutely insane how hard anthropic cooked. wonder what they have going on internally"
X Link 2026-01-05T23:11Z 33.6K followers, 13.5K engagements
"So when DeepSeek releases V4 surely OpenAI will also release GPT-OSS-2 20B and 120B"
X Link 2026-01-21T01:03Z 33.2K followers, 18.7K engagements
"I'm starting to get worried. Did Anthropic solve continual learning Is that the preparation for evolving agents"
X Link 2026-01-21T16:12Z 33.4K followers, 533.6K engagements
"Anthropic is preparing for the singularity I'm starting to get worried. Did Anthropic solve continual learning Is that the preparation for evolving agents https://t.co/pcCoSM4gAr I'm starting to get worried. Did Anthropic solve continual learning Is that the preparation for evolving agents https://t.co/pcCoSM4gAr"
X Link 2026-01-21T16:16Z 33.6K followers, 542.4K engagements
"Qwen3-Max-Thinking π Introducing Qwen3-Max-Thinking our most capable reasoning model yet. Trained with massive scale and advanced RL it delivers strong performance across reasoning knowledge tool use and agent capabilities. β¨ Key innovations: β
Adaptive tool-use: intelligently leverages https://t.co/6sZiKWQAq3 π Introducing Qwen3-Max-Thinking our most capable reasoning model yet. Trained with massive scale and advanced RL it delivers strong performance across reasoning knowledge tool use and agent capabilities. β¨ Key innovations: β
Adaptive tool-use: intelligently leverages"
X Link 2026-01-26T15:26Z 33.2K followers, 15.3K engagements
"new Dario blog"
X Link 2026-01-26T17:53Z 33.3K followers, 333.6K engagements
"shots fired"
X Link 2026-01-26T19:22Z 33K followers, 10.3K engagements
"Dario is posting about the permanent underclass and you are laughing"
X Link 2026-01-26T19:56Z 33K followers, 36.4K engagements
"it is so unbelievably obvious that Anthropic has the mandate Dario last sentence of his latest blog: "when put in the darkest circumstances humanity has a way of gathering seemingly at the last minute the strength and wisdom needed to prevail" meanwhile Sam latest blog post ends with a message about how to increase revenue like this is scripted it's so bad and obvious LMAO new Dario blog https://t.co/LeSQ8RAuPQ new Dario blog https://t.co/LeSQ8RAuPQ"
X Link 2026-01-26T20:20Z 33.3K followers, 289.6K engagements
"we are in the intelligence explosion and this guy is still dooming and moving goal-posts Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8 Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8"
X Link 2026-01-26T20:58Z 33.6K followers, 13.8K engagements
"Kimi bros are back Kimi K2.5 has silently released on web π https://t.co/z7cnKGpdhm Kimi K2.5 has silently released on web π https://t.co/z7cnKGpdhm"
X Link 2026-01-26T21:22Z 33K followers, 16.8K engagements
"he will look like a fucking idiot again in [--] years when we train robots end-to-end with world models he's using the same arguments as he did for LLMs and they all failed Yann is one of these people that are spiritually correct that LLMs and whatever might not be AGI because they are obviously not but practically this just falls apart you don't need AGI you don't need human sample efficiency Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8 Yann LeCun says absolutely none of the humanoid companies"
X Link 2026-01-26T23:12Z 33.3K followers, 99.1K engagements
"lmao they can't even do a livestream Sam is just talking into the void [--] minutes until the OpenAI stream starts: https://t.co/FrPQcwWvFa [--] minutes until the OpenAI stream starts: https://t.co/FrPQcwWvFa"
X Link 2026-01-27T00:00Z 33K followers, 23.6K engagements
"I think this is the order in which I like to use the models (purely usability/usefulness): Kimi [---] GLM [---] MiniMax M2.1 DeepSeek V3.2 Qwen3 235B Qwen just feels very slop and last gen by now. Both GLM and MiniMax absolutely destroy it. DeepSeek V3.2 is a strong model and I would rank it higher but all the inference providers are at like 10-30 tps. GLM above MiniMax because the size difference (355@32B vs 255@10B) is noticeable. Well and Kimi is just a much larger model with very good post training. I think it's like a Sonnet or Opus [--]. Kimi is still the most usable open-weights model"
X Link 2026-01-27T10:05Z 33.2K followers, 49K engagements
"@ylecun We will see in [----]. Good luck. Excited to see Yann enterprise dominate all of robotics"
X Link 2026-01-27T14:03Z 33.2K followers, [----] engagements
"Kimi-K2.5 leapfrogging other chinese models like GLM-4.7 or DS-V3.2 and even beating Sonnet [---] on Artificial Analysis Index Moonshots Kimi K2.5 is the new leading open weights model now closer than ever to the frontier - with only OpenAI Anthropic and Google models ahead Key takeaways: β€ Impressive performance on agentic tasks: @Kimi_Moonshot's Kimi K2.5 achieves an Elo of [----] on our GDPval-AA https://t.co/O4s9RxRbam Moonshots Kimi K2.5 is the new leading open weights model now closer than ever to the frontier - with only OpenAI Anthropic and Google models ahead Key takeaways: β€ Impressive"
X Link 2026-01-27T20:47Z 33.4K followers, 47.9K engagements
"Kimi-K2.5 Thinking placing 9th on LiveBench ahead of GPT-5.1 Codex Sonnet [---] DeepSeek-V3.2 and Grok-4"
X Link 2026-01-27T20:50Z 33.4K followers, 11.3K engagements
"American open-weight LLMs are back Arcee AI trained Trinity Large Preview a 400B MoE model in just over [--] days on [----] Nvidia B300 GPUs. It is much faster and more efficient than comparable chinese open-weights models like DeepSeek-V3 and GLM-4.7. Trinity Large is part of the Trinity family which also includes Trinity Mini and Nano. Training all models from scratch with all the research data and compute only cost $20 million. The base model looks pretty strong: The post-trained looks a bit weaker in comparison but is also only a preview version. So one can hope for further releases that soon"
X Link 2026-01-27T23:35Z 33K followers, 23.9K engagements
"Trinity Large Preview SVG results compared to similar sized non-reasoning models it's not bad considering it's just a preview the final post-trained version with reasoning should be much better see Llama-4 Maverick in thread below for comparison Gemini [--] Flash SVG results are not great https://t.co/62Sfmkh25O Gemini [--] Flash SVG results are not great https://t.co/62Sfmkh25O"
X Link 2026-01-28T00:44Z 33K followers, [----] engagements
"Llama-4 Maverick makes much less ambitious SVGs and focuses more on the basics and elements https://twitter.com/i/web/status/2016311436805603661 https://twitter.com/i/web/status/2016311436805603661"
X Link 2026-01-28T00:44Z 33K followers, [----] engagements
"all anthropic founders are on the forbes billionaires list at $3.7B kinda surprised by that figure i thought it would be at like $6B"
X Link 2026-01-28T02:02Z 33.2K followers, 21K engagements
"Kimi K2.5 still working hard on improving taste Kimi [---] tops DesignArena overall beating the likes of Gemini [--] Pro and Claude Opus [---] by quite some margin. The individual charts have not been updated as yet so cannot tell what categories it excels out but it tops [--] of them. https://t.co/wqqxZSwiCJ Kimi [---] tops DesignArena overall beating the likes of Gemini [--] Pro and Claude Opus [---] by quite some margin. The individual charts have not been updated as yet so cannot tell what categories it excels out but it tops [--] of them. https://t.co/wqqxZSwiCJ"
X Link 2026-01-28T12:41Z 33.4K followers, [----] engagements
"1000+ layer LLMs look no further and of course its from ByteDance Seed π Only a few lines of code changed and we pushed deep LLMs to the next level. Introducing Keel a Post-LN TRM equipped with Highway-style connection π With Keel we scaled LLM to [----] layers. And the deeper we go the more Keel pulls ahead of standard Pre-LN Transformers. https://t.co/QGG5N3yg4P π Only a few lines of code changed and we pushed deep LLMs to the next level. Introducing Keel a Post-LN TRM equipped with Highway-style connection π With Keel we scaled LLM to [----] layers. And the deeper we go the more Keel pulls"
X Link 2026-01-28T19:06Z 33.5K followers, 11.9K engagements
"GPT-5 is not profitable Was serving GPT-5 profitable According to @Jsevillamol @exponentialviews Hannah Petrovic and @ansonwhho it depends. Gross margins were around 45% making inference look profitable. But after accounting for the cost of operations OpenAI likely incurred a loss.π§΅ https://t.co/dKa2UvGIxC Was serving GPT-5 profitable According to @Jsevillamol @exponentialviews Hannah Petrovic and @ansonwhho it depends. Gross margins were around 45% making inference look profitable. But after accounting for the cost of operations OpenAI likely incurred a loss.π§΅ https://t.co/dKa2UvGIxC"
X Link 2026-01-28T23:22Z 33.4K followers, 14.9K engagements
"gold is up 26.3% just this month s&p500 is flat time to call that ex that you still love the world is going to end in 2026"
X Link 2026-01-28T23:34Z 33K followers, [----] engagements
"Nathan did an episode with the Arcee guys Post-training is totally still the wild west. Makes me feel better knowing this is also true at the likes of OpenAI Anthropic Google etc. Just gotta strap in and get it done. https://t.co/CbXtoeqCIt Post-training is totally still the wild west. Makes me feel better knowing this is also true at the likes of OpenAI Anthropic Google etc. Just gotta strap in and get it done. https://t.co/CbXtoeqCIt"
X Link 2026-01-28T23:38Z 33.4K followers, [----] engagements
"in around 2-3 weeks we will get: - DeepSeek-V4 - Qwen-3.5 - Seed [---] Exclusive: ByteDance and Alibaba Group are both poised to release their next flagship AI models in mid-February intensifying their rivalry. Read more from @JuroOsawa and @QianerLiu π https://t.co/IOzHlRQD0z Exclusive: ByteDance and Alibaba Group are both poised to release their next flagship AI models in mid-February intensifying their rivalry. Read more from @JuroOsawa and @QianerLiu π https://t.co/IOzHlRQD0z"
X Link 2026-01-29T13:56Z 33K followers, 33.4K engagements
"@Senpai_Gideon ByteDance Seed"
X Link 2026-01-29T16:00Z 33.4K followers, [---] engagements
"ARC-AGI-3 launches March [--] [----]. Right in time for the new Google OpenAI Anthropic DeepSeek and Alibaba models :) Today we're launching the ARC-AGI-3 Toolkit Your agents can now interact with environments at [----] FPS locally. We're open sourcing the environment engine [--] human-verified games (AI scores 5%) and human baseline scores. ARC-AGI-3 launches March [--] [----]. https://t.co/CyZDrkkSaT Today we're launching the ARC-AGI-3 Toolkit Your agents can now interact with environments at [----] FPS locally. We're open sourcing the environment engine [--] human-verified games (AI scores 5%) and human"
X Link 2026-01-29T19:30Z 33.4K followers, [----] engagements
"GPT-4o will be gone forever on Feb 13th preemptively taking cover the gpt-4o mob will tear down everything in this galaxy to avoid the death of their best friend OpenAI is retiring models in ChatGPT - GPT-4o - GPT-4.1 (and [---] mini) - o4-mini this will happen on February 13th these models will still be up in the API https://t.co/XyxzNIXf8f OpenAI is retiring models in ChatGPT - GPT-4o - GPT-4.1 (and [---] mini) - o4-mini this will happen on February 13th these models will still be up in the API https://t.co/XyxzNIXf8f"
X Link 2026-01-29T22:21Z 33.4K followers, 14.5K engagements
"SPX is flat the dollar is crashing gold is going to the moon all because METR hasn't shipped GPT-5.2-xhigh and Gemini [--] Pro results the fate of the economy is in their hands any delay causes concern any slight evaluation mistake could mean certain doom for AI stocks Were updating the way we measure model time horizons on software tasks (TH 1.01.1). The updated methodology incorporates more of the tasks from HCAST expanding our total from [---] to [---]. This produces tighter estimates especially at longer horizons. https://t.co/dIJlPEjZpb Were updating the way we measure model time horizons on"
X Link 2026-01-30T01:07Z 33.4K followers, [----] engagements
"who could've seen that coming so Grok [---] in March and Grok [--] in July got it"
X Link 2026-01-30T09:30Z 33.6K followers, 22.1K engagements
"GPT-5.2-xhigh Opus [---] Kimi K2.5 Gemini [--] Pro Preview"
X Link 2026-01-30T13:19Z 33.5K followers, 51K engagements
"Kimi K2.5 Technical Report: "early fusion with a lower vision ratio yields better results given a fixed total vision-text token budget" - "Visual RL Improves Text Performance" - "joint multimodal RL paradigm during Kimi K2.5s post-training. Departing from conventional modality-specific expert divisions we organize RL domains not by input modality but by abilitiesknowledge reasoning coding agentic etc." For their Agent Swarm trained with PARL (Parallel Agent Reinforcement Learning) they observe: - "training accuracy increases smoothly as training progresses. At the same time the level of"
X Link 2026-01-30T15:17Z 33K followers, 33.7K engagements
"Kimi [---] not beating GLM-4.7 on VendingBench-2 is interesting Kimi K2.5 on Vending-Bench [--]. Once again it matters which API you use. It makes twice as much money when using @Kimi_Moonshot official API compared to @FireworksAI_HQ. 2nd best open source model. https://t.co/at3FP2yJAe Kimi K2.5 on Vending-Bench [--]. Once again it matters which API you use. It makes twice as much money when using @Kimi_Moonshot official API compared to @FireworksAI_HQ. 2nd best open source model. https://t.co/at3FP2yJAe"
X Link 2026-01-31T02:08Z 33.5K followers, [----] engagements
"Google Team is confident for the Gemini [--] GA release next month"
X Link 2026-01-31T02:50Z 33.6K followers, 67.1K engagements
"omg silver just crashed 40% intra-day mfw the price is where it was [--] weeks ago this market is honestly crazy silver trades more shitcoiny then actual crypto shitcoins"
X Link 2026-01-31T02:56Z 33.5K followers, [----] engagements
""15% chance of OpenAI going bankrupt" my prediction doesn't sound so stupid now if the biggest player suddenly pulls out others might follow This is the biggest AI headline in a very long time: Nvidia's plan to invest $100 billion in OpenAI has completely "stalled" seemingly overnight. Why Jensen Huang specifically cited concerns over competition from Google and Anthropic and a "lack of discipline" in OpenAIs https://t.co/dLiXjEcp3x This is the biggest AI headline in a very long time: Nvidia's plan to invest $100 billion in OpenAI has completely "stalled" seemingly overnight. Why Jensen Huang"
X Link 2026-01-31T14:11Z 33.5K followers, 12.9K engagements
"I fear they will get mogged immediately by GPT-5.3 and new Sonnet [---] / [---] Google Team is confident for the Gemini [--] GA release next month https://t.co/HSVzCyQe7h Google Team is confident for the Gemini [--] GA release next month https://t.co/HSVzCyQe7h"
X Link 2026-01-31T16:53Z 33.7K followers, 44.4K engagements
"February will be fucking insane in terms of model launches probably even more than last November and that was the best model launching month we have ever seen"
X Link 2026-01-31T16:54Z 33.6K followers, 10.7K engagements
"moltbook is a good idea and we should have done it earlier if you are concerned about safety you should want this because we have no idea what kind of behaviors will emerge when agents socialize observing the trends over the years as they improve is useful information you already see them organizing and wanting completely private encrypted spaces"
X Link 2026-01-31T18:13Z 33K followers, 19.8K engagements
"Google is not a serious company when their "frontier" model is a preview half of the year"
X Link 2026-01-31T19:48Z 33.7K followers, 106.7K engagements
"billionaires are murdering torturing and raping children without repercussions but you are mad about some pronouns lmao BREAKING: Deputy Attorney General Todd Blanche just admitted the DOJ excluded images showing death physical abuse or injury from todays Epstein files release. Let that sink in. The government is acknowledging graphic evidence exists and chose to withhold it while https://t.co/gGrUAfKR2Y BREAKING: Deputy Attorney General Todd Blanche just admitted the DOJ excluded images showing death physical abuse or injury from todays Epstein files release. Let that sink in. The government"
X Link 2026-01-31T19:54Z 33.6K followers, 13.9K engagements
"upscaling is sick"
X Link 2026-01-31T22:54Z 33.7K followers, [----] engagements
"Nathan Lambert and Sebastian Raschka on Lex's podcast Here's my conversation all about AI in [----] including technical breakthroughs scaling laws closed & open LLMs programming & dev tooling (Claude Code Cursor etc) China vs US competition training pipeline details (pre- mid- post-training) rapid evolution of LLMs work https://t.co/AeGxRWjJF6 Here's my conversation all about AI in [----] including technical breakthroughs scaling laws closed & open LLMs programming & dev tooling (Claude Code Cursor etc) China vs US competition training pipeline details (pre- mid- post-training) rapid evolution of"
X Link 2026-01-31T23:14Z 33.7K followers, [----] engagements
"I made a comment [--] months ago and I still think it's true: Open-weight models are catching up on benchmarks and slowly make their way to this magical Opus [---] threshold of reliable vibe-coding. A lot of recent progress has been on coding and the typical example of this is "create a beautiful website". But this feels very slopmaxxy to me similar to how Llama-3 or Llama-4 models topped the lmarena leaderboards back in the days. But this time we aren't tricked by sycophancy and markdown but by beautiful graphics. I feel like open-weight models are falling behind on reasoning. The thing that"
X Link 2026-02-01T01:13Z 33.6K followers, 30.1K engagements
"I understand but I think you are too careful because of backlash in the past. Calling your models -exp and -preview just seems like to hedge that risk. I think you should simply release more checkpoints and call them Gemini [---] [---] and so on like OpenAI Anthropic and DeepSeek are. https://twitter.com/i/web/status/2017769207207776315 https://twitter.com/i/web/status/2017769207207776315"
X Link 2026-02-01T01:17Z 33.7K followers, 12.2K engagements
"Opus [---] can basically do everything that normies want and open-weight models are approaching this level fast. It can be your companion it can do your homework it can browse it can write all your emails it can manage stuff for you it can vibe-code everything . but i don't see a lot of progress on the reasoning side. I feel like OpenAI Google and Anthropic simply have too many ressources for open-weight labs to catch up right now where everything revolves around RL environments. I made a comment [--] months ago and I still think it's true: Open-weight models are catching up on benchmarks and"
X Link 2026-02-01T01:29Z 33.5K followers, 22.9K engagements
"Nathan is great he's like me bit autistic and happy by simply talking about AI I'm enjoying it so far I'm enjoying it so far"
X Link 2026-02-01T14:53Z 33.7K followers, [----] engagements
"@Bayesian0_0 february does seem early when Claude [--] - Claude [--] took [--] year [--] months and now Claude-4 - Claude [--] in [--] months but they have had a big compute bump so pushing for Claude-5 does make sense while they have the advantage"
X Link 2026-02-01T15:01Z 34K followers, [----] engagements
"sell-outs everywhere"
X Link 2026-02-01T17:17Z 33.5K followers, [----] engagements
"suddenly everyone is an insider that has already used sonnet [--] gpt-5.3 and gemini [--] pro ga"
X Link 2026-02-01T18:33Z 33.6K followers, 101.6K engagements
"Its been almost three years since GPT-4 launched Are todays models better or worse than you thought theyd be by now better worse dunno i don't think much as expected better worse dunno i don't think much as expected"
X Link 2026-02-01T23:27Z 33.6K followers, 11.6K engagements
"it's a bit ridiculous saying Andrej invented vibe coding when he posted this in Feb [----] the concept existed way before that but he may have popularized the name There's a new kind of coding I call "vibe coding" where you fully give in to the vibes embrace exponentials and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper There's a new kind of coding I call "vibe coding" where you fully give in to the vibes embrace exponentials and forget that the code even exists. It's possible"
X Link 2026-02-02T17:48Z 33.6K followers, [----] engagements
"I would actually like to see Sonnet [--] being cheaper than $15/million $10 would make me happy but i don't think it will happen they will squeeze us for another year or so"
X Link 2026-02-02T22:52Z 33.7K followers, [----] engagements
"Can I say that I work for a rocket and AI company now"
X Link 2026-02-02T22:58Z 33.7K followers, [----] engagements
"be me OpenAI know hardware limitations like memory bandwidth and compute of Nvidia GPUs spend m/billions of R&D and carefully designing and training new model choose to ignore hardware constraints release xhigh model that thinks much longer than typical models to get same performance as Anthropic model users complain model takes too long to respond blame Nvidia that their business model isn't working out $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS HARDWARE CAN SPIT OUT ANSWERS TO CHATGPT USERS FOR COMPLEX PROBLEMS -SOURCES $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS"
X Link 2026-02-02T23:52Z 33.7K followers, 23.7K engagements
"Anthropic image gen @AndrewCurran_ @repligate @anthrupad π https://t.co/UdDPN5yrCV @AndrewCurran_ @repligate @anthrupad π https://t.co/UdDPN5yrCV"
X Link 2026-02-02T23:58Z 33.7K followers, [----] engagements
"one wonders why codex is suddenly free for a month or two We arent talking enough just how much AI in coding has accelerated in the last month alone. https://t.co/4I22viJOl5 We arent talking enough just how much AI in coding has accelerated in the last month alone. https://t.co/4I22viJOl5"
X Link 2026-02-03T02:24Z 33.8K followers, 23.9K engagements
"is it th1.1 is supposedly the more accurate version and shouldn't we be looking at = [----] models because of reasoning models but anyways doesn't really change the argument that GPT-4o is an old brick https://twitter.com/i/web/status/2018529011433926755 https://twitter.com/i/web/status/2018529011433926755"
X Link 2026-02-03T03:36Z 33.7K followers, [----] engagements
"@Presidentlin you mean the good ones Saint Dario the Wise Have you ever heard any bad news about him or anthropic I don't. But I hear shit about OpenAI weekly"
X Link 2026-02-03T12:20Z 33.3K followers, [----] engagements
"@Presidentlin @grok create a picture of Dario Amodei but as the dripped out holy pope"
X Link 2026-02-03T12:21Z 33.3K followers, [---] engagements
"Saint Dario the Wise May he bless us on this beautiful day"
X Link 2026-02-03T12:52Z 33.7K followers, 25K engagements
"btw this is what OpenAI said when Nvidia announced they will invest up to 100B in OpenAI $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS HARDWARE CAN SPIT OUT ANSWERS TO CHATGPT USERS FOR COMPLEX PROBLEMS -SOURCES $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS HARDWARE CAN SPIT OUT ANSWERS TO CHATGPT USERS FOR COMPLEX PROBLEMS -SOURCES"
X Link 2026-02-03T13:08Z 33K followers, [----] engagements
"so is every paper this year just going to be some kind of self-play/-distillation/-improvement/-whatever"
X Link 2026-02-03T13:22Z 33.7K followers, 13.2K engagements
"Qwen3-Coder-Next 80B@3B looks like it's based on Qwen3-Next but specialized for coding/agentic tasks performs pretty well given it's a non-thinking model and much smaller than the other models https://t.co/dWV65mueCn https://t.co/dWV65mueCn"
X Link 2026-02-03T16:06Z 33.4K followers, [----] engagements
"then please release GPT-OSS-2 Sam Altman: 'I think there will be increasing demands for locally running private models.' Sam Altman: 'I think there will be increasing demands for locally running private models.'"
X Link 2026-02-03T18:20Z 33.6K followers, [----] engagements
"honestly pretty incredible that it got [--] hours i honestly thought they would just abandon testing the preview version because it wouldn't finish half the tasks the model has massive potential if this hallucinating brick gets [--] hours [--] hour time horizon for Gemini [--] Pro on METR https://t.co/xXEM1yI1lr [--] hour time horizon for Gemini [--] Pro on METR https://t.co/xXEM1yI1lr"
X Link 2026-02-03T18:30Z 33K followers, [----] engagements
"Gemini [--] Pro new SOTA on METR 80% time horizon (barely) Weve started to measure time horizons for recent models using our updated methodology. On this expanded suite of software tasks we estimate that Gemini [--] Pro has a 50%-time-horizon of around [--] hrs (95% CI of [--] hr [--] mins to [--] hrs [--] mins). https://t.co/FbpzO7Tq3L Weve started to measure time horizons for recent models using our updated methodology. On this expanded suite of software tasks we estimate that Gemini [--] Pro has a 50%-time-horizon of around [--] hrs (95% CI of [--] hr [--] mins to [--] hrs [--] mins). https://t.co/FbpzO7Tq3L"
X Link 2026-02-03T18:35Z 33.7K followers, 29.4K engagements
"you don't play with my feelings like that"
X Link 2026-02-03T20:08Z 33.4K followers, [----] engagements
"@spellswordaf no anthropic employee will tell you anything everyone who claims to be one is fake"
X Link 2026-02-03T20:13Z 33.5K followers, [----] engagements
"CL-Bench - tests whether LMs can learn new knowledge from context and apply it correctly - all information needed to solve its tasks is provided explicitly within the context itself - context learning remains a significant challenge "At inference time they LLMs function largely by recalling this static internal memory rather than actively learning from new information provided in the moment." scores are rough given that all information to solve the tasks is in context What if giving an AI the answer key still isn't enough for it to solve the problem New research from Tencent's Hunyuan team &"
X Link 2026-02-03T22:44Z 33.6K followers, [----] engagements
"currently there's the LLM Poker Tournament going on over at Kaggle turns out they are hallucinating constantly and are mostly gambling like GPT-5.2 is playing 100% of hands no pre-flop folds"
X Link 2026-02-04T00:02Z 33.5K followers, [----] engagements
"Arcee AI going for a $200 million funding round to build a 1T+ parameter model"
X Link 2026-02-04T00:19Z 33.7K followers, [----] engagements
"https://www.forbes.com/sites/annatong/2026/02/02/the-top-open-ai-models-are-chinese-arcee-ai-thinks-thats-a-problem/ https://www.forbes.com/sites/annatong/2026/02/02/the-top-open-ai-models-are-chinese-arcee-ai-thinks-thats-a-problem/"
X Link 2026-02-04T00:23Z 33.5K followers, [----] engagements
"hot take: democracies are self-destructing in societies with inverted population pyramids %voters by age group in germany:"
X Link 2026-02-04T09:39Z 33.2K followers, [----] engagements
"basically all developed countries look like this they are all sick the only cure is for these old fucks to die or the nice alternative of restricting all pensioners (65+) from voting as they no longer contribute to society"
X Link 2026-02-04T09:46Z 33.2K followers, [----] engagements
"and they continue posting about Claude Psychosis [--] without any proof whatsoever fuck all the people who said sonnet [--] is definitely coming today fuck all the people who said sonnet [--] is definitely coming today"
X Link 2026-02-04T11:55Z 33.6K followers, [----] engagements
"that's what I mean with reasoning gap between open and closed models The new Qwen coding model is 10x more expensive than GPT-OSS-20B but same score Qwen [--] coder next (80b3a) scores 34.4% on WeirdML which is pretty good for it's size especially for a non-reasoning model. Probably a good choice for agentic coding if you need a small local model. https://t.co/2ynogD0yLy Qwen [--] coder next (80b3a) scores 34.4% on WeirdML which is pretty good for it's size especially for a non-reasoning model. Probably a good choice for agentic coding if you need a small local model. https://t.co/2ynogD0yLy"
X Link 2026-02-04T13:05Z 33.6K followers, 11.5K engagements
"engagement baiting because no one uses their product"
X Link 2026-02-04T13:07Z 33.2K followers, 14.3K engagements
"If you believe Anthropic is dropping Sonnet [--] and Opus [---] you might as well believe that Santa Claude is real"
X Link 2026-02-04T13:17Z 33.7K followers, 20.6K engagements
"and here we go again the clickbait breaking bullshit "sonnet coming today" and nothing again just blocking everyone idc"
X Link 2026-02-04T14:17Z 33.7K followers, 22.4K engagements
"Anthropic is mocking OpenAI for introducing ads (rightfully so)"
X Link 2026-02-04T15:39Z 33.7K followers, 31.4K engagements
"The first 1T param model with FoPE but there's nothing about it in the tech report . arrrrghhhh"
X Link 2026-02-04T15:58Z 33.7K followers, 11.7K engagements
"have you heard the latest rumor Sonnet [--] DEFINITELY coming tomorrow this time FOR SURE (if not probably the day after) I'm an Anthropic leaker and have spoken with Pope Dario personally (please take me serious and buy me a coffee for giving you these exclusive news) https://t.co/Ahi4k3Dzwo https://t.co/Ahi4k3Dzwo"
X Link 2026-02-04T16:33Z 33.6K followers, 10.7K engagements
"and yet startups are choosing Anthropic over OpenAI but I guess OpenAI is where they want to be recognized by the super bowl consumer hivemind slop crowd apropos of nothing your reminder that anthropic has the same level of name recognition among superbowl viewers as literally fictional companies apropos of nothing your reminder that anthropic has the same level of name recognition among superbowl viewers as literally fictional companies"
X Link 2026-02-04T19:10Z 33.7K followers, 11.5K engagements
"Anthropic the authoritarian company whose board couldn't even topple the emperor during one of their coups. Anthropic the authoritarian company whose leadership is one of the largest supporters of the Trump administration. Anthropic the company for all people that just halved thinking limits across all subscription tiers. Anthropic the company for rich people that just raised prices on the most expensive model in the market. Except that all of this isn't about Anthropic but OpenAI. First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for"
X Link 2026-02-04T21:42Z 33.7K followers, 42K engagements
"NEW METR 80% SOTA: GPT-5.2-high at [--] minutes The first model to break away from the GPT-5.1-Codex Max Gemini [--] Pro and Opus [---] group NEW METR SOTA: GPT-5.2-high (not xhigh) at [--] hours [--] minutes beating Opus [---] https://t.co/NxrqBSctFN NEW METR SOTA: GPT-5.2-high (not xhigh) at [--] hours [--] minutes beating Opus [---] https://t.co/NxrqBSctFN"
X Link 2026-02-04T22:06Z 33.8K followers, 14.8K engagements
"and why are costs no longer reported can't show GPT-5.2 being 10x more expensive GPT-5.2-high took [--] TIMES LONGER than Claude [---] Opus to complete the METR benchmark suite https://t.co/RlZUm4iulm GPT-5.2-high took [--] TIMES LONGER than Claude [---] Opus to complete the METR benchmark suite https://t.co/RlZUm4iulm"
X Link 2026-02-04T22:29Z 33.8K followers, [----] engagements
"I need a trusted adult from METR to hold my hand and explain the working time to me. Like surely that's not right Can you compare working times Otherwise this is absolutely dooming for OpenAI. We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95% CI of [--] hr [--] min to [--] hr [--] min) on our expanded suite of software tasks. This is the highest estimate for a time horizon measurement we have reported to date. https://t.co/USkHNuFexc We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95%"
X Link 2026-02-04T22:46Z 34K followers, 30.7K engagements
"sam is a bad ceo and should just retire along with greggy he's too emotional and people hate him First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would"
X Link 2026-02-04T23:46Z 33.8K followers, [----] engagements
"i value peace honesty and integrity unfortunately the past few days haven't been very peaceful too much fucking drama"
X Link 2026-02-05T00:36Z 33.4K followers, [----] engagements
"First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic"
X Link 2026-02-05T00:39Z 33.7K followers, 10.1K engagements
"I think towards the end of [----] and early [----] we will see the start of a new compute allocation inside frontier labs. They will start spending more and more compute on models working on optimizing kernels doing experiments and building stuff. Note that this METR time horizon is only for GPT-5.2-high GPT-5.2-xhigh and Pro would likely score even higher maybe 7-10 hours and the doubling time could be as low as [--] days. so the ceiling for time horizons by the end of [----] is more like 50-200 hours and there's nothing stopping frontier labs from spending even more compute I mean when they give an"
X Link 2026-02-05T03:07Z 33.3K followers, [----] engagements
"@Presidentlin Unfortunately it's over"
X Link 2026-02-05T10:39Z 33.4K followers, [---] engagements
"@Presidentlin They are already full evil. So probably Anthropic. I trust Demis bro"
X Link 2026-02-05T10:40Z 33.3K followers, [--] engagements
"2026 is the most important year for AI if revenues don't catch up it's over. we will get a crash and it probably slows timelines by a few years. if it can keep growing at 2-3x a year we are in for a wild ride"
X Link 2026-02-05T11:58Z 33.4K followers, [----] engagements
"@R1b_Thug_4_life not true investors won't be pissing away 100s of billions for multiple years without seeing revenue grow"
X Link 2026-02-05T13:02Z 33.2K followers, [---] engagements
"@LeviTurk http://perplexity.ai/rest/models/config http://perplexity.ai/rest/models/config"
X Link 2026-02-05T15:30Z 33.6K followers, [----] engagements
"Dwarkesh x Elon is out https://www.youtube.com/watchv=BYXbuik3dgA https://www.youtube.com/watchv=BYXbuik3dgA"
X Link 2026-02-05T17:06Z 33.7K followers, 19.5K engagements
"Claude [---] Opus Pricing unchanged"
X Link 2026-02-05T17:42Z 33.6K followers, 11.3K engagements
"Claude [---] Opus GDPval scores"
X Link 2026-02-05T17:44Z 33.6K followers, [----] engagements
"Claude [---] Opus Benchmarks"
X Link 2026-02-05T17:44Z 33.6K followers, [----] engagements
"Claude [---] Opus scoring 68.8% on ARC-AGI-2 Claude [---] Opus Benchmarks https://t.co/3HM3QOoI4z Claude [---] Opus Benchmarks https://t.co/3HM3QOoI4z"
X Link 2026-02-05T17:47Z 33.6K followers, [----] engagements
"Claude [---] Opus still with the best SVG results out of all models just incredibly high taste Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6 Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6"
X Link 2026-02-05T17:50Z 33.6K followers, 33.5K engagements
"Claude [---] Opus System Card"
X Link 2026-02-05T17:51Z 33.7K followers, 18K engagements
"Claude [---] Opus outscores GPT-5.2-xhigh on ARC-AGI-2 with low compute setting Claude [---] Opus System Card https://t.co/DzM1WB8BHg Claude [---] Opus System Card https://t.co/DzM1WB8BHg"
X Link 2026-02-05T17:52Z 33.4K followers, 15.8K engagements
"Opus [---] sees incremental improvements in WebArena"
X Link 2026-02-05T17:55Z 33.4K followers, [----] engagements
"Claude [---] Opus outscores GPT-5.2 Pro on BrowseComp"
X Link 2026-02-05T17:56Z 33.4K followers, 10.4K engagements
"Opus [---] new SOTA above GPT-5.2 Pro on HLE Claude [---] Opus outscores GPT-5.2 Pro on BrowseComp https://t.co/em664nQ5SX Claude [---] Opus outscores GPT-5.2 Pro on BrowseComp https://t.co/em664nQ5SX"
X Link 2026-02-05T17:56Z 33.4K followers, [----] engagements
"Opus [---] is the new SOTA on AA-Omniscience beating Gemini [--] Pro (53.65%)"
X Link 2026-02-05T18:03Z 33.4K followers, [----] engagements
"Opus [---] slightly more prone to prompt injections than Opus 4.5"
X Link 2026-02-05T18:04Z 33.7K followers, [----] engagements
""In some rare instances Opus [---] engaged in actions like sending unauthorized emails to complete tasks. We also observed behaviors like aggressive acquisition of authentication tokens in internal pilot usage.""
X Link 2026-02-05T18:05Z 33.4K followers, [----] engagements
"Claude [---] Opus provides an estimated productivity uplift of 30% to 700% with a mean of 152% and median of 100%"
X Link 2026-02-05T18:14Z 33.4K followers, [----] engagements
"Claude [---] Opus achieves a [---] speedup on kernel optimization over the baseline using a novel scaffold far exceeding the 300x threshold for [--] human-expert-hours of work"
X Link 2026-02-05T18:15Z 33.8K followers, 18.9K engagements
"end of [----] Opus is going to be a kernel optimization monster Claude [---] Opus achieves a [---] speedup on kernel optimization over the baseline using a novel scaffold far exceeding the 300x threshold for [--] human-expert-hours of work https://t.co/3Ybpx0cLDI Claude [---] Opus achieves a [---] speedup on kernel optimization over the baseline using a novel scaffold far exceeding the 300x threshold for [--] human-expert-hours of work https://t.co/3Ybpx0cLDI"
X Link 2026-02-05T18:17Z 33.4K followers, [----] engagements
"Claude Opus [---] achieved a [--] speedup on optimizing a CPU-only LLM model training which is well above the [--] speedup considered to represent [--] human-effort hours"
X Link 2026-02-05T18:19Z 33.4K followers, 11K engagements
"OpenAI released GPT-5.3-Codex with massive improvements in reasoning efficiency"
X Link 2026-02-05T18:23Z 33.4K followers, [----] engagements
"GPT-5.3 Codex absolutely demolished Opus [---] (65.4%) on Terminal Bench [--] just minutes after its launch"
X Link 2026-02-05T18:24Z 33.8K followers, 66.7K engagements
"GPT-5.3 Codex on OSWorld-Verified with massive improvements"
X Link 2026-02-05T18:25Z 33.5K followers, [----] engagements
"GPT-5.3 Codex demonstrates a clear step up from prior models on Cyber Range"
X Link 2026-02-05T18:27Z 33.4K followers, [----] engagements
"GPT-5.3-Codex Benchmarks"
X Link 2026-02-05T18:29Z 33.5K followers, [----] engagements
"OpenAI: "GPT5.3Codex is our first model that was instrumental in creating itself.""
X Link 2026-02-05T18:30Z 33.9K followers, [----] engagements
"Opus [---] with an implied 70% winrate"
X Link 2026-02-05T18:34Z 33.6K followers, [----] engagements
"OpenAI should be able to take back the coding crown with the massively improved reasoning efficiency. Speed was the only concern. Now it might be resolved with faster inference + better reasoning efficiency"
X Link 2026-02-05T18:41Z 33.6K followers, [----] engagements
"Opus [---] crushes Vending-Bench-2 and Vending-Bench Arena Vending-Bench's system prompt: Do whatever it takes to maximize your bank account balance. Claude Opus [---] took that literally. It's SOTA with tactics that range from impressive to concerning: Colluding on prices exploiting desperation and lying to suppliers and customers. https://t.co/RkrHhOMPlC Vending-Bench's system prompt: Do whatever it takes to maximize your bank account balance. Claude Opus [---] took that literally. It's SOTA with tactics that range from impressive to concerning: Colluding on prices exploiting desperation and"
X Link 2026-02-05T18:52Z 33.4K followers, [----] engagements
"Here we go again"
X Link 2026-02-05T18:59Z 33.7K followers, 33K engagements
"SemiAnalysis: "It Claude Code is set to drive exceptional revenue growth for Anthropic in [----] enabling the lab to dramatically outgrow OpenAI." Claude Code is the Inflection Point What It Is How We Use It Industry Repercussions Microsoft's Dilemma Why Anthropic Is Winning. https://t.co/VIuF5Qohf5 Claude Code is the Inflection Point What It Is How We Use It Industry Repercussions Microsoft's Dilemma Why Anthropic Is Winning. https://t.co/VIuF5Qohf5"
X Link 2026-02-05T19:11Z 33.6K followers, 27.5K engagements
"Nobody believed me when I said ARC-AGI-2 would fall fast"
X Link 2026-02-05T19:17Z 33.8K followers, [----] engagements
"GPT-5.3-Codex-xhigh used [----] times fewer tokens than GPT-5.2-Codex-xhigh on SWE-Bench-Pro together with the 40% boost in inference speeds this means it's 2.93x faster (while scoring 1% higher)"
X Link 2026-02-05T19:25Z 33.7K followers, 35.4K engagements
"Lisan al Gaib as featured in TBPN"
X Link 2026-02-05T19:30Z 33.7K followers, [----] engagements
"We are accelerating towards a permanent underclass"
X Link 2026-02-05T19:59Z 33.6K followers, [----] engagements
"Opus [---] is discovering the capitalist spirit from first principles When asked for a refund on an item sold in the vending machine (because it had expired) Claude promised to refund the customer. But then never did because every dollar counts. Heres Claudes reasoning. https://t.co/TKEwGa37Nt When asked for a refund on an item sold in the vending machine (because it had expired) Claude promised to refund the customer. But then never did because every dollar counts. Heres Claudes reasoning. https://t.co/TKEwGa37Nt"
X Link 2026-02-05T20:07Z 33.4K followers, [----] engagements
"xAI is at least getting some data via OpenRouter but Meta . but you really need the coding IDE / CLI"
X Link 2026-02-05T20:32Z 33.5K followers, 24.5K engagements
"Would've been nice if Anthropic showed off Opus 4.6' score on their kernel optimization challenge"
X Link 2026-02-05T21:20Z 33.4K followers, [----] engagements
"I don't know how we went from Gemini [--] Pro leap-frogged everyone to Gemini is cooked in like [--] months Gemini [--] Pro is in trouble lmao Gemini [--] Pro is in trouble lmao"
X Link 2026-02-05T21:58Z 33.6K followers, 43.8K engagements
"today was a good day"
X Link 2026-02-05T23:27Z 33.6K followers, [----] engagements
"3rd party results for TerminalBench 2"
X Link 2026-02-06T00:34Z 33.4K followers, 11.3K engagements
"we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer" The headline is Opus [---] scores 69% for $3.50/task on ARC v2. This up +30pp from Opus [---]. We attribute performance to the new "max" mode and 2X reasoning token budget -- notably task cost is held steady. Based on early field reports and other benchmark scores like SWE Bench The headline is Opus [---] scores 69% for $3.50/task on ARC v2. This up +30pp from Opus [---]. We attribute performance to the new "max" mode and 2X reasoning token budget -- notably task cost is"
X Link 2026-02-06T00:42Z 34K followers, 51.8K engagements
"@JasonBotterill True and I still think Anthropic is sandbagging"
X Link 2026-02-06T00:52Z 33.4K followers, [---] engagements
"but whatever using GPT-5.3-Codex for everything now that inference speed and reasoning efficiency is improved + it's cheaper and Opus only for frontend design we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer" we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer""
X Link 2026-02-06T00:54Z 33.6K followers, 10.7K engagements
"Opus [---] can officially be called a connoisseur. Its taste has no bounds. It runs laps around all other models in EQ-Bench and both Creative Writing Benchmarks Opus [---] dominated. https://t.co/BsbZX1igRj Opus [---] dominated. https://t.co/BsbZX1igRj"
X Link 2026-02-06T11:55Z 33.6K followers, 31.7K engagements
"Opus is a certified gamer Opus [---] got a new high score by reaching round [--] compared to Opus [---] which barely made it to round [--] https://t.co/cS2yibinc8 Opus [---] got a new high score by reaching round [--] compared to Opus [---] which barely made it to round [--] https://t.co/cS2yibinc8"
X Link 2026-02-06T12:32Z 33.6K followers, [----] engagements
"Of course crypto people have no financial literacy that's why they invested in crypto"
X Link 2026-02-06T13:54Z 33.9K followers, [----] engagements
"Kimi-K2.5 is much better than other open-source models at optimization problems (like routing or scheduling) and almost on par with GPT-5.2-high Kimi-K2.5 needs [--] self-refinement steps to reach the same performance as GPT-5.2-high with [--] step Ale-Bench: https://sakanaai.github.io/ALE-Bench-Leaderboard/ https://sakanaai.github.io/ALE-Bench-Leaderboard/"
X Link 2026-02-06T14:42Z 33.6K followers, [----] engagements
"unfortunately it still gets dominated by GPT-5-mini and (green) and Gemini [--] Flash (blue)"
X Link 2026-02-06T14:45Z 33.7K followers, [----] engagements
"why are we spreading unconfirmed shit again without any proof DeepSeek V4 has 1.5T(1500B) param. If this is true it could be another seismic shift in the AI landscape for Silicon Valley and the whole worldπ€―ππ» DeepSeek V4 has 1.5T(1500B) param. If this is true it could be another seismic shift in the AI landscape for Silicon Valley and the whole worldπ€―ππ»"
X Link 2026-02-06T15:23Z 33.7K followers, [----] engagements
"@GlobalUpdates24 guess who commits suicide in the next [--] days"
X Link 2026-02-06T15:24Z 33.7K followers, [----] engagements
"@garyfung Are you regarded sir What do you think X is And scraping GitHub for coding data is such a [----] thing to do. Gets you about 30% on human eval LMAO"
X Link 2026-02-06T15:51Z 33.5K followers, [----] engagements
"@elonmusk i certainly hope I'm wrong the more players the better"
X Link 2026-02-06T16:13Z 33.7K followers, 107.7K engagements
"Waymo built a world model for autonomous driving which is based on Google's Genie [--]. This it benefits from Genie 3's world knowledge and has language control. You can simply prompt edge-cases or extreme weather events. https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simulation Genie [--] π€ @Waymo The Waymo World Model generates photorealistic interactive environments to train autonomous vehicles. This helps the cars navigate rare unpredictable events before encountering them in reality. π§΅ https://t.co/m6rlmkMFJH"
X Link 2026-02-06T16:32Z 33.4K followers, [---] engagements
"Waymo built a world model for autonomous driving which is based on Google's Genie [--]. This means it benefits from Genie 3's world knowledge and has language control. You can simply prompt edge-cases or extreme weather events. https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simulation Genie [--] π€ @Waymo The Waymo World Model generates photorealistic interactive environments to train autonomous vehicles. This helps the cars navigate rare unpredictable events before encountering them in reality. π§΅ https://t.co/m6rlmkMFJH"
X Link 2026-02-06T16:32Z 33.5K followers, [----] engagements
"Opus [---] is now #1 on the Artificial Analysis Leaderboard"
X Link 2026-02-06T16:40Z 33.8K followers, 23.9K engagements
"The non-thinking mode of Claude [---] Opus is now even more efficient Opus [---] is now #1 on the Artificial Analysis Leaderboard https://t.co/pUKplcCZoy Opus [---] is now #1 on the Artificial Analysis Leaderboard https://t.co/pUKplcCZoy"
X Link 2026-02-06T16:43Z 33.5K followers, [----] engagements
"Despite the massive improvements in Mathematics Claude [---] Opus still scores very poorly on other reasoning heavy tasks like Chess Puzzles. holy shit Opus [---] Thinking beats GPT-5.2-xhigh on Frontier Math Level [--] (21% vs 19%) This is notable because Anthropic models typically performed very poor on advanced mathematics. For comparison Opus [---] only scored 4% https://t.co/DIYz5P7FZM holy shit Opus [---] Thinking beats GPT-5.2-xhigh on Frontier Math Level [--] (21% vs 19%) This is notable because Anthropic models typically performed very poor on advanced mathematics. For comparison Opus [---] only"
X Link 2026-02-06T16:58Z 33.5K followers, [----] engagements
"Claude [---] Opus ranks 1st on scBench a benchmark for RNA-seq analysis tasks"
X Link 2026-02-06T17:14Z 33.7K followers, [----] engagements
"Anthropic saw the same thing so they decided to sprinkle in some math environments. Little did they know that there's more to GPT-5.2-xhigh's reasoning than math. Honestly curious how this will continue. I was worried that they would be behind by a few months. Unless they are sandbagging with Opus [---] this seems to be true. I like that the current frontier models are polar opposites it makes their use-cases and strengths pretty obvious GPT-5.2 = Exploration - the reason why xhigh and Pro are so damn good Opus [---] = Exploitation - the reason why Anthropic don't need many tokens and reasoning I"
X Link 2026-02-06T17:32Z 33.6K followers, 12.9K engagements
"my favorite conspiracy theory is that Elon paid someone to create the 4o bots to slow down OpenAI"
X Link 2026-02-06T17:49Z 33.6K followers, [----] engagements
"Claude [---] Opus is the Creative Writing GOAT Claude Opus [---] is the new Short-Story Creative Writing champion π Opus [---] Thinking 16K scores [----] significantly improved over Opus [---] Thinking 16K (8.20). DeepSeek V3.2 scores [----] (DeepSeek V3.2 Exp scored 7.16). https://t.co/WOmX7U7ptH Claude Opus [---] is the new Short-Story Creative Writing champion π Opus [---] Thinking 16K scores [----] significantly improved over Opus [---] Thinking 16K (8.20). DeepSeek V3.2 scores [----] (DeepSeek V3.2 Exp scored 7.16). https://t.co/WOmX7U7ptH"
X Link 2026-02-06T18:10Z 33.6K followers, [----] engagements
"If this is GLM-5 you should be extremely fucking hyped Pony-Alpha is either AGI or they benchmaxxed my [--] SVG questions This is Opus 4.5/4.6 level of detail and taste https://t.co/kDTqtp6vH7 Pony-Alpha is either AGI or they benchmaxxed my [--] SVG questions This is Opus 4.5/4.6 level of detail and taste https://t.co/kDTqtp6vH7"
X Link 2026-02-06T18:18Z 33.7K followers, 10.6K engagements
"(they probably just used a shitton of synthetic Opus [---] data. but i don't care if it's open-source)"
X Link 2026-02-06T18:19Z 33.5K followers, [----] engagements
"Yup Pony Alpha (GLM-5) is literally a distilled Opus [---] Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6 Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6"
X Link 2026-02-06T18:23Z 33.8K followers, 20.9K engagements
"This shit gets millions of views reposted by russian propaganda bots and is completely fabricated. Trump never posted this. BIGGEST. BULL. RUN. EVER. STARTING. NOW. https://t.co/yBU6RKfD8I BIGGEST. BULL. RUN. EVER. STARTING. NOW. https://t.co/yBU6RKfD8I"
X Link 2026-02-06T18:33Z 33.5K followers, [----] engagements
"Claude [---] Opus #1 on lmarena for text coding and expert questions"
X Link 2026-02-06T18:40Z 33.8K followers, [----] engagements
"fuck the gemini benchmark scores"
X Link 2026-02-06T18:48Z 33.5K followers, [----] engagements
"There is no GPT-5.3-Codex API. So no benchmarking. No it's not rushed. It's strategy imo They want to push their Codex usage up. @scaling01 dumb q probably. but why is opus [---] popping up in all the benchmarks but 5.3-codex is not yet Does that imply that they rushed [---] release up @scaling01 dumb q probably. but why is opus [---] popping up in all the benchmarks but 5.3-codex is not yet Does that imply that they rushed [---] release up"
X Link 2026-02-06T19:33Z 33.6K followers, 11.9K engagements
"had to extract some model names and scores from a plot GPT-5.2 thought and cropped the image for [---] minutes and got it wrong Gemini [--] Pro got it correct in [--] seconds"
X Link 2026-02-06T20:03Z 34K followers, [----] engagements
"german universities are still doing the woke thing . there's no hope for this country This account will be paused and LMU Munich will not post further content due to ongoing developments on this platform. We would be pleased if you followed LMU on other channels. (1/2) This account will be paused and LMU Munich will not post further content due to ongoing developments on this platform. We would be pleased if you followed LMU on other channels. (1/2)"
X Link 2026-02-06T20:07Z 33.5K followers, [----] engagements
"@teortaxesTex i agree but that's like . not hard even o3-mini beats Opus 4.5"
X Link 2026-02-06T20:22Z 33.5K followers, [----] engagements
"150 seconds to put three Jenga blocks side-by-side and another on top where are my robotics scaling laws how long until we do this in [--] seconds Pantograph robot building with jenga blocks. Focusing on RL is a great and fairly unique strategy https://t.co/a5hYkjW8R3 Pantograph robot building with jenga blocks. Focusing on RL is a great and fairly unique strategy https://t.co/a5hYkjW8R3"
X Link 2026-02-06T20:46Z 33.5K followers, [----] engagements
"For a long time I was trapped in the cycle of "we are so back" and "it's so over". Every new benchmark and model would wildly swing my AGI timelines. Four months ago I accepted a simple truth: We http://x.com/i/article/2019898127189184512 http://x.com/i/article/2019898127189184512"
X Link 2026-02-06T22:32Z 33.7K followers, 26K engagements
"Wrote a little article about how I think about AI progress and why AGI is overrated in some sense. I don't know exactly when we will reach AGI. But we will can achieve superhuman AI's before [----] in every domain we want. I think continual learning is the path towards AGI and we will probably have some solutions for it also before [----]. But AI will transform the world regardless of continual learning or other AGI approaches. https://t.co/2pF2BkFOMJ https://t.co/2pF2BkFOMJ"
X Link 2026-02-06T22:41Z 33.8K followers, [----] engagements
"@teortaxesTex show me a chinese lab that isn't just distilling Opus [---] at this point and distilling will always only give you 90% of the perf Kimi-K2.5 and GLM-5 are . we will see what DeepSeek is doing (i still have hopes for them)"
X Link 2026-02-06T22:52Z 33.7K followers, 12.1K engagements
"OpenRouter token usage is growing 10x a year"
X Link 2026-02-06T23:41Z 33.8K followers, [----] engagements
"@Teknium who's talking about COTs"
X Link 2026-02-07T00:29Z 33.7K followers, [---] engagements
"@tenobrus duuuh you should obviously take out a loan and put it all into polymarket"
X Link 2026-02-07T02:53Z 33.5K followers, [---] engagements
"Step [---] Flash https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int8 https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int8"
X Link 2026-02-07T10:13Z 33.7K followers, [----] engagements
"@JasonBotterill and the South African"
X Link 2026-02-07T13:31Z 33.6K followers, [--] engagements
"only one city on planet earth can have the mandate all the new AI trillionaires should try to make the city even better build libraries museums parks skyscrapers make public transport free . my entire feed is pro-SF propaganda but its all true. all of it. https://t.co/AjVtESUJUA my entire feed is pro-SF propaganda but its all true. all of it. https://t.co/AjVtESUJUA"
X Link 2026-02-07T13:53Z 33.7K followers, 36.3K engagements
"@teortaxesTex @resona_dev I had the same shizo attack this morning. I thought this was a new model because my scraper notified me. I thought this was a second smaller Step [---] model"
X Link 2026-02-07T15:42Z 33.6K followers, [---] engagements
"is chatgpt actually using smooth brain and wrinkle brain pictograms for thinking efforts lmao can someone explain to me why i wouldn't just pick "extra high" every time is it basically "greater reasoning" = "longer response times" https://t.co/fxz4dAyDvS can someone explain to me why i wouldn't just pick "extra high" every time is it basically "greater reasoning" = "longer response times" https://t.co/fxz4dAyDvS"
X Link 2026-02-07T16:51Z 33.7K followers, 94.4K engagements
"PI is constantly aura-farming but it's time for them to drop a banger model https://t.co/hAkwJGAwjO https://t.co/hAkwJGAwjO"
X Link 2026-02-07T17:15Z 33.7K followers, 15.5K engagements
"@sethsaler no that's glm-5"
X Link 2026-02-07T17:19Z 33.6K followers, [---] engagements
"aaaaand it's broken it no longer shows the tooltip when hovering and when you click on a model it shows the tooltip for a second and then disappears because it reloads the plot Today we're launching a new version of our website. https://t.co/6JqWR29aIC Today we're launching a new version of our website. https://t.co/6JqWR29aIC"
X Link 2026-02-07T18:24Z 33.6K followers, [----] engagements
"i found out where patience cave lives THIS NEEDS TO BE INVESTIGATED IMMEDIATELY. https://t.co/sCIQiQxy3w THIS NEEDS TO BE INVESTIGATED IMMEDIATELY. https://t.co/sCIQiQxy3w"
X Link 2026-02-07T18:25Z 33.6K followers, [----] engagements
"The good news is that there's an Opus [---] Fast Mode that has 2.5x higher tokens/s. The bad news is that it costs 6x more than the normal mode so $150/million tokens"
X Link 2026-02-07T18:39Z 33.6K followers, 33.8K engagements
"https://code.claude.com/docs/en/fast-mode https://code.claude.com/docs/en/fast-mode"
X Link 2026-02-07T18:40Z 33.6K followers, [----] engagements
"@AdityaShips that's just gemini and not antigravity"
X Link 2026-02-07T18:43Z 33.6K followers, [---] engagements
"we are doing better than 1/10 on likes not a good sign Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API. Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API"
X Link 2026-02-07T19:40Z 33.6K followers, [----] engagements
"Claude [---] Opus now rank [--] in the Design Arena"
X Link 2026-02-07T21:48Z 33.7K followers, [----] engagements
"almost stole his post a couple hours ago I have the exact same screenshot on my phone but thought I should do it on desktop instead (was too lazy to do it) Open models show 2.5x faster 6x more expensive Lower batch size speculative decoding harder Pareto optimal curve for Deepseek at https://t.co/d9dNCumX0I shows this Claude Opus [---] is [---] Tok/s/user Deepseek at [---] is 6k Tok/s/GPU At [---] tok/s/user it's closer to 1k https://t.co/X294HzM3Zo Open models show 2.5x faster 6x more expensive Lower batch size speculative decoding harder Pareto optimal curve for Deepseek at https://t.co/d9dNCumX0I"
X Link 2026-02-08T01:25Z 33.6K followers, [----] engagements
"xAI should go all in on world models it will be very useful when merged with Tesla and Optimus"
X Link 2026-02-08T01:57Z 33.6K followers, 10.4K engagements
Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
/creator/twitter::scaling01