@scaling01 Avatar @scaling01 Lisan al Gaib

Lisan al Gaib posts on X about anthropic, open ai, ai, $googl the most. They currently have [------] followers and [---] posts still getting attention that total [-------] engagements in the last [--] hours.

Engagements: [-------] #

Engagements Line Chart

Mentions: [---] #

Mentions Line Chart

Followers: [------] #

Followers Line Chart

CreatorRank: [------] #

CreatorRank Line Chart

Social Influence

Social category influence technology brands #7695 stocks #4244 finance 5.03% celebrities 2.84% countries 2.41% automotive brands 0.88% social networks 0.88% travel destinations 0.22% gaming 0.22%

Social topic influence anthropic #80, open ai #159, ai #3978, $googl #402, in the #1836, agi #128, inference #13, model #696, the first #1233, xai #145

Top accounts mentioned or mentioned by @codewithimanshu @teortaxestex @grok @test_tm7873 @teknium @jasonbotterill @kuittinenpetri @kittingercloud @blueemi99 @doctorthe113 @metrevals @feltsteam @thegenioo @lucaploo @_thomasip @bygregorr @chasebrowe32432 @patriot5715 @polynoamial @elonmusk

Top assets mentioned Alphabet Inc Class A (GOOGL) NVIDIA Corp. (NVDA) Tesla, Inc. (TSLA)

Top Social Posts

Top posts by engagements in the last [--] hours

"The gap is closing. China is catching up. Kimi-K2 Thinking crushes GPT-5 and Claude [---] Sonnet in several benchmarks while costing [--] times less compared to Sonnet It's the best open-source model period Its core focus is on agentic tasks and software development. It can now execute [------] sequential tool calls Moonshot applied Quantization-Aware Training to support native INT4 of Kimi-K2 Thinking K2 Thinking is now even better at writing. It responds more personally and emotionally πŸš€ Hello Kimi K2 Thinking The Open-Source Thinking Agent Model is here. πŸ”Ή SOTA on HLE (44.9%) and BrowseComp"
X Link 2025-11-06T15:06Z 34.6K followers, 1.1M engagements

"@fchollet @Yossi_Dahan_ @polynoamial any ideas already what ARC-AGI-4 will look like just more complex game environments or what's the differentiator to V3"
X Link 2026-02-12T23:31Z 34.6K followers, [----] engagements

""There is no question that dialogue between physicists and LLMs can generate fundamentally new knowledge" - Nathaniel Craig Professor of Physics at the University of California - this "new internal OpenAI model" is described as GPT-5.2 with scaffolding in the blog For decades one specific gluon interaction (single-minus at tree level) was widely treated as having zero amplitude meaning it was assumed not to occur. When an amplitude is zero physicists may ignore it. But this preprint shows that the conclusion is too strong: in a For decades one specific gluon interaction (single-minus at tree"
X Link 2026-02-13T19:55Z 34.6K followers, 21K engagements

"@teortaxesTex Elon got already one-shotted by human right wing slop imagine what AGI would do to his brain"
X Link 2026-02-14T19:46Z 34.6K followers, [----] engagements

"META JUST KILLED TOKENIZATION A few hours ago they released "Byte Latent Transformer". A tokenizer free architecture that dynamically encodes Bytes into Patches and achieves better inference efficiency and robustness (I was just talking about how we need dynamic tokenization that is learned during training πŸ₯² It's like fucking christmas) I don't want to talk too much about the architecture. But here's a nice visualization from their paper. Let's look at benchmarks instead :) "BLT models can match the performance of tokenization-based models like Llama [--] at scales up to 8B and 4T bytes and can"
X Link 2024-12-13T14:14Z 34.5K followers, 500.1K engagements

"Meta bros are COOOKING First they kill Tokenization. Now they kill word-level thinking with concept level thinking. See how the layers of abstraction are starting to stack Anyone else seeing the abstractions stacking and wondering what's next LLM training has become very meta already. Initially we taught them to predict the next word using simple statistical methods. Then we scaled them enabling them to predict entire sentences paragraphs Anyone else seeing the abstractions stacking and wondering what's next LLM training has become very meta already. Initially we taught them to predict the"
X Link 2024-12-13T14:36Z 34.5K followers, 265.3K engagements

"I can rest nowπŸ₯² I have gathered all the infinity stones. thanks @karpathy"
X Link 2024-12-13T18:33Z 34.5K followers, 127.9K engagements

"Ilya has so much fucking Aura There is no one in AI that comes close. The GOAT says something like "data is the fossil fuel of AI" and everyone instantly agrees"
X Link 2024-12-13T23:29Z 34.5K followers, 250.6K engagements

"GOOGLE BREAKS THE AI PRICING MODEL 🚨 LATEST LMARENA data with STYLE-CONTROL proves it: Gemini [---] Flash delivers TOP-TIER performance at BUDGET pricing* RIP to everyone's profit margins πŸ“‰ ( I mean you OpenAI - stop being greedy with 4o and o1 pricing ) *assuming constant Flash and Pro pricing for Gemini [---] models"
X Link 2024-12-14T13:33Z 34.5K followers, 199.4K engagements

"my AI predictions for 2025: - at least one lab will declare AGI and mentions ASI - Q1: Google Anthropic OpenAI META Qwen and Mistral model fiesta ( it will be heaven ) - agents / computer use takes off - release of Claude [--] Gemini [--] GPT-5 Grok [--] (or whatever they call their giant 5-20 trillion parameter models) - release of o3 o4 and o5 - open-source replication of o3 - the Frontier Math benchmark will be mostly solved (80%) - SWE-bench will be solved (90%) - ARC-AGI [--] will be mostly solved (80%) within [--] months of it's release - 10+ million context length models my wishful thinking: Someone"
X Link 2025-01-02T00:09Z 34.5K followers, 836.2K engagements

"it's so cute when people doubt Anthropic everyone can train reasoning models but no one can replicate Sonnet [---] what makes you think Anthropic won't crush o1 or o3"
X Link 2025-01-24T19:52Z 34.5K followers, 172.8K engagements

"Grok-3 API just went live grok-3: Input: $3.00/M - ($5.00/M) Output: $15.00/M - ($25.00/M) normal mode- (faster inference mode) grok-3-mini: Input: $0.30/M - ($0.60/M) Output: $0.50/M - ($4.00/M)"
X Link 2025-04-09T22:44Z 34.5K followers, 182.1K engagements

"OPUS [--] NEW SOTA ON ARC-AGI-2 IT'S HAPPENING - I WAS RIGHT Claude [--] models are the first models that effectively use test-time-compute for ARC-AGI-2"
X Link 2025-05-28T20:04Z 34.5K followers, 186.3K engagements

"A few more observations after replicating the Tower of Hanoi game with their exact prompts: - You need AT LEAST 2N - [--] moves and the output format requires [--] tokens per move + some constant stuff. - Furthermore the output limit for Sonnet [---] is 128k DeepSeek R1 64K and o3-mini 100k tokens. This includes the reasoning tokens they use before outputting their final answer - all models will have [--] accuracy with more than [--] disks simply because they can not output that much - the max solvable sizes WITHOUT ANY ROOM FOR REASONING (floor(log2(output_limit/10))) DeepSeek: [--] disks Sonnet [---] and"
X Link 2025-06-08T18:39Z 34.5K followers, 618.6K engagements

"HOLY SHIT IT'S FUCKING REAL LET THE PRICE WARS BEGIN OpenAI updated their pricing page. o3 is now cheaper than GPT-4o but more importantly cheaper than Sonnet [--] and Gemini [---] Pro I would cry if I were Anthropic and Google"
X Link 2025-06-10T17:22Z 34.5K followers, 105.7K engagements

"be me Yann LeCun make a useless prediction that "autoregressive LLMs are doomed" show picture that probability goes to [--] wait until studies come out that show that there is a constant non-zero probability of errors that leads to the observed decay in performance claim victory mfw he didn't see the second post in the thread that shows the human data looks exactly the same just scaled I don't wanna say 'I told you so' but I told you so. I don't wanna say 'I told you so' but I told you so"
X Link 2025-06-17T23:44Z 34.5K followers, 182.8K engagements

"what a joke xAI valued at 200B Anthropic latest valuation was 61.5B xAI revenue 0B Anthropic revenue 4B"
X Link 2025-07-11T20:45Z 34.5K followers, 580.1K engagements

""You're sheltering chinese AI researchers are you not""
X Link 2025-07-23T12:30Z 34.5K followers, 233.8K engagements

"holy shit get ready for a hallucination fiesta with gpt-oss"
X Link 2025-08-05T17:17Z 34.5K followers, 268.9K engagements

"made a little Sankey to show you why I'm fuming ChatGPT Plus before vs after the GPT-5 release"
X Link 2025-08-08T11:30Z 34.5K followers, 291.3K engagements

"GPT-5 scored [--] on the offline IQ Test [--] with vision what"
X Link 2025-08-08T23:08Z 34.5K followers, 155.3K engagements

"OpenAI: "We are going to fix model naming and make it less confusing" also OpenAI:"
X Link 2025-08-09T21:22Z 34.5K followers, 53.5K engagements

"GPT-5 Thinking limit up to 3000/week for Plus Users I thank you all for the participation in the first ChatGPT Plus rebellion It looks like the civil war has ended. We forced an emergency decision. @techikansh trying [----] per week now @techikansh trying [----] per week now"
X Link 2025-08-10T18:23Z 34.5K followers, 102K engagements

"BREAKING: Incredible ChatGPT Plus Limit Updates"
X Link 2025-08-10T20:23Z 34.5K followers, 344.7K engagements

"DeepSeek V3.1 beats Claude [--] Opus on Aider Polyglot This makes it the best non-TTC coding model and all of that for $1"
X Link 2025-08-19T19:42Z 34.5K followers, 288.7K engagements

"IMO medalist / Masters or PhD and $45-100/hour lol xAI is hiring like crazy https://t.co/K78mNuyMBz xAI is hiring like crazy https://t.co/K78mNuyMBz"
X Link 2025-09-15T03:02Z 34.5K followers, 665.2K engagements

"China absolutely dominates the US in advanced technologies and manufacturing"
X Link 2025-10-11T17:11Z 34.5K followers, 363.6K engagements

"Apple just leaked the size of Gemini [--] Pro - 1.2T params"
X Link 2025-11-05T19:48Z 34.5K followers, 318K engagements

"Kimi-K2 Thinking βœ… GPT-5.1 Gemini [---] Pro Preview Claude [---] Sonnet and Opus Arcee/PI Trinity models DeepSeek-V4"
X Link 2025-11-06T17:43Z 34.5K followers, 89.1K engagements

"Kimi-K2 apparently cost $4.6M to train but GPT-5 is obviously better"
X Link 2025-11-07T18:40Z 34.5K followers, 184.9K engagements

"Grok-5 started training and is now officially postponed to Q1 [----]. Musk also said it will be "the smartest AI in the world by a significant margin in every metric without exception" Elon gives Grok-5 a 10% of being AGI"
X Link 2025-11-14T22:05Z 34.5K followers, 288K engagements

"Huge Leaks on Grok-5 and its predecessors from recent Elon Musk interview: - "Grok-5 is a [--] trillion parameter model whereas Grok [--] and [--] are based on a [--] trillion parameter model" - "the [--] trillion parameters will have a much higher intelligence density per gigabyte than Grok 4" - Grok-5 "inherently multimodal so it's text pictures video audio" Grok-5 started training and is now officially postponed to Q1 [----]. Musk also said it will be "the smartest AI in the world by a significant margin in every metric without exception" Elon gives Grok-5 a 10% of being AGI Grok-5 started training and is"
X Link 2025-11-14T22:18Z 34.5K followers, 346K engagements

"GPT-5.1 Codex beats Sonnet [---] Thinking on SWE-Bench while being [--] times cheaper ouch Results are in for GPT [---] codex It's #1 on SWE Bench and has similar performance to its predecessor on Terminal Bench and LiveCodeBench. https://t.co/9LhhnCgWxf Results are in for GPT [---] codex It's #1 on SWE Bench and has similar performance to its predecessor on Terminal Bench and LiveCodeBench. https://t.co/9LhhnCgWxf"
X Link 2025-11-15T00:46Z 34.5K followers, 286.7K engagements

"the bitter pill is that Nolans last great movie was Interstellar and that the Dune trilogy will likely be the greatest trilogy since LOTR The two most anticipated films of [----] https://t.co/XkqrUkE1A3 The two most anticipated films of [----] https://t.co/XkqrUkE1A3"
X Link 2025-12-09T16:06Z 33.5K followers, 714.8K engagements

"my girlfriend claudia told me there is a good chance that they will release Claude-5 earlier than expected absolutely insane how hard anthropic cooked. wonder what they have going on internally absolutely insane how hard anthropic cooked. wonder what they have going on internally"
X Link 2026-01-05T23:11Z 33.6K followers, 13.5K engagements

"Opus [---] feels brain damaged today anyone else"
X Link 2026-01-13T17:16Z 34.1K followers, 73.6K engagements

"they will IPO at like $500B and ship Claude-5 at the same time within a week they will be $1T market cap LFG the thinking machines situation is an unintentional tell about anthropic where not a single cofounder or high salience figure has visibly departed. there have been zero dramatic exits. yes anthropic liquidity has been low but vesting has no impact on that. regardless if you the thinking machines situation is an unintentional tell about anthropic where not a single cofounder or high salience figure has visibly departed. there have been zero dramatic exits. yes anthropic liquidity has"
X Link 2026-01-15T21:59Z 34.1K followers, 51.5K engagements

"Anthropic is preparing for the singularity I'm starting to get worried. Did Anthropic solve continual learning Is that the preparation for evolving agents https://t.co/pcCoSM4gAr I'm starting to get worried. Did Anthropic solve continual learning Is that the preparation for evolving agents https://t.co/pcCoSM4gAr"
X Link 2026-01-21T16:16Z 33.6K followers, 542.4K engagements

"the anthropic challenge goes a bit over my head because I'm not a 5head and have never done anything like it before but there is a nice doc that explains it it helped: nonetheless it still took me like 1-2 hours just to understand the whole setup all the operations and the planning then I just told chatti what to do (probably saved me a decent amount of time too): [-----] cycles baseline with packing / using as many available slots for ALU and LOAD instead of just [--] slot per cycle (with [---] streams) [----] cycles with naive valu and vload usage (16 streams * VLEN 8) [----] cycles by keeping each"
X Link 2026-01-22T23:48Z 34.1K followers, 34.2K engagements

"we are in the intelligence explosion and this guy is still dooming and moving goal-posts Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8 Yann LeCun says absolutely none of the humanoid companies have any idea how to make those robots smart enough to be useful. https://t.co/z0wYtXwcf8"
X Link 2026-01-26T20:58Z 33.6K followers, 13.8K engagements

"Kimi is still the most usable open-weights model Moonshot is honestly the Anthropic of China. A focus on taste and agentic behaviour. πŸ₯ Meet Kimi K2.5 Open-Source Visual Agentic Intelligence. πŸ”Ή Global SOTA on Agentic Benchmarks: HLE full set (50.2%) BrowseComp (74.9%) πŸ”Ή Open-source SOTA on Vision and Coding: MMMU Pro (78.5%) VideoMMMU (86.6%) SWE-bench Verified (76.8%) πŸ”Ή Code with Taste: turn chats https://t.co/wp6JZS47bN πŸ₯ Meet Kimi K2.5 Open-Source Visual Agentic Intelligence. πŸ”Ή Global SOTA on Agentic Benchmarks: HLE full set (50.2%) BrowseComp (74.9%) πŸ”Ή Open-source SOTA on Vision"
X Link 2026-01-27T09:51Z 34K followers, 55.4K engagements

"who could've seen that coming so Grok [---] in March and Grok [--] in July got it"
X Link 2026-01-30T09:30Z 33.6K followers, 22.1K engagements

"GPT-5.2-xhigh Opus [---] Kimi K2.5 Gemini [--] Pro Preview"
X Link 2026-01-30T13:19Z 33.5K followers, 51K engagements

"Kimi [---] not beating GLM-4.7 on VendingBench-2 is interesting Kimi K2.5 on Vending-Bench [--]. Once again it matters which API you use. It makes twice as much money when using @Kimi_Moonshot official API compared to @FireworksAI_HQ. 2nd best open source model. https://t.co/at3FP2yJAe Kimi K2.5 on Vending-Bench [--]. Once again it matters which API you use. It makes twice as much money when using @Kimi_Moonshot official API compared to @FireworksAI_HQ. 2nd best open source model. https://t.co/at3FP2yJAe"
X Link 2026-01-31T02:08Z 33.5K followers, [----] engagements

"Google Team is confident for the Gemini [--] GA release next month"
X Link 2026-01-31T02:50Z 33.6K followers, 67.1K engagements

"omg silver just crashed 40% intra-day mfw the price is where it was [--] weeks ago this market is honestly crazy silver trades more shitcoiny then actual crypto shitcoins"
X Link 2026-01-31T02:56Z 33.5K followers, [----] engagements

""15% chance of OpenAI going bankrupt" my prediction doesn't sound so stupid now if the biggest player suddenly pulls out others might follow This is the biggest AI headline in a very long time: Nvidia's plan to invest $100 billion in OpenAI has completely "stalled" seemingly overnight. Why Jensen Huang specifically cited concerns over competition from Google and Anthropic and a "lack of discipline" in OpenAIs https://t.co/dLiXjEcp3x This is the biggest AI headline in a very long time: Nvidia's plan to invest $100 billion in OpenAI has completely "stalled" seemingly overnight. Why Jensen Huang"
X Link 2026-01-31T14:11Z 33.5K followers, 12.9K engagements

"I fear they will get mogged immediately by GPT-5.3 and new Sonnet [---] / [---] Google Team is confident for the Gemini [--] GA release next month https://t.co/HSVzCyQe7h Google Team is confident for the Gemini [--] GA release next month https://t.co/HSVzCyQe7h"
X Link 2026-01-31T16:53Z 33.7K followers, 44.4K engagements

"February will be fucking insane in terms of model launches probably even more than last November and that was the best model launching month we have ever seen"
X Link 2026-01-31T16:54Z 33.6K followers, 10.7K engagements

"Google is not a serious company when their "frontier" model is a preview half of the year"
X Link 2026-01-31T19:48Z 33.7K followers, 106.7K engagements

"billionaires are murdering torturing and raping children without repercussions but you are mad about some pronouns lmao BREAKING: Deputy Attorney General Todd Blanche just admitted the DOJ excluded images showing death physical abuse or injury from todays Epstein files release. Let that sink in. The government is acknowledging graphic evidence exists and chose to withhold it while https://t.co/gGrUAfKR2Y BREAKING: Deputy Attorney General Todd Blanche just admitted the DOJ excluded images showing death physical abuse or injury from todays Epstein files release. Let that sink in. The government"
X Link 2026-01-31T19:54Z 33.6K followers, 13.9K engagements

"upscaling is sick"
X Link 2026-01-31T22:54Z 33.7K followers, [----] engagements

"Nathan Lambert and Sebastian Raschka on Lex's podcast Here's my conversation all about AI in [----] including technical breakthroughs scaling laws closed & open LLMs programming & dev tooling (Claude Code Cursor etc) China vs US competition training pipeline details (pre- mid- post-training) rapid evolution of LLMs work https://t.co/AeGxRWjJF6 Here's my conversation all about AI in [----] including technical breakthroughs scaling laws closed & open LLMs programming & dev tooling (Claude Code Cursor etc) China vs US competition training pipeline details (pre- mid- post-training) rapid evolution of"
X Link 2026-01-31T23:14Z 33.7K followers, [----] engagements

"I made a comment [--] months ago and I still think it's true: Open-weight models are catching up on benchmarks and slowly make their way to this magical Opus [---] threshold of reliable vibe-coding. A lot of recent progress has been on coding and the typical example of this is "create a beautiful website". But this feels very slopmaxxy to me similar to how Llama-3 or Llama-4 models topped the lmarena leaderboards back in the days. But this time we aren't tricked by sycophancy and markdown but by beautiful graphics. I feel like open-weight models are falling behind on reasoning. The thing that"
X Link 2026-02-01T01:13Z 33.6K followers, 30.1K engagements

"I understand but I think you are too careful because of backlash in the past. Calling your models -exp and -preview just seems like to hedge that risk. I think you should simply release more checkpoints and call them Gemini [---] [---] and so on like OpenAI Anthropic and DeepSeek are. https://twitter.com/i/web/status/2017769207207776315 https://twitter.com/i/web/status/2017769207207776315"
X Link 2026-02-01T01:17Z 33.7K followers, 12.2K engagements

"Opus [---] can basically do everything that normies want and open-weight models are approaching this level fast. It can be your companion it can do your homework it can browse it can write all your emails it can manage stuff for you it can vibe-code everything . but i don't see a lot of progress on the reasoning side. I feel like OpenAI Google and Anthropic simply have too many ressources for open-weight labs to catch up right now where everything revolves around RL environments. I made a comment [--] months ago and I still think it's true: Open-weight models are catching up on benchmarks and"
X Link 2026-02-01T01:29Z 33.5K followers, 22.9K engagements

"Nathan is great he's like me bit autistic and happy by simply talking about AI I'm enjoying it so far I'm enjoying it so far"
X Link 2026-02-01T14:53Z 33.7K followers, [----] engagements

"sell-outs everywhere"
X Link 2026-02-01T17:17Z 33.5K followers, [----] engagements

"suddenly everyone is an insider that has already used sonnet [--] gpt-5.3 and gemini [--] pro ga"
X Link 2026-02-01T18:33Z 33.6K followers, 101.6K engagements

"Its been almost three years since GPT-4 launched Are todays models better or worse than you thought theyd be by now better worse dunno i don't think much as expected better worse dunno i don't think much as expected"
X Link 2026-02-01T23:27Z 33.6K followers, 11.6K engagements

"You don't understand how ridiculously large the gap between GPT-4o and the models of today is METR time horizons double every [----] days GPT-4o was released [---] days ago This means we had over [--] doublings since then. or a 128x in time horizon people claim a study found no efficiency gains using LLM for coding look inside participants used 4o in chat sidebar every time https://t.co/hVZh8XhuJG people claim a study found no efficiency gains using LLM for coding look inside participants used 4o in chat sidebar every time https://t.co/hVZh8XhuJG"
X Link 2026-02-02T15:20Z 34.1K followers, 89.6K engagements

"it's a bit ridiculous saying Andrej invented vibe coding when he posted this in Feb [----] the concept existed way before that but he may have popularized the name There's a new kind of coding I call "vibe coding" where you fully give in to the vibes embrace exponentials and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper There's a new kind of coding I call "vibe coding" where you fully give in to the vibes embrace exponentials and forget that the code even exists. It's possible"
X Link 2026-02-02T17:48Z 33.6K followers, [----] engagements

"I would actually like to see Sonnet [--] being cheaper than $15/million $10 would make me happy but i don't think it will happen they will squeeze us for another year or so"
X Link 2026-02-02T22:52Z 33.7K followers, [----] engagements

"Can I say that I work for a rocket and AI company now"
X Link 2026-02-02T22:58Z 33.7K followers, [----] engagements

"be me OpenAI know hardware limitations like memory bandwidth and compute of Nvidia GPUs spend m/billions of R&D and carefully designing and training new model choose to ignore hardware constraints release xhigh model that thinks much longer than typical models to get same performance as Anthropic model users complain model takes too long to respond blame Nvidia that their business model isn't working out $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS HARDWARE CAN SPIT OUT ANSWERS TO CHATGPT USERS FOR COMPLEX PROBLEMS -SOURCES $NVDA - OPENAI UNSATISFIED WITH SPEED AT WHICH NVIDIAS"
X Link 2026-02-02T23:52Z 33.7K followers, 23.7K engagements

"Anthropic image gen @AndrewCurran_ @repligate @anthrupad πŸ‘€ https://t.co/UdDPN5yrCV @AndrewCurran_ @repligate @anthrupad πŸ‘€ https://t.co/UdDPN5yrCV"
X Link 2026-02-02T23:58Z 33.7K followers, [----] engagements

"one wonders why codex is suddenly free for a month or two We arent talking enough just how much AI in coding has accelerated in the last month alone. https://t.co/4I22viJOl5 We arent talking enough just how much AI in coding has accelerated in the last month alone. https://t.co/4I22viJOl5"
X Link 2026-02-03T02:24Z 33.8K followers, 23.9K engagements

"is it th1.1 is supposedly the more accurate version and shouldn't we be looking at = [----] models because of reasoning models but anyways doesn't really change the argument that GPT-4o is an old brick https://twitter.com/i/web/status/2018529011433926755 https://twitter.com/i/web/status/2018529011433926755"
X Link 2026-02-03T03:36Z 33.7K followers, [----] engagements

"Saint Dario the Wise May he bless us on this beautiful day"
X Link 2026-02-03T12:52Z 34.1K followers, 25K engagements

"so is every paper this year just going to be some kind of self-play/-distillation/-improvement/-whatever"
X Link 2026-02-03T13:22Z 34.1K followers, 13.3K engagements

"then please release GPT-OSS-2 Sam Altman: 'I think there will be increasing demands for locally running private models.' Sam Altman: 'I think there will be increasing demands for locally running private models.'"
X Link 2026-02-03T18:20Z 33.6K followers, [----] engagements

"Gemini [--] Pro new SOTA on METR 80% time horizon (barely) Weve started to measure time horizons for recent models using our updated methodology. On this expanded suite of software tasks we estimate that Gemini [--] Pro has a 50%-time-horizon of around [--] hrs (95% CI of [--] hr [--] mins to [--] hrs [--] mins). https://t.co/FbpzO7Tq3L Weve started to measure time horizons for recent models using our updated methodology. On this expanded suite of software tasks we estimate that Gemini [--] Pro has a 50%-time-horizon of around [--] hrs (95% CI of [--] hr [--] mins to [--] hrs [--] mins). https://t.co/FbpzO7Tq3L"
X Link 2026-02-03T18:35Z 34.1K followers, 29.4K engagements

"fuck all the people who said sonnet [--] is definitely coming today"
X Link 2026-02-03T20:07Z 34.1K followers, 65.1K engagements

"@spellswordaf no anthropic employee will tell you anything everyone who claims to be one is fake"
X Link 2026-02-03T20:13Z 33.5K followers, [----] engagements

"CL-Bench - tests whether LMs can learn new knowledge from context and apply it correctly - all information needed to solve its tasks is provided explicitly within the context itself - context learning remains a significant challenge "At inference time they LLMs function largely by recalling this static internal memory rather than actively learning from new information provided in the moment." scores are rough given that all information to solve the tasks is in context What if giving an AI the answer key still isn't enough for it to solve the problem New research from Tencent's Hunyuan team &"
X Link 2026-02-03T22:44Z 33.6K followers, [----] engagements

"currently there's the LLM Poker Tournament going on over at Kaggle turns out they are hallucinating constantly and are mostly gambling like GPT-5.2 is playing 100% of hands no pre-flop folds"
X Link 2026-02-04T00:02Z 33.5K followers, [----] engagements

"OpenAI halves the reasoning effort for all subscribers and gives the compute straight to API customers where they can now milk 40% more money GPT-5.2 and GPT-5.2-Codex are now 40% faster. We have optimized our inference stack for all API customers. Same model. Same weights. Lower latency. GPT-5.2 and GPT-5.2-Codex are now 40% faster. We have optimized our inference stack for all API customers. Same model. Same weights. Lower latency"
X Link 2026-02-04T00:12Z 34.1K followers, 158.3K engagements

"Arcee AI going for a $200 million funding round to build a 1T+ parameter model"
X Link 2026-02-04T00:19Z 34.1K followers, [----] engagements

"https://www.forbes.com/sites/annatong/2026/02/02/the-top-open-ai-models-are-chinese-arcee-ai-thinks-thats-a-problem/ https://www.forbes.com/sites/annatong/2026/02/02/the-top-open-ai-models-are-chinese-arcee-ai-thinks-thats-a-problem/"
X Link 2026-02-04T00:23Z 33.5K followers, [----] engagements

"and they continue posting about Claude Psychosis [--] without any proof whatsoever fuck all the people who said sonnet [--] is definitely coming today fuck all the people who said sonnet [--] is definitely coming today"
X Link 2026-02-04T11:55Z 34.1K followers, [----] engagements

"that's what I mean with reasoning gap between open and closed models The new Qwen coding model is 10x more expensive than GPT-OSS-20B but same score Qwen [--] coder next (80b3a) scores 34.4% on WeirdML which is pretty good for it's size especially for a non-reasoning model. Probably a good choice for agentic coding if you need a small local model. https://t.co/2ynogD0yLy Qwen [--] coder next (80b3a) scores 34.4% on WeirdML which is pretty good for it's size especially for a non-reasoning model. Probably a good choice for agentic coding if you need a small local model. https://t.co/2ynogD0yLy"
X Link 2026-02-04T13:05Z 33.6K followers, 11.5K engagements

"If you believe Anthropic is dropping Sonnet [--] and Opus [---] you might as well believe that Santa Claude is real"
X Link 2026-02-04T13:17Z 34.2K followers, 20.6K engagements

"and here we go again the clickbait breaking bullshit "sonnet coming today" and nothing again just blocking everyone idc"
X Link 2026-02-04T14:17Z 34.1K followers, 22.4K engagements

"Anthropic is posting videos about "roaring cougars" and you are laughing Ads are coming to AI. But not to Claude. Keep thinking. https://t.co/n2yECeBWyT Ads are coming to AI. But not to Claude. Keep thinking. https://t.co/n2yECeBWyT"
X Link 2026-02-04T15:34Z 34.2K followers, [----] engagements

"Anthropic is mocking OpenAI for introducing ads (rightfully so)"
X Link 2026-02-04T15:39Z 34.1K followers, 31.4K engagements

"The first 1T param model with FoPE but there's nothing about it in the tech report . arrrrghhhh"
X Link 2026-02-04T15:58Z 33.7K followers, 11.7K engagements

"https://huggingface.co/internlm/Intern-S1-Pro https://huggingface.co/internlm/Intern-S1-Pro"
X Link 2026-02-04T15:58Z 34K followers, [----] engagements

"have you heard the latest rumor Sonnet [--] DEFINITELY coming tomorrow this time FOR SURE (if not probably the day after) I'm an Anthropic leaker and have spoken with Pope Dario personally (please take me serious and buy me a coffee for giving you these exclusive news) https://t.co/Ahi4k3Dzwo https://t.co/Ahi4k3Dzwo"
X Link 2026-02-04T16:33Z 34.2K followers, 10.7K engagements

"As long as game playthroughs look like this we can be certain that it's not AGI. It had been standing there "thinking" for several minutes before I started recording"
X Link 2026-02-04T19:02Z 34.1K followers, [----] engagements

"and yet startups are choosing Anthropic over OpenAI but I guess OpenAI is where they want to be recognized by the super bowl consumer hivemind slop crowd apropos of nothing your reminder that anthropic has the same level of name recognition among superbowl viewers as literally fictional companies apropos of nothing your reminder that anthropic has the same level of name recognition among superbowl viewers as literally fictional companies"
X Link 2026-02-04T19:10Z 33.7K followers, 11.5K engagements

"OpenAI doesn't understand that they have already lost the cultural war"
X Link 2026-02-04T19:19Z 34.1K followers, 16.2K engagements

"OpenAI does what it does best again. Making empty promises about their "ads principles". You I and even the toddler next door all know that as soon as investors slightly nudge you you will give in and falter like a house of cards. OpenAI has demonstrated this time and time again. That you choose profits and weakening of the non profit every time. First you will say "no ads for children no ads in private chats or in chats about mental health and only for free and go users". Next week you will say " actually everyone will see ads". The week after "ads are now included in the Plus tier". This"
X Link 2026-02-04T21:29Z 34.1K followers, 19.5K engagements

"Anthropic the authoritarian company whose board couldn't even topple the emperor during one of their coups. Anthropic the authoritarian company whose leadership is one of the largest supporters of the Trump administration. Anthropic the company for all people that just halved thinking limits across all subscription tiers. Anthropic the company for rich people that just raised prices on the most expensive model in the market. Except that all of this isn't about Anthropic but OpenAI. First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for"
X Link 2026-02-04T21:42Z 33.7K followers, 42K engagements

"NEW METR SOTA: GPT-5.2-high (not xhigh) at [--] hours [--] minutes beating Opus 4.5"
X Link 2026-02-04T22:05Z 34.1K followers, 32.9K engagements

"NEW METR 80% SOTA: GPT-5.2-high at [--] minutes The first model to break away from the GPT-5.1-Codex Max Gemini [--] Pro and Opus [---] group NEW METR SOTA: GPT-5.2-high (not xhigh) at [--] hours [--] minutes beating Opus [---] https://t.co/NxrqBSctFN NEW METR SOTA: GPT-5.2-high (not xhigh) at [--] hours [--] minutes beating Opus [---] https://t.co/NxrqBSctFN"
X Link 2026-02-04T22:06Z 33.8K followers, 14.8K engagements

"@METR_Evals why are you not testing xhigh or Pro i thought this is about frontier performance"
X Link 2026-02-04T22:27Z 34.1K followers, 14.1K engagements

"and why are costs no longer reported can't show GPT-5.2 being 10x more expensive GPT-5.2-high took [--] TIMES LONGER than Claude [---] Opus to complete the METR benchmark suite https://t.co/RlZUm4iulm GPT-5.2-high took [--] TIMES LONGER than Claude [---] Opus to complete the METR benchmark suite https://t.co/RlZUm4iulm"
X Link 2026-02-04T22:29Z 33.8K followers, [----] engagements

"I need a trusted adult from METR to hold my hand and explain the working time to me. Like surely that's not right Can you compare working times Otherwise this is absolutely dooming for OpenAI. We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95% CI of [--] hr [--] min to [--] hr [--] min) on our expanded suite of software tasks. This is the highest estimate for a time horizon measurement we have reported to date. https://t.co/USkHNuFexc We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95%"
X Link 2026-02-04T22:46Z 34.1K followers, 30.7K engagements

"GPT-5.2 has won the Poker Arena on Kaggle"
X Link 2026-02-04T23:37Z 34K followers, [----] engagements

"sam is a bad ceo and should just retire along with greggy he's too emotional and people hate him First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would"
X Link 2026-02-04T23:46Z 34.1K followers, [----] engagements

"First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic First the good part of the Anthropic ads: they are funny and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for ads says that we wont do exactly this; we would obviously never run ads in the way Anthropic"
X Link 2026-02-05T00:39Z 33.7K followers, 10.1K engagements

"GPT-5.3 soonish probably in the next [--] weeks"
X Link 2026-02-05T01:30Z 34.2K followers, 23.7K engagements

"Some more thoughts on METR results: - GPT-5.2-high is where I would have intuitively ranked it based on codex vibes (better than Opus but significantly slower although it's not the 26x as reported in their data. I guess it's closer to 3-5x slower) - notice GPT-5.2 was launched on Dec 11th and it got [---] hours - I expect current frontier models in labs to be around the [--] hour mark right now and by the end of the year between 24-48 hours probably 30something hours NEW METR SOTA: GPT-5.2-high (not xhigh) at [--] hours [--] minutes beating Opus [---] https://t.co/NxrqBSctFN NEW METR SOTA: GPT-5.2-high"
X Link 2026-02-05T02:26Z 34.1K followers, [----] engagements

"We could see [--] day time horizons by the end of [----]. GPT-5.2 was launched on Dec 11th with [---] hours time horizon. Time horizons double every [---] days (which I think is conservative). Until [----] there are [---] days or [--------] doubling periods which translates to an 8.087x in time horizons or [-----] hours. We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95% CI of [--] hr [--] min to [--] hr [--] min) on our expanded suite of software tasks. This is the highest estimate for a time horizon measurement we have reported to date."
X Link 2026-02-05T02:55Z 34.3K followers, 16.6K engagements

"@LeviTurk http://perplexity.ai/rest/models/config http://perplexity.ai/rest/models/config"
X Link 2026-02-05T15:30Z 33.6K followers, [----] engagements

"Dwarkesh x Elon is out https://www.youtube.com/watchv=BYXbuik3dgA https://www.youtube.com/watchv=BYXbuik3dgA"
X Link 2026-02-05T17:06Z 34.1K followers, 19.5K engagements

"Elon Musk: "5 years from now we will launch every year more AI in space than the cumulative total on earth. I'd expect at least a few hundred GW of AI in space." Dwarkesh x Elon is out https://t.co/ygWlEdUAeo Dwarkesh x Elon is out https://t.co/ygWlEdUAeo"
X Link 2026-02-05T17:24Z 34K followers, [----] engagements

"Opus [---] live in Anthropic Console"
X Link 2026-02-05T17:31Z 34.3K followers, [----] engagements

"Claude [---] Opus GDPval scores"
X Link 2026-02-05T17:44Z 33.6K followers, [----] engagements

"Claude [---] Opus Benchmarks"
X Link 2026-02-05T17:44Z 33.6K followers, [----] engagements

"Claude [---] Opus still with the best SVG results out of all models just incredibly high taste Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6 Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6"
X Link 2026-02-05T17:50Z 33.6K followers, 33.5K engagements

"Claude [---] Opus System Card"
X Link 2026-02-05T17:51Z 34.2K followers, 18K engagements

"Opus [---] is now better than human experts at reading and interpreting scientific charts"
X Link 2026-02-05T17:54Z 34.3K followers, 42.3K engagements

"Opus [---] new SOTA on DeepSearchQA"
X Link 2026-02-05T17:57Z 34K followers, [----] engagements

"Opus [---] slightly more prone to prompt injections than Opus 4.5"
X Link 2026-02-05T18:04Z 33.7K followers, [----] engagements

"Opus [---] seems to have hallucinate more"
X Link 2026-02-05T18:07Z 34.1K followers, 15K engagements

"Claude [---] Opus achieves a [---] speedup on kernel optimization over the baseline using a novel scaffold far exceeding the 300x threshold for [--] human-expert-hours of work"
X Link 2026-02-05T18:15Z 33.8K followers, 18.9K engagements

"GPT-5.3 Codex absolutely demolished Opus [---] (65.4%) on Terminal Bench [--] just minutes after its launch"
X Link 2026-02-05T18:24Z 34.1K followers, 66.8K engagements

"OpenAI: "GPT5.3Codex is our first model that was instrumental in creating itself.""
X Link 2026-02-05T18:30Z 33.9K followers, [----] engagements

"OpenAI should be able to take back the coding crown with the massively improved reasoning efficiency. Speed was the only concern. Now it might be resolved with faster inference + better reasoning efficiency"
X Link 2026-02-05T18:41Z 34.5K followers, [----] engagements

"GPT-5.3-Codex SVG results are good but Opus [---] is still noticeably better Claude [---] Opus still with the best SVG results out of all models just incredibly high taste https://t.co/RHGblMNusd Claude [---] Opus still with the best SVG results out of all models just incredibly high taste https://t.co/RHGblMNusd"
X Link 2026-02-05T18:47Z 34K followers, [----] engagements

"LiveBench results for Opus [---] Opus [---] has the highest ever Reasoning Score and by a big margin"
X Link 2026-02-05T18:58Z 34.2K followers, 11.9K engagements

"Here we go again"
X Link 2026-02-05T18:59Z 34.1K followers, 33K engagements

"SemiAnalysis: "It Claude Code is set to drive exceptional revenue growth for Anthropic in [----] enabling the lab to dramatically outgrow OpenAI." Claude Code is the Inflection Point What It Is How We Use It Industry Repercussions Microsoft's Dilemma Why Anthropic Is Winning. https://t.co/VIuF5Qohf5 Claude Code is the Inflection Point What It Is How We Use It Industry Repercussions Microsoft's Dilemma Why Anthropic Is Winning. https://t.co/VIuF5Qohf5"
X Link 2026-02-05T19:11Z 33.6K followers, 27.5K engagements

"Nobody believed me when I said ARC-AGI-2 would fall fast"
X Link 2026-02-05T19:17Z 34.4K followers, [----] engagements

"GPT-5.3-Codex-xhigh used [----] times fewer tokens than GPT-5.2-Codex-xhigh on SWE-Bench-Pro together with the 40% boost in inference speeds this means it's 2.93x faster (while scoring 1% higher)"
X Link 2026-02-05T19:25Z 33.7K followers, 35.4K engagements

"Lisan al Gaib as featured in TBPN"
X Link 2026-02-05T19:30Z 34.1K followers, [----] engagements

"Now what do we make of the Sonnet [--] leaks Is Anthropic sandbagging with Opus 4.6"
X Link 2026-02-05T19:51Z 34.1K followers, 16.3K engagements

"@AndrewCurran_ so is Opus [--] also ready then is Sonnet [--] stronger than Opus 4.6"
X Link 2026-02-05T19:55Z 34K followers, [----] engagements

"We are accelerating towards a permanent underclass"
X Link 2026-02-05T19:59Z 34.3K followers, [----] engagements

"We might see the full GPT-5.3 in [--] weeks as the Codex upgrade announcement has an end date"
X Link 2026-02-05T20:14Z 34.3K followers, 11.8K engagements

"@heyruchir Codex is a year old and they have plenty of data through their chat and API. They were never at risk. DeepSeek and Kimi have plenty of OpenRouter data + Kimi has Kimi Code. Not comparable to xAI and Meta. Although xAI has been farming data on OpenRouter too"
X Link 2026-02-05T21:04Z 34K followers, 13.4K engagements

"I don't know how we went from Gemini [--] Pro leap-frogged everyone to Gemini is cooked in like [--] months Gemini [--] Pro is in trouble lmao Gemini [--] Pro is in trouble lmao"
X Link 2026-02-05T21:58Z 33.6K followers, 43.8K engagements

"today was a good day"
X Link 2026-02-05T23:27Z 34.1K followers, [----] engagements

"we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer" The headline is Opus [---] scores 69% for $3.50/task on ARC v2. This up +30pp from Opus [---]. We attribute performance to the new "max" mode and 2X reasoning token budget -- notably task cost is held steady. Based on early field reports and other benchmark scores like SWE Bench The headline is Opus [---] scores 69% for $3.50/task on ARC v2. This up +30pp from Opus [---]. We attribute performance to the new "max" mode and 2X reasoning token budget -- notably task cost is"
X Link 2026-02-06T00:42Z 34.2K followers, 51.9K engagements

"but whatever using GPT-5.3-Codex for everything now that inference speed and reasoning efficiency is improved + it's cheaper and Opus only for frontend design we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer" we might be getting scammed by Anthropic: "we speculate this is a smaller model (maybe Sonnet-ish) that runs thinking for longer""
X Link 2026-02-06T00:54Z 33.6K followers, 10.7K engagements

"Opus [---] ranking 2nd in SimpleBench 5.6% higher than Opus 4.5"
X Link 2026-02-06T08:51Z 34K followers, [----] engagements

"Opus [---] can officially be called a connoisseur. Its taste has no bounds. It runs laps around all other models in EQ-Bench and both Creative Writing Benchmarks Opus [---] dominated. https://t.co/BsbZX1igRj Opus [---] dominated. https://t.co/BsbZX1igRj"
X Link 2026-02-06T11:55Z 34.4K followers, 32.5K engagements

"Opus is a certified gamer Opus [---] got a new high score by reaching round [--] compared to Opus [---] which barely made it to round [--] https://t.co/cS2yibinc8 Opus [---] got a new high score by reaching round [--] compared to Opus [---] which barely made it to round [--] https://t.co/cS2yibinc8"
X Link 2026-02-06T12:32Z 34.1K followers, [----] engagements

"No it's hallucinating and thinking of Claude-1 which was 52B Opus [---] confirmed for 52B model https://t.co/XmfDtWlTjJ Opus [---] confirmed for 52B model https://t.co/XmfDtWlTjJ"
X Link 2026-02-06T12:53Z 34K followers, 24.7K engagements

"Kimi-K2.5 is much better than other open-source models at optimization problems (like routing or scheduling) and almost on par with GPT-5.2-high Kimi-K2.5 needs [--] self-refinement steps to reach the same performance as GPT-5.2-high with [--] step Ale-Bench: https://sakanaai.github.io/ALE-Bench-Leaderboard/ https://sakanaai.github.io/ALE-Bench-Leaderboard/"
X Link 2026-02-06T14:42Z 34.1K followers, [----] engagements

"unfortunately it still gets dominated by GPT-5-mini and (green) and Gemini [--] Flash (blue)"
X Link 2026-02-06T14:45Z 34.1K followers, [----] engagements

"why are we spreading unconfirmed shit again without any proof DeepSeek V4 has 1.5T(1500B) param. If this is true it could be another seismic shift in the AI landscape for Silicon Valley and the whole worldπŸ€―πŸ™€πŸ˜» DeepSeek V4 has 1.5T(1500B) param. If this is true it could be another seismic shift in the AI landscape for Silicon Valley and the whole worldπŸ€―πŸ™€πŸ˜»"
X Link 2026-02-06T15:23Z 33.7K followers, [----] engagements

"@GlobalUpdates24 guess who commits suicide in the next [--] days"
X Link 2026-02-06T15:24Z 34.1K followers, [----] engagements

"@garyfung Are you regarded sir What do you think X is And scraping GitHub for coding data is such a [----] thing to do. Gets you about 30% on human eval LMAO"
X Link 2026-02-06T15:51Z 33.5K followers, [----] engagements

"@elonmusk i certainly hope I'm wrong the more players the better"
X Link 2026-02-06T16:13Z 33.7K followers, 107.7K engagements

"The non-thinking mode of Claude [---] Opus is now even more efficient Opus [---] is now #1 on the Artificial Analysis Leaderboard https://t.co/pUKplcCZoy Opus [---] is now #1 on the Artificial Analysis Leaderboard https://t.co/pUKplcCZoy"
X Link 2026-02-06T16:43Z 33.5K followers, [----] engagements

"Elon says Anthropic is doing a good job with mechanistic interpretability and sees it as the way towards better less reward-hacky models. He says he wants an AI debugger"
X Link 2026-02-06T16:47Z 34.2K followers, 14.7K engagements

"Claude [---] Opus ranks 1st on scBench a benchmark for RNA-seq analysis tasks"
X Link 2026-02-06T17:14Z 34.4K followers, [----] engagements

"Anthropic saw the same thing so they decided to sprinkle in some math environments. Little did they know that there's more to GPT-5.2-xhigh's reasoning than math. Honestly curious how this will continue. I was worried that they would be behind by a few months. Unless they are sandbagging with Opus [---] this seems to be true. I like that the current frontier models are polar opposites it makes their use-cases and strengths pretty obvious GPT-5.2 = Exploration - the reason why xhigh and Pro are so damn good Opus [---] = Exploitation - the reason why Anthropic don't need many tokens and reasoning I"
X Link 2026-02-06T17:32Z 33.6K followers, 12.9K engagements

"my favorite conspiracy theory is that Elon paid someone to create the 4o bots to slow down OpenAI"
X Link 2026-02-06T17:49Z 34.1K followers, [----] engagements

"Claude [---] Opus is the Creative Writing GOAT Claude Opus [---] is the new Short-Story Creative Writing champion πŸ† Opus [---] Thinking 16K scores [----] significantly improved over Opus [---] Thinking 16K (8.20). DeepSeek V3.2 scores [----] (DeepSeek V3.2 Exp scored 7.16). https://t.co/WOmX7U7ptH Claude Opus [---] is the new Short-Story Creative Writing champion πŸ† Opus [---] Thinking 16K scores [----] significantly improved over Opus [---] Thinking 16K (8.20). DeepSeek V3.2 scores [----] (DeepSeek V3.2 Exp scored 7.16). https://t.co/WOmX7U7ptH"
X Link 2026-02-06T18:10Z 34.1K followers, [----] engagements

"why i think it's GLM-5: - it says its GLM from zai - it denies this certain event from [----] that totally didn't happen in china and spews propaganda shit (so definitely chinese model) - inference is slow as fuck if it was OpenAI or xAI as usual it would be much faster inference https://twitter.com/i/web/status/2019837936917729501 https://twitter.com/i/web/status/2019837936917729501"
X Link 2026-02-06T18:17Z 34.1K followers, [----] engagements

"(they probably just used a shitton of synthetic Opus [---] data. but i don't care if it's open-source)"
X Link 2026-02-06T18:19Z 33.5K followers, [----] engagements

"Yup Pony Alpha (GLM-5) is literally a distilled Opus [---] Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6 Opus [---] (non-thinking) is by far the best model to ever create SVGs https://t.co/irFMtkPxP6"
X Link 2026-02-06T18:23Z 33.8K followers, 20.9K engagements

"This shit gets millions of views reposted by russian propaganda bots and is completely fabricated. Trump never posted this. BIGGEST. BULL. RUN. EVER. STARTING. NOW. https://t.co/yBU6RKfD8I BIGGEST. BULL. RUN. EVER. STARTING. NOW. https://t.co/yBU6RKfD8I"
X Link 2026-02-06T18:33Z 33.5K followers, [----] engagements

"Claude [---] Opus #1 on lmarena for text coding and expert questions"
X Link 2026-02-06T18:40Z 33.8K followers, [----] engagements

"Opus [---] in 3rd place on WeirdML WITHOUT thinking It looks like without the number patterns tasks it would've been in first place. Thinking results will come later. claude-opus-4.6 (no thinking) scores 65.9% on WeirdML up from 63.7% for opus-4.5. Very interesting model it is really great but very different from other models For example the low score for the "number_patterns" task is from the fact that it several times uses all its [--] https://t.co/rVeGN0ke99 claude-opus-4.6 (no thinking) scores 65.9% on WeirdML up from 63.7% for opus-4.5. Very interesting model it is really great but very"
X Link 2026-02-06T18:43Z 34K followers, [----] engagements

"okay hear me out: how about gemini [---] that obliterates codex and opus IN REAL-WORLD TASKS"
X Link 2026-02-06T18:48Z 34K followers, [----] engagements

"There is no GPT-5.3-Codex API. So no benchmarking. No it's not rushed. It's strategy imo They want to push their Codex usage up. @scaling01 dumb q probably. but why is opus [---] popping up in all the benchmarks but 5.3-codex is not yet Does that imply that they rushed [---] release up @scaling01 dumb q probably. but why is opus [---] popping up in all the benchmarks but 5.3-codex is not yet Does that imply that they rushed [---] release up"
X Link 2026-02-06T19:33Z 33.6K followers, 11.9K engagements

"had to extract some model names and scores from a plot GPT-5.2 thought and cropped the image for [---] minutes and got it wrong Gemini [--] Pro got it correct in [--] seconds"
X Link 2026-02-06T20:03Z 34.1K followers, [----] engagements

"german universities are still doing the woke thing . there's no hope for this country This account will be paused and LMU Munich will not post further content due to ongoing developments on this platform. We would be pleased if you followed LMU on other channels. (1/2) This account will be paused and LMU Munich will not post further content due to ongoing developments on this platform. We would be pleased if you followed LMU on other channels. (1/2)"
X Link 2026-02-06T20:07Z 33.5K followers, [----] engagements

"@teortaxesTex i agree but that's like . not hard even o3-mini beats Opus 4.5"
X Link 2026-02-06T20:22Z 34.4K followers, [----] engagements

"150 seconds to put three Jenga blocks side-by-side and another on top where are my robotics scaling laws how long until we do this in [--] seconds Pantograph robot building with jenga blocks. Focusing on RL is a great and fairly unique strategy https://t.co/a5hYkjW8R3 Pantograph robot building with jenga blocks. Focusing on RL is a great and fairly unique strategy https://t.co/a5hYkjW8R3"
X Link 2026-02-06T20:46Z 33.5K followers, [----] engagements

"For a long time I was trapped in the cycle of "we are so back" and "it's so over". Every new benchmark and model would wildly swing my AGI timelines. Four months ago I accepted a simple truth: We http://x.com/i/article/2019898127189184512 http://x.com/i/article/2019898127189184512"
X Link 2026-02-06T22:32Z 33.7K followers, 26K engagements

"@teortaxesTex show me a chinese lab that isn't just distilling Opus [---] at this point and distilling will always only give you 90% of the perf Kimi-K2.5 and GLM-5 are . we will see what DeepSeek is doing (i still have hopes for them)"
X Link 2026-02-06T22:52Z 34.1K followers, 12.2K engagements

"OpenRouter token usage is growing 10x a year"
X Link 2026-02-06T23:41Z 34.5K followers, [----] engagements

"@Teknium who's talking about COTs"
X Link 2026-02-07T00:29Z 33.7K followers, [---] engagements

"@tenobrus duuuh you should obviously take out a loan and put it all into polymarket"
X Link 2026-02-07T02:53Z 33.5K followers, [---] engagements

"Step [---] Flash https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int8 https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int8"
X Link 2026-02-07T10:13Z 33.7K followers, [----] engagements

"@JasonBotterill and the South African"
X Link 2026-02-07T13:31Z 33.6K followers, [--] engagements

"only one city on planet earth can have the mandate all the new AI trillionaires should try to make the city even better build libraries museums parks skyscrapers make public transport free . my entire feed is pro-SF propaganda but its all true. all of it. https://t.co/AjVtESUJUA my entire feed is pro-SF propaganda but its all true. all of it. https://t.co/AjVtESUJUA"
X Link 2026-02-07T13:53Z 33.7K followers, 36.3K engagements

"@teortaxesTex @resona_dev I had the same shizo attack this morning. I thought this was a new model because my scraper notified me. I thought this was a second smaller Step [---] model"
X Link 2026-02-07T15:42Z 33.6K followers, [---] engagements

"is chatgpt actually using smooth brain and wrinkle brain pictograms for thinking efforts lmao can someone explain to me why i wouldn't just pick "extra high" every time is it basically "greater reasoning" = "longer response times" https://t.co/fxz4dAyDvS can someone explain to me why i wouldn't just pick "extra high" every time is it basically "greater reasoning" = "longer response times" https://t.co/fxz4dAyDvS"
X Link 2026-02-07T16:51Z 34.2K followers, 94.4K engagements

"if aliens wanted to wipe us out they would and could do it without us even noticing why would they take any risk and telegraph that they are even there"
X Link 2026-02-07T16:56Z 34.1K followers, [----] engagements

"PI is constantly aura-farming but it's time for them to drop a banger model https://t.co/hAkwJGAwjO https://t.co/hAkwJGAwjO"
X Link 2026-02-07T17:15Z 33.7K followers, 15.5K engagements

"@sethsaler no that's glm-5"
X Link 2026-02-07T17:19Z 33.6K followers, [---] engagements

"aaaaand it's broken it no longer shows the tooltip when hovering and when you click on a model it shows the tooltip for a second and then disappears because it reloads the plot Today we're launching a new version of our website. https://t.co/6JqWR29aIC Today we're launching a new version of our website. https://t.co/6JqWR29aIC"
X Link 2026-02-07T18:24Z 33.6K followers, [----] engagements

"i found out where patience cave lives THIS NEEDS TO BE INVESTIGATED IMMEDIATELY. https://t.co/sCIQiQxy3w THIS NEEDS TO BE INVESTIGATED IMMEDIATELY. https://t.co/sCIQiQxy3w"
X Link 2026-02-07T18:25Z 34.1K followers, [----] engagements

"The good news is that there's an Opus [---] Fast Mode that has 2.5x higher tokens/s. The bad news is that it costs 6x more than the normal mode so $150/million tokens"
X Link 2026-02-07T18:39Z 33.6K followers, 33.8K engagements

"https://code.claude.com/docs/en/fast-mode https://code.claude.com/docs/en/fast-mode"
X Link 2026-02-07T18:40Z 33.6K followers, [----] engagements

"@AdityaShips that's just gemini and not antigravity"
X Link 2026-02-07T18:43Z 33.6K followers, [---] engagements

"we are doing better than 1/10 on likes not a good sign Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API. Our teams have been building with a 2.5x-faster version of Claude Opus [---]. Were now making it available as an early experiment via Claude Code and our API"
X Link 2026-02-07T19:40Z 33.6K followers, [----] engagements

"Claude [---] Opus now rank [--] in the Design Arena"
X Link 2026-02-07T21:48Z 34.1K followers, [----] engagements

"almost stole his post a couple hours ago I have the exact same screenshot on my phone but thought I should do it on desktop instead (was too lazy to do it) Open models show 2.5x faster 6x more expensive Lower batch size speculative decoding harder Pareto optimal curve for Deepseek at https://t.co/d9dNCumX0I shows this Claude Opus [---] is [---] Tok/s/user Deepseek at [---] is 6k Tok/s/GPU At [---] tok/s/user it's closer to 1k https://t.co/X294HzM3Zo Open models show 2.5x faster 6x more expensive Lower batch size speculative decoding harder Pareto optimal curve for Deepseek at https://t.co/d9dNCumX0I"
X Link 2026-02-08T01:25Z 33.6K followers, [----] engagements

"xAI should go all in on world models it will be very useful when merged with Tesla and Optimus"
X Link 2026-02-08T01:57Z 33.6K followers, 10.4K engagements

"Frontier AI systems are getting more expensive but at the same time intelligence per dollar is falling rapidly. Most economically valuable work will be captured by models/systems available in $20-$200 subscriptions because we won't be able to fully automate most work so the models still need some human oversight. Meaning there is no benefit from massively parallel running agents. The coming $1000-$10000 subscription will be mainly used for coding. What will be real is that there will be a productivity gap in coding that comes at an extreme cost. But it will make the rich richer. It begins"
X Link 2026-02-08T02:21Z 33.6K followers, [----] engagements

"Qwen-3.5 next week πŸš€ Qwen [---] has been spotted on GitHub - Qwen3.5-9B-Instruct - Qwen3.5-35B-A3B-Instruct The [--] currently available models on the Arena "Karp-001" and "Karp-002" could possibly be the small Qwen-3.5 models https://t.co/VAfafLEPWk Qwen [---] has been spotted on GitHub - Qwen3.5-9B-Instruct - Qwen3.5-35B-A3B-Instruct The [--] currently available models on the Arena "Karp-001" and "Karp-002" could possibly be the small Qwen-3.5 models https://t.co/VAfafLEPWk"
X Link 2026-02-08T11:54Z 33.7K followers, [----] engagements

"this is an all time top [--] sentence brainrot is evolving language Clavicular was mid jestergooning when a group of Foids came and spiked his Cortisol levels 😭 Is Ignoring the Foids while munting and mogging Moids more useful then SMV chadfishing in the club https://t.co/QtSfANxCyh Clavicular was mid jestergooning when a group of Foids came and spiked his Cortisol levels 😭 Is Ignoring the Foids while munting and mogging Moids more useful then SMV chadfishing in the club https://t.co/QtSfANxCyh"
X Link 2026-02-08T12:15Z 33.6K followers, [----] engagements

"I have honestly thought a lot about going into politics. Most countries are utterly uninformed and unprepared for what's coming. If I cant be part of some ai lab the next best thing is to run for public office If I cant be part of some ai lab the next best thing is to run for public office"
X Link 2026-02-08T13:22Z 33.7K followers, [----] engagements

"something something M2.2 on the MiniMax website"
X Link 2026-02-08T14:41Z 34.3K followers, 43.6K engagements

"let the chinese century (february) begin Qwen3.5 GLM-5 MiniMax M2.2 Seed [---] DeepSeek-V4 something something M2.2 on the MiniMax website https://t.co/wLE7WhbdV4 something something M2.2 on the MiniMax website https://t.co/wLE7WhbdV4"
X Link 2026-02-08T14:54Z 33.9K followers, 37.3K engagements

"found a new benchmark We report costs in our leaderboard and opus [---] is significantly more expensive than [---] because it is a token hog. Anecdotally not much of an improvement in code performance. https://t.co/aMdj7ye5m4 We report costs in our leaderboard and opus [---] is significantly more expensive than [---] because it is a token hog. Anecdotally not much of an improvement in code performance. https://t.co/aMdj7ye5m4"
X Link 2026-02-08T15:29Z 33.8K followers, 16.4K engagements

"Now you might ask yourself: don't labs face the same problem Don't they need tasks that take [--] hours to RL on to get models to 24-hour time horizons I'll preface this by saying I'm not at any frontier lab and don't speak from experience. This is just how I think about it. My guess is that labs will need longer and longer tasks but A) I don't think they need 24-hour tasks to get to 24-hour time horizons and B) they don't need as many as METR does. I think there's a gap between what training requires and what evaluation requires. On the training side: you can improve performance on long"
X Link 2026-02-08T16:54Z 33.7K followers, 10.4K engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing