Dark | Light
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

[@scaling01](/creator/twitter/scaling01)
"their grid is better they have more kilometers of HV transmission lines more than twice the energy production a much greener grid with renewables more batteries but it's not just energy it's also transportation and 5G internet buildout more paved roads more ships more rail . most importantly they have higher growth rates in the buildout on all of these things go sleep kasey"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1942016174134620387) 2025-07-07 00:21:51 UTC 16.3K followers, 375.9K engagements


"@Meme_God_069 It has been a few hours since release. More providers will add it and optimize. Also it's basically free at $0.15/$0.85"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947582950201299375) 2025-07-22 09:02:14 UTC 16.2K followers, XXX engagements


"It's just weird honestly. It suggests to me that GPT-5 isn't GPT-4.5 + RL but just another update of o3 which is presumably GPT-4.1 + RL + potentially distilled from GPT-4.5-reasoning. Maybe we will get a Christmas Chonky with reasoning running on Blackwell"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946892465631281250) 2025-07-20 11:18:29 UTC 16.1K followers, 6333 engagements


"probably nothing Kimi-K2 just above o3-mini Grok-3 Claude X Sonnet and o1 on PHYBench"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944845893124981076) 2025-07-14 19:46:09 UTC 16.1K followers, 48.7K engagements


"chinese math olympiad team lost again to the better chinese olympiad team"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947679474289369551) 2025-07-22 15:25:47 UTC 16.3K followers, 16.4K engagements


"rest I must in exile I'm going to live until GPT-5 returns"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945642899493810582) 2025-07-17 00:33:10 UTC 16.1K followers, 3745 engagements


"OpenAI researcher: "We are releasing GPT-5 soon" We are so fucking back"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946483768866357517) 2025-07-19 08:14:29 UTC 16K followers, 5845 engagements


"I told you Mistral is dying Qwen is our new open-source king"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946275804716863676) 2025-07-18 18:28:06 UTC 16.3K followers, 3500 engagements


"More OpenAI Agent mode benchmarks:: WebArena: XX% BrowserComp: 69%"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945896072821268979) 2025-07-17 17:19:11 UTC 16.1K followers, 1454 engagements


"- Sama is already talking about AGI all the time not long before he says it - Q1 model fiesta: happened - agents / computer use: literally yesterday - o3 yes o4 in the coming X weeks (GPT-5) o5 end of year - o3 replication coming: Kimi-K2 reasoner or R2 - SWE-Bench XX% EOY: confident - ARC-AGI-2 XX% by end of year: coin-flip - Frontier Math 80%: unsure but with Gold IMO it seems slightly closer - 10+ million context length models: Llama-4 but not really"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946545953407902019) 2025-07-19 12:21:35 UTC 16.3K followers, 34.3K engagements


""As the race to AGI intensifies the national security state will get involved""  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1948040236367278190) 2025-07-23 15:19:19 UTC 16.3K followers, 4080 engagements


"@kalomaze I thought it would be way faster. Is this already with VLLM or SGLang They should be faster right"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945845971813876137) 2025-07-17 14:00:06 UTC 16K followers, XXX engagements


"@sama That's a lot of agents Assuming o3 fits on one 8xH100 you would have 125k instances of o3. Batch size won't be huge with reasoning models maybe 4-8 So 500k to X million o3 agents could run on those newly deployed GPUs. But there will also be plenty of H200 B200 and GB200"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947067181474164889) 2025-07-20 22:52:45 UTC 16.3K followers, 26K engagements


"OpenAI AI has more employees with IMO gold medals than the rest of the world combined"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946491162123960421) 2025-07-19 08:43:51 UTC 16.1K followers, 2060 engagements


"100% confirmation that the OpenAI open-source model release was delayed because of Kimi-K2 translation: our model sucks gets badly beaten by Kimi-K2 need to train a better one"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1943849403002839386) 2025-07-12 01:46:27 UTC 16.3K followers, 49.1K engagements


"Generative AI is the fastest adopted technology in history"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945975245153742896) 2025-07-17 22:33:47 UTC 16.3K followers, 9790 engagements


"GPT-4 was released XXX days or XXX years ago we are getting old"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944880249201991950) 2025-07-14 22:02:40 UTC 16.2K followers, 2901 engagements


""we need XXXXX% gross profit margin XX% aren't enough" but sure it's all for the benefit of all of humanity"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946919598592286899) 2025-07-20 13:06:18 UTC 16.1K followers, 2514 engagements


"Kimi-K2 Technical Report is out to reveal all the secrets"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947384137892966693) 2025-07-21 19:52:13 UTC 16.3K followers, 24.4K engagements


"Kimi-K2 better than Grok-4 it's so over for benchmaxxers"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944055665082687756) 2025-07-12 15:26:04 UTC 16.2K followers, 190.4K engagements


"o4-mini is a bit cleaner but overall design and details is easily a win for Qwen3-Coder"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947746437321621778) 2025-07-22 19:51:52 UTC 16.3K followers, XXX engagements


"If GPT-6 is in training (which I don't believe) it would mean that they no longer scale compute 100x between major releases"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946888451183255863) 2025-07-20 11:02:32 UTC 16.2K followers, 12.7K engagements


"@sama the next X weeks are a good time to release all your models :)"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946576372672913617) 2025-07-19 14:22:27 UTC 16.3K followers, 11.5K engagements


"what if I told you that OpenAI Google Anthropic and xAI will all be working together in a few years"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947685542742606304) 2025-07-22 15:49:54 UTC 16.3K followers, 101.2K engagements


"I made it on the ARC-AGI-3 leaderboard I honestly don't know how people got below XXX I made a few mistakes but XXX of them"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946577546675392885) 2025-07-19 14:27:07 UTC 16.1K followers, 4203 engagements


"@dylan522p Is that still up-to-date where is xAI on this list"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947777201279258683) 2025-07-22 21:54:07 UTC 16.3K followers, XXX engagements


"We are nearing the singularity and are taking off "We don't plan to release anything with this level of math capabilities for several months." Soon it will be years and then it will be never. The competitive advantage is too great"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946487707749818656) 2025-07-19 08:30:08 UTC 16.2K followers, 21.3K engagements


"Nothing on GPT-5 Sam was just there to taunt us"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945898343692964280) 2025-07-17 17:28:12 UTC 16.2K followers, 5471 engagements


"There is a XX% chance that ChatGPT agent will actually gamble away your life savings if you asked it"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945930617775882728) 2025-07-17 19:36:27 UTC 16.2K followers, 16.6K engagements


"Qwen-3 arch: 480B total params 35B active XX layers GQA with XX heads XXX experts X active experts 6144 hidden dim shallower than Qwen3-235B-A22B"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947776314242699773) 2025-07-22 21:50:35 UTC 16.3K followers, 2439 engagements


"@_ueaj yes but some things are not generalizable by more params so eventually we will have to go to more expensive archs/attention mechanisms the brain has a tiny context window but a ridiculous amount of compute"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946986865619325395) 2025-07-20 17:33:36 UTC 16.1K followers, XX engagements


"I just did my taxes for the very first time in my life and I totally understand my parents now. I don't know how they did it without AI or an accountant. It would've taken me like 10x longer if I had to Google everything"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947051017129837030) 2025-07-20 21:48:31 UTC 16.2K followers, 2047 engagements


"The updated Qwen3-235B-A22B is now the best non-reasoning models period. It beats Kimi-K2 Claude-4 Opus and DeepSeek V3 on multiple benchmarks like GPQA AIME ARC-AGI LiveCodeBench or BFCLv3 just to name a few"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947350866840748521) 2025-07-21 17:40:01 UTC 16.3K followers, 20.5K engagements


"We are about to smash the METR trend AI can now productively work for at least XXX hours in a row. Nearly 3x as long as o3"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946484863168569359) 2025-07-19 08:18:49 UTC 16.1K followers, 2616 engagements


"The White House just released America's AI Action Plan. I've read the whole thing. This document makes it very clear that this is about "winning the AI race" and even compare it to the cold war era. It's a paper about national-security Here are the most important quotes: - Just like we won the space race it is imperative that the United States and its allies win this race. - Americas AI Action Plan has three pillars: innovation infrastructure and international diplomacy and security. Pillar I - Innovation: - Led by the Department of Commerce revise the NIST AI Risk Management Framework to"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1948037110662848925) 2025-07-23 15:06:54 UTC 16.3K followers, 15.6K engagements


"TLDR Kimi-K2 Technical Report: - 1.04T@32B parameter DeepSeekV3 style MoE with MLA higher sparsity but half the attention heads - trained on 15.5T tokens XXX billion tokens @ 4k XX billion tokens @ 32k extension with YaRN - use of MuonClip optimizer including QK-Clip to help training stability - general refinorcement learning framework including RLVR and self-critque for non-verifiable domains - large-scale agentic data synthesis pipeline QK-Clip: constrains attention logits by rescaling the key and query projection weights post-update Sparsity Scaling Laws: - higher sparsity yields"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947400424622866793) 2025-07-21 20:56:56 UTC 16.3K followers, 10.4K engagements


"The Interactive DeepResearch Reports by Kimi-K2 look pretty sleek"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944444946246726000) 2025-07-13 17:12:55 UTC 16.3K followers, 87.9K engagements


"A few more observations after replicating the Tower of Hanoi game with their exact prompts: - You need AT LEAST 2N - X moves and the output format requires XX tokens per move + some constant stuff. - Furthermore the output limit for Sonnet XXX is 128k DeepSeek R1 64K and o3-mini 100k tokens. This includes the reasoning tokens they use before outputting their final answer - all models will have X accuracy with more than XX disks simply because they can not output that much - the max solvable sizes WITHOUT ANY ROOM FOR REASONING (floor(log2(output_limit/10))) DeepSeek: XX disks Sonnet XXX and"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1931783050511126954) 2025-06-08 18:39:04 UTC 16.2K followers, 612.5K engagements


"I don't think Americans understand how far ahead Chinas infrastructure is"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1942005210580205856) 2025-07-06 23:38:17 UTC 16.3K followers, 18.5M engagements


"Not a mystery. Orcas are cracked they have dialects podspecific hunting strategies and even use tools"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945510653579395246) 2025-07-16 15:47:40 UTC 16.2K followers, 2063 engagements


"Qwen3-235B beats GPT-4.5 while being XXX times cheaper"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947581307698991308) 2025-07-22 08:55:42 UTC 16.3K followers, 6035 engagements


"Janitor has spoken GPT-5 will come in hot Remember the night before the last day of shipmas We were all disappointed but Janitor tried to tell us. Then came o3-preview"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946903149283012870) 2025-07-20 12:00:57 UTC 16.1K followers, 2320 engagements


"@MarMa10134863 if they are smart they spend the extra XXX milly to make them all storm proof"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947680490690842947) 2025-07-22 15:29:49 UTC 16.3K followers, XX engagements


"The White House finally saw the chart: "American energy capacity has stagnated since the 1970s while China has rapidly built out their grid. Americas path to AI dominance depends on changing this troubling trend" - Quote from the AI Action Plan by the White House"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1948038229808079032) 2025-07-23 15:11:21 UTC 16.3K followers, XXX engagements


"Inverse Scaling in Test-Time Compute by Anthropic So are reasoning models cooked No they cited the Apple Tower of Hanoi paper. And it looks more like an Anthropic skill issue to me since o3's performance decreases in only X benchmark while Opus X has decreased performance in X benchmarks"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947625084513845429) 2025-07-22 11:49:39 UTC 16.3K followers, 1667 engagements


"STOP SPREADING FAKE NEWS This image is from 2009 in Shanghai. Completely different city and over XX years ago and only a single person died because it was still in development. Furthermore it had nothing to do with concrete of the building itself but with the ground/foundation"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1942209026080583998) 2025-07-07 13:08:10 UTC 16.3K followers, 85.9K engagements


"more hillclimbing on ARC-AGI-3 If the guy who got XXX did it unassisted without computer or notes that would be impressive"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946616408235749465) 2025-07-19 17:01:32 UTC 16.3K followers, 2978 engagements


"what the fuck is Gemini XXX Pro doing down there"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947414658199007237) 2025-07-21 21:53:30 UTC 16.3K followers, 2719 engagements


"@test_tm7873 solutions for mensa norway are everywhere"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944072891793805542) 2025-07-12 16:34:31 UTC 16.2K followers, 2654 engagements


"It's not GPT-5 Repeat: not GPT-5 (I ragequit this is confirmed by trusted sources)"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945640155890483359) 2025-07-17 00:22:16 UTC 16.2K followers, 103.8K engagements


"two labs independently got IMO gold and you are gooning anon"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946543658032931208) 2025-07-19 12:12:27 UTC 16K followers, 5129 engagements


"Gary Marcus strikes again: "No pure LLM is anywhere near getting a silver medal in a math olympiad" "Pure deep learning had a good run but it's time to move on" 😂😂😂"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946530148813025544) 2025-07-19 11:18:46 UTC 16.3K followers, 47.9K engagements


"We would have Llama-4 Behemoth and reasoning models if chinese models didn't exist"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946373609967988917) 2025-07-19 00:56:45 UTC 16.2K followers, 1187 engagements


"1e28 flops is XXXX days on this machine you could train GPT-4 in a few hours"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947712193795133933) 2025-07-22 17:35:48 UTC 16.3K followers, 11.2K engagements


"Kimi-K2 now #1 on EQ-Bench and Creative Writing beating o3 This could mean that Moonshot has really high-quality data and is not just using synthetic data from OpenAIAnthropic or Google"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944361306078822500) 2025-07-13 11:40:34 UTC 16.2K followers, 58K engagements


"Qwen3-Coder-480B-A35B-Instruct Also known as Qwen-Coder-Plus with X million tokens input and 65k tokens output but apparently without thinking"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947733375332040885) 2025-07-22 18:59:58 UTC 16.3K followers, 19.3K engagements


"Incredible model. XX minutes thinking and $XXXX later to respond with "base64" It's so over for GPT-5"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1943339529236287952) 2025-07-10 16:00:23 UTC 16.1K followers, 66.3K engagements


"@TheCinesthetic tarantino is clueless its a bad scene the soldiers see the explosions coming but just stand in place waiting to get blasted and the aircraft are just invisible there's just random explosions lmao"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947627626803171499) 2025-07-22 11:59:46 UTC 16.3K followers, XXX engagements


"@basedjensen @a_karvonen Both don't show general intelligence lol It's RL'd to solve maths there's nothing general about it. Aesthetics don't matter as long as the proof is correct reproducible and not just brute force"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946886020378862052) 2025-07-20 10:52:53 UTC 16.1K followers, XXX engagements


"GPT-5 will still have low medium and high reasoning budgets"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946535207206551746) 2025-07-19 11:38:52 UTC 16K followers, 11K engagements


"OpenAI got gold on the International Math Olympiad with an experimental model (not GPT-5)"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946479386934366659) 2025-07-19 07:57:04 UTC 16.1K followers, 1433 engagements


"Religious believers and LLM doubters are so similar. They can't recognize change and their world has only shrunk never grown"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946751959249089008) 2025-07-20 02:00:10 UTC 16.1K followers, 2407 engagements


"Qwen-3 Coder randomly makes some text on the left sidebar larger and not aligned"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947746566808474027) 2025-07-22 19:52:23 UTC 16.3K followers, 1163 engagements


"@Infopulsed So do you also think that we are doomed Because in theory the AI arms race is starting and we are all fucked in 3-5 years"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947579846965547355) 2025-07-22 08:49:54 UTC 16.2K followers, XX engagements


"@zephyr_z9 @teortaxesTex @natolambert Yeah. Just checked their page lol But ByteDance is cooking on all fronts not only language models"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944788934921486362) 2025-07-14 15:59:49 UTC 16.1K followers, XXX engagements


"insert where is qwen meme But I'm glad they finally shipped it after like X months. It's going to be interesting comparing Gemini and Qwen"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944814664069583263) 2025-07-14 17:42:03 UTC 16.2K followers, 18.5K engagements


"Grok-4 trained on 100k H100 with 10x more compute than Grok-3 and 100x more compute than Grok-2"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1943160965119279204) 2025-07-10 04:10:50 UTC 16.1K followers, 397.7K engagements


"@ Zuck all they need is X months to build a frontier coding model"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947773545733394439) 2025-07-22 21:39:35 UTC 16.3K followers, 15.6K engagements


"I understand that. But I still think the dynamics are that if AI is truly as disruptive as it sounds then frontier performance will get privatized. Of course the public is going to get nice distilled economical models but the big boi uncensored models where you can add arbitrary compute are going to be locked to a select few"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946581420819779677) 2025-07-19 14:42:31 UTC 16K followers, XXX engagements


"Zuck just offered Sam Altman a $X trillion pay package to join Meta's superintelligence team"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947323274146275665) 2025-07-21 15:50:22 UTC 16.3K followers, 33.1K engagements


"another interesting benchmark by Lech and o3 is at the top"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947714819521712161) 2025-07-22 17:46:14 UTC 16.3K followers, 1516 engagements


"@Mohsine_Mahzi eughhh I don't want that I want to see the model names. I just want to keep my model selector"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946906395661754571) 2025-07-20 12:13:51 UTC 16.1K followers, 1447 engagements


"Is anyone actually using these goofy ass glasses"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947470409865302138) 2025-07-22 01:35:02 UTC 16.3K followers, 1185 engagements


"the only thing you need to understand about AI: it's currently improving exponentially in ALL domains - knowledge coding mathematics self-driving cars computer-use browsing video understanding"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947419483276025996) 2025-07-21 22:12:40 UTC 16.3K followers, 2402 engagements


"ARC-AGI-3 scores X% for AI XXX% for humans now live with API where you can test your agent:"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946261191782797717) 2025-07-18 17:30:02 UTC 16.2K followers, 30.6K engagements


"He can't be serious. He posted this right before OpenAI announced they got Gold in the IMO. Truly the Jim Cramer of AI"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946528532772909415) 2025-07-19 11:12:21 UTC 16.3K followers, 39.7K engagements


"As long as GPT-5 is a noticeable step up from o3 I don't care what models they used It would only make my heart a bit happier if I knew there is a reasoning Chonky"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946893586592657880) 2025-07-20 11:22:57 UTC 16.1K followers, 1582 engagements


"Qwen about to release a 480B MoE for coding with X million context "Qwen3-Coder-480B-A35B-Instruct is a powerful coding-specialized language model excelling in code generation tool use and agentic tasks.""  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947732150872084693) 2025-07-22 18:55:06 UTC 16.3K followers, 123.1K engagements


"btw I don't want a model router I want to be able to select the models I use"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946903963200262523) 2025-07-20 12:04:11 UTC 16.3K followers, 51.2K engagements


"The whole GPT-5 model routing thing just sounds like a way of saving compute/money and ripping of plus and pro users. Sama already talked about using a credit based system and I'm sure they also talked about raising prices at some point"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946889286780907772) 2025-07-20 11:05:52 UTC 16.2K followers, 5670 engagements


"hot take: non-reasoning models are more elegant than reasoning models"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947779680263639215) 2025-07-22 22:03:58 UTC 16.3K followers, 30K engagements


"A very high bar. Considering only a maximum of XX people can receive a Nobel Prize each year I would argue that this is a superhuman feat"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946489420195991795) 2025-07-19 08:36:56 UTC 16.1K followers, 3751 engagements


"what a joke xAI valued at 200B Anthropic latest valuation was 61.5B xAI revenue 0B Anthropic revenue 4B"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1943773619211194539) 2025-07-11 20:45:19 UTC 16.3K followers, 579K engagements


"the spice fields are part of my empire period"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946223434465239372) 2025-07-18 15:00:00 UTC 16.1K followers, 2441 engagements


"The White House doesn't want you to use DeepSeek Qwen or Kimi by focusing evaluations on censorship and alignment with the CCP instead of capabilities and usefulness"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1948039683440877658) 2025-07-23 15:17:08 UTC 16.3K followers, 3585 engagements


"GPT-5 expectations: - SOTA on most benchmarks (#1 on my meta-benchmark) - specifically: SOTA on ARCAGI2 and METR - 2025 knowledge cutoff - longer context window (400k) - fully multimodal (text/image/audio + video input) - sane output pricing: = $XX / 1M tokens nicetohaves: - fewer hallucinations than o3 - less sycophancy - no more barrage of em dashes - clean code style no weird multiline comments"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946335109277274344) 2025-07-18 22:23:45 UTC 16.2K followers, 1947 engagements


"@polynoamial @SherylHsu02 So it's like o3-pro or Grok-4 some parallel approach with multiple agents"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946482273274167363) 2025-07-19 08:08:32 UTC 16.1K followers, 4073 engagements


"GPT-4o was never the "top" model lore accurate timeline for the newbies: GPT-3.5 GPT-4 GPT-4 Turbo Opus X Sonnet XXX o1 Gemini XXX Pro and right now im so fucking confused what the actual best model is Gemini XXX Pro Opus X and o3 are all sick but probably o3-high / o3-pro if you want to count parallel cheating"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947411249408385359) 2025-07-21 21:39:57 UTC 16.3K followers, 5548 engagements


"@OfficialLoganK Can we expect a delivery in July or later in August"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946990846290575659) 2025-07-20 17:49:25 UTC 16.1K followers, 1475 engagements


"Grok-4 falling behind Gemini XXX Pro on SimpleBench"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946339886878859774) 2025-07-18 22:42:44 UTC 16.3K followers, 96.8K engagements


"Official OpenAI Agent mode benchmarks in one thread Coming to Pro Plus and Team users"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945897524507611541) 2025-07-17 17:24:57 UTC 16.2K followers, 25.9K engagements


"Top OpenAI researcher says that: "We're close to AI substantially contributing to scientific discovery" after an internal model reaches Gold in the International Math Olympiad"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946481763339137033) 2025-07-19 08:06:30 UTC 16.1K followers, 2257 engagements


"Kimi K2 is here and it's insane It's the best OS non-thinking model and one of the best non-thinking models competitive with GPT-4.1 Sonnet X and Opus X X trillion params 32B active trained with Muon optimizer on 15.5T tokens It is also much cheaper than all the alternatives: $0.60/million input and $2.50/million output tokens vs GPT-4.1 at $2/$8 vs Sonnet X at $3/$15"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1943689306339496198) 2025-07-11 15:10:17 UTC 16K followers, 34.8K engagements


"@luke_chaj @sama Idk I think you are memory bound with 100-200k context per user"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947232029717729563) 2025-07-21 09:47:48 UTC 16.3K followers, XXX engagements


"Google got the IMO Gold without AlphaProof or AlphaGeometry with a fine-tuned version of Gemini DeepThink"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947338259626864991) 2025-07-21 16:49:55 UTC 16.2K followers, 8285 engagements


"you are better off asking o3 than ChatGPT agent to build a genetically modified supervirus"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945931094412452087) 2025-07-17 19:38:21 UTC 16.1K followers, 1450 engagements


"my AI predictions for 2025: - at least one lab will declare AGI and mentions ASI - Q1: Google Anthropic OpenAI META Qwen and Mistral model fiesta ( it will be heaven ) - agents / computer use takes off - release of Claude X Gemini X GPT-5 Grok X (or whatever they call their giant 5-20 trillion parameter models) - release of o3 o4 and o5 - open-source replication of o3 - the Frontier Math benchmark will be mostly solved (80%) - SWE-bench will be solved (90%) - ARC-AGI X will be mostly solved (80%) within X months of it's release - 10+ million context length models my wishful thinking: Someone"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1874608907508752546) 2025-01-02 00:09:26 UTC 16.3K followers, 343.3K engagements


"Remember Elon firing against OpenAI for not being open-source So where are the Grok-2 and Grok-3 weights"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1943485492852375635) 2025-07-11 01:40:24 UTC 16.1K followers, 129.9K engagements


"blinky blinky towers grey suburbia every kid knows this"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1942041577943175425) 2025-07-07 02:02:48 UTC 16.1K followers, 221.4K engagements


"ChatGPT Agent has lower performance than o3 on PaperBench SWE-Bench verified OpenAI PRs and OpenAI Research Engineer Interview questions"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945932154455695752) 2025-07-17 19:42:33 UTC 16.2K followers, 5089 engagements


"Both Google DeepMind and OpenAI got a gold medal at the IMO"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946661941797060843) 2025-07-19 20:02:28 UTC 16.1K followers, 9490 engagements


""AI is just autocomplete" meanwhile: Zuck offered at least XX OpenAI researchers pay packages of $XXX million"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947019290604908985) 2025-07-20 19:42:27 UTC 16.3K followers, 9258 engagements


"@GregKamradt it seems like the documentation for getting started with building ARC-AGI-3 agents is outdated: the classes Agent and Observation don't exist and the init file looks completely different too"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947060421543756196) 2025-07-20 22:25:53 UTC 16.2K followers, XXX engagements


"Somehow ChatGPT agent has higher hallucination rates but what is XXXXX vs XXXXX lol"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945929949602328711) 2025-07-17 19:33:48 UTC 16.1K followers, 2546 engagements


"Zuck just offered himself $XX billion. he declined"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947356602538832051) 2025-07-21 18:02:48 UTC 16.3K followers, 3070 engagements


"@GregKamradt maybe also add an option to use openrouter for public testing I instantly ran into rate limit as i don't use OpenAI API that much"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947063052139901340) 2025-07-20 22:36:20 UTC 16.2K followers, XXX engagements


"No clue where this guy got the rest of the points and I'm not finding out without making notes and spending another hour on this"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946653849089187925) 2025-07-19 19:30:19 UTC 16.1K followers, 1386 engagements


"Qwen3-235B-A22B scored XX% on ARC-AGI-1 without thinking That's the same level as Gemini XXX Pro Sonnet X or o3-low with thinking. But it might be trained on it if not then it's insane"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1947351789222711455) 2025-07-21 17:43:41 UTC 16.3K followers, 35.6K engagements


"Are you ready for the Grok-4 coding model Aider-Polyglot results look good Grok-4 beating o3"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1944104802914385998) 2025-07-12 18:41:19 UTC 16.1K followers, 4808 engagements


"HEY FUCKERS HOW ABOUT FIXING YOUR APP AND RELEASING GPT-5"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945972489416315204) 2025-07-17 22:22:50 UTC 16.2K followers, 3474 engagements


"I bet in "no emoji" arena it would even beat GPT-4o and GPT-4.5 making it the best non-thinking model"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1945877502607405396) 2025-07-17 16:05:23 UTC 16.1K followers, 1268 engagements


"Grok-4 confirmed to have a 256K context window"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1943170092012818608) 2025-07-10 04:47:06 UTC 16.2K followers, 1676 engagements


"why is the input for ARC-AGI-3 in slow motion like please speed it up i don't have the whole day"  
![@scaling01 Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::1825243643529027584.png) [@scaling01](/creator/x/scaling01) on [X](/post/tweet/1946266870564126985) 2025-07-18 17:52:36 UTC 16.2K followers, 1307 engagements

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@scaling01 "their grid is better they have more kilometers of HV transmission lines more than twice the energy production a much greener grid with renewables more batteries but it's not just energy it's also transportation and 5G internet buildout more paved roads more ships more rail . most importantly they have higher growth rates in the buildout on all of these things go sleep kasey"
@scaling01 Avatar @scaling01 on X 2025-07-07 00:21:51 UTC 16.3K followers, 375.9K engagements

"@Meme_God_069 It has been a few hours since release. More providers will add it and optimize. Also it's basically free at $0.15/$0.85"
@scaling01 Avatar @scaling01 on X 2025-07-22 09:02:14 UTC 16.2K followers, XXX engagements

"It's just weird honestly. It suggests to me that GPT-5 isn't GPT-4.5 + RL but just another update of o3 which is presumably GPT-4.1 + RL + potentially distilled from GPT-4.5-reasoning. Maybe we will get a Christmas Chonky with reasoning running on Blackwell"
@scaling01 Avatar @scaling01 on X 2025-07-20 11:18:29 UTC 16.1K followers, 6333 engagements

"probably nothing Kimi-K2 just above o3-mini Grok-3 Claude X Sonnet and o1 on PHYBench"
@scaling01 Avatar @scaling01 on X 2025-07-14 19:46:09 UTC 16.1K followers, 48.7K engagements

"chinese math olympiad team lost again to the better chinese olympiad team"
@scaling01 Avatar @scaling01 on X 2025-07-22 15:25:47 UTC 16.3K followers, 16.4K engagements

"rest I must in exile I'm going to live until GPT-5 returns"
@scaling01 Avatar @scaling01 on X 2025-07-17 00:33:10 UTC 16.1K followers, 3745 engagements

"OpenAI researcher: "We are releasing GPT-5 soon" We are so fucking back"
@scaling01 Avatar @scaling01 on X 2025-07-19 08:14:29 UTC 16K followers, 5845 engagements

"I told you Mistral is dying Qwen is our new open-source king"
@scaling01 Avatar @scaling01 on X 2025-07-18 18:28:06 UTC 16.3K followers, 3500 engagements

"More OpenAI Agent mode benchmarks:: WebArena: XX% BrowserComp: 69%"
@scaling01 Avatar @scaling01 on X 2025-07-17 17:19:11 UTC 16.1K followers, 1454 engagements

"- Sama is already talking about AGI all the time not long before he says it - Q1 model fiesta: happened - agents / computer use: literally yesterday - o3 yes o4 in the coming X weeks (GPT-5) o5 end of year - o3 replication coming: Kimi-K2 reasoner or R2 - SWE-Bench XX% EOY: confident - ARC-AGI-2 XX% by end of year: coin-flip - Frontier Math 80%: unsure but with Gold IMO it seems slightly closer - 10+ million context length models: Llama-4 but not really"
@scaling01 Avatar @scaling01 on X 2025-07-19 12:21:35 UTC 16.3K followers, 34.3K engagements

""As the race to AGI intensifies the national security state will get involved""
@scaling01 Avatar @scaling01 on X 2025-07-23 15:19:19 UTC 16.3K followers, 4080 engagements

"@kalomaze I thought it would be way faster. Is this already with VLLM or SGLang They should be faster right"
@scaling01 Avatar @scaling01 on X 2025-07-17 14:00:06 UTC 16K followers, XXX engagements

"@sama That's a lot of agents Assuming o3 fits on one 8xH100 you would have 125k instances of o3. Batch size won't be huge with reasoning models maybe 4-8 So 500k to X million o3 agents could run on those newly deployed GPUs. But there will also be plenty of H200 B200 and GB200"
@scaling01 Avatar @scaling01 on X 2025-07-20 22:52:45 UTC 16.3K followers, 26K engagements

"OpenAI AI has more employees with IMO gold medals than the rest of the world combined"
@scaling01 Avatar @scaling01 on X 2025-07-19 08:43:51 UTC 16.1K followers, 2060 engagements

"100% confirmation that the OpenAI open-source model release was delayed because of Kimi-K2 translation: our model sucks gets badly beaten by Kimi-K2 need to train a better one"
@scaling01 Avatar @scaling01 on X 2025-07-12 01:46:27 UTC 16.3K followers, 49.1K engagements

"Generative AI is the fastest adopted technology in history"
@scaling01 Avatar @scaling01 on X 2025-07-17 22:33:47 UTC 16.3K followers, 9790 engagements

"GPT-4 was released XXX days or XXX years ago we are getting old"
@scaling01 Avatar @scaling01 on X 2025-07-14 22:02:40 UTC 16.2K followers, 2901 engagements

""we need XXXXX% gross profit margin XX% aren't enough" but sure it's all for the benefit of all of humanity"
@scaling01 Avatar @scaling01 on X 2025-07-20 13:06:18 UTC 16.1K followers, 2514 engagements

"Kimi-K2 Technical Report is out to reveal all the secrets"
@scaling01 Avatar @scaling01 on X 2025-07-21 19:52:13 UTC 16.3K followers, 24.4K engagements

"Kimi-K2 better than Grok-4 it's so over for benchmaxxers"
@scaling01 Avatar @scaling01 on X 2025-07-12 15:26:04 UTC 16.2K followers, 190.4K engagements

"o4-mini is a bit cleaner but overall design and details is easily a win for Qwen3-Coder"
@scaling01 Avatar @scaling01 on X 2025-07-22 19:51:52 UTC 16.3K followers, XXX engagements

"If GPT-6 is in training (which I don't believe) it would mean that they no longer scale compute 100x between major releases"
@scaling01 Avatar @scaling01 on X 2025-07-20 11:02:32 UTC 16.2K followers, 12.7K engagements

"@sama the next X weeks are a good time to release all your models :)"
@scaling01 Avatar @scaling01 on X 2025-07-19 14:22:27 UTC 16.3K followers, 11.5K engagements

"what if I told you that OpenAI Google Anthropic and xAI will all be working together in a few years"
@scaling01 Avatar @scaling01 on X 2025-07-22 15:49:54 UTC 16.3K followers, 101.2K engagements

"I made it on the ARC-AGI-3 leaderboard I honestly don't know how people got below XXX I made a few mistakes but XXX of them"
@scaling01 Avatar @scaling01 on X 2025-07-19 14:27:07 UTC 16.1K followers, 4203 engagements

"@dylan522p Is that still up-to-date where is xAI on this list"
@scaling01 Avatar @scaling01 on X 2025-07-22 21:54:07 UTC 16.3K followers, XXX engagements

"We are nearing the singularity and are taking off "We don't plan to release anything with this level of math capabilities for several months." Soon it will be years and then it will be never. The competitive advantage is too great"
@scaling01 Avatar @scaling01 on X 2025-07-19 08:30:08 UTC 16.2K followers, 21.3K engagements

"Nothing on GPT-5 Sam was just there to taunt us"
@scaling01 Avatar @scaling01 on X 2025-07-17 17:28:12 UTC 16.2K followers, 5471 engagements

"There is a XX% chance that ChatGPT agent will actually gamble away your life savings if you asked it"
@scaling01 Avatar @scaling01 on X 2025-07-17 19:36:27 UTC 16.2K followers, 16.6K engagements

"Qwen-3 arch: 480B total params 35B active XX layers GQA with XX heads XXX experts X active experts 6144 hidden dim shallower than Qwen3-235B-A22B"
@scaling01 Avatar @scaling01 on X 2025-07-22 21:50:35 UTC 16.3K followers, 2439 engagements

"@_ueaj yes but some things are not generalizable by more params so eventually we will have to go to more expensive archs/attention mechanisms the brain has a tiny context window but a ridiculous amount of compute"
@scaling01 Avatar @scaling01 on X 2025-07-20 17:33:36 UTC 16.1K followers, XX engagements

"I just did my taxes for the very first time in my life and I totally understand my parents now. I don't know how they did it without AI or an accountant. It would've taken me like 10x longer if I had to Google everything"
@scaling01 Avatar @scaling01 on X 2025-07-20 21:48:31 UTC 16.2K followers, 2047 engagements

"The updated Qwen3-235B-A22B is now the best non-reasoning models period. It beats Kimi-K2 Claude-4 Opus and DeepSeek V3 on multiple benchmarks like GPQA AIME ARC-AGI LiveCodeBench or BFCLv3 just to name a few"
@scaling01 Avatar @scaling01 on X 2025-07-21 17:40:01 UTC 16.3K followers, 20.5K engagements

"We are about to smash the METR trend AI can now productively work for at least XXX hours in a row. Nearly 3x as long as o3"
@scaling01 Avatar @scaling01 on X 2025-07-19 08:18:49 UTC 16.1K followers, 2616 engagements

"The White House just released America's AI Action Plan. I've read the whole thing. This document makes it very clear that this is about "winning the AI race" and even compare it to the cold war era. It's a paper about national-security Here are the most important quotes: - Just like we won the space race it is imperative that the United States and its allies win this race. - Americas AI Action Plan has three pillars: innovation infrastructure and international diplomacy and security. Pillar I - Innovation: - Led by the Department of Commerce revise the NIST AI Risk Management Framework to"
@scaling01 Avatar @scaling01 on X 2025-07-23 15:06:54 UTC 16.3K followers, 15.6K engagements

"TLDR Kimi-K2 Technical Report: - 1.04T@32B parameter DeepSeekV3 style MoE with MLA higher sparsity but half the attention heads - trained on 15.5T tokens XXX billion tokens @ 4k XX billion tokens @ 32k extension with YaRN - use of MuonClip optimizer including QK-Clip to help training stability - general refinorcement learning framework including RLVR and self-critque for non-verifiable domains - large-scale agentic data synthesis pipeline QK-Clip: constrains attention logits by rescaling the key and query projection weights post-update Sparsity Scaling Laws: - higher sparsity yields"
@scaling01 Avatar @scaling01 on X 2025-07-21 20:56:56 UTC 16.3K followers, 10.4K engagements

"The Interactive DeepResearch Reports by Kimi-K2 look pretty sleek"
@scaling01 Avatar @scaling01 on X 2025-07-13 17:12:55 UTC 16.3K followers, 87.9K engagements

"A few more observations after replicating the Tower of Hanoi game with their exact prompts: - You need AT LEAST 2N - X moves and the output format requires XX tokens per move + some constant stuff. - Furthermore the output limit for Sonnet XXX is 128k DeepSeek R1 64K and o3-mini 100k tokens. This includes the reasoning tokens they use before outputting their final answer - all models will have X accuracy with more than XX disks simply because they can not output that much - the max solvable sizes WITHOUT ANY ROOM FOR REASONING (floor(log2(output_limit/10))) DeepSeek: XX disks Sonnet XXX and"
@scaling01 Avatar @scaling01 on X 2025-06-08 18:39:04 UTC 16.2K followers, 612.5K engagements

"I don't think Americans understand how far ahead Chinas infrastructure is"
@scaling01 Avatar @scaling01 on X 2025-07-06 23:38:17 UTC 16.3K followers, 18.5M engagements

"Not a mystery. Orcas are cracked they have dialects podspecific hunting strategies and even use tools"
@scaling01 Avatar @scaling01 on X 2025-07-16 15:47:40 UTC 16.2K followers, 2063 engagements

"Qwen3-235B beats GPT-4.5 while being XXX times cheaper"
@scaling01 Avatar @scaling01 on X 2025-07-22 08:55:42 UTC 16.3K followers, 6035 engagements

"Janitor has spoken GPT-5 will come in hot Remember the night before the last day of shipmas We were all disappointed but Janitor tried to tell us. Then came o3-preview"
@scaling01 Avatar @scaling01 on X 2025-07-20 12:00:57 UTC 16.1K followers, 2320 engagements

"@MarMa10134863 if they are smart they spend the extra XXX milly to make them all storm proof"
@scaling01 Avatar @scaling01 on X 2025-07-22 15:29:49 UTC 16.3K followers, XX engagements

"The White House finally saw the chart: "American energy capacity has stagnated since the 1970s while China has rapidly built out their grid. Americas path to AI dominance depends on changing this troubling trend" - Quote from the AI Action Plan by the White House"
@scaling01 Avatar @scaling01 on X 2025-07-23 15:11:21 UTC 16.3K followers, XXX engagements

"Inverse Scaling in Test-Time Compute by Anthropic So are reasoning models cooked No they cited the Apple Tower of Hanoi paper. And it looks more like an Anthropic skill issue to me since o3's performance decreases in only X benchmark while Opus X has decreased performance in X benchmarks"
@scaling01 Avatar @scaling01 on X 2025-07-22 11:49:39 UTC 16.3K followers, 1667 engagements

"STOP SPREADING FAKE NEWS This image is from 2009 in Shanghai. Completely different city and over XX years ago and only a single person died because it was still in development. Furthermore it had nothing to do with concrete of the building itself but with the ground/foundation"
@scaling01 Avatar @scaling01 on X 2025-07-07 13:08:10 UTC 16.3K followers, 85.9K engagements

"more hillclimbing on ARC-AGI-3 If the guy who got XXX did it unassisted without computer or notes that would be impressive"
@scaling01 Avatar @scaling01 on X 2025-07-19 17:01:32 UTC 16.3K followers, 2978 engagements

"what the fuck is Gemini XXX Pro doing down there"
@scaling01 Avatar @scaling01 on X 2025-07-21 21:53:30 UTC 16.3K followers, 2719 engagements

"@test_tm7873 solutions for mensa norway are everywhere"
@scaling01 Avatar @scaling01 on X 2025-07-12 16:34:31 UTC 16.2K followers, 2654 engagements

"It's not GPT-5 Repeat: not GPT-5 (I ragequit this is confirmed by trusted sources)"
@scaling01 Avatar @scaling01 on X 2025-07-17 00:22:16 UTC 16.2K followers, 103.8K engagements

"two labs independently got IMO gold and you are gooning anon"
@scaling01 Avatar @scaling01 on X 2025-07-19 12:12:27 UTC 16K followers, 5129 engagements

"Gary Marcus strikes again: "No pure LLM is anywhere near getting a silver medal in a math olympiad" "Pure deep learning had a good run but it's time to move on" 😂😂😂"
@scaling01 Avatar @scaling01 on X 2025-07-19 11:18:46 UTC 16.3K followers, 47.9K engagements

"We would have Llama-4 Behemoth and reasoning models if chinese models didn't exist"
@scaling01 Avatar @scaling01 on X 2025-07-19 00:56:45 UTC 16.2K followers, 1187 engagements

"1e28 flops is XXXX days on this machine you could train GPT-4 in a few hours"
@scaling01 Avatar @scaling01 on X 2025-07-22 17:35:48 UTC 16.3K followers, 11.2K engagements

"Kimi-K2 now #1 on EQ-Bench and Creative Writing beating o3 This could mean that Moonshot has really high-quality data and is not just using synthetic data from OpenAIAnthropic or Google"
@scaling01 Avatar @scaling01 on X 2025-07-13 11:40:34 UTC 16.2K followers, 58K engagements

"Qwen3-Coder-480B-A35B-Instruct Also known as Qwen-Coder-Plus with X million tokens input and 65k tokens output but apparently without thinking"
@scaling01 Avatar @scaling01 on X 2025-07-22 18:59:58 UTC 16.3K followers, 19.3K engagements

"Incredible model. XX minutes thinking and $XXXX later to respond with "base64" It's so over for GPT-5"
@scaling01 Avatar @scaling01 on X 2025-07-10 16:00:23 UTC 16.1K followers, 66.3K engagements

"@TheCinesthetic tarantino is clueless its a bad scene the soldiers see the explosions coming but just stand in place waiting to get blasted and the aircraft are just invisible there's just random explosions lmao"
@scaling01 Avatar @scaling01 on X 2025-07-22 11:59:46 UTC 16.3K followers, XXX engagements

"@basedjensen @a_karvonen Both don't show general intelligence lol It's RL'd to solve maths there's nothing general about it. Aesthetics don't matter as long as the proof is correct reproducible and not just brute force"
@scaling01 Avatar @scaling01 on X 2025-07-20 10:52:53 UTC 16.1K followers, XXX engagements

"GPT-5 will still have low medium and high reasoning budgets"
@scaling01 Avatar @scaling01 on X 2025-07-19 11:38:52 UTC 16K followers, 11K engagements

"OpenAI got gold on the International Math Olympiad with an experimental model (not GPT-5)"
@scaling01 Avatar @scaling01 on X 2025-07-19 07:57:04 UTC 16.1K followers, 1433 engagements

"Religious believers and LLM doubters are so similar. They can't recognize change and their world has only shrunk never grown"
@scaling01 Avatar @scaling01 on X 2025-07-20 02:00:10 UTC 16.1K followers, 2407 engagements

"Qwen-3 Coder randomly makes some text on the left sidebar larger and not aligned"
@scaling01 Avatar @scaling01 on X 2025-07-22 19:52:23 UTC 16.3K followers, 1163 engagements

"@Infopulsed So do you also think that we are doomed Because in theory the AI arms race is starting and we are all fucked in 3-5 years"
@scaling01 Avatar @scaling01 on X 2025-07-22 08:49:54 UTC 16.2K followers, XX engagements

"@zephyr_z9 @teortaxesTex @natolambert Yeah. Just checked their page lol But ByteDance is cooking on all fronts not only language models"
@scaling01 Avatar @scaling01 on X 2025-07-14 15:59:49 UTC 16.1K followers, XXX engagements

"insert where is qwen meme But I'm glad they finally shipped it after like X months. It's going to be interesting comparing Gemini and Qwen"
@scaling01 Avatar @scaling01 on X 2025-07-14 17:42:03 UTC 16.2K followers, 18.5K engagements

"Grok-4 trained on 100k H100 with 10x more compute than Grok-3 and 100x more compute than Grok-2"
@scaling01 Avatar @scaling01 on X 2025-07-10 04:10:50 UTC 16.1K followers, 397.7K engagements

"@ Zuck all they need is X months to build a frontier coding model"
@scaling01 Avatar @scaling01 on X 2025-07-22 21:39:35 UTC 16.3K followers, 15.6K engagements

"I understand that. But I still think the dynamics are that if AI is truly as disruptive as it sounds then frontier performance will get privatized. Of course the public is going to get nice distilled economical models but the big boi uncensored models where you can add arbitrary compute are going to be locked to a select few"
@scaling01 Avatar @scaling01 on X 2025-07-19 14:42:31 UTC 16K followers, XXX engagements

"Zuck just offered Sam Altman a $X trillion pay package to join Meta's superintelligence team"
@scaling01 Avatar @scaling01 on X 2025-07-21 15:50:22 UTC 16.3K followers, 33.1K engagements

"another interesting benchmark by Lech and o3 is at the top"
@scaling01 Avatar @scaling01 on X 2025-07-22 17:46:14 UTC 16.3K followers, 1516 engagements

"@Mohsine_Mahzi eughhh I don't want that I want to see the model names. I just want to keep my model selector"
@scaling01 Avatar @scaling01 on X 2025-07-20 12:13:51 UTC 16.1K followers, 1447 engagements

"Is anyone actually using these goofy ass glasses"
@scaling01 Avatar @scaling01 on X 2025-07-22 01:35:02 UTC 16.3K followers, 1185 engagements

"the only thing you need to understand about AI: it's currently improving exponentially in ALL domains - knowledge coding mathematics self-driving cars computer-use browsing video understanding"
@scaling01 Avatar @scaling01 on X 2025-07-21 22:12:40 UTC 16.3K followers, 2402 engagements

"ARC-AGI-3 scores X% for AI XXX% for humans now live with API where you can test your agent:"
@scaling01 Avatar @scaling01 on X 2025-07-18 17:30:02 UTC 16.2K followers, 30.6K engagements

"He can't be serious. He posted this right before OpenAI announced they got Gold in the IMO. Truly the Jim Cramer of AI"
@scaling01 Avatar @scaling01 on X 2025-07-19 11:12:21 UTC 16.3K followers, 39.7K engagements

"As long as GPT-5 is a noticeable step up from o3 I don't care what models they used It would only make my heart a bit happier if I knew there is a reasoning Chonky"
@scaling01 Avatar @scaling01 on X 2025-07-20 11:22:57 UTC 16.1K followers, 1582 engagements

"Qwen about to release a 480B MoE for coding with X million context "Qwen3-Coder-480B-A35B-Instruct is a powerful coding-specialized language model excelling in code generation tool use and agentic tasks.""
@scaling01 Avatar @scaling01 on X 2025-07-22 18:55:06 UTC 16.3K followers, 123.1K engagements

"btw I don't want a model router I want to be able to select the models I use"
@scaling01 Avatar @scaling01 on X 2025-07-20 12:04:11 UTC 16.3K followers, 51.2K engagements

"The whole GPT-5 model routing thing just sounds like a way of saving compute/money and ripping of plus and pro users. Sama already talked about using a credit based system and I'm sure they also talked about raising prices at some point"
@scaling01 Avatar @scaling01 on X 2025-07-20 11:05:52 UTC 16.2K followers, 5670 engagements

"hot take: non-reasoning models are more elegant than reasoning models"
@scaling01 Avatar @scaling01 on X 2025-07-22 22:03:58 UTC 16.3K followers, 30K engagements

"A very high bar. Considering only a maximum of XX people can receive a Nobel Prize each year I would argue that this is a superhuman feat"
@scaling01 Avatar @scaling01 on X 2025-07-19 08:36:56 UTC 16.1K followers, 3751 engagements

"what a joke xAI valued at 200B Anthropic latest valuation was 61.5B xAI revenue 0B Anthropic revenue 4B"
@scaling01 Avatar @scaling01 on X 2025-07-11 20:45:19 UTC 16.3K followers, 579K engagements

"the spice fields are part of my empire period"
@scaling01 Avatar @scaling01 on X 2025-07-18 15:00:00 UTC 16.1K followers, 2441 engagements

"The White House doesn't want you to use DeepSeek Qwen or Kimi by focusing evaluations on censorship and alignment with the CCP instead of capabilities and usefulness"
@scaling01 Avatar @scaling01 on X 2025-07-23 15:17:08 UTC 16.3K followers, 3585 engagements

"GPT-5 expectations: - SOTA on most benchmarks (#1 on my meta-benchmark) - specifically: SOTA on ARCAGI2 and METR - 2025 knowledge cutoff - longer context window (400k) - fully multimodal (text/image/audio + video input) - sane output pricing: = $XX / 1M tokens nicetohaves: - fewer hallucinations than o3 - less sycophancy - no more barrage of em dashes - clean code style no weird multiline comments"
@scaling01 Avatar @scaling01 on X 2025-07-18 22:23:45 UTC 16.2K followers, 1947 engagements

"@polynoamial @SherylHsu02 So it's like o3-pro or Grok-4 some parallel approach with multiple agents"
@scaling01 Avatar @scaling01 on X 2025-07-19 08:08:32 UTC 16.1K followers, 4073 engagements

"GPT-4o was never the "top" model lore accurate timeline for the newbies: GPT-3.5 GPT-4 GPT-4 Turbo Opus X Sonnet XXX o1 Gemini XXX Pro and right now im so fucking confused what the actual best model is Gemini XXX Pro Opus X and o3 are all sick but probably o3-high / o3-pro if you want to count parallel cheating"
@scaling01 Avatar @scaling01 on X 2025-07-21 21:39:57 UTC 16.3K followers, 5548 engagements

"@OfficialLoganK Can we expect a delivery in July or later in August"
@scaling01 Avatar @scaling01 on X 2025-07-20 17:49:25 UTC 16.1K followers, 1475 engagements

"Grok-4 falling behind Gemini XXX Pro on SimpleBench"
@scaling01 Avatar @scaling01 on X 2025-07-18 22:42:44 UTC 16.3K followers, 96.8K engagements

"Official OpenAI Agent mode benchmarks in one thread Coming to Pro Plus and Team users"
@scaling01 Avatar @scaling01 on X 2025-07-17 17:24:57 UTC 16.2K followers, 25.9K engagements

"Top OpenAI researcher says that: "We're close to AI substantially contributing to scientific discovery" after an internal model reaches Gold in the International Math Olympiad"
@scaling01 Avatar @scaling01 on X 2025-07-19 08:06:30 UTC 16.1K followers, 2257 engagements

"Kimi K2 is here and it's insane It's the best OS non-thinking model and one of the best non-thinking models competitive with GPT-4.1 Sonnet X and Opus X X trillion params 32B active trained with Muon optimizer on 15.5T tokens It is also much cheaper than all the alternatives: $0.60/million input and $2.50/million output tokens vs GPT-4.1 at $2/$8 vs Sonnet X at $3/$15"
@scaling01 Avatar @scaling01 on X 2025-07-11 15:10:17 UTC 16K followers, 34.8K engagements

"@luke_chaj @sama Idk I think you are memory bound with 100-200k context per user"
@scaling01 Avatar @scaling01 on X 2025-07-21 09:47:48 UTC 16.3K followers, XXX engagements

"Google got the IMO Gold without AlphaProof or AlphaGeometry with a fine-tuned version of Gemini DeepThink"
@scaling01 Avatar @scaling01 on X 2025-07-21 16:49:55 UTC 16.2K followers, 8285 engagements

"you are better off asking o3 than ChatGPT agent to build a genetically modified supervirus"
@scaling01 Avatar @scaling01 on X 2025-07-17 19:38:21 UTC 16.1K followers, 1450 engagements

"my AI predictions for 2025: - at least one lab will declare AGI and mentions ASI - Q1: Google Anthropic OpenAI META Qwen and Mistral model fiesta ( it will be heaven ) - agents / computer use takes off - release of Claude X Gemini X GPT-5 Grok X (or whatever they call their giant 5-20 trillion parameter models) - release of o3 o4 and o5 - open-source replication of o3 - the Frontier Math benchmark will be mostly solved (80%) - SWE-bench will be solved (90%) - ARC-AGI X will be mostly solved (80%) within X months of it's release - 10+ million context length models my wishful thinking: Someone"
@scaling01 Avatar @scaling01 on X 2025-01-02 00:09:26 UTC 16.3K followers, 343.3K engagements

"Remember Elon firing against OpenAI for not being open-source So where are the Grok-2 and Grok-3 weights"
@scaling01 Avatar @scaling01 on X 2025-07-11 01:40:24 UTC 16.1K followers, 129.9K engagements

"blinky blinky towers grey suburbia every kid knows this"
@scaling01 Avatar @scaling01 on X 2025-07-07 02:02:48 UTC 16.1K followers, 221.4K engagements

"ChatGPT Agent has lower performance than o3 on PaperBench SWE-Bench verified OpenAI PRs and OpenAI Research Engineer Interview questions"
@scaling01 Avatar @scaling01 on X 2025-07-17 19:42:33 UTC 16.2K followers, 5089 engagements

"Both Google DeepMind and OpenAI got a gold medal at the IMO"
@scaling01 Avatar @scaling01 on X 2025-07-19 20:02:28 UTC 16.1K followers, 9490 engagements

""AI is just autocomplete" meanwhile: Zuck offered at least XX OpenAI researchers pay packages of $XXX million"
@scaling01 Avatar @scaling01 on X 2025-07-20 19:42:27 UTC 16.3K followers, 9258 engagements

"@GregKamradt it seems like the documentation for getting started with building ARC-AGI-3 agents is outdated: the classes Agent and Observation don't exist and the init file looks completely different too"
@scaling01 Avatar @scaling01 on X 2025-07-20 22:25:53 UTC 16.2K followers, XXX engagements

"Somehow ChatGPT agent has higher hallucination rates but what is XXXXX vs XXXXX lol"
@scaling01 Avatar @scaling01 on X 2025-07-17 19:33:48 UTC 16.1K followers, 2546 engagements

"Zuck just offered himself $XX billion. he declined"
@scaling01 Avatar @scaling01 on X 2025-07-21 18:02:48 UTC 16.3K followers, 3070 engagements

"@GregKamradt maybe also add an option to use openrouter for public testing I instantly ran into rate limit as i don't use OpenAI API that much"
@scaling01 Avatar @scaling01 on X 2025-07-20 22:36:20 UTC 16.2K followers, XXX engagements

"No clue where this guy got the rest of the points and I'm not finding out without making notes and spending another hour on this"
@scaling01 Avatar @scaling01 on X 2025-07-19 19:30:19 UTC 16.1K followers, 1386 engagements

"Qwen3-235B-A22B scored XX% on ARC-AGI-1 without thinking That's the same level as Gemini XXX Pro Sonnet X or o3-low with thinking. But it might be trained on it if not then it's insane"
@scaling01 Avatar @scaling01 on X 2025-07-21 17:43:41 UTC 16.3K followers, 35.6K engagements

"Are you ready for the Grok-4 coding model Aider-Polyglot results look good Grok-4 beating o3"
@scaling01 Avatar @scaling01 on X 2025-07-12 18:41:19 UTC 16.1K followers, 4808 engagements

"HEY FUCKERS HOW ABOUT FIXING YOUR APP AND RELEASING GPT-5"
@scaling01 Avatar @scaling01 on X 2025-07-17 22:22:50 UTC 16.2K followers, 3474 engagements

"I bet in "no emoji" arena it would even beat GPT-4o and GPT-4.5 making it the best non-thinking model"
@scaling01 Avatar @scaling01 on X 2025-07-17 16:05:23 UTC 16.1K followers, 1268 engagements

"Grok-4 confirmed to have a 256K context window"
@scaling01 Avatar @scaling01 on X 2025-07-10 04:47:06 UTC 16.2K followers, 1676 engagements

"why is the input for ARC-AGI-3 in slow motion like please speed it up i don't have the whole day"
@scaling01 Avatar @scaling01 on X 2025-07-18 17:52:36 UTC 16.2K followers, 1307 engagements

creator/twitter::1825243643529027584/posts
/creator/twitter::1825243643529027584/posts