[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
@scaling01
"@eigenron @karpathy you basically just want a binary counter you could also use square waves but square waves are just build out of inifinitely many sinusoids so sinusoids are more fundamental + they are smooth this is a good video" @scaling01 on X 2025-07-27 20:04:25 UTC 17.6K followers, 3685 engagements
"imagine if Lobster is the open-source model" @scaling01 on X 2025-07-25 23:15:12 UTC 17.6K followers, 13.9K engagements
"Lobster is still pretty fast with 80-100tks/s maybe there's an even bigger model 👀" @scaling01 on X 2025-07-25 23:21:55 UTC 17.6K followers, 7149 engagements
"I told you Mistral is dying Qwen is our new open-source king" @scaling01 on X 2025-07-18 18:28:06 UTC 17.6K followers, 3505 engagements
"- Sama is already talking about AGI all the time not long before he says it - Q1 model fiesta: happened - agents / computer use: literally yesterday - o3 yes o4 in the coming X weeks (GPT-5) o5 end of year - o3 replication coming: Kimi-K2 reasoner or R2 - SWE-Bench XX% EOY: confident - ARC-AGI-2 XX% by end of year: coin-flip - Frontier Math 80%: unsure but with Gold IMO it seems slightly closer - 10+ million context length models: Llama-4 but not really" @scaling01 on X 2025-07-19 12:21:35 UTC 17.6K followers, 34.3K engagements
"@sama That's a lot of agents Assuming o3 fits on one 8xH100 you would have 125k instances of o3. Batch size won't be huge with reasoning models maybe 4-8 So 500k to X million o3 agents could run on those newly deployed GPUs. But there will also be plenty of H200 B200 and GB200" @scaling01 on X 2025-07-20 22:52:45 UTC 17.6K followers, 27.5K engagements
"The thing is the model says its from OpenAI but Google and Anthropic models no longer say they are from OpenAI. They have their own data now so it's very unlikely its one of them. Probabilities from which lab it is: XX% OpenAI X% one of the chinese labs they are still heavily using OAI data X% completely new lab" @scaling01 on X 2025-07-25 23:14:03 UTC 17.6K followers, 27.6K engagements
"and yes I believe they are based on GPT-4.1" @scaling01 on X 2025-07-25 15:10:06 UTC 17.6K followers, 9695 engagements
"decent if you compare it only to non-reasoning models but nowhere near the XX% the Qwen team reported but pretty bad against o4-mini or Gemini XXX Flash" @scaling01 on X 2025-07-24 18:46:16 UTC 17.6K followers, 5782 engagements
"The new OpenAI models are insane. Sonnet straight up gets destroyed by Summit in this comparison. (CLICK ON THE POST TO SEE THE CORRECT ORDER OF THE IMAGES) Summit - Desert with lone tree: Sonnet- Desert with lone tree: Summit - Landscape: Sonnet - Lanscape:" @scaling01 on X 2025-07-27 00:51:06 UTC 17.6K followers, 53.6K engagements
"this is what I imagine the self-driving looks like in all other cars except Waymo and Tesla" @scaling01 on X 2025-07-24 09:05:40 UTC 17.4K followers, 2454 engagements
"Generative AI is the fastest adopted technology in history" @scaling01 on X 2025-07-17 22:33:47 UTC 17.6K followers, 9967 engagements
"GPT-4 was released XXX days or XXX years ago we are getting old" @scaling01 on X 2025-07-14 22:02:40 UTC 17.5K followers, 2916 engagements
"Kimi-K2 Technical Report is out to reveal all the secrets" @scaling01 on X 2025-07-21 19:52:13 UTC 17.6K followers, 24.9K engagements
"the hype will be off the charts if it's the open-source model and GPT-5 is even stronger" @scaling01 on X 2025-07-25 23:53:26 UTC 17.6K followers, 23K engagements
"o4-mini is a bit cleaner but overall design and details is easily a win for Qwen3-Coder" @scaling01 on X 2025-07-22 19:51:52 UTC 17.6K followers, XXX engagements
"@sama the next X weeks are a good time to release all your models :)" @scaling01 on X 2025-07-19 14:22:27 UTC 17.6K followers, 11.7K engagements
"what if I told you that OpenAI Google Anthropic and xAI will all be working together in a few years" @scaling01 on X 2025-07-22 15:49:54 UTC 17.6K followers, 232.9K engagements
"I made it on the ARC-AGI-3 leaderboard I honestly don't know how people got below XXX I made a few mistakes but XXX of them" @scaling01 on X 2025-07-19 14:27:07 UTC 17.6K followers, 4275 engagements
"GPT-5 casually building cookie clicker with all features in X minutes" @scaling01 on X 2025-07-25 18:16:16 UTC 17.6K followers, 74.7K engagements
"There is a XX% chance that ChatGPT agent will actually gamble away your life savings if you asked it" @scaling01 on X 2025-07-17 19:36:27 UTC 17.4K followers, 16.7K engagements
"got lucky two times in a row zenith is definitely a thinking model and not a "-mini" model other models like o4-mini o3 and Opus-4 Thinking can solve the multiplication question but a lot of the chinese models Grok-4 and even Gemini XXX Pro can't do this the double base64 encoding thing tells me it's not a mini model because they completely break apart with a second layer of encoding - but zenith gets most of the message right to this day the only models that can reliably do double base64 encoding are Sonnet and Opus" @scaling01 on X 2025-07-26 21:56:21 UTC 17.6K followers, 28.6K engagements
"The vibe shift has been incredible to watch over the last few days. We went from GPT-5 will be disappointing to GPT-5 will be another GPT-4 moment" @scaling01 on X 2025-07-26 17:48:31 UTC 17.6K followers, 57.9K engagements
"be me Theo Von interviewing Sam Altman ask him what nuclear fusion is Sam: "smashing atoms together" fuck cool i bet a lot of people would watch that Sam: "it's pretty hard to watch two atoms" DUDE WHAT IF ME MAKE IT LIKE THESE SPERM RACES" @scaling01 on X 2025-07-23 20:27:31 UTC 17.6K followers, 3430 engagements
"and subscribe to the best AI channel on YouTube:" @scaling01 on X 2025-07-18 22:44:12 UTC 17.5K followers, 7502 engagements
"Anthropic valued at over $150B roast xAI valuation XX times compare constantly to Anthropic spam "ANTHROPIC IS UNDERVALUED" button profit" @scaling01 on X 2025-07-25 20:27:22 UTC 17.6K followers, 7052 engagements
"The White House just released America's AI Action Plan. I've read the whole thing. This document makes it very clear that this is about "winning the AI race" and even compare it to the cold war era. It's a paper about national-security Here are the most important quotes: - Just like we won the space race it is imperative that the United States and its allies win this race. - Americas AI Action Plan has three pillars: innovation infrastructure and international diplomacy and security. Pillar I - Innovation: - Led by the Department of Commerce revise the NIST AI Risk Management Framework to" @scaling01 on X 2025-07-23 15:06:54 UTC 17.6K followers, 47.1K engagements
"@heyruchir GPT-4.5 is 5T but it's kinda old" @scaling01 on X 2025-07-25 19:29:12 UTC 17.6K followers, 1907 engagements
"40% chance that this is the prologue to WW3 happening around 2028-2032" @scaling01 on X 2025-07-23 22:30:25 UTC 17.6K followers, 5629 engagements
"I just prompted a bunch of LLMs to create a website that explains Transformers to kids. They were honestly all terrible and confusing. Somehow I didn't get Summit or Lobster in like XX retries. Would love to see how GPT-5 models handle this task what do you think is the correct answer here" @scaling01 on X 2025-07-27 23:30:32 UTC 17.6K followers, 2801 engagements
"I don't think Americans understand how far ahead Chinas infrastructure is" @scaling01 on X 2025-07-06 23:38:17 UTC 17.6K followers, 18.6M engagements
"i have tried like XX times to get summit and zenith never gotten zenith and X times summit but it timed out every single time and no response" @scaling01 on X 2025-07-26 00:23:39 UTC 17.6K followers, 8974 engagements
"Zenith doesn't seem as good in SVGs as Summit. Summit: Zenith:" @scaling01 on X 2025-07-27 00:57:30 UTC 17.6K followers, 13.2K engagements
"would be hella awkward if Lobster Nectarine and Starfish weren't the GPT-5 models good luck to your NVDA stock if they aren't GPT-5" @scaling01 on X 2025-07-25 22:15:18 UTC 17.6K followers, 33.1K engagements
"anybody know what model kraken-072125-2 is on lmarena i just tried the SVG thing and it says Claude is it really Anthropic or some chinese lab" @scaling01 on X 2025-07-26 23:40:10 UTC 17.6K followers, 9689 engagements
"The White House finally saw the chart: "American energy capacity has stagnated since the 1970s while China has rapidly built out their grid. Americas path to AI dominance depends on changing this troubling trend" - Quote from the AI Action Plan by the White House" @scaling01 on X 2025-07-23 15:11:21 UTC 17.3K followers, 2433 engagements
"Inverse Scaling in Test-Time Compute by Anthropic So are reasoning models cooked No they cited the Apple Tower of Hanoi paper. And it looks more like an Anthropic skill issue to me since o3's performance decreases in only X benchmark while Opus X has decreased performance in X benchmarks" @scaling01 on X 2025-07-22 11:49:39 UTC 17.6K followers, 1730 engagements
"Official HLE scores for Grok-4 and Grok-4 Heavy destroying o3 and Gemini XXX Pro" @scaling01 on X 2025-07-10 04:33:43 UTC 17.6K followers, 1395 engagements
"more hillclimbing on ARC-AGI-3 If the guy who got XXX did it unassisted without computer or notes that would be impressive" @scaling01 on X 2025-07-19 17:01:32 UTC 17.6K followers, 3125 engagements
"what the fuck is Gemini XXX Pro doing down there" @scaling01 on X 2025-07-21 21:53:30 UTC 17.6K followers, 2777 engagements
"You know Bruce Wayne and Tony Stark "only" had $XX billion and were superheroes Imagine what Elon can do with $XXX billion" @scaling01 on X 2025-07-22 22:30:00 UTC 17.6K followers, 3083 engagements
"@__int32 because they want to kill Anthropic" @scaling01 on X 2025-07-25 17:55:47 UTC 17.6K followers, 2821 engagements
"two labs independently got IMO gold and you are gooning anon" @scaling01 on X 2025-07-19 12:12:27 UTC 17.4K followers, 5196 engagements
"even harder question and summit got it not a huge signal but at least we know they are better at multiplication than pretty much all model" @scaling01 on X 2025-07-26 22:09:08 UTC 17.6K followers, 17.7K engagements
"I can't stop playing with this tool that GPT-5 made it's so mesmerizing" @scaling01 on X 2025-07-26 20:58:59 UTC 17.6K followers, 39.6K engagements
"1e28 flops is XXXX days on this machine you could train GPT-4 in a few hours" @scaling01 on X 2025-07-22 17:35:48 UTC 17.4K followers, 14.1K engagements
"@MinuteMovies3 not sure the whole o3-alpha thing is confusing af" @scaling01 on X 2025-07-25 15:11:08 UTC 17.6K followers, 3644 engagements
"idk does Sonnet win this Summit - NY skyline: Sonnet - NY skyline:" @scaling01 on X 2025-07-27 00:54:55 UTC 17.6K followers, 5103 engagements
"Lobster - GPT-5 Nectarine - GPT-5-mini Starfish - GPT-5-nano" @scaling01 on X 2025-07-25 15:04:14 UTC 17.6K followers, 142.9K engagements
"Zuck showing the engineers his plans for datacenters in tents" @scaling01 on X 2025-07-24 20:43:09 UTC 17.6K followers, 1547 engagements
"with GPT-5 I mean Lobster on web.lmarena dot ai if Lobster is anything else but GPT-5 then whoever cooked this shit up will be my new GOAT" @scaling01 on X 2025-07-25 23:05:32 UTC 17.6K followers, 31.8K engagements
"@shaunralston not verbatim but XX% something like this: Create an interactive website for kids that explains the Transformer architecture and it's components in a playful way. Be creative" @scaling01 on X 2025-07-27 23:34:00 UTC 17.6K followers, XXX engagements
"The most disappointing thing would be if we safely reached ASI; it figures out the fundamental laws of the universe but we couldn't do anything cool with them. Imagine being stuck on a planet in an infinite doomed universe" @scaling01 on X 2025-07-24 22:13:59 UTC 17.6K followers, 4569 engagements
"@sama are labor camps $120B for anti-immigrantion tax cuts for the rich no more medicaid and no more snap also very cool" @scaling01 on X 2025-07-05 15:29:52 UTC 17.6K followers, 29.2K engagements
"Religious believers and LLM doubters are so similar. They can't recognize change and their world has only shrunk never grown" @scaling01 on X 2025-07-20 02:00:10 UTC 17.6K followers, 2448 engagements
"@eigenron if your data doesn't have any positional structure then LLMs won't learn those sinusoidal patterns because they don't improve loss but I think all languages have some kind of structure like subject-verb-object or variants of this" @scaling01 on X 2025-07-27 23:22:13 UTC 17.6K followers, XXX engagements
"basically this: Create a stunning interactive animation of a neural network or brain-like graph structureuse artistic colors smooth transitions and beautiful visuals. The page should feel alive immersive and impressive with no buttonsjust scrolling or continuous animation. Make it breathtaking. then i just prompted for improvements but Lobster was 100x from the start" @scaling01 on X 2025-07-25 23:06:44 UTC 17.6K followers, 14.8K engagements
"I think it will get a new highscore on LisanBench" @scaling01 on X 2025-07-25 15:19:58 UTC 17.6K followers, 6342 engagements
"@Infopulsed So do you also think that we are doomed Because in theory the AI arms race is starting and we are all fucked in 3-5 years" @scaling01 on X 2025-07-22 08:49:54 UTC 17.6K followers, XX engagements
"Seems like the new models are really based on GPT-4.1 series. They have the same knowledge cut-off of June 2024" @scaling01 on X 2025-07-27 02:03:54 UTC 17.6K followers, 28.6K engagements
"I hope GPT-5 will finally be smart enough to have real conversations without inconsistencies" @scaling01 on X 2025-07-24 21:55:15 UTC 17.6K followers, 2522 engagements
"Zuck just offered Sam Altman a $X trillion pay package to join Meta's superintelligence team" @scaling01 on X 2025-07-21 15:50:22 UTC 17.5K followers, 33.2K engagements
"another interesting benchmark by Lech and o3 is at the top" @scaling01 on X 2025-07-22 17:46:14 UTC 17.6K followers, 1581 engagements
"Is anyone actually using these goofy ass glasses" @scaling01 on X 2025-07-22 01:35:02 UTC 17.6K followers, 1220 engagements
"the only thing you need to understand about AI: it's currently improving exponentially in ALL domains - knowledge coding mathematics self-driving cars computer-use browsing video understanding" @scaling01 on X 2025-07-21 22:12:40 UTC 17.6K followers, 2480 engagements
"He can't be serious. He posted this right before OpenAI announced they got Gold in the IMO. Truly the Jim Cramer of AI" @scaling01 on X 2025-07-19 11:12:21 UTC 17.6K followers, 39.8K engagements
"Qwen about to release a 480B MoE for coding with X million context "Qwen3-Coder-480B-A35B-Instruct is a powerful coding-specialized language model excelling in code generation tool use and agentic tasks."" @scaling01 on X 2025-07-22 18:55:06 UTC 17.6K followers, 131.8K engagements
"btw I don't want a model router I want to be able to select the models I use" @scaling01 on X 2025-07-20 12:04:11 UTC 17.6K followers, 51.7K engagements
"hot take: non-reasoning models are more elegant than reasoning models" @scaling01 on X 2025-07-22 22:03:58 UTC 17.4K followers, 43.5K engagements
"I'm back and Gemini XXX Pro is still the king (no glaze) I did some more manual data cleaning and scrapped the shitty "average scaled score" and replaced it with Glicko-2 rating system with params: INITIAL_RATING = 1500 INITIAL_RD = XXX INITIAL_VOL = XXXX TAU () = XXX Furthermore I increased the minimum number of appearances from X to XX benchmarks to make it more stable. The labels show the lower XX% ratings (a conservative lower skill estimate) and in brackets the number of benchmarks the model appeared in. Below this post I attached the full table with mu sigma lower XX% ratings and number" @scaling01 on X 2025-05-05 13:50:54 UTC 17.5K followers, 65.1K engagements
"Why is Grok-4 locked behind the Premium+ subscription Throw us Premium plebs a bone Elon" @scaling01 on X 2025-07-27 13:13:57 UTC 17.6K followers, 35.6K engagements
"Never mind. They still suck at creating sheet music But I think they might be a bit better at respecting the time signature" @scaling01 on X 2025-07-27 11:28:57 UTC 17.6K followers, 5252 engagements
"GPT-5 expectations: - SOTA on most benchmarks (#1 on my meta-benchmark) - specifically: SOTA on ARCAGI2 and METR - 2025 knowledge cutoff - longer context window (400k) - fully multimodal (text/image/audio + video input) - sane output pricing: = $XX / 1M tokens nicetohaves: - fewer hallucinations than o3 - less sycophancy - no more barrage of em dashes - clean code style no weird multiline comments" @scaling01 on X 2025-07-18 22:23:45 UTC 17.5K followers, 1964 engagements
"Introducing LisanBench LisanBench is a simple scalable and precise benchmark designed to evaluate large language models on knowledge forward-planning constraint adherence memory and attention and long context reasoning and "stamina". "I see possible futures all at once. Our enemies are all around us and in so many futures they prevail. But I do see a way there is a narrow way through." - Paul Atreides How it works: Models are given a starting English word and must generate the longest possible sequence of valid English words. Each subsequent word in the chain must: - Differ from the previous" @scaling01 on X 2025-05-30 17:54:52 UTC 17.6K followers, 78.9K engagements
"Claude XXX is going to be released in the next X months" @scaling01 on X 2025-07-24 19:49:53 UTC 17.6K followers, 1446 engagements
"Grok-4 falling behind Gemini XXX Pro on SimpleBench" @scaling01 on X 2025-07-18 22:42:44 UTC 17.6K followers, 97K engagements
"@petergostev i didn't get a chance to test o3-alpha" @scaling01 on X 2025-07-25 23:15:48 UTC 17.6K followers, 2492 engagements
"GPT-5 vs Grok-4 exact same prompts wildly different output in web lmarena" @scaling01 on X 2025-07-25 22:52:11 UTC 17.6K followers, 550.4K engagements
"The first step towards nationalizing AI developments just happened. "Priority access (for the Department of Defense) to computing resources in the event of a national emergency so that DOD is prepared to fully leverage these technologies during a significant conflict"" @scaling01 on X 2025-07-23 15:13:23 UTC 17.3K followers, 3298 engagements
"Now they have lost it completely. 3x valuation of Anthropic and still not even a fraction of the revenue. $80B in March magic farts and giggles happen $200B in July" @scaling01 on X 2025-07-11 20:42:25 UTC 17.5K followers, 16.5K engagements
"Qwen3-235B-Thinking caught up to Gemini XXX Pro and o3 (at least on benchmarks)" @scaling01 on X 2025-07-25 10:37:17 UTC 17.6K followers, 10.7K engagements
"played X hour with GPT-5 on lmarena literally same prompts for both models and Grok-4 just falls apart while GPT-5 creates art" @scaling01 on X 2025-07-25 21:49:59 UTC 17.6K followers, 293.1K engagements
"I have a low probability on misaligned AI killing us all because it's much more likely we will do it ourselves and it's going to happen within the next 5-10 years If we are still alive in XX years my probability of humanity becoming a type X civilization rise astronomically" @scaling01 on X 2025-07-23 22:47:13 UTC 17.5K followers, 2711 engagements
"you are better off asking o3 than ChatGPT agent to build a genetically modified supervirus" @scaling01 on X 2025-07-17 19:38:21 UTC 17.6K followers, 1492 engagements
"This only got X likes back then but it's now very real. The US is in an AI arms race with China" @scaling01 on X 2025-07-23 14:45:57 UTC 17.6K followers, 2678 engagements
"my AI predictions for 2025: - at least one lab will declare AGI and mentions ASI - Q1: Google Anthropic OpenAI META Qwen and Mistral model fiesta ( it will be heaven ) - agents / computer use takes off - release of Claude X Gemini X GPT-5 Grok X (or whatever they call their giant 5-20 trillion parameter models) - release of o3 o4 and o5 - open-source replication of o3 - the Frontier Math benchmark will be mostly solved (80%) - SWE-bench will be solved (90%) - ARC-AGI X will be mostly solved (80%) within X months of it's release - 10+ million context length models my wishful thinking: Someone" @scaling01 on X 2025-01-02 00:09:26 UTC 17.6K followers, 365.3K engagements
"GPT-5 DELAYED UNTIL AUGUST OPENAI OPEN-SOURCE MODEL NEXT WEEK GPT-5 GPT-5 mini will be available in ChatGPT GPT-5 nano only in the API" @scaling01 on X 2025-07-24 16:34:41 UTC 17.6K followers, 35.1K engagements
"@eigenron yes NNs can learn any function try to predict the next token but you are just given a bag of words instead of an ordered list of words it's pretty hard so learning positional embedding reduces loss by a lot" @scaling01 on X 2025-07-27 23:04:06 UTC 17.6K followers, XXX engagements
"ChatGPT Agent has lower performance than o3 on PaperBench SWE-Bench verified OpenAI PRs and OpenAI Research Engineer Interview questions" @scaling01 on X 2025-07-17 19:42:33 UTC 17.6K followers, 5120 engagements
""AI is just autocomplete" meanwhile: Zuck offered at least XX OpenAI researchers pay packages of $XXX million" @scaling01 on X 2025-07-20 19:42:27 UTC 17.4K followers, 9292 engagements
"Tents as datacenters. Now for real: X storm and it's over" @scaling01 on X 2025-07-24 14:40:24 UTC 17.4K followers, 2547 engagements
"@_ueaj can't compare tks/s to prod models we don't know their inference stack they could be using GB200 with batchsize X or other ridiculous setups as long as they are in testing phase" @scaling01 on X 2025-07-25 23:28:02 UTC 17.6K followers, XXX engagements
"yes and I told you so that you understand that using sinusoids is the simplest and most general form for positional embeddings square waves/binary encoding is worse simply setting one component to the index doesn't work either try to come up with anything else that has the properties we want and is easy to learn considering the transformer arch" @scaling01 on X 2025-07-27 22:05:39 UTC 17.6K followers, XXX engagements
"Anthropic seems to be falling behind in everything that is not coding" @scaling01 on X 2025-07-25 10:51:06 UTC 17.6K followers, 5233 engagements
"Somehow ChatGPT agent has higher hallucination rates but what is XXXXX vs XXXXX lol" @scaling01 on X 2025-07-17 19:33:48 UTC 17.5K followers, 2573 engagements
""You're sheltering chinese AI researchers are you not"" @scaling01 on X 2025-07-23 12:30:21 UTC 17.6K followers, 232.3K engagements
"good night it's going to be an exciting week" @scaling01 on X 2025-07-28 02:39:17 UTC 17.6K followers, 5361 engagements
"HEY FUCKERS HOW ABOUT FIXING YOUR APP AND RELEASING GPT-5" @scaling01 on X 2025-07-17 22:22:50 UTC 17.6K followers, 3756 engagements
"I bet in "no emoji" arena it would even beat GPT-4o and GPT-4.5 making it the best non-thinking model" @scaling01 on X 2025-07-17 16:05:23 UTC 17.6K followers, 1639 engagements
"I actually don't like it Sounds like an arrogant kid that just learned about metaphors/similes but I'm autistic and have never read a book in my life so what do I know" @scaling01 on X 2025-07-27 03:02:25 UTC 17.6K followers, 6699 engagements
"why is the input for ARC-AGI-3 in slow motion like please speed it up i don't have the whole day" @scaling01 on X 2025-07-18 17:52:36 UTC 17.5K followers, 1322 engagements
"knowledge benchmarks are so fucking useless they are all faulty and contaminated" @scaling01 on X 2025-07-24 18:35:09 UTC 17.6K followers, 1845 engagements