Dark | Light
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

[@nekofneko](/creator/twitter/nekofneko)
"To watch the full inspiring speech & insightful Q&A session check out the video here: A huge thanks again to Terence Tao for sharing his vision for the future of math. And congratulations to all the brilliant medalists and participants at #IMO2025 9/9"  
![@nekofneko Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::824589026074058752.png) [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1946579534943502343) 2025-07-19 14:35:01 UTC XX followers, 1253 engagements


"๐Ÿ“Š FINAL SCOREBOARD: Grok 4: 3/6 (P1 X 5) Gemini XXX Pro: 2/6 (P1 5) ByteDance Seed 1.6: 2/6 (P3 5) Claude Sonnet 4: 2/6 (P1 3) o3-medium: 2/6 (P3 5) ByteDance Seed XXX Thinking: 1/6 (P3) o4-mini-high: 1/6 (P1) DeepSeek R1: 0/6"  
![@nekofneko Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::824589026074058752.png) [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1945825982251950584) 2025-07-17 12:40:40 UTC XX followers, XXX engagements


"๐Ÿ“ Complete vs Partial Solutions: Only X models gave fully rigorous solutions: Bytedance Seed XXX and Gemini XXX Pro: Problem X โœ… Most other "correct" answers were partial and lacked full justificationin many cases they seemed like lucky guesses"  
![@nekofneko Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::824589026074058752.png) [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1945827245853208932) 2025-07-17 12:45:41 UTC XX followers, XXX engagements


"Tao's "no extra tools" rule is critical. LLMs may solve PhD-level problems but they fail at grade-school multiplication without a calculator. This test shows o1-mini struggles past 9x9 digits and gpt-4o past 4x4. If an "AGI" can't master what a human can is it truly general"  
![@nekofneko Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::824589026074058752.png) [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1946923968365252902) 2025-07-20 13:23:40 UTC XX followers, XXX engagements


"๐Ÿงต UPDATED: Complete evaluation of X frontier models on IMO 2025 problems After testing Claude Sonnet X ByteDance Seed XXX Gemini XXX Pro OpenAI o4-mini-high o3-medium Grok X and DeepSeek R1 here are the comprehensive results"  
![@nekofneko Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::824589026074058752.png) [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1945825543238406315) 2025-07-17 12:38:55 UTC XX followers, XXX engagements


"Following the conclusion of IMO 2025 in Australia today I tested three frontier models on all X problems: Claude Sonnet X (with thinking) ByteDance Seed XXX (with thinking) and Gemini XXX Pro. The results weren't as impressive as expected"  
![@nekofneko Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::824589026074058752.png) [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1945491686160994405) 2025-07-16 14:32:18 UTC XX followers, 8055 engagements

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@nekofneko "To watch the full inspiring speech & insightful Q&A session check out the video here: A huge thanks again to Terence Tao for sharing his vision for the future of math. And congratulations to all the brilliant medalists and participants at #IMO2025 9/9"
@nekofneko Avatar @nekofneko on X 2025-07-19 14:35:01 UTC XX followers, 1253 engagements

"๐Ÿ“Š FINAL SCOREBOARD: Grok 4: 3/6 (P1 X 5) Gemini XXX Pro: 2/6 (P1 5) ByteDance Seed 1.6: 2/6 (P3 5) Claude Sonnet 4: 2/6 (P1 3) o3-medium: 2/6 (P3 5) ByteDance Seed XXX Thinking: 1/6 (P3) o4-mini-high: 1/6 (P1) DeepSeek R1: 0/6"
@nekofneko Avatar @nekofneko on X 2025-07-17 12:40:40 UTC XX followers, XXX engagements

"๐Ÿ“ Complete vs Partial Solutions: Only X models gave fully rigorous solutions: Bytedance Seed XXX and Gemini XXX Pro: Problem X โœ… Most other "correct" answers were partial and lacked full justificationin many cases they seemed like lucky guesses"
@nekofneko Avatar @nekofneko on X 2025-07-17 12:45:41 UTC XX followers, XXX engagements

"Tao's "no extra tools" rule is critical. LLMs may solve PhD-level problems but they fail at grade-school multiplication without a calculator. This test shows o1-mini struggles past 9x9 digits and gpt-4o past 4x4. If an "AGI" can't master what a human can is it truly general"
@nekofneko Avatar @nekofneko on X 2025-07-20 13:23:40 UTC XX followers, XXX engagements

"๐Ÿงต UPDATED: Complete evaluation of X frontier models on IMO 2025 problems After testing Claude Sonnet X ByteDance Seed XXX Gemini XXX Pro OpenAI o4-mini-high o3-medium Grok X and DeepSeek R1 here are the comprehensive results"
@nekofneko Avatar @nekofneko on X 2025-07-17 12:38:55 UTC XX followers, XXX engagements

"Following the conclusion of IMO 2025 in Australia today I tested three frontier models on all X problems: Claude Sonnet X (with thinking) ByteDance Seed XXX (with thinking) and Gemini XXX Pro. The results weren't as impressive as expected"
@nekofneko Avatar @nekofneko on X 2025-07-16 14:32:18 UTC XX followers, 8055 engagements

creator/twitter::824589026074058752/posts
/creator/twitter::824589026074058752/posts