[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.] [@nekofneko](/creator/twitter/nekofneko) "๐ Complete vs Partial Solutions: Only X models gave fully rigorous solutions: Bytedance Seed XXX and Gemini XXX Pro: Problem X โ Most other "correct" answers were partial and lacked full justificationin many cases they seemed like lucky guesses"  [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1945827245853208932) 2025-07-17 12:45:41 UTC XX followers, XXX engagements "๐งต UPDATED: Complete evaluation of X frontier models on IMO 2025 problems After testing Claude Sonnet X ByteDance Seed XXX Gemini XXX Pro OpenAI o4-mini-high o3-medium Grok X and DeepSeek R1 here are the comprehensive results"  [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1945825543238406315) 2025-07-17 12:38:55 UTC XX followers, XXX engagements "Following the conclusion of IMO 2025 in Australia today I tested three frontier models on all X problems: Claude Sonnet X (with thinking) ByteDance Seed XXX (with thinking) and Gemini XXX Pro. The results weren't as impressive as expected"  [@nekofneko](/creator/x/nekofneko) on [X](/post/tweet/1945491686160994405) 2025-07-16 14:32:18 UTC XX followers, 8071 engagements
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
@nekofneko
"๐ Complete vs Partial Solutions: Only X models gave fully rigorous solutions: Bytedance Seed XXX and Gemini XXX Pro: Problem X โ
Most other "correct" answers were partial and lacked full justificationin many cases they seemed like lucky guesses" @nekofneko on X 2025-07-17 12:45:41 UTC XX followers, XXX engagements
"๐งต UPDATED: Complete evaluation of X frontier models on IMO 2025 problems After testing Claude Sonnet X ByteDance Seed XXX Gemini XXX Pro OpenAI o4-mini-high o3-medium Grok X and DeepSeek R1 here are the comprehensive results" @nekofneko on X 2025-07-17 12:38:55 UTC XX followers, XXX engagements
"Following the conclusion of IMO 2025 in Australia today I tested three frontier models on all X problems: Claude Sonnet X (with thinking) ByteDance Seed XXX (with thinking) and Gemini XXX Pro. The results weren't as impressive as expected" @nekofneko on X 2025-07-16 14:32:18 UTC XX followers, 8071 engagements
/creator/twitter::824589026074058752/posts