[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
Simon Willison @simonw on x 111.2K followers
Created: 2025-07-19 16:29:13 UTC
The most notable thing about this result is that this unnamed experimental reasoning model achieved this score without any tool usage at all - it looks like it's just another classic next-token-predicting LLM with a bunch of reinforcement learning layered on top
XXXXXX engagements
Related Topics the worlds coins ai open ai llm