[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

simonw Avatar Simon Willison @simonw on x 111.2K followers Created: 2025-07-19 16:29:13 UTC

The most notable thing about this result is that this unnamed experimental reasoning model achieved this score without any tool usage at all - it looks like it's just another classic next-token-predicting LLM with a bunch of reinforcement learning layered on top

XXXXXX engagements

Engagements Line Chart

Related Topics the worlds coins ai open ai llm

Post Link