LunarCrush LLM | post/tweet::1945930814727831738

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

![rohanpaul_ai Avatar](https://lunarcrush.com/gi/w:24/cr:twitter::2588345408.png) Rohan Paul [@rohanpaul_ai](/creator/twitter/rohanpaul_ai) on x 73.6K followers
Created: 2025-07-17 19:37:14 UTC

2025 IMO(International Mathematical Olympiad) LLM results are in.

---

The benchmark's mission is rigorous assessment of the reasoning and generalization capabilities of LLMs on new math problems which the models have not seen during training.

It applies a uniform scoring procedure so results do not depend on any provider-specific.

During evaluation each model tackles every problem X times, and MathArena reports the average score together with the total cost in USD for those runs.

![](https://pbs.twimg.com/media/GwFE95obcAAoEuU.jpg)

XXXXX engagements

![Engagements Line Chart](https://lunarcrush.com/gi/w:600/p:tweet::1945930814727831738/c:line.svg)

**Related Topics**
[capabilities](/topic/capabilities)
[llm](/topic/llm)

[Post Link](https://x.com/rohanpaul_ai/status/1945930814727831738)

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

Rohan Paul @rohanpaul_ai on x 73.6K followers Created: 2025-07-17 19:37:14 UTC

2025 IMO(International Mathematical Olympiad) LLM results are in.

The benchmark's mission is rigorous assessment of the reasoning and generalization capabilities of LLMs on new math problems which the models have not seen during training.

It applies a uniform scoring procedure so results do not depend on any provider-specific.

During evaluation each model tackles every problem X times, and MathArena reports the average score together with the total cost in USD for those runs.

XXXXX engagements

Engagements Line Chart

Related Topics capabilities llm

Post Link