[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]  Lisan al Gaib [@scaling01](/creator/twitter/scaling01) on x 17.7K followers Created: 2025-07-22 11:49:39 UTC Inverse Scaling in Test-Time Compute by Anthropic So are reasoning models cooked? No, they cited the Apple Tower of Hanoi paper. And it looks more like an Anthropic skill issue to me, since o3's performance decreases in only X benchmark, while Opus X has decreased performance in X benchmarks.  XXXXX engagements  **Related Topics** [hanoi](/topic/hanoi) [gaib](/topic/gaib) [Post Link](https://x.com/scaling01/status/1947625084513845429)
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
Lisan al Gaib @scaling01 on x 17.7K followers
Created: 2025-07-22 11:49:39 UTC
Inverse Scaling in Test-Time Compute by Anthropic
So are reasoning models cooked?
No, they cited the Apple Tower of Hanoi paper.
And it looks more like an Anthropic skill issue to me, since o3's performance decreases in only X benchmark, while Opus X has decreased performance in X benchmarks.
XXXXX engagements
/post/tweet::1947625084513845429