LunarCrush LLM | post/tweet::1942275969932484997

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

![aiDotEngineer Avatar](https://lunarcrush.com/gi/w:24/cr:twitter::1376308155626360837.png) AI Engineer [@aiDotEngineer](/creator/twitter/aiDotEngineer) on x 28K followers
Created: 2025-07-07 17:34:11 UTC

🆕 Training Agentic Reasoners

today's feature is @willccbb's triumphant return to the AIE stage RL track - now as part of @PrimeIntellect! 

A lot of agent builders are basically doing "RL by hand". He concisely explains current RL algorithms in one slide (!) but then argues that RL - particularly for open models - is stuck in math and code Q&A land

the new hotness is multi-turn agentic RL, and the new verifiers library is the ultimate toolkit for building an agent and turning it into an RL loop.



More people should be exploring building better agent models and Will + PI is enabling that for everyone!

![](https://pbs.twimg.com/media/GvRYjLnaYAE_M5s.jpg)

XXXXXX engagements

![Engagements Line Chart](https://lunarcrush.com/gi/w:600/p:tweet::1942275969932484997/c:line.svg)

**Related Topics**
[coins ai](/topic/coins-ai)

[Post Link](https://x.com/aiDotEngineer/status/1942275969932484997)

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

AI Engineer @aiDotEngineer on x 28K followers Created: 2025-07-07 17:34:11 UTC

🆕 Training Agentic Reasoners

today's feature is @willccbb's triumphant return to the AIE stage RL track - now as part of @PrimeIntellect!

A lot of agent builders are basically doing "RL by hand". He concisely explains current RL algorithms in one slide (!) but then argues that RL - particularly for open models - is stuck in math and code Q&A land

the new hotness is multi-turn agentic RL, and the new verifiers library is the ultimate toolkit for building an agent and turning it into an RL loop.

More people should be exploring building better agent models and Will + PI is enabling that for everyone!

XXXXXX engagements

Engagements Line Chart

Related Topics coins ai

Post Link