[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]  AI Engineer [@aiDotEngineer](/creator/twitter/aiDotEngineer) on x 28K followers Created: 2025-07-07 17:34:11 UTC 🆕 Training Agentic Reasoners today's feature is @willccbb's triumphant return to the AIE stage RL track - now as part of @PrimeIntellect! A lot of agent builders are basically doing "RL by hand". He concisely explains current RL algorithms in one slide (!) but then argues that RL - particularly for open models - is stuck in math and code Q&A land the new hotness is multi-turn agentic RL, and the new verifiers library is the ultimate toolkit for building an agent and turning it into an RL loop. More people should be exploring building better agent models and Will + PI is enabling that for everyone!  XXXXXX engagements  **Related Topics** [coins ai](/topic/coins-ai) [Post Link](https://x.com/aiDotEngineer/status/1942275969932484997)
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
AI Engineer @aiDotEngineer on x 28K followers
Created: 2025-07-07 17:34:11 UTC
🆕 Training Agentic Reasoners
today's feature is @willccbb's triumphant return to the AIE stage RL track - now as part of @PrimeIntellect!
A lot of agent builders are basically doing "RL by hand". He concisely explains current RL algorithms in one slide (!) but then argues that RL - particularly for open models - is stuck in math and code Q&A land
the new hotness is multi-turn agentic RL, and the new verifiers library is the ultimate toolkit for building an agent and turning it into an RL loop.
More people should be exploring building better agent models and Will + PI is enabling that for everyone!
XXXXXX engagements
Related Topics coins ai
/post/tweet::1942275969932484997