Dark | Light
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

[@rajammanabrolu](/creator/twitter/rajammanabrolu)
"We've been quite motivated by the idea of trying to find scalable levers to balance between safety and capabilities. Process/fine grained rewards are a great way but it's hard to scale for non verifiable domains. Trust me I spent the last couple of years trying. So we figured out a way to train PRMs across domains with just outcome data + train multi objective policies with vectorized rewards on top My fav part is that one of the domains we tested on was real data collected from my classroom and at Georgia Tech for over a year with a tool that helps with AI oral assessment in class (think AI"  
[X Link](https://x.com/rajammanabrolu/status/1983253152926445600) [@rajammanabrolu](/creator/x/rajammanabrolu) 2025-10-28T19:22Z 7705 followers, 1726 engagements


"courtesy @ccui9 and Georgia Tech collab with Ray Hung if you're an instructor thinking of using these in your own classes"  
[X Link](https://x.com/rajammanabrolu/status/1983253655227899987) [@rajammanabrolu](/creator/x/rajammanabrolu) 2025-10-28T19:24Z 7705 followers, XXX engagements

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@rajammanabrolu "We've been quite motivated by the idea of trying to find scalable levers to balance between safety and capabilities. Process/fine grained rewards are a great way but it's hard to scale for non verifiable domains. Trust me I spent the last couple of years trying. So we figured out a way to train PRMs across domains with just outcome data + train multi objective policies with vectorized rewards on top My fav part is that one of the domains we tested on was real data collected from my classroom and at Georgia Tech for over a year with a tool that helps with AI oral assessment in class (think AI"
X Link @rajammanabrolu 2025-10-28T19:22Z 7705 followers, 1726 engagements

"courtesy @ccui9 and Georgia Tech collab with Ray Hung if you're an instructor thinking of using these in your own classes"
X Link @rajammanabrolu 2025-10-28T19:24Z 7705 followers, XXX engagements

creator/twitter::1115817574078582785/posts
/creator/twitter::1115817574078582785/posts