[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]  joao [@jay_wooow](/creator/twitter/jay_wooow) on x 8480 followers Created: 2025-07-18 15:54:25 UTC You cannot learn that which you cannot sample Crank up the temperature to train more curious agents. Simple and effective. From “Training a Generally Curious Agent”: We design a diverse set of tasks where an LLM agent needs strategic information gathering to succeed, then train an LLM on self-generated data to prefer higher performing trajectories. The resulting behavior learned can transfer zero-shot to unseen tasks, showcasing its potential to build general decision making agents. XXX engagements  **Related Topics** [llm](/topic/llm) [Post Link](https://x.com/jay_wooow/status/1946237128993656863)
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
joao @jay_wooow on x 8480 followers
Created: 2025-07-18 15:54:25 UTC
You cannot learn that which you cannot sample
Crank up the temperature to train more curious agents. Simple and effective.
From “Training a Generally Curious Agent”:
We design a diverse set of tasks where an LLM agent needs strategic information gathering to succeed, then train an LLM on self-generated data to prefer higher performing trajectories. The resulting behavior learned can transfer zero-shot to unseen tasks, showcasing its potential to build general decision making agents.
XXX engagements
Related Topics llm
/post/tweet::1946237128993656863