Dark | Light
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

![jay_wooow Avatar](https://lunarcrush.com/gi/w:24/cr:twitter::238065796.png) joao [@jay_wooow](/creator/twitter/jay_wooow) on x 8480 followers
Created: 2025-07-18 15:54:25 UTC

You cannot learn that which you cannot sample

Crank up the temperature to train more curious agents. Simple and effective.

From “Training a Generally Curious Agent”:

We design a diverse set of tasks where an LLM agent needs strategic information gathering to succeed, then train an LLM on self-generated data to prefer higher performing trajectories. The resulting behavior learned can transfer zero-shot to unseen tasks, showcasing its potential to build general decision making agents.


XXX engagements

![Engagements Line Chart](https://lunarcrush.com/gi/w:600/p:tweet::1946237128993656863/c:line.svg)

**Related Topics**
[llm](/topic/llm)

[Post Link](https://x.com/jay_wooow/status/1946237128993656863)

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

jay_wooow Avatar joao @jay_wooow on x 8480 followers Created: 2025-07-18 15:54:25 UTC

You cannot learn that which you cannot sample

Crank up the temperature to train more curious agents. Simple and effective.

From “Training a Generally Curious Agent”:

We design a diverse set of tasks where an LLM agent needs strategic information gathering to succeed, then train an LLM on self-generated data to prefer higher performing trajectories. The resulting behavior learned can transfer zero-shot to unseen tasks, showcasing its potential to build general decision making agents.

XXX engagements

Engagements Line Chart

Related Topics llm

Post Link

post/tweet::1946237128993656863
/post/tweet::1946237128993656863