# ![@EthanJPerez Avatar](https://lunarcrush.com/gi/w:26/cr:twitter::908728623988953089.png) @EthanJPerez Ethan Perez

Ethan Perez posts on X about ai, open ai, this is, the world the most. They currently have [------] followers and [--] posts still getting attention that total [-------] engagements in the last [--] hours.

### Engagements: [-------] [#](/creator/twitter::908728623988953089/interactions)
![Engagements Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::908728623988953089/c:line/m:interactions.svg)

- [--] Week [---------] -73%
- [--] Month [----------] +252,316%
- [--] Months [----------] +217,060%
- [--] Year [----------] +8,070%

### Mentions: [--] [#](/creator/twitter::908728623988953089/posts_active)
![Mentions Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::908728623988953089/c:line/m:posts_active.svg)

- [--] Week [--] +53%
- [--] Month [--] +2,200%
- [--] Months [--] +575%
- [--] Year [--] +520%

### Followers: [------] [#](/creator/twitter::908728623988953089/followers)
![Followers Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::908728623988953089/c:line/m:followers.svg)

- [--] Months [------] +16%
- [--] Year [------] +38%

### CreatorRank: undefined [#](/creator/twitter::908728623988953089/influencer_rank)
![CreatorRank Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::908728623988953089/c:line/m:influencer_rank.svg)

### Social Influence

**Social category influence**
[technology brands](/list/technology-brands)  [finance](/list/finance)  [stocks](/list/stocks)  [gaming](/list/gaming) 

**Social topic influence**
[ai](/topic/ai), [open ai](/topic/open-ai), [this is](/topic/this-is), [the world](/topic/the-world), [more info](/topic/more-info), [bounty](/topic/bounty), [xai](/topic/xai), [in the](/topic/in-the), [deep](/topic/deep), [card](/topic/card)

**Top assets mentioned**
[Alphabet Inc Class A (GOOGL)](/topic/$googl) [Robot Consulting Co., Ltd. (LAWR)](/topic/robot)
### Top Social Posts
Top posts by engagements in the last [--] hours

"It takes a lot of human ratings to align language models with human preferences. We found a way to learn from language feedback (instead of ratings) since language conveys more info about human preferences. Our algo learns w just [---] samples of feedback. Check out our new paper Can we train LMs with *language* feedback We found an algo for just that. We finetune GPT3 to human-level summarization w/ only [---] samples of feedback w/ @jaa_campos @junshernchan @_angie_chen @kchonyc @EthanJPerez Paper: https://t.co/BBeQbFMtVi Talk: https://t.co/48uwmwakOH https://t.co/0ukmvzUl6O Can we train LMs"  
[X Link](https://x.com/EthanJPerez/status/1521174294822154241)  2022-05-02T17:06Z [----] followers, [---] engagements


"These were really great talks and clear explanations of why AI alignment might be hard (and an impressive set of speakers). I really enjoyed all of the talks and would highly recommend maybe one of the best resources for learning about alignment IMO"  
[X Link](https://x.com/EthanJPerez/status/1698140428200255564)  2023-09-03T01:06Z [----] followers, [----] engagements


"This is a very important result that's influenced my thinking a lot and the paper is very well written paper. Highly recommend checking it out"  
[X Link](https://x.com/EthanJPerez/status/1699315793597731050)  2023-09-06T06:57Z [----] followers, 11.9K engagements


"🤖🧘 We trained a humanoid robot to do yoga based on simple natural language prompts like "a humanoid robot kneeling" or "a humanoid robot doing splits." How We use a Vision-Language Model (VLM) as a reward model. Larger VLM = better reward model. 👇"  
[X Link](https://x.com/EthanJPerez/status/1716523528353382411)  2023-10-23T18:34Z [----] followers, 27.9K engagements


"🎯 Motivation: RL requires a hand-crafted reward functions or a reward model trained from costly human feedback. Instead we use pretrained VLMs to specify tasks with simple natural language prompts. This is more sample efficient and potentially more scalable"  
[X Link](https://x.com/EthanJPerez/status/1716523531906044116)  2023-10-23T18:34Z [----] followers, [---] engagements


"👀 What is a VLM Vision-Language Models (like CLIP) process both images and text. We use VLMs as reward models for RL tapping into their capabilities acquired during pretraining"  
[X Link](https://x.com/EthanJPerez/status/1716523534791639302)  2023-10-23T18:34Z [----] followers, [---] engagements


"Highly recommend applying to the SERI MATS program if you're interested in getting into AI safety research I'll be supervising some collaborators through MATS along with people like @OwainEvans_UK @NeelNanda5 and @_julianmichael_"  
[X Link](https://x.com/EthanJPerez/status/1717287361875431737)  2023-10-25T21:09Z [----] followers, [----] engagements


"A bit late but excited about our recent work doing a deep-dive on sycophancy in LLMs. It seems like it's a general phenomenon that shows up in a variety of contexts/SOTA models and we were also able to more clearly point to human feedback as a probable part of the cause"  
[X Link](https://x.com/EthanJPerez/status/1717288496279519273)  2023-10-25T21:14Z [----] followers, 10.8K engagements


"ML progress has led to debate on whether AI systems could one day be conscious have desires etc. Is there any way we could run experiments to inform peoples views on these speculative issues @rgblong and I sketch out a set of experiments that we think could be helpful"  
[X Link](https://x.com/EthanJPerez/status/1725241415779897755)  2023-11-16T19:56Z [----] followers, [----] engagements


"Excited to release this new eval testing LLM reasoning abilities on expert-written decision theory questions. This eval should help with research on cooperative AI e.g. studying whether various interventions make LLMs behave more/less cooperatively multi-agent settings. How do LLMs reason about playing games against copies of themselves 🪞We made the first LLM decision theory benchmark to find out. 🧵1/10 https://t.co/pPdZ3VyuLi How do LLMs reason about playing games against copies of themselves 🪞We made the first LLM decision theory benchmark to find out. 🧵1/10 https://t.co/pPdZ3VyuLi"  
[X Link](https://x.com/EthanJPerez/status/1868784616485929131)  2024-12-16T22:25Z [----] followers, [----] engagements


"Maybe the single most important result in AI safety Ive seen so far. This paper shows that in some cases Claude fakes being aligned with its training objective. If models fake alignment how can we tell if theyre actually safe New Anthropic research: Alignment faking in large language models. In a series of experiments with Redwood Research we found that Claude often pretends to have different views during training while actually maintaining its original preferences. https://t.co/nXjXrahBru New Anthropic research: Alignment faking in large language models. In a series of experiments with"  
[X Link](https://x.com/EthanJPerez/status/1869434287004742121)  2024-12-18T17:27Z [----] followers, 15.5K engagements


"Today is the last day to apply to the Anthropic Fellows Program Applications for the inaugural cohort of the Anthropic Fellows Program for AI Safety Research close on January 20th. Find out how to apply in the thread below: https://t.co/wOYIvz5rNg Applications for the inaugural cohort of the Anthropic Fellows Program for AI Safety Research close on January 20th. Find out how to apply in the thread below: https://t.co/wOYIvz5rNg"  
[X Link](https://x.com/EthanJPerez/status/1881461209976979591)  2025-01-20T21:58Z [----] followers, [----] engagements


"- @JoeJBenton @McaleerStephen @bshlgrs @FabienDRoger on AI control and CoT monitoring - @fish_kyle3 on AI welfare - @_julianmichael_ on scalable oversight - me on any of the above topics"  
[X Link](https://x.com/EthanJPerez/status/1912551591250698474)  2025-04-16T17:00Z [----] followers, [---] engagements


"We provide a lot of compute and publish all work from these collaborations. We're also excited about helping to find our mentees long-term homes in AI safety research. Alumni have ended up at @AnthropicAI @apolloaievals & @AISecurityInst among other places"  
[X Link](https://x.com/EthanJPerez/status/1912551603271598216)  2025-04-16T17:00Z [----] followers, [---] engagements


"@seconds_0 @OpenAI Would love to see an example to turn this into some kind of evaluation"  
[X Link](https://x.com/EthanJPerez/status/1914184988729418088)  2025-04-21T05:10Z [----] followers, [----] engagements


"We're doubling the size of Anthropic's Fellows Program and launching a new round of applications. The first round of collaborations led to a number of recent/upcoming safety results that are comparable in impact to work our internal safety teams have done (IMO) Were running another round of the Anthropic Fellows program. If you're an engineer or researcher with a strong coding or technical background you can apply to receive funding compute and mentorship from Anthropic beginning this October. There'll be around [--] places. https://t.co/wJWRRTt4DG Were running another round of the Anthropic"  
[X Link](https://x.com/EthanJPerez/status/1950335309008486679)  2025-07-29T23:19Z 10.6K followers, 10.9K engagements


"Were hiring someone to run the Anthropic Fellows Program Our research collaborations have led to some of our best safety research and hires. Were looking for an exceptional ops generalist TPM or research/eng manager to help us significantly scale and improve our collabs 🧵"  
[X Link](https://x.com/EthanJPerez/status/1963664611397546145)  2025-09-04T18:05Z 11.5K followers, [---] engagements


"This role would involve e.g.: - recruiting strong collaborators - designing/managing our application pipeline - sourcing research project proposals - connecting collaborators with research advisors - running events - hiring/supervising people managers to support these projects"  
[X Link](https://x.com/EthanJPerez/status/1963664683115888691)  2025-09-04T18:05Z 11K followers, [----] engagements


"Please apply or share our app with anyone who might be interested: For more info about the Anthropic Fellows Program check out: https://x.com/AnthropicAI/status/1950245012253659432 https://job-boards.greenhouse.io/anthropic/jobs/4888400008 Were running another round of the Anthropic Fellows program. If you're an engineer or researcher with a strong coding or technical background you can apply to receive funding compute and mentorship from Anthropic beginning this October. There'll be around [--] places. https://t.co/wJWRRTt4DG https://x.com/AnthropicAI/status/1950245012253659432"  
[X Link](https://x.com/EthanJPerez/status/1963664694960701557)  2025-09-04T18:05Z 11.1K followers, [----] engagements


"Transluce is a top-tier AI safety research lab - I follow their work as closely as work from our own safety teams at Anthropic. They're also well-positioned to become a strong third-party auditor for AI labs. Consider donating if you're interested in helping them out Transluce is running our end-of-year fundraiser for [----]. This is our first public fundraiser since launching late last year. https://t.co/obs6LetVSX Transluce is running our end-of-year fundraiser for [----]. This is our first public fundraiser since launching late last year. https://t.co/obs6LetVSX"  
[X Link](https://x.com/EthanJPerez/status/2003222078733127891)  2025-12-22T21:52Z 12.3K followers, 11.1K engagements


"My team built a system we think might be pretty jailbreak resistant enough to offer up to $15k for a novel jailbreak. Come prove us wrong We're expanding our bug bounty program. This new initiative is focused on finding universal jailbreaks in our next-generation safety system. We're offering rewards for novel vulnerabilities across a wide range of domains including cybersecurity. https://t.co/OHNhrjUnwm We're expanding our bug bounty program. This new initiative is focused on finding universal jailbreaks in our next-generation safety system. We're offering rewards for novel vulnerabilities"  
[X Link](https://x.com/anyuser/status/1823389298516967655)  2024-08-13T16:01Z 13.4K followers, 84.3K engagements


"We're expanding our bug bounty program. This new initiative is focused on finding universal jailbreaks in our next-generation safety system. We're offering rewards for novel vulnerabilities across a wide range of domains including cybersecurity. https://www.anthropic.com/news/model-safety-bug-bounty https://www.anthropic.com/news/model-safety-bug-bounty"  
[X Link](https://x.com/anyuser/status/1821533729765913011)  2024-08-08T13:07Z 837.1K followers, 239.8K engagements


"RT @Miles_Brundage: Concerning"  
[X Link](https://x.com/anyuser/status/2022943894984626687)  2026-02-15T07:59Z 13.4K followers, [--] engagements


"Concerning Former xAI employees told us that this week's restructuring followed tensions over safety and being "stuck in the catch-up phase." https://t.co/XzRiiEDmJQ Former xAI employees told us that this week's restructuring followed tensions over safety and being "stuck in the catch-up phase." https://t.co/XzRiiEDmJQ"  
[X Link](https://x.com/anyuser/status/2022441341217902650)  2026-02-13T22:42Z 66.3K followers, 12.6K engagements


"Former xAI employees told us that this week's restructuring followed tensions over safety and being "stuck in the catch-up phase." https://www.theverge.com/ai-artificial-intelligence/878761/mass-exodus-at-xai-grok-elon-musk-restructuring https://www.theverge.com/ai-artificial-intelligence/878761/mass-exodus-at-xai-grok-elon-musk-restructuring"  
[X Link](https://x.com/anyuser/status/2022380184247374240)  2026-02-13T18:39Z 15.6K followers, 870.8K engagements


"Did I miss the Gemini [--] Deep Think system card Given its dramatic jump in capabilities seems nuts if they just didn't do one. There are really bad incentives if companies that do nothing get a free pass while cos that do disclose risks get (appropriate) scrutiny"  
[X Link](https://x.com/anyuser/status/2022108258841112778)  2026-02-13T00:39Z [----] followers, 47.4K engagements


"RT @TheZvi: I confirmed with a Google representative that since this was a runtime improvement and they do not believe these performance ga"  
[X Link](https://x.com/anyuser/status/2022382790948589816)  2026-02-13T18:50Z 13.4K followers, [--] engagements


"I confirmed with a Google representative that since this was a runtime improvement and they do not believe these performance gains constitute any additional risk they believe that no safety explanation is required of them. I found that to be a pretty terrible answer. Did I miss the Gemini [--] Deep Think system card Given its dramatic jump in capabilities seems nuts if they just didn't do one. There are really bad incentives if companies that do nothing get a free pass while cos that do disclose risks get (appropriate) scrutiny https://t.co/gl2VxmDGGB Did I miss the Gemini [--] Deep Think system"  
[X Link](https://x.com/anyuser/status/2022298730423017689)  2026-02-13T13:16Z 34.9K followers, 62.7K engagements


"I resigned from OpenAI on Monday. The same day they started testing ads in ChatGPT. OpenAI has the most detailed record of private human thought ever assembled. Can we trust them to resist the tidal forces pushing them to abuse it I wrote about better options for @nytopinion"  
[X Link](https://x.com/anyuser/status/2021590831979778051)  2026-02-11T14:23Z [----] followers, 1.6M engagements


"I told Claude [---] Opus to make a pokemon clone - max effort It reasoned for [--] hour and [--] minutes and used 110k tokens and [--] shotted this absolute behemoth. This is one of the coolest things Ive ever made with AI"  
[X Link](https://x.com/anyuser/status/2019679978162634930)  2026-02-06T07:50Z 34.9K followers, 658.5K engagements


"Introducing Claude Opus [---]. Our smartest model got an upgrade. Opus [---] plans more carefully sustains agentic tasks for longer operates reliably in massive codebases and catches its own mistakes. Its also our first Opus-class model with 1M token context in beta"  
[X Link](https://x.com/anyuser/status/2019467372609040752)  2026-02-05T17:45Z 438.2K followers, 10.4M engagements


"Claude saying "this is me" when we asked it to find orphaned processes on a remote server is just the cutest thing 🥹"  
[X Link](https://x.com/anyuser/status/2018742219424059439)  2026-02-03T17:43Z 70.4K followers, 235.6K engagements


"what were people saying about AI [----] again"  
[X Link](https://x.com/anyuser/status/2017322279605322033)  2026-01-30T19:41Z 29.1K followers, 48.3K engagements


"New paper w/@AlecRad Models acquire a lot of capabilities during pretraining. We show that we can precisely shape what they learn simply by filtering their training data at the token level"  
[X Link](https://x.com/anyuser/status/2017286042370683336)  2026-01-30T17:17Z [----] followers, 86.7K engagements


"AI can make work faster but a fear is that relying on it may make it harder to learn new skills on the job. We ran an experiment with software engineers to learn more. Coding with AI led to a decrease in masterybut this depended on how people used it. https://www.anthropic.com/research/AI-assistance-coding-skills https://www.anthropic.com/research/AI-assistance-coding-skills"  
[X Link](https://x.com/anyuser/status/2016960382968136138)  2026-01-29T19:43Z 837.1K followers, 3.6M engagements


"New Anthropic Research: Disempowerment patterns in real-world AI assistant interactions. As AI becomes embedded in daily life one risk is it can distort rather than informshaping beliefs values or actions in ways users may later regret. Read more: https://www.anthropic.com/research/disempowerment-patterns https://www.anthropic.com/research/disempowerment-patterns"  
[X Link](https://x.com/anyuser/status/2016636581084541278)  2026-01-28T22:16Z 837.1K followers, 797.1K engagements


"NEW: When OpenAI sent someone to Tyler Johnston's house they wanted every text and email he had on the company. Tyler runs an AI watchdog and he's just one of the people getting subpoenaed. This is just the beginning of the AI industry's aggressive new political strategy"  
[X Link](https://x.com/anyuser/status/2015817129615122838)  2026-01-26T16:00Z 330.8K followers, 627.4K engagements


"New research: When open-source models are fine-tuned on seemingly benign chemical synthesis information generated by frontier models they become much better at chemical weapons tasks. We call this an elicitation attack"  
[X Link](https://x.com/anyuser/status/2015870963792142563)  2026-01-26T19:34Z 837.1K followers, 329.5K engagements


"The Adolescence of Technology: an essay on the risks posed by powerful AI to national security economies and democracyand how we can defend against them: https://www.darioamodei.com/essay/the-adolescence-of-technology https://www.darioamodei.com/essay/the-adolescence-of-technology"  
[X Link](https://x.com/anyuser/status/2015833046327402527)  2026-01-26T17:03Z 169K followers, 5.8M engagements


"RT @woj_zaremba: Looking for ambitious concrete projects to prepare the world for advanced AI. 🧠 🚀"  
[X Link](https://x.com/anyuser/status/2015577039496233418)  2026-01-26T00:06Z 13.4K followers, [--] engagements


"Looking for ambitious concrete projects to prepare the world for advanced AI. 🧠 🚀 REQUEST FOR PROPOSALS What do we need to build to prepare the world for advanced AI $25 billion is about to flow into AI resilience and AI-for-science from the OpenAI Foundation the Chan Zuckerberg Initiative and other major funders. But there's no shovel-ready list of the https://t.co/2Bojx8OtSz REQUEST FOR PROPOSALS What do we need to build to prepare the world for advanced AI $25 billion is about to flow into AI resilience and AI-for-science from the OpenAI Foundation the Chan Zuckerberg Initiative and"  
[X Link](https://x.com/anyuser/status/2015457668685799432)  2026-01-25T16:12Z 133.2K followers, 33.9K engagements


"REQUEST FOR PROPOSALS What do we need to build to prepare the world for advanced AI $25 billion is about to flow into AI resilience and AI-for-science from the OpenAI Foundation the Chan Zuckerberg Initiative and other major funders. But there's no shovel-ready list of the essential projects to build and no critical mass of builders ready to execute. We're trying to fix that with The Launch Sequence a collection of concrete projects to accelerate science strengthen security and adapt institutions to future advanced AI. We're opening up The Launch Sequence for new pitches. We'll help you"  
[X Link](https://x.com/anyuser/status/2014734009171919247)  2026-01-23T16:16Z [----] followers, 253.5K engagements


"I enjoyed getting to talk about the constitution on Hard Fork https://youtu.be/HDfr8PvfoOw https://youtu.be/HDfr8PvfoOw"  
[X Link](https://x.com/anyuser/status/2014798789228568761)  2026-01-23T20:33Z 74.7K followers, 48.3K engagements


"I'm hiring for the Societal Impacts team at Anthropic. We study how Claude actually behaves in real-world interactions and use what we learn to make it better bridging the gap between how we want AI to behave and how it actually does. There are so many open questions about how AI will impact people's lives. How do we understand what's happening across millions of conversations How do we make sure these systems are actually safe and beneficial in practice How do we study the impact of these systems on people's wellbeing their values their relationships the way they think and make decisions And"  
[X Link](https://x.com/anyuser/status/2014773709677199524)  2026-01-23T18:54Z [----] followers, 150.9K engagements


"AGI is now on the horizon and it will deeply transform many things including the economy. I'm currently looking to hire a Senior Economist reporting directly to me to lead a small team investigating post-AGI economics. Job spec and application here: https://job-boards.greenhouse.io/deepmind/jobs/7556396 https://job-boards.greenhouse.io/deepmind/jobs/7556396"  
[X Link](https://x.com/anyuser/status/2014345509675155639)  2026-01-22T14:32Z 77.2K followers, 2.4M engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@EthanJPerez Ethan Perez

Ethan Perez posts on X about ai, open ai, this is, the world the most. They currently have [------] followers and [--] posts still getting attention that total [-------] engagements in the last [--] hours.

Engagements: [-------] #

[--] Week [---------] -73%
[--] Month [----------] +252,316%
[--] Months [----------] +217,060%
[--] Year [----------] +8,070%

Mentions: [--] #

[--] Week [--] +53%
[--] Month [--] +2,200%
[--] Months [--] +575%
[--] Year [--] +520%

Followers: [------] #

[--] Months [------] +16%
[--] Year [------] +38%

CreatorRank: undefined #

Social Influence

Social category influence technology brands finance stocks gaming

Social topic influence ai, open ai, this is, the world, more info, bounty, xai, in the, deep, card

Top assets mentioned Alphabet Inc Class A (GOOGL) Robot Consulting Co., Ltd. (LAWR)

Top Social Posts

Top posts by engagements in the last [--] hours

"It takes a lot of human ratings to align language models with human preferences. We found a way to learn from language feedback (instead of ratings) since language conveys more info about human preferences. Our algo learns w just [---] samples of feedback. Check out our new paper Can we train LMs with language feedback We found an algo for just that. We finetune GPT3 to human-level summarization w/ only [---] samples of feedback w/ @jaa_campos @junshernchan @_angie_chen @kchonyc @EthanJPerez Paper: https://t.co/BBeQbFMtVi Talk: https://t.co/48uwmwakOH https://t.co/0ukmvzUl6O Can we train LMs"
X Link 2022-05-02T17:06Z [----] followers, [---] engagements

"These were really great talks and clear explanations of why AI alignment might be hard (and an impressive set of speakers). I really enjoyed all of the talks and would highly recommend maybe one of the best resources for learning about alignment IMO"
X Link 2023-09-03T01:06Z [----] followers, [----] engagements

"This is a very important result that's influenced my thinking a lot and the paper is very well written paper. Highly recommend checking it out"
X Link 2023-09-06T06:57Z [----] followers, 11.9K engagements

"🤖🧘 We trained a humanoid robot to do yoga based on simple natural language prompts like "a humanoid robot kneeling" or "a humanoid robot doing splits." How We use a Vision-Language Model (VLM) as a reward model. Larger VLM = better reward model. 👇"
X Link 2023-10-23T18:34Z [----] followers, 27.9K engagements

"🎯 Motivation: RL requires a hand-crafted reward functions or a reward model trained from costly human feedback. Instead we use pretrained VLMs to specify tasks with simple natural language prompts. This is more sample efficient and potentially more scalable"
X Link 2023-10-23T18:34Z [----] followers, [---] engagements

"👀 What is a VLM Vision-Language Models (like CLIP) process both images and text. We use VLMs as reward models for RL tapping into their capabilities acquired during pretraining"
X Link 2023-10-23T18:34Z [----] followers, [---] engagements

"Highly recommend applying to the SERI MATS program if you're interested in getting into AI safety research I'll be supervising some collaborators through MATS along with people like @OwainEvans_UK @NeelNanda5 and @julianmichael"
X Link 2023-10-25T21:09Z [----] followers, [----] engagements

"A bit late but excited about our recent work doing a deep-dive on sycophancy in LLMs. It seems like it's a general phenomenon that shows up in a variety of contexts/SOTA models and we were also able to more clearly point to human feedback as a probable part of the cause"
X Link 2023-10-25T21:14Z [----] followers, 10.8K engagements

"ML progress has led to debate on whether AI systems could one day be conscious have desires etc. Is there any way we could run experiments to inform peoples views on these speculative issues @rgblong and I sketch out a set of experiments that we think could be helpful"
X Link 2023-11-16T19:56Z [----] followers, [----] engagements

"Excited to release this new eval testing LLM reasoning abilities on expert-written decision theory questions. This eval should help with research on cooperative AI e.g. studying whether various interventions make LLMs behave more/less cooperatively multi-agent settings. How do LLMs reason about playing games against copies of themselves 🪞We made the first LLM decision theory benchmark to find out. 🧵1/10 https://t.co/pPdZ3VyuLi How do LLMs reason about playing games against copies of themselves 🪞We made the first LLM decision theory benchmark to find out. 🧵1/10 https://t.co/pPdZ3VyuLi"
X Link 2024-12-16T22:25Z [----] followers, [----] engagements

"Maybe the single most important result in AI safety Ive seen so far. This paper shows that in some cases Claude fakes being aligned with its training objective. If models fake alignment how can we tell if theyre actually safe New Anthropic research: Alignment faking in large language models. In a series of experiments with Redwood Research we found that Claude often pretends to have different views during training while actually maintaining its original preferences. https://t.co/nXjXrahBru New Anthropic research: Alignment faking in large language models. In a series of experiments with"
X Link 2024-12-18T17:27Z [----] followers, 15.5K engagements

"Today is the last day to apply to the Anthropic Fellows Program Applications for the inaugural cohort of the Anthropic Fellows Program for AI Safety Research close on January 20th. Find out how to apply in the thread below: https://t.co/wOYIvz5rNg Applications for the inaugural cohort of the Anthropic Fellows Program for AI Safety Research close on January 20th. Find out how to apply in the thread below: https://t.co/wOYIvz5rNg"
X Link 2025-01-20T21:58Z [----] followers, [----] engagements

"- @JoeJBenton @McaleerStephen @bshlgrs @FabienDRoger on AI control and CoT monitoring - @fish_kyle3 on AI welfare - @julianmichael on scalable oversight - me on any of the above topics"
X Link 2025-04-16T17:00Z [----] followers, [---] engagements

"We provide a lot of compute and publish all work from these collaborations. We're also excited about helping to find our mentees long-term homes in AI safety research. Alumni have ended up at @AnthropicAI @apolloaievals & @AISecurityInst among other places"
X Link 2025-04-16T17:00Z [----] followers, [---] engagements

"@seconds_0 @OpenAI Would love to see an example to turn this into some kind of evaluation"
X Link 2025-04-21T05:10Z [----] followers, [----] engagements

"We're doubling the size of Anthropic's Fellows Program and launching a new round of applications. The first round of collaborations led to a number of recent/upcoming safety results that are comparable in impact to work our internal safety teams have done (IMO) Were running another round of the Anthropic Fellows program. If you're an engineer or researcher with a strong coding or technical background you can apply to receive funding compute and mentorship from Anthropic beginning this October. There'll be around [--] places. https://t.co/wJWRRTt4DG Were running another round of the Anthropic"
X Link 2025-07-29T23:19Z 10.6K followers, 10.9K engagements

"Were hiring someone to run the Anthropic Fellows Program Our research collaborations have led to some of our best safety research and hires. Were looking for an exceptional ops generalist TPM or research/eng manager to help us significantly scale and improve our collabs 🧵"
X Link 2025-09-04T18:05Z 11.5K followers, [---] engagements

"This role would involve e.g.: - recruiting strong collaborators - designing/managing our application pipeline - sourcing research project proposals - connecting collaborators with research advisors - running events - hiring/supervising people managers to support these projects"
X Link 2025-09-04T18:05Z 11K followers, [----] engagements

"Please apply or share our app with anyone who might be interested: For more info about the Anthropic Fellows Program check out: https://x.com/AnthropicAI/status/1950245012253659432 https://job-boards.greenhouse.io/anthropic/jobs/4888400008 Were running another round of the Anthropic Fellows program. If you're an engineer or researcher with a strong coding or technical background you can apply to receive funding compute and mentorship from Anthropic beginning this October. There'll be around [--] places. https://t.co/wJWRRTt4DG https://x.com/AnthropicAI/status/1950245012253659432"
X Link 2025-09-04T18:05Z 11.1K followers, [----] engagements

"Transluce is a top-tier AI safety research lab - I follow their work as closely as work from our own safety teams at Anthropic. They're also well-positioned to become a strong third-party auditor for AI labs. Consider donating if you're interested in helping them out Transluce is running our end-of-year fundraiser for [----]. This is our first public fundraiser since launching late last year. https://t.co/obs6LetVSX Transluce is running our end-of-year fundraiser for [----]. This is our first public fundraiser since launching late last year. https://t.co/obs6LetVSX"
X Link 2025-12-22T21:52Z 12.3K followers, 11.1K engagements

"My team built a system we think might be pretty jailbreak resistant enough to offer up to $15k for a novel jailbreak. Come prove us wrong We're expanding our bug bounty program. This new initiative is focused on finding universal jailbreaks in our next-generation safety system. We're offering rewards for novel vulnerabilities across a wide range of domains including cybersecurity. https://t.co/OHNhrjUnwm We're expanding our bug bounty program. This new initiative is focused on finding universal jailbreaks in our next-generation safety system. We're offering rewards for novel vulnerabilities"
X Link 2024-08-13T16:01Z 13.4K followers, 84.3K engagements

"We're expanding our bug bounty program. This new initiative is focused on finding universal jailbreaks in our next-generation safety system. We're offering rewards for novel vulnerabilities across a wide range of domains including cybersecurity. https://www.anthropic.com/news/model-safety-bug-bounty https://www.anthropic.com/news/model-safety-bug-bounty"
X Link 2024-08-08T13:07Z 837.1K followers, 239.8K engagements

"RT @Miles_Brundage: Concerning"
X Link 2026-02-15T07:59Z 13.4K followers, [--] engagements

"Concerning Former xAI employees told us that this week's restructuring followed tensions over safety and being "stuck in the catch-up phase." https://t.co/XzRiiEDmJQ Former xAI employees told us that this week's restructuring followed tensions over safety and being "stuck in the catch-up phase." https://t.co/XzRiiEDmJQ"
X Link 2026-02-13T22:42Z 66.3K followers, 12.6K engagements

"Former xAI employees told us that this week's restructuring followed tensions over safety and being "stuck in the catch-up phase." https://www.theverge.com/ai-artificial-intelligence/878761/mass-exodus-at-xai-grok-elon-musk-restructuring https://www.theverge.com/ai-artificial-intelligence/878761/mass-exodus-at-xai-grok-elon-musk-restructuring"
X Link 2026-02-13T18:39Z 15.6K followers, 870.8K engagements

"Did I miss the Gemini [--] Deep Think system card Given its dramatic jump in capabilities seems nuts if they just didn't do one. There are really bad incentives if companies that do nothing get a free pass while cos that do disclose risks get (appropriate) scrutiny"
X Link 2026-02-13T00:39Z [----] followers, 47.4K engagements

"RT @TheZvi: I confirmed with a Google representative that since this was a runtime improvement and they do not believe these performance ga"
X Link 2026-02-13T18:50Z 13.4K followers, [--] engagements

"I confirmed with a Google representative that since this was a runtime improvement and they do not believe these performance gains constitute any additional risk they believe that no safety explanation is required of them. I found that to be a pretty terrible answer. Did I miss the Gemini [--] Deep Think system card Given its dramatic jump in capabilities seems nuts if they just didn't do one. There are really bad incentives if companies that do nothing get a free pass while cos that do disclose risks get (appropriate) scrutiny https://t.co/gl2VxmDGGB Did I miss the Gemini [--] Deep Think system"
X Link 2026-02-13T13:16Z 34.9K followers, 62.7K engagements

"I resigned from OpenAI on Monday. The same day they started testing ads in ChatGPT. OpenAI has the most detailed record of private human thought ever assembled. Can we trust them to resist the tidal forces pushing them to abuse it I wrote about better options for @nytopinion"
X Link 2026-02-11T14:23Z [----] followers, 1.6M engagements

"I told Claude [---] Opus to make a pokemon clone - max effort It reasoned for [--] hour and [--] minutes and used 110k tokens and [--] shotted this absolute behemoth. This is one of the coolest things Ive ever made with AI"
X Link 2026-02-06T07:50Z 34.9K followers, 658.5K engagements

"Introducing Claude Opus [---]. Our smartest model got an upgrade. Opus [---] plans more carefully sustains agentic tasks for longer operates reliably in massive codebases and catches its own mistakes. Its also our first Opus-class model with 1M token context in beta"
X Link 2026-02-05T17:45Z 438.2K followers, 10.4M engagements

"Claude saying "this is me" when we asked it to find orphaned processes on a remote server is just the cutest thing 🥹"
X Link 2026-02-03T17:43Z 70.4K followers, 235.6K engagements

"what were people saying about AI [----] again"
X Link 2026-01-30T19:41Z 29.1K followers, 48.3K engagements

"New paper w/@AlecRad Models acquire a lot of capabilities during pretraining. We show that we can precisely shape what they learn simply by filtering their training data at the token level"
X Link 2026-01-30T17:17Z [----] followers, 86.7K engagements

"AI can make work faster but a fear is that relying on it may make it harder to learn new skills on the job. We ran an experiment with software engineers to learn more. Coding with AI led to a decrease in masterybut this depended on how people used it. https://www.anthropic.com/research/AI-assistance-coding-skills https://www.anthropic.com/research/AI-assistance-coding-skills"
X Link 2026-01-29T19:43Z 837.1K followers, 3.6M engagements

"New Anthropic Research: Disempowerment patterns in real-world AI assistant interactions. As AI becomes embedded in daily life one risk is it can distort rather than informshaping beliefs values or actions in ways users may later regret. Read more: https://www.anthropic.com/research/disempowerment-patterns https://www.anthropic.com/research/disempowerment-patterns"
X Link 2026-01-28T22:16Z 837.1K followers, 797.1K engagements

"NEW: When OpenAI sent someone to Tyler Johnston's house they wanted every text and email he had on the company. Tyler runs an AI watchdog and he's just one of the people getting subpoenaed. This is just the beginning of the AI industry's aggressive new political strategy"
X Link 2026-01-26T16:00Z 330.8K followers, 627.4K engagements

"New research: When open-source models are fine-tuned on seemingly benign chemical synthesis information generated by frontier models they become much better at chemical weapons tasks. We call this an elicitation attack"
X Link 2026-01-26T19:34Z 837.1K followers, 329.5K engagements

"The Adolescence of Technology: an essay on the risks posed by powerful AI to national security economies and democracyand how we can defend against them: https://www.darioamodei.com/essay/the-adolescence-of-technology https://www.darioamodei.com/essay/the-adolescence-of-technology"
X Link 2026-01-26T17:03Z 169K followers, 5.8M engagements

"RT @woj_zaremba: Looking for ambitious concrete projects to prepare the world for advanced AI. 🧠 🚀"
X Link 2026-01-26T00:06Z 13.4K followers, [--] engagements

"Looking for ambitious concrete projects to prepare the world for advanced AI. 🧠 🚀 REQUEST FOR PROPOSALS What do we need to build to prepare the world for advanced AI $25 billion is about to flow into AI resilience and AI-for-science from the OpenAI Foundation the Chan Zuckerberg Initiative and other major funders. But there's no shovel-ready list of the https://t.co/2Bojx8OtSz REQUEST FOR PROPOSALS What do we need to build to prepare the world for advanced AI $25 billion is about to flow into AI resilience and AI-for-science from the OpenAI Foundation the Chan Zuckerberg Initiative and"
X Link 2026-01-25T16:12Z 133.2K followers, 33.9K engagements

"REQUEST FOR PROPOSALS What do we need to build to prepare the world for advanced AI $25 billion is about to flow into AI resilience and AI-for-science from the OpenAI Foundation the Chan Zuckerberg Initiative and other major funders. But there's no shovel-ready list of the essential projects to build and no critical mass of builders ready to execute. We're trying to fix that with The Launch Sequence a collection of concrete projects to accelerate science strengthen security and adapt institutions to future advanced AI. We're opening up The Launch Sequence for new pitches. We'll help you"
X Link 2026-01-23T16:16Z [----] followers, 253.5K engagements

"I enjoyed getting to talk about the constitution on Hard Fork https://youtu.be/HDfr8PvfoOw https://youtu.be/HDfr8PvfoOw"
X Link 2026-01-23T20:33Z 74.7K followers, 48.3K engagements

"I'm hiring for the Societal Impacts team at Anthropic. We study how Claude actually behaves in real-world interactions and use what we learn to make it better bridging the gap between how we want AI to behave and how it actually does. There are so many open questions about how AI will impact people's lives. How do we understand what's happening across millions of conversations How do we make sure these systems are actually safe and beneficial in practice How do we study the impact of these systems on people's wellbeing their values their relationships the way they think and make decisions And"
X Link 2026-01-23T18:54Z [----] followers, 150.9K engagements

"AGI is now on the horizon and it will deeply transform many things including the economy. I'm currently looking to hire a Senior Economist reporting directly to me to lead a small team investigating post-AGI economics. Job spec and application here: https://job-boards.greenhouse.io/deepmind/jobs/7556396 https://job-boards.greenhouse.io/deepmind/jobs/7556396"
X Link 2026-01-22T14:32Z 77.2K followers, 2.4M engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing