#  @polynoamial Noam Brown Noam Brown posts on X about ai, open ai, agi, math the most. They currently have [-------] followers and [---] posts still getting attention that total [-------] engagements in the last [--] hours. ### Engagements: [-------] [#](/creator/twitter::825088493764407298/interactions)  - [--] Week [---------] +11% - [--] Month [----------] +1,675% - [--] Months [----------] +59% - [--] Year [----------] +68% ### Mentions: [--] [#](/creator/twitter::825088493764407298/posts_active)  - [--] Week [--] +42% - [--] Month [--] +93% - [--] Months [---] +11% - [--] Year [---] +59% ### Followers: [-------] [#](/creator/twitter::825088493764407298/followers)  - [--] Week [-------] +0.99% - [--] Month [-------] +2.90% - [--] Months [-------] +11% - [--] Year [-------] +38% ### CreatorRank: [------] [#](/creator/twitter::825088493764407298/influencer_rank)  ### Social Influence **Social category influence** [technology brands](/list/technology-brands) 45% [finance](/list/finance) 10% [social networks](/list/social-networks) 4% [stocks](/list/stocks) #2140 [gaming](/list/gaming) 1% [countries](/list/countries) 1% **Social topic influence** [ai](/topic/ai) #759, [open ai](/topic/open-ai) #1022, [agi](/topic/agi) #115, [math](/topic/math) 6%, [the world](/topic/the-world) 6%, [gold](/topic/gold) 6%, [level](/topic/level) 5%, [if you](/topic/if-you) 5%, [imo](/topic/imo) #5, [inference](/topic/inference) #8 **Top accounts mentioned or mentioned by** [@openai](/creator/undefined) [@fchollet](/creator/undefined) [@ghidorah_x](/creator/undefined) [@grokton](/creator/undefined) [@scaling01](/creator/undefined) [@k4l1_89](/creator/undefined) [@lateinteraction](/creator/undefined) [@antisimplistic](/creator/undefined) [@openais](/creator/undefined) [@deanwball](/creator/undefined) [@karpathy](/creator/undefined) [@kimmonismus](/creator/undefined) [@sholtodouglas](/creator/undefined) [@googledeepmind](/creator/undefined) [@kevinwang3290](/creator/undefined) [@hideeveryflower](/creator/undefined) [@ricky_neace](/creator/undefined) [@teknium](/creator/undefined) [@michael_druggan](/creator/undefined) [@superbbias](/creator/undefined) **Top assets mentioned** [Alphabet Inc Class A (GOOGL)](/topic/$googl) ### Top Social Posts Top posts by engagements in the last [--] hours "I'm often askedhow to land a researchjob at a frontier AI lab. It's hard especially without a research background but I like to point to @kellerjordan0 as an example showing it can be done. Keller graduated from UCSD with no publication record and was working at an AI content moderation startup when he landed a cold call with@bneyshabur (who was at Google) and presented an idea to improve upon Behnam's recent paper. Behnam agreed to mentor him which led to an ICLR paper. Sadly there's less open research today but improving upon a researcher's published work is a great way to demonstrate" [X Link](https://x.com/polynoamial/status/2014084431062114744) 2026-01-21T21:15Z 100.7K followers, 703.5K engagements "Had to cut this one for space: 2019: AI can't create artcreativity is uniquely human" [X Link](https://x.com/polynoamial/status/2015924757489975421) 2026-01-26T23:08Z 100.7K followers, 23.4K engagements "Codex is writing all my code these days and I've fully switched to using the Codex app. The velocity of the Codex team is really impressive. I am Tibo and I have an incredible team. Codex would not exist without them and they cooked. Enjoy the new Codex app access through your free/go ChatGPT plan and 2X rate limits on other plans. Can't wait to hear what you do with it. https://t.co/Lwg13vEJDn https://t.co/c7AaRCenoQ I am Tibo and I have an incredible team. Codex would not exist without them and they cooked. Enjoy the new Codex app access through your free/go ChatGPT plan and 2X rate limits" [X Link](https://x.com/polynoamial/status/2018387805341380848) 2026-02-02T18:15Z 100.7K followers, 142.1K engagements "Every [--] months I hear people claim @OpenAI isnt doing real research and is just incrementally improving ChatGPT. I even heard it right before ๐/o1. In my opinion @OpenAI is the best frontier lab to do research at today. Building an AI research intern in [----] is not hype. How does OpenAI balance long-term research bets with product-forward research fundamentals Ive been getting this question a lot lately usually framed as a suggestion that Jakub (@merettm) and I are pushing an increasingly product-focused agenda. That characterization is How does OpenAI balance long-term research bets with" [X Link](https://x.com/polynoamial/status/2018792698107634108) 2026-02-03T21:04Z 100.7K followers, 103.8K engagements "I worked in quant trading for a year after undergrad but didn't want my lifetime contribution to humanity to be making equity markets marginally more efficient. Taking a paycut to pursue AI research was my best life decision. Today you don't even need to take a paycut to do it. be you work in HFT shaving nanoseconds off latency or extracting bps from models have existential dread see this tweet wonder if your skills could be better used making AGI apply to attend this party meet the openai team build AGI be you work in HFT shaving nanoseconds off latency or extracting bps from models have" [X Link](https://x.com/polynoamial/status/1911925322486104535) 2025-04-14T23:31Z 100.7K followers, 322.5K engagements "Here's the 80% success rate plot" [X Link](https://x.com/polynoamial/status/2019182634379931869) 2026-02-04T22:53Z 100.7K followers, 27K engagements "You can read about GPT-5.3-Codex here: https://openai.com/index/introducing-gpt-5-3-codex/ https://openai.com/index/introducing-gpt-5-3-codex/" [X Link](https://x.com/polynoamial/status/2019476537418912099) 2026-02-05T18:21Z 100.7K followers, [----] engagements "RT @deanwball: The use of GPT-5 as evidence that "AI is slowing down" is a legendary example of mass delusion. Yes its release timing gave" [X Link](https://x.com/polynoamial/status/2020262825525231895) 2026-02-07T22:26Z 100.7K followers, [--] engagements "There have been fair questions on whether LLM contributions to STEM are overhyped but I've spoken with physicists about this result and they've told me it is a truly significant research contribution roughly at the level of a solid journal paper and GPT-5.2 played a key role. GPT-5.2 derived a new result in theoretical physics. Were releasing the result in a preprint with researchers from @the_IAS @VanderbiltU @Cambridge_Uni and @Harvard. It shows that a gluon interaction many physicists expected would not occur can arise under specific GPT-5.2 derived a new result in theoretical physics." [X Link](https://x.com/polynoamial/status/2022413904757035167) 2026-02-13T20:53Z 100.7K followers, 122.1K engagements "If you haven't disabled voice authentication for your bank account and had a conversation with your family about AI voice impersonation yet now would be a good time. We're sharing our learnings from a small-scale preview of Voice Engine a model which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker. https://t.co/yLsfGaVtrZ We're sharing our learnings from a small-scale preview of Voice Engine a model which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely" [X Link](https://x.com/polynoamial/status/1773799870890918358) 2024-03-29T19:50Z 100.2K followers, 162.5K engagements "I've heard people claim that Sam is just drumming up hype but from what I've seen everything he's saying matches the median view of @OpenAI researchers on the ground. Sam Altman says the pathway to AGI is now clear and "we actually know what do" it will be easier to get to Level [--] Innovating AI than he initially thought and "things are going to go a lot faster than people are appreciating right now" https://t.co/LOhzP0SA8h Sam Altman says the pathway to AGI is now clear and "we actually know what do" it will be easier to get to Level [--] Innovating AI than he initially thought and "things are" [X Link](https://x.com/polynoamial/status/1855037689533178289) 2024-11-09T00:00Z 100.4K followers, 845.2K engagements "We announced @OpenAI o1 just [--] months ago. Today we announced o3. We have every reason to believe this trajectory will continue" [X Link](https://x.com/polynoamial/status/1870172996650053653) 2024-12-20T18:22Z 100.2K followers, 1.9M engagements "Scaling pretraining and scaling thinking are two different dimensions of improvement. They are complementary not in competition" [X Link](https://x.com/polynoamial/status/1895207166799401178) 2025-02-27T20:19Z 100.3K followers, 130.5K engagements "It's deeply concerning that one of the best AI researchers I've worked with @kaicathyc was denied a U.S. green card today. A Canadian who's lived and contributed here for [--] years now has to leave. Were risking Americas AI leadership when we turn away talent like this" [X Link](https://x.com/polynoamial/status/1915765141846515883) 2025-04-25T13:49Z 100.6K followers, 2.6M engagements "You dont need a PhD to be a great AI researcher. Even @OpenAIs Chief Research Officer doesnt have a PhD" [X Link](https://x.com/polynoamial/status/1939103344985022676) 2025-06-28T23:27Z 100.1K followers, 1.3M engagements "Congrats to the GDM team on their IMO result I think their parallel success highlights how fast AI progress is. Their approach was a bit different than ours but I think that shows there are many research directions for further progress. Some thoughts on our model and results ๐งต" [X Link](https://x.com/polynoamial/status/1947398531259523481) 2025-07-21T20:49Z 100.3K followers, 502.8K engagements "Self play works so well in chess go and poker because those games are two-player zero-sum. That simplifies a lot of problems. The real world is messier which is why we havent seen many successes from self play in LLMs yet. Btw @karpathy did great and I mostly agree with him .@karpathy says that LLMs currently lack the cultural accumulation and self-play that propelled humans out of the savannah: Culture: Why cant an LLM write a book for the other LLMs Why cant other LLMs read this LLMs book and be inspired by it or shocked by it Self https://t.co/InZwkzW9U2 .@karpathy says that LLMs currently" [X Link](https://x.com/polynoamial/status/1980653716085825626) 2025-10-21T15:13Z 100.2K followers, 346.3K engagements "@deanwball I hope policymakers will consider all of this going forward when deciding whose opinions to trust" [X Link](https://x.com/polynoamial/status/2020263694530486692) 2026-02-07T22:29Z 100.1K followers, [----] engagements "1) The RSP says that AI R&D-4 calls for both ASL-3 and an "affirmative case" on misalignment. The model is released under ASL-3 but no affirmative case has been released. 2) My critique is primarily about methodology. My issue isn't with the safeguards around model release but rather than it's bad methodology to primarily rely on an internal survey to determine model capabilities and then follow up only with respondents that give a certain response. https://twitter.com/i/web/status/2021306884490396098 https://twitter.com/i/web/status/2021306884490396098" [X Link](https://x.com/polynoamial/status/2021306884490396098) 2026-02-10T19:34Z 100.5K followers, [----] engagements "Today Im excited to share with you all the fruit of our effort at @OpenAI to create AI models capable of truly general reasoning: OpenAI's new o1 model series (aka ๐) Let me explain ๐งต 1/" [X Link](https://x.com/polynoamial/status/1834280155730043108) 2024-09-12T17:17Z 100.7K followers, 2.8M engagements "@OpenAI o1 is trained with RL to think before responding via a private chain of thought. The longer it thinks the better it does on reasoning tasks. This opens up a new dimension for scaling. Were no longer bottlenecked by pretraining. We can now scale inference compute too" [X Link](https://x.com/polynoamial/status/1834280425457426689) 2024-09-12T17:18Z 100.7K followers, 1.5M engagements "GPT-5-Codex is 10x faster for the easiest queries and will think 2x longer for the hardest queries that benefit most from more compute. Were releasing GPT-5-Codex a version of GPT-5 further optimized for agentic coding in Codex. Available in the Codex CLI IDE Extension web mobile and for code reviews in Github. https://t.co/OVGrUovgHN Were releasing GPT-5-Codex a version of GPT-5 further optimized for agentic coding in Codex. Available in the Codex CLI IDE Extension web mobile and for code reviews in Github. https://t.co/OVGrUovgHN" [X Link](https://x.com/polynoamial/status/1967667644905251156) 2025-09-15T19:11Z 100.7K followers, 161.9K engagements "Social media tends to frame AI debate into two caricatures: (A) Skeptics who think LLMs are doomed and AI is a bunch of hype. (B) Fanatics who think we have all the ingredients and superintelligence is imminent. But if you read what leading researchers actually say (beyond the headlines) theres a surprising amount of convergence: 1) The current paradigm is likely sufficient for massive economic and societal impact even without further research breakthroughs. 2) More research breakthroughs are probably needed to achieve AGI/ASI. (Continual learning and sample efficiency are two examples that" [X Link](https://x.com/polynoamial/status/1994439121243169176) 2025-11-28T16:11Z 100.7K followers, 1.3M engagements "I vibecoded an open-source poker river solver over the holiday break. The code is 100% written by Codex and I also made a version with Claude Code to compare. Overall these tools allowed me to iterate much faster in a domain I know well. But I also felt I couldn't fully trust them. They'd make mistakes and encounter bugs but rather than acknowledging it they'd often think it wasn't a big deal or on occasion just straight up try to gaslight me into thinking nothing is wrong. In one memorable debugging session with Claude Code I asked it as a sanity check what the expected value would be of an" [X Link](https://x.com/polynoamial/status/2008277764093157623) 2026-01-05T20:41Z 100.7K followers, 422.3K engagements "1987: AI can't win at chessplanning is uniquely human 1997: AI can'twin at Gointuition is uniquely human 2016: AI can'twin at pokerbluffing is uniquely human 2023: AI can'tget IMO goldreasoning is uniquely human 2026: AI can'tmake wise decisionsjudgment is uniquely human" [X Link](https://x.com/polynoamial/status/2015874457307644058) 2026-01-26T19:48Z 100.7K followers, 967.8K engagements "It's fun watching Doug try to contain his exasperation with the bots' "logic" and actions in the replays. Clearly we still have a long way to go until LLMs master poker. Happy to see GPT-5.2 is the champion though The results are in. [------] hands later we have a winner in the AI Poker Showdown from @kaggle I don't want to spoil the results here but we had an awesome match in the finals between o3 and GPT [---] Watch the video to see who wins: https://t.co/zHPwb2DDlh https://t.co/eeMaaPdJ8R The results are in. [------] hands later we have a winner in the AI Poker Showdown from @kaggle I don't want" [X Link](https://x.com/polynoamial/status/2019177248683942207) 2026-02-04T22:32Z 100.7K followers, 26K engagements "GPT-5.2 evals are finally out for METR and it's state-of-the-art. Here's the linear-scale plot. The 80% success-rate plot (below) is even more stark . We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95% CI of [--] hr [--] min to [--] hr [--] min) on our expanded suite of software tasks. This is the highest estimate for a time horizon measurement we have reported to date. https://t.co/USkHNuFexc We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95% CI of [--] hr [--] min to [--] hr [--] min) on our" [X Link](https://x.com/polynoamial/status/2019182632391831662) 2026-02-04T22:53Z 100.7K followers, 582K engagements "GPT-5.3-Codex's much better token efficiency *AND* faster inference is the biggest story of this release. Folks at @OpenAI worked hard to improve this and it will only get better from here. GPT-5.3-Codex is here *Best coding performance (57% SWE-Bench Pro 76% TerminalBench [---] 64% OSWorld). *Mid-task steerability and live updates during tasks. *Faster Less than half the tokens of 5.2-Codex for same tasks and 25% faster per token *Good computer use. GPT-5.3-Codex is here *Best coding performance (57% SWE-Bench Pro 76% TerminalBench [---] 64% OSWorld). *Mid-task steerability and live updates" [X Link](https://x.com/polynoamial/status/2019476535044948419) 2026-02-05T18:21Z 100.7K followers, 152.6K engagements "When GPT-5 was released some folks claimed AI progress was hitting a wall whereas others said progress would continue. GPT-5.2 was released [--] months ago. GPT-5.3-Codex was released [--] days ago and is twice as token efficient for coding. It's clear who turned out to be correct" [X Link](https://x.com/polynoamial/status/2020236875496321526) 2026-02-07T20:42Z 100.7K followers, 372.3K engagements "@VictorTaelin Yes. I think by the end of the year the main challenge for @METR_Evals will be measuring horizons that long" [X Link](https://x.com/polynoamial/status/2020305698438148346) 2026-02-08T01:16Z 100.7K followers, 100.1K engagements "I appreciate @Anthropic's honesty in their latest system card but the content of it does not give me confidence that the company will act responsibly with deployment of advanced AI models: -They primarily relied on an internal survey to determine whether Opus [---] crossed their autonomous AI R&D-4 threshold (and would thus requirestronger safeguards to release under their Responsible Scaling Policy). This wasn't even an external survey of an impartial 3rd party but rather a survey of Anthropic employees. -When 5/16 internal survey respondents initially gave an assessment that suggested" [X Link](https://x.com/polynoamial/status/2021266471406666231) 2026-02-10T16:54Z 100.7K followers, 186.3K engagements "@Anthropic System card is here: https://www-cdn.anthropic.com/14e4fb01875d2a69f646fa5e574dea2b1c0ff7b5.pdf https://www-cdn.anthropic.com/14e4fb01875d2a69f646fa5e574dea2b1c0ff7b5.pdf" [X Link](https://x.com/polynoamial/status/2021267177995637228) 2026-02-10T16:57Z 100.7K followers, 14K engagements "@fchollet How long do you think it will be before ARC-3 is saturated" [X Link](https://x.com/polynoamial/status/2022054165577707580) 2026-02-12T21:04Z 100.7K followers, 22.2K engagements "Francois Chollet: "AGI 2030" Folks often point to @fchollet as an AGI skeptic but he's said multiple times that he thinks it arrives within [--] years. The AI we have today is not AGI but it's progressing very quickly. @Yossi_Dahan_ @polynoamial ARC-4 is in the works to be released early [----]. ARC-5 is also planned. The final ARC will probably be 6-7. The point is to keep making benchmarks until it is no longer possible to propose something that humans can do and AI can't. AGI [----]. @Yossi_Dahan_ @polynoamial ARC-4 is in the works to be released early [----]. ARC-5 is also planned. The final ARC" [X Link](https://x.com/polynoamial/status/2022089151374668155) 2026-02-12T23:23Z 100.7K followers, 105.6K engagements "After the IMO results last summer some dismissed it as high school math. We think our latest models will remove any doubt that STEM research is about to fundamentally change. Mathematicians created a set of [--] research questions that arose naturally from their own research. Only they know the answers and they gave the world a week to use LLMs to try to solve them. We think our latest models make it possible to solve several of them. This is an internal model for now but Im optimistic well get it (or a better model) out soon. Very excited about the "First Proof" challenge. I believe novel" [X Link](https://x.com/polynoamial/status/2022527227049742779) 2026-02-14T04:24Z 100.7K followers, 321.2K engagements "Perhaps a ๐ถ take but I think the criticisms of @GoogleDeepMind's release are missing the point and the real problem is that AI labs and safety orgs need to adapt to a world where intelligence is a function of inference compute. When Google says that Deep Think poses no new risks beyond Gemini [--] Pro they probably mean that Deep Think is a scaffold of Gemini [--] Pro that anyone externally could have constructed on their own anyway. In other words the capabilities of Deep Think have always been available to anyone willing to pay for Deep Think amounts of inference simply by scaffolding a bunch of" [X Link](https://x.com/polynoamial/status/2022818095879065610) 2026-02-14T23:39Z 100.7K followers, 75.1K engagements "Lots of vague AI hype on social media these days. There are good reasons to be optimistic about further progress but plenty of unsolved research problems remain" [X Link](https://x.com/polynoamial/status/1880333390525919722) 2025-01-17T19:16Z 100K followers, 132K engagements "It can be hard to feel the AGI until you see an AI surpass top humans in a domain you care deeply about. Competitive coders will feel it within a couple years. Paul is early but I think writers will feel it too. Everyone will have their Lee Sedol moment at a different time. Writer of the legendary movie Taxi Driver is having an existential crisis about AI https://t.co/5H89SWUKn9 Writer of the legendary movie Taxi Driver is having an existential crisis about AI https://t.co/5H89SWUKn9" [X Link](https://x.com/polynoamial/status/1881039073558806617) 2025-01-19T18:00Z 100K followers, 442.4K engagements "Making it a little easier for the whole world to see what we're seeing. big news: the free tier of chatgpt is going to get o3-mini (and the plus tier will get tons of o3-mini usage) big news: the free tier of chatgpt is going to get o3-mini (and the plus tier will get tons of o3-mini usage)" [X Link](https://x.com/polynoamial/status/1882483254743470098) 2025-01-23T17:39Z 100K followers, 103.5K engagements "Algorithmic breakthroughs and scaling are complementary not in competition. The former bends the performance vs compute curve while the latter moves further along the curve" [X Link](https://x.com/polynoamial/status/1884368245408555085) 2025-01-28T22:29Z 100K followers, 75.2K engagements "We at @OpenAI are proud to release o3-mini including for the FREE tier. On many evals it outperforms o1. Were shifting the entire costintelligence curve. Model intelligence will continue to go up and the cost for the same intelligence will continue to go down" [X Link](https://x.com/polynoamial/status/1885408714334597552) 2025-01-31T19:24Z 100K followers, 165.9K engagements "o1 was released less than [--] months ago. o3-mini was released [--] days ago. Deep Research was released today. Its a powerful tool and I cant wait to see what the world does with it but AI will continue to progress rapidly from here" [X Link](https://x.com/polynoamial/status/1886223995877339568) 2025-02-03T01:23Z 100K followers, 89.3K engagements "There's a lot of talk of LLMs "saturating all the evals" but there's plenty of evals people could make where LLMs would do poorly: -Beat a Zelda game -Make a profit in a prediction market -Write a stand-up set that's original and funny I'm bullish on AI but we're far from done" [X Link](https://x.com/polynoamial/status/1887560706117279807) 2025-02-06T17:55Z 100.1K followers, 175.3K engagements "When we briefed people on ๐ before o1-preview's release seeing the CoT live was usually the "aha" moment for them that made it clear this was going to be a big deal. These aren't the raw CoTs but it's a big step closer and I'm glad we can share that experience with the world. Updated chain of thought in OpenAI o3-mini for free and paid users and in o3-mini-high for paid users. https://t.co/uF4XTBGpC5 Updated chain of thought in OpenAI o3-mini for free and paid users and in o3-mini-high for paid users. https://t.co/uF4XTBGpC5" [X Link](https://x.com/polynoamial/status/1887621287616651429) 2025-02-06T21:55Z 100K followers, 175.6K engagements "I think a lot of the misinformation about what LLMs can and cannot do comes from the fact that progress in the field is much faster than academia is used to. Recently computer scientist Binghui Peng and his team proved mathematically that there may be a hard limit to LLMs compositional task-solving abilities. https://t.co/LDAhMz0Xy5 https://t.co/cxOMkioMDm Recently computer scientist Binghui Peng and his team proved mathematically that there may be a hard limit to LLMs compositional task-solving abilities. https://t.co/LDAhMz0Xy5 https://t.co/cxOMkioMDm" [X Link](https://x.com/polynoamial/status/1888467178879627546) 2025-02-09T05:57Z 100K followers, 184.4K engagements "A lot of folks seem surprised so I want to point out: the robot in this video is likely teleoperated not autonomous (would be great if @1x_tech / @ericjang11 could confirm that). But it's still a very impressive demo of the hardware especially considering how affordable it is. Introducing NEO Gamma. Another step closer to home. https://t.co/Fiu2ohbIiP Introducing NEO Gamma. Another step closer to home. https://t.co/Fiu2ohbIiP" [X Link](https://x.com/polynoamial/status/1893032730344226899) 2025-02-21T20:19Z 100K followers, 161.3K engagements "Theres a long history of AI doing well at games but thats typically involved the AI *training* on that game. What makes this result so cool and significant is that the model was never trained on Pokemon and yet still does well. A few researchers at Anthropic have over the past year had a part-time obsession with a peculiar problem. Can Claude play Pokmon A thread: https://t.co/K8SkNXCxYJ A few researchers at Anthropic have over the past year had a part-time obsession with a peculiar problem. Can Claude play Pokmon A thread: https://t.co/K8SkNXCxYJ" [X Link](https://x.com/polynoamial/status/1894433384195412256) 2025-02-25T17:04Z 100K followers, 146.8K engagements "Seeing these creative writing outputs has been a real "feel the AGI" moment for some folks at @OpenAI. The pessimist line lately has been only stuff like code and math will keep getting better; the fuzzy subjective bits will stall. Nope. The tide is rising everywhere. we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story we trained a new model that is good at creative" [X Link](https://x.com/polynoamial/status/1899658588626579627) 2025-03-12T03:07Z 100K followers, 185.2K engagements "This isn't quite true. Test-time compute helps when verification is easier than generation (e.g. sudoku) but if the task is "When was George Washington born" and you don't know no amount of thinking will get you to the correct answer. You're bottlenecked by verification. A crucial point that everyone should be internalizing: in the age of test-time search it's pretty much always possible to reach any level of capability by simply expending more compute. So its not just about "can you do it". The key is how efficiently you can do it. A crucial point that everyone should be internalizing: in" [X Link](https://x.com/polynoamial/status/1904534656588136743) 2025-03-25T14:03Z 100K followers, 224.8K engagements "Our latest @OpenAI model GPT-4.1 achieves 55% on SWE-Bench Verified *without being a reasoning model*. @michpokrass and team did an amazing job on this (New reasoning models coming soon too.)" [X Link](https://x.com/polynoamial/status/1911831926241153170) 2025-04-14T17:20Z 100K followers, 212.5K engagements "This is worth tuning in for Livestream in o3 hours. Livestream in o3 hours" [X Link](https://x.com/polynoamial/status/1912510161635488230) 2025-04-16T14:15Z 100K followers, 108K engagements "Today we're releasing @OpenAI o3/o4-mini. The eval numbers are SOTA (2700 Elo is among the top [---] competition coders) But what I'm most excited about is the stuff we can't benchmark. I expect o3/o4-mini will aid scientists in their research and I'm excited to see what they do" [X Link](https://x.com/polynoamial/status/1912556828099244454) 2025-04-16T17:20Z 100K followers, 64.2K engagements "Our new @OpenAI o3 and o4-mini models further confirm that scaling inference improves intelligence and that scaling RL shifts up the whole compute vs. intelligence curve. There is still a lot of room to scale both of these further" [X Link](https://x.com/polynoamial/status/1912564068168450396) 2025-04-16T17:49Z 100K followers, 131.8K engagements "We did not solve math. For example our models are still not great at writing proofs. o3 and o4-mini are nowhere close to getting International Mathematics Olympiad gold medals. AI has *solved* math. OpenAI did it with o4 Not "is close to solving math" Not "is competitive at math" *SOLVED* This is far bigger than anyone realizes. Let me explain why. First you need to understand some historical context. Typically with AI/ML you know that you're https://t.co/5TUIECG91d AI has *solved* math. OpenAI did it with o4 Not "is close to solving math" Not "is competitive at math" *SOLVED* This is far" [X Link](https://x.com/anyuser/status/1912575974782423164) 2025-04-16T18:37Z 100K followers, 346.6K engagements "I recently made this plot for a talk I gave on AI progress and it helped me appreciate how quickly AI models are improving. I know there's still a lot of benchmarks where progress is flat but progress on Codeforces was quite flat for a long time too" [X Link](https://x.com/polynoamial/status/1918746853866127700) 2025-05-03T19:17Z 100K followers, 207.2K engagements "People often ask me: will reasoning models ever move beyond easily verifiable tasks I tell them we already have empirical proof that they can and we released a product around it: @OpenAI Deep Research" [X Link](https://x.com/polynoamial/status/1922344909412929672) 2025-05-13T17:35Z 100K followers, 99.5K engagements "Input is now $2 per 1M and Output is now $8 per 1M. The cost vs intelligence curve will continue to improve rapidly. we dropped the price of o3 by 80% excited to see what people will do with it now. think you'll also be happy with o3-pro pricing for the performance :) we dropped the price of o3 by 80% excited to see what people will do with it now. think you'll also be happy with o3-pro pricing for the performance :)" [X Link](https://x.com/polynoamial/status/1932463115867914741) 2025-06-10T15:41Z 100K followers, 155.8K engagements "I'm fortunate to be able to devote my career to researching AI and building reasoning models like o3 for the world to use. If you want to join us in pushing forward the intelligence frontier we're hiring at @OpenAI" [X Link](https://x.com/polynoamial/status/1932600979113005300) 2025-06-11T00:49Z 100K followers, 118.2K engagements "When this happened in [----] almost everyone in AI thought it was outrageously overpriced still cant believe DeepMind was acquired for $400M still cant believe DeepMind was acquired for $400M" [X Link](https://x.com/polynoamial/status/1936276912424599719) 2025-06-21T04:16Z 100K followers, 374.4K engagements "AI researchers will literally negotiate $100 million comp packages by themselves but they wont play poker for more than $50 buy-ins" [X Link](https://x.com/polynoamial/status/1940275777981100448) 2025-07-02T05:06Z 100K followers, 199.6K engagements "Something to watch out for when evaluating tool-using agents: they can "cheat" by browsing the web and simply looking up the answer key. The @OpenAI ChatGPT Agent team had to take special care to mitigate this risk. ChatGPT agents capabilities are reflected in its state-of-the-art performance on academic and real-world task evaluations like data modeling spreadsheet editing and investment banking. https://t.co/t52TvkjhwF ChatGPT agents capabilities are reflected in its state-of-the-art performance on academic and real-world task evaluations like data modeling spreadsheet editing and" [X Link](https://x.com/polynoamial/status/1945910153301164100) 2025-07-17T18:15Z 100K followers, 75.2K engagements "Today we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the [----] IMO with a general reasoning LLMunder the same time limits as humans without tools. As remarkable as that sounds its even more significant than the headline ๐งต 1/N Im excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the worlds most prestigious math competitionthe International Math Olympiad (IMO). https://t.co/SG3k6EknaC 1/N Im excited to share that our latest @OpenAI" [X Link](https://x.com/polynoamial/status/1946478249187377206) 2025-07-19T07:52Z 100K followers, 1.2M engagements "When you work at a frontier lab you usually know where frontier capabilities are months before anyone else. But this result is brand new using recently developed techniques. It was a surprise even to many researchers at OpenAI. Today everyone gets to see where the frontier is" [X Link](https://x.com/polynoamial/status/1946478260482625627) 2025-07-19T07:52Z 100K followers, 77.3K engagements "I think it's safe to say this @OpenAI IMO gold result came as a bit of a surprise to folks" [X Link](https://x.com/polynoamial/status/1946485373124608491) 2025-07-19T08:20Z 100K followers, 464.5K engagements "It takes us a few months to turn the experimental research frontier into a product. But progress is so fast that a few months can mean a big difference in capabilities. So all the models underperform humans on the new International Mathematical Olympiad questions and Grok-4 is especially bad on it even with best-of-n selection Unbelievable https://t.co/Z06oFaZ8Sc So all the models underperform humans on the new International Mathematical Olympiad questions and Grok-4 is especially bad on it even with best-of-n selection Unbelievable https://t.co/Z06oFaZ8Sc" [X Link](https://x.com/anyuser/status/1946509154752811136) 2025-07-19T09:55Z 100K followers, 132.9K engagements "Its truly a privilege to be able to wake up every morning see where the latest intelligence frontier is and help push it a little further" [X Link](https://x.com/anyuser/status/1946614116480671840) 2025-07-19T16:52Z 100K followers, 161.2K engagements "It can be hard to feel the AGI until you see an AI master a domain you care deeply about. Everyone will have their Lee Sedol moment at a different time. the openai IMO news hit me pretty heavy this weekend i'm still in the acute phase of the impact i think i consider myself a professional mathematician (a characterization some actual professional mathematicians might take issue with but my party my rules) and i don't think i the openai IMO news hit me pretty heavy this weekend i'm still in the acute phase of the impact i think i consider myself a professional mathematician (a characterization" [X Link](https://x.com/polynoamial/status/1947878967639433568) 2025-07-23T04:38Z 100K followers, 162.7K engagements "Our new @OpenAI open models Our open models are here. Both of them. https://t.co/9tFxefOXcg Our open models are here. Both of them. https://t.co/9tFxefOXcg" [X Link](https://x.com/polynoamial/status/1952778238368887184) 2025-08-05T17:06Z 100.1K followers, 150.3K engagements "Hmm I wonder what this could be LIVE5TREAM THURSDAY 10AM PT LIVE5TREAM THURSDAY 10AM PT" [X Link](https://x.com/anyuser/status/1953145902442659913) 2025-08-06T17:27Z 100K followers, 107.1K engagements "I'm more optimistic than ever that we at @OpenAI can eliminate hallucinations. There's still more research to be done but GPT-5 is solid progress" [X Link](https://x.com/polynoamial/status/1953517966978322545) 2025-08-07T18:05Z 100.1K followers, 113.5K engagements "In my opinion the most important takeaway from this result is that our @OpenAI International Math Olympiad (IMO) gold model is also our best competitive coding model. ๐งต 1/n Im thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold ๐ฅ๐ฅ in one of the worlds top programming competitions - the [----] International Olympiad in Informatics (IOI) - placing first among AI participants ๐จ๐ป๐จ๐ป https://t.co/k3RQxIzXPH 1/n Im thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold ๐ฅ๐ฅ in one of the worlds top programming competitions - the" [X Link](https://x.com/anyuser/status/1954966398989635668) 2025-08-11T18:01Z 100K followers, 167.8K engagements "@kimmonismus Does it still work if you raise the table [--] inches" [X Link](https://x.com/anyuser/status/1955411293772583190) 2025-08-12T23:29Z 99.1K followers, 668.7K engagements "I recently chatted with a VC who believed AGI was coming and would disrupt a lot of jobs but not *their* job. Of course AIs could write code and review contracts. But making accurate calibrated predictions about the future *That's* uniquely human. ๐ฎ Introducing Prophet Arena the AI benchmark for general predictive intelligence. That is can AI truly predict the future by connecting todays dots ๐ What makes it special - It cant be hacked. Most benchmarks saturate over time but here models face live unseen https://t.co/rhwR5WlU9d ๐ฎ Introducing Prophet Arena the AI benchmark for general" [X Link](https://x.com/polynoamial/status/1957175457343848512) 2025-08-17T20:19Z 100K followers, 184.6K engagements "GPT-5 Thinking definitely isnt perfect but its the first AI model I can trust more than many common sources of truth on the internet" [X Link](https://x.com/polynoamial/status/1959730758140174445) 2025-08-24T21:33Z 100K followers, 134.1K engagements "To all undergrads interested in learning about AI: be wary of taking Intro to AI as your first AI course. In many programs the class you actually want first is Intro to Machine Learning. AI technology has exploded in the past [--] years thanks to deep neural networks. Yet at many schools the Intro to AI curriculum has barely changed from what it was in [----] and spends often only a few lectures on machine learning. Unfortunately revamping Intro to AI is controversial at many universities and inertia tends to dominate. Dont decide which course to take based on the name alone. Instead check the" [X Link](https://x.com/polynoamial/status/1961065513745789423) 2025-08-28T13:57Z 100K followers, 151.3K engagements "@emollick People deserved to know what was coming. There were many factors to consider but a big one was that it didn't feel right to hide a development like this from the world for so long" [X Link](https://x.com/anyuser/status/1964375558009319858) 2025-09-06T17:10Z 100K followers, 55K engagements "When we at @OpenAI released o1-preview a year ago it would think for seconds. Today our best reasoning models can think for hours browse the web and write code. But there's a lot of room to push reasoning even further. I'm excited for what the next year will bring @OpenAI o1 is trained with RL to think before responding via a private chain of thought. The longer it thinks the better it does on reasoning tasks. This opens up a new dimension for scaling. Were no longer bottlenecked by pretraining. We can now scale inference compute too. https://t.co/niqRO9hhg1 @OpenAI o1 is trained with RL to" [X Link](https://x.com/polynoamial/status/1966527147469598794) 2025-09-12T15:39Z 100K followers, 224.8K engagements "Julian was co-first author on AlphaGo AlphaZero and MuZero. He doesn't have a major twitter presence but he's been at the forefront of AI exponential progress for more than a decade. As a researcher at a frontier lab Im often surprised by how unaware of current AI progress public discussions are. I wrote a post to summarize studies of recent progress and what we should expect in the next 1-2 years: https://t.co/B7438Z9lOF As a researcher at a frontier lab Im often surprised by how unaware of current AI progress public discussions are. I wrote a post to summarize studies of recent progress and" [X Link](https://x.com/polynoamial/status/1972167347088904371) 2025-09-28T05:11Z 100.1K followers, 327.7K engagements "My new hobby is asking GPT-5 Thinking to find errors in every @Wikipedia page I read. Interestingly almost every page I checked has at least one error. ๐งต" [X Link](https://x.com/anyuser/status/1973780497261371533) 2025-10-02T16:01Z 100K followers, 342.7K engagements "Below is a deep dive into why self play works for two-player zero-sum (2p0s) games like Go/Poker/Starcraft but is so much harder to use in "real world" domains. tl;dr: self play converges to minimax in 2p0sgames and minimax is really useful in those games. Every finite 2p0s game has a minimax equilibrium which is essentially an unbeatable strategy in expectation (assuming the players alternate sides). In rock paper scissors for example minimax is 1/3rd on each action. Is minimax what we want Not necessarily. If you're playing minimax in Rock Paper Scissors when most opponents' strategies" [X Link](https://x.com/polynoamial/status/1980697004658556972) 2025-10-21T18:05Z 100.1K followers, 314.5K engagements "Today we at @OpenAI are releasing GPT-5.1-Codex-Max which can work autonomously for more than a day over millions of tokens. Pretraining hasn't hit a wall and neither has test-time compute. Congrats to my teammates @kevinleestone & @mikegmalek for helping to make it possible" [X Link](https://x.com/polynoamial/status/1991212955250327768) 2025-11-19T18:32Z 99.6K followers, 465.5K engagements "@_NathanCalvin Its important to know what their limits are. I definitely dont trust it blindly" [X Link](https://x.com/polynoamial/status/2018446441228959782) 2026-02-02T22:08Z 99.4K followers, [----] engagements "@daniel_mac8 @OpenAI I think to truly automate research well need new capabilities" [X Link](https://x.com/polynoamial/status/2018797614838464732) 2026-02-03T21:23Z 99.6K followers, 29.7K engagements "@_sholtodouglas @jekbradbury @GoogleDeepMind @andy_l_jones @OpenAI @kevin_wang3290 A word on comp: I know folks who become quants to make money but [--] years later ask what they're doing with their life. We're at a special time in history. In AI research you can help positively guide the most important tech of our time *and* get paid well https://x.com/polynoamial/status/1911925322486104535 I worked in quant trading for a year after undergrad but didn't want my lifetime contribution to humanity to be making equity markets marginally more efficient. Taking a paycut to pursue AI research was my" [X Link](https://x.com/polynoamial/status/2014085135268757985) 2026-01-21T21:18Z 100.7K followers, 28.8K engagements "Labs like@OpenAI also hire researchers straight out of undergrad like@kevin_wang3290 though the bar is high. Kevin was highly recommended by his advisor and was first author on a NeurIPS [----] paper. There's a lot of bad NeurIPS papers but we could tell this was a great one. (Indeed after he joined OpenAI his paper was one of [--] out of [----] toreceivea Best Paper award.) His advisor's recommendation counted for a lot because it can be hard to evaluate a researcher just based on a resume or even a paper. https://x.com/kevin_wang3290/status/1902753430583525727" [X Link](https://x.com/polynoamial/status/2014084657030349005) 2026-01-21T21:16Z 100.7K followers, 21.9K engagements "12 years ago I tried making my first poker AI in college and dreamed of beating the world's best pros. After seven years of a PhD I'm excited to announce that I finally did it It's been quite an adventure. Looking forward to the next one Facebook AI and @CarnegieMellon researchers have built Pluribus the first AI bot to beat elite poker pros in [--] player Texas Holdem. This breakthrough is the first major benchmark outside of [--] player games and were sharing specifics on how we built it. https://t.co/zId9x4VBqc https://t.co/u89irNcxEK Facebook AI and @CarnegieMellon researchers have built" [X Link](https://x.com/polynoamial/status/1149382073381195776) 2019-07-11T18:16Z 100.2K followers, [----] engagements "6 years ago today AlphaGo beat Lee Sedol in a milestone for AI. Typically deep learning gets the credit but it's important to know that nobody has *ever* trained a raw NN that's superhuman in Go. All superhuman Go bots require tree search. IMO planning is underappreciated in AI" [X Link](https://x.com/polynoamial/status/1501534834950213632) 2022-03-09T12:26Z 100.2K followers, [----] engagements "I've been listening to @lexfridman's podcast for a long time but it was truly an amazing experience to sit down with him myself and talk about our latest research in multi-agent AI for Poker Diplomacy & more Here's my conversation with Noam Brown (@polynoamial) co-creator of AI systems that achieve superhuman level performance in games of poker and Diplomacy that involves strategic negotiations with humans. This was a fascinating technical conversation. https://t.co/e6BArJjnag https://t.co/m9O592F2Cu Here's my conversation with Noam Brown (@polynoamial) co-creator of AI systems that achieve" [X Link](https://x.com/polynoamial/status/1600261114637750272) 2022-12-06T22:49Z 100.2K followers, [----] engagements "This meme summarizes the paper nicely OpenAI presents: Competitive Programming with Large Reasoning Models - Competed live at IOI [----] - o3 achieved gold - General-purpose o3 surpasses o1 w/ hand-crafted pipelines specialized for coding resultss https://t.co/zuZPq0rZJF OpenAI presents: Competitive Programming with Large Reasoning Models - Competed live at IOI [----] - o3 achieved gold - General-purpose o3 surpasses o1 w/ hand-crafted pipelines specialized for coding resultss https://t.co/zuZPq0rZJF" [X Link](https://x.com/polynoamial/status/1889541408065028421) 2025-02-12T05:05Z 100.2K followers, 89K engagements "Deep Research has been a big hit but was previously limited to pro subscribers. Now all plus users can try it for themselves Deep research is now rolling out to all ChatGPT Plus Team Edu and Enterprise users ๐พ Deep research is now rolling out to all ChatGPT Plus Team Edu and Enterprise users ๐พ" [X Link](https://x.com/polynoamial/status/1894458074267881811) 2025-02-25T18:42Z 100.2K followers, 136.5K engagements "6 years ago AI pioneer and now Turing Award winner @RichardSSutton distilled [--] years of AI into a simple Bitter Lesson: general methods that scale with data and compute ultimately win. With the rise of AI agents it's an important lesson to keep in mind: https://www.cs.utexas.edu/eunsol/courses/data/bitter_lesson.pdf https://www.cs.utexas.edu/eunsol/courses/data/bitter_lesson.pdf https://t.co/dW1mBPdS2f Machines that learn from experience were explored by Alan Turing almost eighty years ago which makes it particularly gratifying and humbling to receive an award in his name for reviving this" [X Link](https://x.com/polynoamial/status/1897693005601292491) 2025-03-06T16:57Z 100.2K followers, 112.1K engagements "Memory isn't just another product feature. It signals a shift from episodic interactions (think a call center) to evolving ones (more like a colleague or friend). Still a lot of research to do but it's a step toward fundamentally changing how we interact with LLMs. Starting today memory in ChatGPT can now reference all of your past chats to provide more personalized responses drawing on your preferences and interests to make it even more helpful for writing getting advice learning and beyond. https://t.co/s9BrWl94iY Starting today memory in ChatGPT can now reference all of your past chats to" [X Link](https://x.com/polynoamial/status/1910379351759347860) 2025-04-10T17:08Z 100.4K followers, 251K engagements "Theres an old joke in AI: as soon as machines outperform humans at something it stops being considered AI. Glad to see poker solvers have reached that point. ChatGPT totally sucks at poker. It knows it sucks if you ask it. Today's newsletter is a deeper dive into why with some speculation about what this means for AI capabilities. https://t.co/rmB7YwVm6D ChatGPT totally sucks at poker. It knows it sucks if you ask it. Today's newsletter is a deeper dive into why with some speculation about what this means for AI capabilities. https://t.co/rmB7YwVm6D" [X Link](https://x.com/polynoamial/status/1925394934468853792) 2025-05-22T03:34Z 100.2K followers, 82.1K engagements "Their bet allowed for formal math AI systems (like AlphaProof). In [----] almost nobody thought an LLM could be IMO gold level by [----]. We are seeing much faster AI progress than **Paul Christiano** and **Yudkowsky** predicted who had gold in [----] at 8% and 16% respectively by methods that are more general than expected We are seeing much faster AI progress than **Paul Christiano** and **Yudkowsky** predicted who had gold in [----] at 8% and 16% respectively by methods that are more general than expected" [X Link](https://x.com/polynoamial/status/1946517375060500651) 2025-07-19T10:28Z 100.2K followers, 170.1K engagements "@Mihonarium 1) We posted *after* the closing ceremony. It was livestreamed so this is easy to confirm. 2) We weren't in touch with IMO. I spoke with one organizer before the post to let him know. He requested we wait until after the closing ceremony ends to respect the kids and we did" [X Link](https://x.com/polynoamial/status/1947024171860476264) 2025-07-20T20:01Z 100.2K followers, 102.3K engagements "Considering the technology and the pace of progress I think this is quite sane. This is insane. AI capex might account for a larger share of GDP than basically any technology since the railroad. Basically its a mini-wartime economy but the guns are chips and the tanks are databases https://t.co/E11IxmYtOv This is insane. AI capex might account for a larger share of GDP than basically any technology since the railroad. Basically its a mini-wartime economy but the guns are chips and the tanks are databases https://t.co/E11IxmYtOv" [X Link](https://x.com/polynoamial/status/1952043966121365813) 2025-08-03T16:28Z 100.2K followers, 101.7K engagements "@adcock_brett Nice It's great to see more robotics demos like this. And yeah there's tons more ways to demonstrate environment robustness. I'd love to see how far you all can push it" [X Link](https://x.com/polynoamial/status/1956491291304632637) 2025-08-15T23:00Z 100.3K followers, 22K engagements "12/12 problems solved which would be equivalent to a 1st place performance. GPT-5's solutions were responsible for solving 11/12 of them. 1/n Im really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the [----] ICPC World Finals the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have https://t.co/MA5KQdIxCj 1/n Im really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the [----] ICPC World Finals the premier collegiate" [X Link](https://x.com/polynoamial/status/1968369005116408149) 2025-09-17T17:38Z 100.2K followers, 73.3K engagements "A good example is @_sholtodouglas at @GoogleDeepMind. He's quiet on Twitter doesn't have any flashy first-author publications and has only been in the field for [---] years but people in AI know he was one of the most important people behind Gemini's success @eladgil @patrickc In AI at least the real [--] under [--] imo you have never heard of. They are [--] layers down the org chart from the CEO. They are usually not on Twitter they have an unmaintained LinkedIn they dont go on podcasts and they maybe published at one point but dont do so anymore. They @eladgil @patrickc In AI at least the real 30" [X Link](https://x.com/polynoamial/status/1748839866740154476) 2024-01-20T22:48Z 100.5K followers, 513.2K engagements "This is on the scale of the Apollo Program and Manhattan Project when measured as a fraction of GDP. This kind of investment only happens when the science is carefully vetted and people believe it will succeed and be completely transformative. I agree its the right time. Announcing The Stargate Project The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States. We will begin deploying $100 billion immediately. This infrastructure will secure Announcing The Stargate Project The Stargate" [X Link](https://x.com/polynoamial/status/1881833454213767600) 2025-01-21T22:37Z 100.6K followers, 926.5K engagements Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
@polynoamial Noam BrownNoam Brown posts on X about ai, open ai, agi, math the most. They currently have [-------] followers and [---] posts still getting attention that total [-------] engagements in the last [--] hours.
Social category influence technology brands 45% finance 10% social networks 4% stocks #2140 gaming 1% countries 1%
Social topic influence ai #759, open ai #1022, agi #115, math 6%, the world 6%, gold 6%, level 5%, if you 5%, imo #5, inference #8
Top accounts mentioned or mentioned by @openai @fchollet @ghidorah_x @grokton @scaling01 @k4l1_89 @lateinteraction @antisimplistic @openais @deanwball @karpathy @kimmonismus @sholtodouglas @googledeepmind @kevinwang3290 @hideeveryflower @ricky_neace @teknium @michael_druggan @superbbias
Top assets mentioned Alphabet Inc Class A (GOOGL)
Top posts by engagements in the last [--] hours
"I'm often askedhow to land a researchjob at a frontier AI lab. It's hard especially without a research background but I like to point to @kellerjordan0 as an example showing it can be done. Keller graduated from UCSD with no publication record and was working at an AI content moderation startup when he landed a cold call with@bneyshabur (who was at Google) and presented an idea to improve upon Behnam's recent paper. Behnam agreed to mentor him which led to an ICLR paper. Sadly there's less open research today but improving upon a researcher's published work is a great way to demonstrate"
X Link 2026-01-21T21:15Z 100.7K followers, 703.5K engagements
"Had to cut this one for space: 2019: AI can't create artcreativity is uniquely human"
X Link 2026-01-26T23:08Z 100.7K followers, 23.4K engagements
"Codex is writing all my code these days and I've fully switched to using the Codex app. The velocity of the Codex team is really impressive. I am Tibo and I have an incredible team. Codex would not exist without them and they cooked. Enjoy the new Codex app access through your free/go ChatGPT plan and 2X rate limits on other plans. Can't wait to hear what you do with it. https://t.co/Lwg13vEJDn https://t.co/c7AaRCenoQ I am Tibo and I have an incredible team. Codex would not exist without them and they cooked. Enjoy the new Codex app access through your free/go ChatGPT plan and 2X rate limits"
X Link 2026-02-02T18:15Z 100.7K followers, 142.1K engagements
"Every [--] months I hear people claim @OpenAI isnt doing real research and is just incrementally improving ChatGPT. I even heard it right before ๐/o1. In my opinion @OpenAI is the best frontier lab to do research at today. Building an AI research intern in [----] is not hype. How does OpenAI balance long-term research bets with product-forward research fundamentals Ive been getting this question a lot lately usually framed as a suggestion that Jakub (@merettm) and I are pushing an increasingly product-focused agenda. That characterization is How does OpenAI balance long-term research bets with"
X Link 2026-02-03T21:04Z 100.7K followers, 103.8K engagements
"I worked in quant trading for a year after undergrad but didn't want my lifetime contribution to humanity to be making equity markets marginally more efficient. Taking a paycut to pursue AI research was my best life decision. Today you don't even need to take a paycut to do it. be you work in HFT shaving nanoseconds off latency or extracting bps from models have existential dread see this tweet wonder if your skills could be better used making AGI apply to attend this party meet the openai team build AGI be you work in HFT shaving nanoseconds off latency or extracting bps from models have"
X Link 2025-04-14T23:31Z 100.7K followers, 322.5K engagements
"Here's the 80% success rate plot"
X Link 2026-02-04T22:53Z 100.7K followers, 27K engagements
"You can read about GPT-5.3-Codex here: https://openai.com/index/introducing-gpt-5-3-codex/ https://openai.com/index/introducing-gpt-5-3-codex/"
X Link 2026-02-05T18:21Z 100.7K followers, [----] engagements
"RT @deanwball: The use of GPT-5 as evidence that "AI is slowing down" is a legendary example of mass delusion. Yes its release timing gave"
X Link 2026-02-07T22:26Z 100.7K followers, [--] engagements
"There have been fair questions on whether LLM contributions to STEM are overhyped but I've spoken with physicists about this result and they've told me it is a truly significant research contribution roughly at the level of a solid journal paper and GPT-5.2 played a key role. GPT-5.2 derived a new result in theoretical physics. Were releasing the result in a preprint with researchers from @the_IAS @VanderbiltU @Cambridge_Uni and @Harvard. It shows that a gluon interaction many physicists expected would not occur can arise under specific GPT-5.2 derived a new result in theoretical physics."
X Link 2026-02-13T20:53Z 100.7K followers, 122.1K engagements
"If you haven't disabled voice authentication for your bank account and had a conversation with your family about AI voice impersonation yet now would be a good time. We're sharing our learnings from a small-scale preview of Voice Engine a model which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker. https://t.co/yLsfGaVtrZ We're sharing our learnings from a small-scale preview of Voice Engine a model which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely"
X Link 2024-03-29T19:50Z 100.2K followers, 162.5K engagements
"I've heard people claim that Sam is just drumming up hype but from what I've seen everything he's saying matches the median view of @OpenAI researchers on the ground. Sam Altman says the pathway to AGI is now clear and "we actually know what do" it will be easier to get to Level [--] Innovating AI than he initially thought and "things are going to go a lot faster than people are appreciating right now" https://t.co/LOhzP0SA8h Sam Altman says the pathway to AGI is now clear and "we actually know what do" it will be easier to get to Level [--] Innovating AI than he initially thought and "things are"
X Link 2024-11-09T00:00Z 100.4K followers, 845.2K engagements
"We announced @OpenAI o1 just [--] months ago. Today we announced o3. We have every reason to believe this trajectory will continue"
X Link 2024-12-20T18:22Z 100.2K followers, 1.9M engagements
"Scaling pretraining and scaling thinking are two different dimensions of improvement. They are complementary not in competition"
X Link 2025-02-27T20:19Z 100.3K followers, 130.5K engagements
"It's deeply concerning that one of the best AI researchers I've worked with @kaicathyc was denied a U.S. green card today. A Canadian who's lived and contributed here for [--] years now has to leave. Were risking Americas AI leadership when we turn away talent like this"
X Link 2025-04-25T13:49Z 100.6K followers, 2.6M engagements
"You dont need a PhD to be a great AI researcher. Even @OpenAIs Chief Research Officer doesnt have a PhD"
X Link 2025-06-28T23:27Z 100.1K followers, 1.3M engagements
"Congrats to the GDM team on their IMO result I think their parallel success highlights how fast AI progress is. Their approach was a bit different than ours but I think that shows there are many research directions for further progress. Some thoughts on our model and results ๐งต"
X Link 2025-07-21T20:49Z 100.3K followers, 502.8K engagements
"Self play works so well in chess go and poker because those games are two-player zero-sum. That simplifies a lot of problems. The real world is messier which is why we havent seen many successes from self play in LLMs yet. Btw @karpathy did great and I mostly agree with him .@karpathy says that LLMs currently lack the cultural accumulation and self-play that propelled humans out of the savannah: Culture: Why cant an LLM write a book for the other LLMs Why cant other LLMs read this LLMs book and be inspired by it or shocked by it Self https://t.co/InZwkzW9U2 .@karpathy says that LLMs currently"
X Link 2025-10-21T15:13Z 100.2K followers, 346.3K engagements
"@deanwball I hope policymakers will consider all of this going forward when deciding whose opinions to trust"
X Link 2026-02-07T22:29Z 100.1K followers, [----] engagements
"1) The RSP says that AI R&D-4 calls for both ASL-3 and an "affirmative case" on misalignment. The model is released under ASL-3 but no affirmative case has been released. 2) My critique is primarily about methodology. My issue isn't with the safeguards around model release but rather than it's bad methodology to primarily rely on an internal survey to determine model capabilities and then follow up only with respondents that give a certain response. https://twitter.com/i/web/status/2021306884490396098 https://twitter.com/i/web/status/2021306884490396098"
X Link 2026-02-10T19:34Z 100.5K followers, [----] engagements
"Today Im excited to share with you all the fruit of our effort at @OpenAI to create AI models capable of truly general reasoning: OpenAI's new o1 model series (aka ๐) Let me explain ๐งต 1/"
X Link 2024-09-12T17:17Z 100.7K followers, 2.8M engagements
"@OpenAI o1 is trained with RL to think before responding via a private chain of thought. The longer it thinks the better it does on reasoning tasks. This opens up a new dimension for scaling. Were no longer bottlenecked by pretraining. We can now scale inference compute too"
X Link 2024-09-12T17:18Z 100.7K followers, 1.5M engagements
"GPT-5-Codex is 10x faster for the easiest queries and will think 2x longer for the hardest queries that benefit most from more compute. Were releasing GPT-5-Codex a version of GPT-5 further optimized for agentic coding in Codex. Available in the Codex CLI IDE Extension web mobile and for code reviews in Github. https://t.co/OVGrUovgHN Were releasing GPT-5-Codex a version of GPT-5 further optimized for agentic coding in Codex. Available in the Codex CLI IDE Extension web mobile and for code reviews in Github. https://t.co/OVGrUovgHN"
X Link 2025-09-15T19:11Z 100.7K followers, 161.9K engagements
"Social media tends to frame AI debate into two caricatures: (A) Skeptics who think LLMs are doomed and AI is a bunch of hype. (B) Fanatics who think we have all the ingredients and superintelligence is imminent. But if you read what leading researchers actually say (beyond the headlines) theres a surprising amount of convergence: 1) The current paradigm is likely sufficient for massive economic and societal impact even without further research breakthroughs. 2) More research breakthroughs are probably needed to achieve AGI/ASI. (Continual learning and sample efficiency are two examples that"
X Link 2025-11-28T16:11Z 100.7K followers, 1.3M engagements
"I vibecoded an open-source poker river solver over the holiday break. The code is 100% written by Codex and I also made a version with Claude Code to compare. Overall these tools allowed me to iterate much faster in a domain I know well. But I also felt I couldn't fully trust them. They'd make mistakes and encounter bugs but rather than acknowledging it they'd often think it wasn't a big deal or on occasion just straight up try to gaslight me into thinking nothing is wrong. In one memorable debugging session with Claude Code I asked it as a sanity check what the expected value would be of an"
X Link 2026-01-05T20:41Z 100.7K followers, 422.3K engagements
"1987: AI can't win at chessplanning is uniquely human 1997: AI can'twin at Gointuition is uniquely human 2016: AI can'twin at pokerbluffing is uniquely human 2023: AI can'tget IMO goldreasoning is uniquely human 2026: AI can'tmake wise decisionsjudgment is uniquely human"
X Link 2026-01-26T19:48Z 100.7K followers, 967.8K engagements
"It's fun watching Doug try to contain his exasperation with the bots' "logic" and actions in the replays. Clearly we still have a long way to go until LLMs master poker. Happy to see GPT-5.2 is the champion though The results are in. [------] hands later we have a winner in the AI Poker Showdown from @kaggle I don't want to spoil the results here but we had an awesome match in the finals between o3 and GPT [---] Watch the video to see who wins: https://t.co/zHPwb2DDlh https://t.co/eeMaaPdJ8R The results are in. [------] hands later we have a winner in the AI Poker Showdown from @kaggle I don't want"
X Link 2026-02-04T22:32Z 100.7K followers, 26K engagements
"GPT-5.2 evals are finally out for METR and it's state-of-the-art. Here's the linear-scale plot. The 80% success-rate plot (below) is even more stark . We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95% CI of [--] hr [--] min to [--] hr [--] min) on our expanded suite of software tasks. This is the highest estimate for a time horizon measurement we have reported to date. https://t.co/USkHNuFexc We estimate that GPT-5.2 with high (not xhigh) reasoning effort has a 50%-time-horizon of around [---] hrs (95% CI of [--] hr [--] min to [--] hr [--] min) on our"
X Link 2026-02-04T22:53Z 100.7K followers, 582K engagements
"GPT-5.3-Codex's much better token efficiency AND faster inference is the biggest story of this release. Folks at @OpenAI worked hard to improve this and it will only get better from here. GPT-5.3-Codex is here *Best coding performance (57% SWE-Bench Pro 76% TerminalBench [---] 64% OSWorld). *Mid-task steerability and live updates during tasks. *Faster Less than half the tokens of 5.2-Codex for same tasks and 25% faster per token *Good computer use. GPT-5.3-Codex is here *Best coding performance (57% SWE-Bench Pro 76% TerminalBench [---] 64% OSWorld). *Mid-task steerability and live updates"
X Link 2026-02-05T18:21Z 100.7K followers, 152.6K engagements
"When GPT-5 was released some folks claimed AI progress was hitting a wall whereas others said progress would continue. GPT-5.2 was released [--] months ago. GPT-5.3-Codex was released [--] days ago and is twice as token efficient for coding. It's clear who turned out to be correct"
X Link 2026-02-07T20:42Z 100.7K followers, 372.3K engagements
"@VictorTaelin Yes. I think by the end of the year the main challenge for @METR_Evals will be measuring horizons that long"
X Link 2026-02-08T01:16Z 100.7K followers, 100.1K engagements
"I appreciate @Anthropic's honesty in their latest system card but the content of it does not give me confidence that the company will act responsibly with deployment of advanced AI models: -They primarily relied on an internal survey to determine whether Opus [---] crossed their autonomous AI R&D-4 threshold (and would thus requirestronger safeguards to release under their Responsible Scaling Policy). This wasn't even an external survey of an impartial 3rd party but rather a survey of Anthropic employees. -When 5/16 internal survey respondents initially gave an assessment that suggested"
X Link 2026-02-10T16:54Z 100.7K followers, 186.3K engagements
"@Anthropic System card is here: https://www-cdn.anthropic.com/14e4fb01875d2a69f646fa5e574dea2b1c0ff7b5.pdf https://www-cdn.anthropic.com/14e4fb01875d2a69f646fa5e574dea2b1c0ff7b5.pdf"
X Link 2026-02-10T16:57Z 100.7K followers, 14K engagements
"@fchollet How long do you think it will be before ARC-3 is saturated"
X Link 2026-02-12T21:04Z 100.7K followers, 22.2K engagements
"Francois Chollet: "AGI 2030" Folks often point to @fchollet as an AGI skeptic but he's said multiple times that he thinks it arrives within [--] years. The AI we have today is not AGI but it's progressing very quickly. @Yossi_Dahan_ @polynoamial ARC-4 is in the works to be released early [----]. ARC-5 is also planned. The final ARC will probably be 6-7. The point is to keep making benchmarks until it is no longer possible to propose something that humans can do and AI can't. AGI [----]. @Yossi_Dahan_ @polynoamial ARC-4 is in the works to be released early [----]. ARC-5 is also planned. The final ARC"
X Link 2026-02-12T23:23Z 100.7K followers, 105.6K engagements
"After the IMO results last summer some dismissed it as high school math. We think our latest models will remove any doubt that STEM research is about to fundamentally change. Mathematicians created a set of [--] research questions that arose naturally from their own research. Only they know the answers and they gave the world a week to use LLMs to try to solve them. We think our latest models make it possible to solve several of them. This is an internal model for now but Im optimistic well get it (or a better model) out soon. Very excited about the "First Proof" challenge. I believe novel"
X Link 2026-02-14T04:24Z 100.7K followers, 321.2K engagements
"Perhaps a ๐ถ take but I think the criticisms of @GoogleDeepMind's release are missing the point and the real problem is that AI labs and safety orgs need to adapt to a world where intelligence is a function of inference compute. When Google says that Deep Think poses no new risks beyond Gemini [--] Pro they probably mean that Deep Think is a scaffold of Gemini [--] Pro that anyone externally could have constructed on their own anyway. In other words the capabilities of Deep Think have always been available to anyone willing to pay for Deep Think amounts of inference simply by scaffolding a bunch of"
X Link 2026-02-14T23:39Z 100.7K followers, 75.1K engagements
"Lots of vague AI hype on social media these days. There are good reasons to be optimistic about further progress but plenty of unsolved research problems remain"
X Link 2025-01-17T19:16Z 100K followers, 132K engagements
"It can be hard to feel the AGI until you see an AI surpass top humans in a domain you care deeply about. Competitive coders will feel it within a couple years. Paul is early but I think writers will feel it too. Everyone will have their Lee Sedol moment at a different time. Writer of the legendary movie Taxi Driver is having an existential crisis about AI https://t.co/5H89SWUKn9 Writer of the legendary movie Taxi Driver is having an existential crisis about AI https://t.co/5H89SWUKn9"
X Link 2025-01-19T18:00Z 100K followers, 442.4K engagements
"Making it a little easier for the whole world to see what we're seeing. big news: the free tier of chatgpt is going to get o3-mini (and the plus tier will get tons of o3-mini usage) big news: the free tier of chatgpt is going to get o3-mini (and the plus tier will get tons of o3-mini usage)"
X Link 2025-01-23T17:39Z 100K followers, 103.5K engagements
"Algorithmic breakthroughs and scaling are complementary not in competition. The former bends the performance vs compute curve while the latter moves further along the curve"
X Link 2025-01-28T22:29Z 100K followers, 75.2K engagements
"We at @OpenAI are proud to release o3-mini including for the FREE tier. On many evals it outperforms o1. Were shifting the entire costintelligence curve. Model intelligence will continue to go up and the cost for the same intelligence will continue to go down"
X Link 2025-01-31T19:24Z 100K followers, 165.9K engagements
"o1 was released less than [--] months ago. o3-mini was released [--] days ago. Deep Research was released today. Its a powerful tool and I cant wait to see what the world does with it but AI will continue to progress rapidly from here"
X Link 2025-02-03T01:23Z 100K followers, 89.3K engagements
"There's a lot of talk of LLMs "saturating all the evals" but there's plenty of evals people could make where LLMs would do poorly: -Beat a Zelda game -Make a profit in a prediction market -Write a stand-up set that's original and funny I'm bullish on AI but we're far from done"
X Link 2025-02-06T17:55Z 100.1K followers, 175.3K engagements
"When we briefed people on ๐ before o1-preview's release seeing the CoT live was usually the "aha" moment for them that made it clear this was going to be a big deal. These aren't the raw CoTs but it's a big step closer and I'm glad we can share that experience with the world. Updated chain of thought in OpenAI o3-mini for free and paid users and in o3-mini-high for paid users. https://t.co/uF4XTBGpC5 Updated chain of thought in OpenAI o3-mini for free and paid users and in o3-mini-high for paid users. https://t.co/uF4XTBGpC5"
X Link 2025-02-06T21:55Z 100K followers, 175.6K engagements
"I think a lot of the misinformation about what LLMs can and cannot do comes from the fact that progress in the field is much faster than academia is used to. Recently computer scientist Binghui Peng and his team proved mathematically that there may be a hard limit to LLMs compositional task-solving abilities. https://t.co/LDAhMz0Xy5 https://t.co/cxOMkioMDm Recently computer scientist Binghui Peng and his team proved mathematically that there may be a hard limit to LLMs compositional task-solving abilities. https://t.co/LDAhMz0Xy5 https://t.co/cxOMkioMDm"
X Link 2025-02-09T05:57Z 100K followers, 184.4K engagements
"A lot of folks seem surprised so I want to point out: the robot in this video is likely teleoperated not autonomous (would be great if @1x_tech / @ericjang11 could confirm that). But it's still a very impressive demo of the hardware especially considering how affordable it is. Introducing NEO Gamma. Another step closer to home. https://t.co/Fiu2ohbIiP Introducing NEO Gamma. Another step closer to home. https://t.co/Fiu2ohbIiP"
X Link 2025-02-21T20:19Z 100K followers, 161.3K engagements
"Theres a long history of AI doing well at games but thats typically involved the AI training on that game. What makes this result so cool and significant is that the model was never trained on Pokemon and yet still does well. A few researchers at Anthropic have over the past year had a part-time obsession with a peculiar problem. Can Claude play Pokmon A thread: https://t.co/K8SkNXCxYJ A few researchers at Anthropic have over the past year had a part-time obsession with a peculiar problem. Can Claude play Pokmon A thread: https://t.co/K8SkNXCxYJ"
X Link 2025-02-25T17:04Z 100K followers, 146.8K engagements
"Seeing these creative writing outputs has been a real "feel the AGI" moment for some folks at @OpenAI. The pessimist line lately has been only stuff like code and math will keep getting better; the fuzzy subjective bits will stall. Nope. The tide is rising everywhere. we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story we trained a new model that is good at creative"
X Link 2025-03-12T03:07Z 100K followers, 185.2K engagements
"This isn't quite true. Test-time compute helps when verification is easier than generation (e.g. sudoku) but if the task is "When was George Washington born" and you don't know no amount of thinking will get you to the correct answer. You're bottlenecked by verification. A crucial point that everyone should be internalizing: in the age of test-time search it's pretty much always possible to reach any level of capability by simply expending more compute. So its not just about "can you do it". The key is how efficiently you can do it. A crucial point that everyone should be internalizing: in"
X Link 2025-03-25T14:03Z 100K followers, 224.8K engagements
"Our latest @OpenAI model GPT-4.1 achieves 55% on SWE-Bench Verified without being a reasoning model. @michpokrass and team did an amazing job on this (New reasoning models coming soon too.)"
X Link 2025-04-14T17:20Z 100K followers, 212.5K engagements
"This is worth tuning in for Livestream in o3 hours. Livestream in o3 hours"
X Link 2025-04-16T14:15Z 100K followers, 108K engagements
"Today we're releasing @OpenAI o3/o4-mini. The eval numbers are SOTA (2700 Elo is among the top [---] competition coders) But what I'm most excited about is the stuff we can't benchmark. I expect o3/o4-mini will aid scientists in their research and I'm excited to see what they do"
X Link 2025-04-16T17:20Z 100K followers, 64.2K engagements
"Our new @OpenAI o3 and o4-mini models further confirm that scaling inference improves intelligence and that scaling RL shifts up the whole compute vs. intelligence curve. There is still a lot of room to scale both of these further"
X Link 2025-04-16T17:49Z 100K followers, 131.8K engagements
"We did not solve math. For example our models are still not great at writing proofs. o3 and o4-mini are nowhere close to getting International Mathematics Olympiad gold medals. AI has solved math. OpenAI did it with o4 Not "is close to solving math" Not "is competitive at math" SOLVED This is far bigger than anyone realizes. Let me explain why. First you need to understand some historical context. Typically with AI/ML you know that you're https://t.co/5TUIECG91d AI has solved math. OpenAI did it with o4 Not "is close to solving math" Not "is competitive at math" SOLVED This is far"
X Link 2025-04-16T18:37Z 100K followers, 346.6K engagements
"I recently made this plot for a talk I gave on AI progress and it helped me appreciate how quickly AI models are improving. I know there's still a lot of benchmarks where progress is flat but progress on Codeforces was quite flat for a long time too"
X Link 2025-05-03T19:17Z 100K followers, 207.2K engagements
"People often ask me: will reasoning models ever move beyond easily verifiable tasks I tell them we already have empirical proof that they can and we released a product around it: @OpenAI Deep Research"
X Link 2025-05-13T17:35Z 100K followers, 99.5K engagements
"Input is now $2 per 1M and Output is now $8 per 1M. The cost vs intelligence curve will continue to improve rapidly. we dropped the price of o3 by 80% excited to see what people will do with it now. think you'll also be happy with o3-pro pricing for the performance :) we dropped the price of o3 by 80% excited to see what people will do with it now. think you'll also be happy with o3-pro pricing for the performance :)"
X Link 2025-06-10T15:41Z 100K followers, 155.8K engagements
"I'm fortunate to be able to devote my career to researching AI and building reasoning models like o3 for the world to use. If you want to join us in pushing forward the intelligence frontier we're hiring at @OpenAI"
X Link 2025-06-11T00:49Z 100K followers, 118.2K engagements
"When this happened in [----] almost everyone in AI thought it was outrageously overpriced still cant believe DeepMind was acquired for $400M still cant believe DeepMind was acquired for $400M"
X Link 2025-06-21T04:16Z 100K followers, 374.4K engagements
"AI researchers will literally negotiate $100 million comp packages by themselves but they wont play poker for more than $50 buy-ins"
X Link 2025-07-02T05:06Z 100K followers, 199.6K engagements
"Something to watch out for when evaluating tool-using agents: they can "cheat" by browsing the web and simply looking up the answer key. The @OpenAI ChatGPT Agent team had to take special care to mitigate this risk. ChatGPT agents capabilities are reflected in its state-of-the-art performance on academic and real-world task evaluations like data modeling spreadsheet editing and investment banking. https://t.co/t52TvkjhwF ChatGPT agents capabilities are reflected in its state-of-the-art performance on academic and real-world task evaluations like data modeling spreadsheet editing and"
X Link 2025-07-17T18:15Z 100K followers, 75.2K engagements
"Today we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the [----] IMO with a general reasoning LLMunder the same time limits as humans without tools. As remarkable as that sounds its even more significant than the headline ๐งต 1/N Im excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the worlds most prestigious math competitionthe International Math Olympiad (IMO). https://t.co/SG3k6EknaC 1/N Im excited to share that our latest @OpenAI"
X Link 2025-07-19T07:52Z 100K followers, 1.2M engagements
"When you work at a frontier lab you usually know where frontier capabilities are months before anyone else. But this result is brand new using recently developed techniques. It was a surprise even to many researchers at OpenAI. Today everyone gets to see where the frontier is"
X Link 2025-07-19T07:52Z 100K followers, 77.3K engagements
"I think it's safe to say this @OpenAI IMO gold result came as a bit of a surprise to folks"
X Link 2025-07-19T08:20Z 100K followers, 464.5K engagements
"It takes us a few months to turn the experimental research frontier into a product. But progress is so fast that a few months can mean a big difference in capabilities. So all the models underperform humans on the new International Mathematical Olympiad questions and Grok-4 is especially bad on it even with best-of-n selection Unbelievable https://t.co/Z06oFaZ8Sc So all the models underperform humans on the new International Mathematical Olympiad questions and Grok-4 is especially bad on it even with best-of-n selection Unbelievable https://t.co/Z06oFaZ8Sc"
X Link 2025-07-19T09:55Z 100K followers, 132.9K engagements
"Its truly a privilege to be able to wake up every morning see where the latest intelligence frontier is and help push it a little further"
X Link 2025-07-19T16:52Z 100K followers, 161.2K engagements
"It can be hard to feel the AGI until you see an AI master a domain you care deeply about. Everyone will have their Lee Sedol moment at a different time. the openai IMO news hit me pretty heavy this weekend i'm still in the acute phase of the impact i think i consider myself a professional mathematician (a characterization some actual professional mathematicians might take issue with but my party my rules) and i don't think i the openai IMO news hit me pretty heavy this weekend i'm still in the acute phase of the impact i think i consider myself a professional mathematician (a characterization"
X Link 2025-07-23T04:38Z 100K followers, 162.7K engagements
"Our new @OpenAI open models Our open models are here. Both of them. https://t.co/9tFxefOXcg Our open models are here. Both of them. https://t.co/9tFxefOXcg"
X Link 2025-08-05T17:06Z 100.1K followers, 150.3K engagements
"Hmm I wonder what this could be LIVE5TREAM THURSDAY 10AM PT LIVE5TREAM THURSDAY 10AM PT"
X Link 2025-08-06T17:27Z 100K followers, 107.1K engagements
"I'm more optimistic than ever that we at @OpenAI can eliminate hallucinations. There's still more research to be done but GPT-5 is solid progress"
X Link 2025-08-07T18:05Z 100.1K followers, 113.5K engagements
"In my opinion the most important takeaway from this result is that our @OpenAI International Math Olympiad (IMO) gold model is also our best competitive coding model. ๐งต 1/n Im thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold ๐ฅ๐ฅ in one of the worlds top programming competitions - the [----] International Olympiad in Informatics (IOI) - placing first among AI participants ๐จ๐ป๐จ๐ป https://t.co/k3RQxIzXPH 1/n Im thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold ๐ฅ๐ฅ in one of the worlds top programming competitions - the"
X Link 2025-08-11T18:01Z 100K followers, 167.8K engagements
"@kimmonismus Does it still work if you raise the table [--] inches"
X Link 2025-08-12T23:29Z 99.1K followers, 668.7K engagements
"I recently chatted with a VC who believed AGI was coming and would disrupt a lot of jobs but not their job. Of course AIs could write code and review contracts. But making accurate calibrated predictions about the future That's uniquely human. ๐ฎ Introducing Prophet Arena the AI benchmark for general predictive intelligence. That is can AI truly predict the future by connecting todays dots ๐ What makes it special - It cant be hacked. Most benchmarks saturate over time but here models face live unseen https://t.co/rhwR5WlU9d ๐ฎ Introducing Prophet Arena the AI benchmark for general"
X Link 2025-08-17T20:19Z 100K followers, 184.6K engagements
"GPT-5 Thinking definitely isnt perfect but its the first AI model I can trust more than many common sources of truth on the internet"
X Link 2025-08-24T21:33Z 100K followers, 134.1K engagements
"To all undergrads interested in learning about AI: be wary of taking Intro to AI as your first AI course. In many programs the class you actually want first is Intro to Machine Learning. AI technology has exploded in the past [--] years thanks to deep neural networks. Yet at many schools the Intro to AI curriculum has barely changed from what it was in [----] and spends often only a few lectures on machine learning. Unfortunately revamping Intro to AI is controversial at many universities and inertia tends to dominate. Dont decide which course to take based on the name alone. Instead check the"
X Link 2025-08-28T13:57Z 100K followers, 151.3K engagements
"@emollick People deserved to know what was coming. There were many factors to consider but a big one was that it didn't feel right to hide a development like this from the world for so long"
X Link 2025-09-06T17:10Z 100K followers, 55K engagements
"When we at @OpenAI released o1-preview a year ago it would think for seconds. Today our best reasoning models can think for hours browse the web and write code. But there's a lot of room to push reasoning even further. I'm excited for what the next year will bring @OpenAI o1 is trained with RL to think before responding via a private chain of thought. The longer it thinks the better it does on reasoning tasks. This opens up a new dimension for scaling. Were no longer bottlenecked by pretraining. We can now scale inference compute too. https://t.co/niqRO9hhg1 @OpenAI o1 is trained with RL to"
X Link 2025-09-12T15:39Z 100K followers, 224.8K engagements
"Julian was co-first author on AlphaGo AlphaZero and MuZero. He doesn't have a major twitter presence but he's been at the forefront of AI exponential progress for more than a decade. As a researcher at a frontier lab Im often surprised by how unaware of current AI progress public discussions are. I wrote a post to summarize studies of recent progress and what we should expect in the next 1-2 years: https://t.co/B7438Z9lOF As a researcher at a frontier lab Im often surprised by how unaware of current AI progress public discussions are. I wrote a post to summarize studies of recent progress and"
X Link 2025-09-28T05:11Z 100.1K followers, 327.7K engagements
"My new hobby is asking GPT-5 Thinking to find errors in every @Wikipedia page I read. Interestingly almost every page I checked has at least one error. ๐งต"
X Link 2025-10-02T16:01Z 100K followers, 342.7K engagements
"Below is a deep dive into why self play works for two-player zero-sum (2p0s) games like Go/Poker/Starcraft but is so much harder to use in "real world" domains. tl;dr: self play converges to minimax in 2p0sgames and minimax is really useful in those games. Every finite 2p0s game has a minimax equilibrium which is essentially an unbeatable strategy in expectation (assuming the players alternate sides). In rock paper scissors for example minimax is 1/3rd on each action. Is minimax what we want Not necessarily. If you're playing minimax in Rock Paper Scissors when most opponents' strategies"
X Link 2025-10-21T18:05Z 100.1K followers, 314.5K engagements
"Today we at @OpenAI are releasing GPT-5.1-Codex-Max which can work autonomously for more than a day over millions of tokens. Pretraining hasn't hit a wall and neither has test-time compute. Congrats to my teammates @kevinleestone & @mikegmalek for helping to make it possible"
X Link 2025-11-19T18:32Z 99.6K followers, 465.5K engagements
"@_NathanCalvin Its important to know what their limits are. I definitely dont trust it blindly"
X Link 2026-02-02T22:08Z 99.4K followers, [----] engagements
"@daniel_mac8 @OpenAI I think to truly automate research well need new capabilities"
X Link 2026-02-03T21:23Z 99.6K followers, 29.7K engagements
"@_sholtodouglas @jekbradbury @GoogleDeepMind @andy_l_jones @OpenAI @kevin_wang3290 A word on comp: I know folks who become quants to make money but [--] years later ask what they're doing with their life. We're at a special time in history. In AI research you can help positively guide the most important tech of our time and get paid well https://x.com/polynoamial/status/1911925322486104535 I worked in quant trading for a year after undergrad but didn't want my lifetime contribution to humanity to be making equity markets marginally more efficient. Taking a paycut to pursue AI research was my"
X Link 2026-01-21T21:18Z 100.7K followers, 28.8K engagements
"Labs like@OpenAI also hire researchers straight out of undergrad like@kevin_wang3290 though the bar is high. Kevin was highly recommended by his advisor and was first author on a NeurIPS [----] paper. There's a lot of bad NeurIPS papers but we could tell this was a great one. (Indeed after he joined OpenAI his paper was one of [--] out of [----] toreceivea Best Paper award.) His advisor's recommendation counted for a lot because it can be hard to evaluate a researcher just based on a resume or even a paper. https://x.com/kevin_wang3290/status/1902753430583525727"
X Link 2026-01-21T21:16Z 100.7K followers, 21.9K engagements
"12 years ago I tried making my first poker AI in college and dreamed of beating the world's best pros. After seven years of a PhD I'm excited to announce that I finally did it It's been quite an adventure. Looking forward to the next one Facebook AI and @CarnegieMellon researchers have built Pluribus the first AI bot to beat elite poker pros in [--] player Texas Holdem. This breakthrough is the first major benchmark outside of [--] player games and were sharing specifics on how we built it. https://t.co/zId9x4VBqc https://t.co/u89irNcxEK Facebook AI and @CarnegieMellon researchers have built"
X Link 2019-07-11T18:16Z 100.2K followers, [----] engagements
"6 years ago today AlphaGo beat Lee Sedol in a milestone for AI. Typically deep learning gets the credit but it's important to know that nobody has ever trained a raw NN that's superhuman in Go. All superhuman Go bots require tree search. IMO planning is underappreciated in AI"
X Link 2022-03-09T12:26Z 100.2K followers, [----] engagements
"I've been listening to @lexfridman's podcast for a long time but it was truly an amazing experience to sit down with him myself and talk about our latest research in multi-agent AI for Poker Diplomacy & more Here's my conversation with Noam Brown (@polynoamial) co-creator of AI systems that achieve superhuman level performance in games of poker and Diplomacy that involves strategic negotiations with humans. This was a fascinating technical conversation. https://t.co/e6BArJjnag https://t.co/m9O592F2Cu Here's my conversation with Noam Brown (@polynoamial) co-creator of AI systems that achieve"
X Link 2022-12-06T22:49Z 100.2K followers, [----] engagements
"This meme summarizes the paper nicely OpenAI presents: Competitive Programming with Large Reasoning Models - Competed live at IOI [----] - o3 achieved gold - General-purpose o3 surpasses o1 w/ hand-crafted pipelines specialized for coding resultss https://t.co/zuZPq0rZJF OpenAI presents: Competitive Programming with Large Reasoning Models - Competed live at IOI [----] - o3 achieved gold - General-purpose o3 surpasses o1 w/ hand-crafted pipelines specialized for coding resultss https://t.co/zuZPq0rZJF"
X Link 2025-02-12T05:05Z 100.2K followers, 89K engagements
"Deep Research has been a big hit but was previously limited to pro subscribers. Now all plus users can try it for themselves Deep research is now rolling out to all ChatGPT Plus Team Edu and Enterprise users ๐พ Deep research is now rolling out to all ChatGPT Plus Team Edu and Enterprise users ๐พ"
X Link 2025-02-25T18:42Z 100.2K followers, 136.5K engagements
"6 years ago AI pioneer and now Turing Award winner @RichardSSutton distilled [--] years of AI into a simple Bitter Lesson: general methods that scale with data and compute ultimately win. With the rise of AI agents it's an important lesson to keep in mind: https://www.cs.utexas.edu/eunsol/courses/data/bitter_lesson.pdf https://www.cs.utexas.edu/eunsol/courses/data/bitter_lesson.pdf https://t.co/dW1mBPdS2f Machines that learn from experience were explored by Alan Turing almost eighty years ago which makes it particularly gratifying and humbling to receive an award in his name for reviving this"
X Link 2025-03-06T16:57Z 100.2K followers, 112.1K engagements
"Memory isn't just another product feature. It signals a shift from episodic interactions (think a call center) to evolving ones (more like a colleague or friend). Still a lot of research to do but it's a step toward fundamentally changing how we interact with LLMs. Starting today memory in ChatGPT can now reference all of your past chats to provide more personalized responses drawing on your preferences and interests to make it even more helpful for writing getting advice learning and beyond. https://t.co/s9BrWl94iY Starting today memory in ChatGPT can now reference all of your past chats to"
X Link 2025-04-10T17:08Z 100.4K followers, 251K engagements
"Theres an old joke in AI: as soon as machines outperform humans at something it stops being considered AI. Glad to see poker solvers have reached that point. ChatGPT totally sucks at poker. It knows it sucks if you ask it. Today's newsletter is a deeper dive into why with some speculation about what this means for AI capabilities. https://t.co/rmB7YwVm6D ChatGPT totally sucks at poker. It knows it sucks if you ask it. Today's newsletter is a deeper dive into why with some speculation about what this means for AI capabilities. https://t.co/rmB7YwVm6D"
X Link 2025-05-22T03:34Z 100.2K followers, 82.1K engagements
"Their bet allowed for formal math AI systems (like AlphaProof). In [----] almost nobody thought an LLM could be IMO gold level by [----]. We are seeing much faster AI progress than Paul Christiano and Yudkowsky predicted who had gold in [----] at 8% and 16% respectively by methods that are more general than expected We are seeing much faster AI progress than Paul Christiano and Yudkowsky predicted who had gold in [----] at 8% and 16% respectively by methods that are more general than expected"
X Link 2025-07-19T10:28Z 100.2K followers, 170.1K engagements
"@Mihonarium 1) We posted after the closing ceremony. It was livestreamed so this is easy to confirm. 2) We weren't in touch with IMO. I spoke with one organizer before the post to let him know. He requested we wait until after the closing ceremony ends to respect the kids and we did"
X Link 2025-07-20T20:01Z 100.2K followers, 102.3K engagements
"Considering the technology and the pace of progress I think this is quite sane. This is insane. AI capex might account for a larger share of GDP than basically any technology since the railroad. Basically its a mini-wartime economy but the guns are chips and the tanks are databases https://t.co/E11IxmYtOv This is insane. AI capex might account for a larger share of GDP than basically any technology since the railroad. Basically its a mini-wartime economy but the guns are chips and the tanks are databases https://t.co/E11IxmYtOv"
X Link 2025-08-03T16:28Z 100.2K followers, 101.7K engagements
"@adcock_brett Nice It's great to see more robotics demos like this. And yeah there's tons more ways to demonstrate environment robustness. I'd love to see how far you all can push it"
X Link 2025-08-15T23:00Z 100.3K followers, 22K engagements
"12/12 problems solved which would be equivalent to a 1st place performance. GPT-5's solutions were responsible for solving 11/12 of them. 1/n Im really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the [----] ICPC World Finals the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have https://t.co/MA5KQdIxCj 1/n Im really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the [----] ICPC World Finals the premier collegiate"
X Link 2025-09-17T17:38Z 100.2K followers, 73.3K engagements
"A good example is @_sholtodouglas at @GoogleDeepMind. He's quiet on Twitter doesn't have any flashy first-author publications and has only been in the field for [---] years but people in AI know he was one of the most important people behind Gemini's success @eladgil @patrickc In AI at least the real [--] under [--] imo you have never heard of. They are [--] layers down the org chart from the CEO. They are usually not on Twitter they have an unmaintained LinkedIn they dont go on podcasts and they maybe published at one point but dont do so anymore. They @eladgil @patrickc In AI at least the real 30"
X Link 2024-01-20T22:48Z 100.5K followers, 513.2K engagements
"This is on the scale of the Apollo Program and Manhattan Project when measured as a fraction of GDP. This kind of investment only happens when the science is carefully vetted and people believe it will succeed and be completely transformative. I agree its the right time. Announcing The Stargate Project The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States. We will begin deploying $100 billion immediately. This infrastructure will secure Announcing The Stargate Project The Stargate"
X Link 2025-01-21T22:37Z 100.6K followers, 926.5K engagements
Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
/creator/twitter::polynoamial