#  @a1zhang Alex L Zhang Alex L Zhang posts on X about future, ai, environment, inference the most. They currently have [------] followers and [---] posts still getting attention that total [-----] engagements in the last [--] hours. ### Engagements: [-----] [#](/creator/twitter::4593727300/interactions)  - [--] Week [---------] -13% - [--] Month [---------] +25% - [--] Months [---------] +3,226% - [--] Year [---------] +266,535% ### Mentions: [--] [#](/creator/twitter::4593727300/posts_active)  - [--] Week [--] -21% - [--] Month [--] +31% - [--] Months [--] +135% - [--] Year [---] +3,267% ### Followers: [------] [#](/creator/twitter::4593727300/followers)  - [--] Week [------] +7.30% - [--] Month [------] +33% - [--] Months [------] +133% - [--] Year [------] +166% ### CreatorRank: [-------] [#](/creator/twitter::4593727300/influencer_rank)  ### Social Influence **Social category influence** [technology brands](/list/technology-brands) 15% [stocks](/list/stocks) 6% [social networks](/list/social-networks) 4% [finance](/list/finance) 3% [gaming](/list/gaming) 3% [cryptocurrencies](/list/cryptocurrencies) 1% **Social topic influence** [future](/topic/future) 7%, [ai](/topic/ai) 6%, [environment](/topic/environment) #541, [inference](/topic/inference) 5%, [if you](/topic/if-you) 5%, [llm](/topic/llm) 5%, [we are](/topic/we-are) 4%, [context window](/topic/context-window) 4%, [in the](/topic/in-the) 3%, [this is](/topic/this-is) 3% **Top accounts mentioned or mentioned by** [@gpumode](/creator/undefined) [@lateinteraction](/creator/undefined) [@raw_works](/creator/undefined) [@shihwesley](/creator/undefined) [@simonguozirui](/creator/undefined) [@msirovatka](/creator/undefined) [@philipp__k](/creator/undefined) [@amd](/creator/undefined) [@swishmoe](/creator/undefined) [@tensorfi](/creator/undefined) [@socialtranxiety](/creator/undefined) [@rawworks](/creator/undefined) [@marksaroufim](/creator/undefined) [@noahziems](/creator/undefined) [@jacobli99](/creator/undefined) [@primeintellect](/creator/undefined) [@laudeinstitute](/creator/undefined) [@sashimikun_void](/creator/undefined) [@dotdotjames](/creator/undefined) [@irl_danb](/creator/undefined) **Top assets mentioned** [Alphabet Inc Class A (GOOGL)](/topic/$googl) [Dell Technologies, Inc. (DELL)](/topic/dell) [Oasys (OAS)](/topic/oasys) ### Top Social Posts Top posts by engagements in the last [--] hours "another related direction Ill be paying attention to this year :) Memory is probably the biggest challenge for building practical AI agents. Thrilled to share our work exploring a shift from manually defining memory for each domain enabling agents to design better memory mechanisms for themselves. Meta-learning memory designs unlocks Memory is probably the biggest challenge for building practical AI agents. Thrilled to share our work exploring a shift from manually defining memory for each domain enabling agents to design better memory mechanisms for themselves. Meta-learning memory designs" [X Link](https://x.com/a1zhang/status/2021684304787464398) 2026-02-11T20:34Z 29.3K followers, 28K engagements "The prompt being symbolic is extremely important and I think theres a bit of confusion on what that means. We are claiming that for the REPL / environment that the RLM interacts with its inputs (i.e. prompt and we call it this way to emphasize the idea that an RLM is by construction a language model) needs to have some kind of symbolic handle in this environment to be used. Putting the book in a file is an equivalent statement. The file is the symbolic handle to the RLM prompt which the RLM can access through its REPL. This point is made even more clear in using a bash file system as the" [X Link](https://x.com/a1zhang/status/2022077206084591997) 2026-02-12T22:35Z 29.3K followers, [----] engagements "@socialtranxiety yeah it's called recursive meta cognition by some folks at deepmind" [X Link](https://x.com/a1zhang/status/2020265947031114147) 2026-02-07T22:38Z 29.3K followers, [----] engagements "This is true if youre clear on what is an instruction and what is raw data. In the blog we explain that we separate out the query (put directly in context) and the raw data. In my RLM implementation on GitHub this is also specified in this way (i.e. there is an option to pass in instructions directly to the LM). This is all a convenience in cases where you know exactly what the instructions are. This is not a safe assumption to make in many cases though. The reason we dont distinguish these two in the definition of an RLM is that in many cases it is not easy to distinguish nor is it" [X Link](https://x.com/a1zhang/status/2022305515431133304) 2026-02-13T13:43Z 29.3K followers, [---] engagements "πππ @a1zhang apologies from my content team @a1zhang apologies from my content team" [X Link](https://x.com/a1zhang/status/2012226766505451782) 2026-01-16T18:13Z 29.3K followers, 13K engagements "Specifically we were lucky that LongBench-Pro a separate source of long-context problems separate from our eval tasks was released recently. We are excited about future results that train larger models as RLMs but the fact that this small model picks up these recursive strategies from a tiny amount of data is pretty encouraging Read more in appendix A of the paper: https://arxiv.org/pdf/2512.24601 https://arxiv.org/pdf/2512.24601" [X Link](https://x.com/a1zhang/status/2016923297712144523) 2026-01-29T17:16Z 29K followers, [----] engagements "The model was trained trajectories using a fixed system prompt and following the structure of our RLM repo. We recommend using vLLM with our inference code to use it out of the box. Open model: https://huggingface.co/mit-oasys/rlm-qwen3-8b-v0.1 https://github.com/alexzhang13/rlm https://huggingface.co/mit-oasys/rlm-qwen3-8b-v0.1 https://github.com/alexzhang13/rlm" [X Link](https://x.com/a1zhang/status/2016923299754709012) 2026-01-29T17:16Z 29.1K followers, [----] engagements "Second we expanded the writeup with an extra section (and corresponding results). Building on an earlier discussion we solidify the three defining properties of an RLM: [--]. A symbolic handle to the prompt [--]. Access to a persistent Turing-complete environment that contains this handle [--]. The ability to perform symbolic recursion within this environment We also add new results for a CodeAct baseline that has access to sub-calls. This contrasts RLMs to existing agents with sub-calls. (Spoiler: the latter works much less effectively as you scale up the context length.)" [X Link](https://x.com/a1zhang/status/2016923303554797581) 2026-01-29T17:16Z 29.2K followers, [----] engagements "We are excited by all the support and future work to be done on RLMs Thanks again to Laude Institute Prime Intellect and Modal for their support of this research. Please let us know how RLMs do in your own domains and where they can improve :) https://arxiv.org/abs/2512.24601 https://arxiv.org/abs/2512.24601" [X Link](https://x.com/a1zhang/status/2016923305454800945) 2026-01-29T17:16Z 29.2K followers, [----] engagements "I came across this work that implicitly implements an RLM in a DSL that executes both code and natural language instructions. Super cool https://elliecheng.com/blog/2026/01/20/enabling-rlm-with-shared-program-state/ https://elliecheng.com/blog/2026/01/20/enabling-rlm-with-shared-program-state/" [X Link](https://x.com/a1zhang/status/2017284304309538939) 2026-01-30T17:10Z 29.1K followers, 10.3K engagements "Some of the benchmarks in our paper provide simple patterns where this isn't the case (e.g. OOLONG where you have to chunk and loop over sub-calls) and Claude Code wouldn't natively do this without specific instruction. There is also the more obvious friction of trying to apply Claude Code to a generic task and it being 1) overkill and 2) an awkward interface for doing so But yep I also think current models work as RLMs but the real long term value is in performance down the road following this strategy :)" [X Link](https://x.com/a1zhang/status/2020370355290788000) 2026-02-08T05:33Z 28.9K followers, [----] engagements "hmm Recursive Language Models (RLMs) let agents manage 10M+ tokens by delegating tasks recursively. This Google Cloud Community Article explains why ADK was the perfect choice for re-implementing the original RLM codebase in a more enterprise-ready format https://t.co/p3MsNtLVJL https://t.co/CBMj1xbxD3 Recursive Language Models (RLMs) let agents manage 10M+ tokens by delegating tasks recursively. This Google Cloud Community Article explains why ADK was the perfect choice for re-implementing the original RLM codebase in a more enterprise-ready format https://t.co/p3MsNtLVJL" [X Link](https://x.com/a1zhang/status/2020593391009136737) 2026-02-08T20:19Z 29.3K followers, 113K engagements "If you're dealing with a 100M context window there wasn't anything to cache to begin with. You'd want a natural way for an LM to handle this which an RLM provides. As for general token costs RLMs can allow handling a system without the individual LM calls looking at the entire context. Suppose there exists an LM that can ingest 100M context windows -- it is forced to look at all of it even if it doesn't need to. https://twitter.com/i/web/status/2020627077624742021 https://twitter.com/i/web/status/2020627077624742021" [X Link](https://x.com/a1zhang/status/2020627077624742021) 2026-02-08T22:33Z 29K followers, [---] engagements "@lateinteraction @thkostolansky I hand drew mine in high school LOL" [X Link](https://x.com/a1zhang/status/2022075201648046095) 2026-02-12T22:27Z 29.3K followers, [----] engagements "Fair questions dont worry I would point you to a bunch of prior tweets or videos Ive talked on but theyre scattered at this point. [--]. RLM = subagents perhaps it is an argument on how exactly a subagent calling system should look (Im not claiming optimality or anything its more based on intuition). [--]. CC currently doesnt do exactly what the RLM describes. I think this is a misconception and I would hope (its a beta feature afaik) it moves closer to this. CC explicitly calls sub-agents as a tool (eg. Opus [---] will directly output JSON(call sub agent)) which differs from writing programs in" [X Link](https://x.com/a1zhang/status/2022479615567057197) 2026-02-14T01:14Z 29.3K followers, [----] engagements "Providing some responses I want to preface this all by saying the comparison of CC to RLM is a little fuzzy and not the correct framing to me because a lot of CC is a highly post-trained task-specific scaffold that shares a lot of similarities to the defn of an RLM. In fact there are small tweaks that can be made to CC for it to fit the definition of an RLM. To illustrate this point there are a few existing plugins for integrating RLMs into CC / OC. [--]. Agreed except there are some notable limitations of CC-style sub-agents. It relies on the root model being entirely correct about its order /" [X Link](https://x.com/a1zhang/status/2022517956777800016) 2026-02-14T03:47Z 29.3K followers, [---] engagements "What if scaling the context windows of frontier LLMs is much easier than it sounds Were excited to share our work on Recursive Language Models (RLMs). A new inference strategy where LLMs can decompose and recursively interact with input prompts of seemingly unbounded length as a REPL environment. On the OOLONG benchmark RLMs with GPT-5-mini outperforms GPT-5 by over 110% gains (more than double) on 132k-token sequences and is cheaper to query on average. On the BrowseComp-Plus benchmark RLMs with GPT-5 can take in 10M+ tokens as their prompt and answer highly compositional queries without" [X Link](https://x.com/a1zhang/status/1978469116542337259) 2025-10-15T14:32Z 29.3K followers, 951.1K engagements "Much like the switch in [----] from language models to reasoning models we think [----] will be all about the switch to Recursive Language Models (RLMs). It turns out that models can be far more powerful if you allow them to treat *their own prompts* as an object in an external environment which they understand and manipulate by writing code that invokes LLMs Our full paper on RLMs is now availablewith much more expansive experiments compared to our initial blogpost from October [----] https://arxiv.org/pdf/2512.24601 https://arxiv.org/pdf/2512.24601" [X Link](https://x.com/a1zhang/status/2007198916073136152) 2026-01-02T21:14Z 29.3K followers, 2M engagements "RLMs are our bitter-lesson-pilled approach to inference-time scaling and they can scale the context size of LLMs by orders of magnitude From the outside an RLM exposes the same interface as a language model. It accepts a string prompt and produces a string response. But internally RLMs do not feed the prompt directly to the Transformer. Instead they set up the LLM in a REPL environment where the prompt *is placed into a variable* and then allow the LLM to write code to peek into break up and recursively invoke itself over snippets of the prompt." [X Link](https://x.com/a1zhang/status/2007198918891974661) 2026-01-02T21:14Z 29.3K followers, 61.9K engagements "@DavidFSWD @lateinteraction awesome well also be releasing some code for people to play with soon" [X Link](https://x.com/a1zhang/status/2007324307589083587) 2026-01-03T05:33Z 29.3K followers, 13.5K engagements "We just updated the RLM paper with some new stuff. First we just released RLM-Qwen3-8B the first natively recursive language model (at tiny scale). We post-trained Qwen3-8B using only [----] RLM trajectories from unrelated domains to our evaluation benchmarks. RLM-Qwen3-8B works well across several tasks and delivers a pretty large boost over using an RLM scaffold with the underlying Qwen3-8B model off-the-shelf and even larger gains over directly using Qwen3-8B directly for long-context problems. https://twitter.com/i/web/status/2016923294461476873" [X Link](https://x.com/a1zhang/status/2016923294461476873) 2026-01-29T17:16Z 29.3K followers, 67.6K engagements "This is bar for bar one of the core pieces of intuition behind how we came up with RLMs why they currently work and why they are so promising for this year Lots of exciting stuff will be released soon :) Why do coding agents work so well and what would it take to replicate their success in other domains One important and under-appreciated reason is that agentic coding is a type of neurosymbolic AI. The main weakness of LLMs is that they are statistical machines and struggle at Why do coding agents work so well and what would it take to replicate their success in other domains One important" [X Link](https://x.com/a1zhang/status/2018417254103200165) 2026-02-02T20:12Z 29.3K followers, 30.2K engagements "while procrastinating on research I decided it's finally time to add RLMs to pypi pip install rlms" [X Link](https://x.com/a1zhang/status/2020263239653945849) 2026-02-07T22:27Z 29.3K followers, 79.5K engagements "Maybe I can provide some intuition but lmk if its unclear I am trying to refine how I explain this anyways To start I think the RLM idea is super simple but elegant (I'm biased obviously). The paper argues that future language models 1) do not need to think about context window limits; 2) will have reasoning chains that mix code (symbolic) and neural LMs (fuzzy). RLMs are what we think minimally such a system should look like. Explicitly it is an LM REPL + prompt where the REPL contains the prompt and sub-agents as a *function inside the REPL*. This last part is quite important because it" [X Link](https://x.com/a1zhang/status/2020365316698571122) 2026-02-08T05:13Z 29.3K followers, 59.2K engagements "RT @raw_works: out of curiosity after reading this i started benchmarking rlm and dspy.rlm on longmemeval tl;dr - i think i might have a" [X Link](https://x.com/a1zhang/status/2021381235013206191) 2026-02-11T00:30Z 29.3K followers, [--] engagements "π Whos up to build an RLM agent for ARC-AGI-3 Bounty available https://github .com/arcprize/ARC-AGI-3-Agents Whos up to build an RLM agent for ARC-AGI-3 Bounty available https://github .com/arcprize/ARC-AGI-3-Agents" [X Link](https://x.com/a1zhang/status/2022075073046487083) 2026-02-12T22:27Z 29.3K followers, [----] engagements "Funnily enough I tried to dabble with ARC AGI before and with very little success Super cool to well designed RLMs achieving SOTA :) We set a new ARC-AGI-2 SotA: 85.28% using an Agentica agent (350 lines) that writes and runs code. https://t.co/tohFfBZb2P We set a new ARC-AGI-2 SotA: 85.28% using an Agentica agent (350 lines) that writes and runs code. https://t.co/tohFfBZb2P" [X Link](https://x.com/a1zhang/status/2022079197095899588) 2026-02-12T22:43Z 29.3K followers, 17.7K engagements "Weve extended the ML Valentine's π dataset and found it a new home: https://t.co/VzSbaH2sl2 β€π€ My favorite this year @a1zhangs RLM π Because I think about it over and over and over again. https://t.co/XrWLPrhkC0 Weve extended the ML Valentine's π dataset and found it a new home: https://t.co/VzSbaH2sl2 β€π€ My favorite this year @a1zhangs RLM π Because I think about it over and over and over again. https://t.co/XrWLPrhkC0" [X Link](https://x.com/a1zhang/status/2022770995417817590) 2026-02-14T20:32Z 29.3K followers, [----] engagements "We @AuricSource solved 8/10 problems from the #1stProof benchmark (Abouzaid et al. arXiv:2602.05192) all with Lean [--] formal verification. Q4 & Q6: substantial partial QED with precise remaining gaps. The twist AI agents did the heavy lifting reasoning proving and https://t.co/clorfPuy76 We @AuricSource solved 8/10 problems from the #1stProof benchmark (Abouzaid et al. arXiv:2602.05192) all with Lean [--] formal verification. Q4 & Q6: substantial partial QED with precise remaining gaps. The twist AI agents did the heavy lifting reasoning proving and https://t.co/clorfPuy76" [X Link](https://x.com/a1zhang/status/2023526374569554176) 2026-02-16T22:34Z 29.3K followers, [----] engagements "Also for cuEquivariance kernels we explicitly ban them because they are closed source and this competition in particular focuses on inference which has a lot of room to be faster. We only use them as a benchmark reference. We dont use them as the correctness reference because they have precision errors w.rt. a pure FP32 PyTorch implementation" [X Link](https://x.com/a1zhang/status/1937711764486885676) 2025-06-25T03:17Z 12.3K followers, [---] engagements "Ive been looking forward to this one for a long time LOL @simonguozirui I'm giving a talk at GPU mode tomorrow. Feel free to join the livestream: https://t.co/2UWmxdjNEc I'm giving a talk at GPU mode tomorrow. Feel free to join the livestream: https://t.co/2UWmxdjNEc" [X Link](https://x.com/a1zhang/status/1939026077117989161) 2025-06-28T18:20Z 12.3K followers, [---] engagements "Very much a noob question but for benchmarking CUDA code speed we generally have to clear caches so multiple repeated runs are fair. If I were to benchmark CPU code speed (e.g. on AlgoTune) does a similar principle apply And how easy is it to do this in say Python" [X Link](https://x.com/a1zhang/status/1943596638796132476) 2025-07-11T09:02Z 12.4K followers, [----] engagements "As a reminder the first game is just clicking on a green button. Many games in the VideoGameBench list require clicking on buttons characters etc. The mapping from text actions -- spatial coordinates is non-trivial and all evaluated models but Claude [---] struggle" [X Link](https://x.com/a1zhang/status/1943845634035134585) 2025-07-12T01:31Z 12.4K followers, [--] engagements "The second game requires moving a square through a grid-world maze using the arrow-keys. Solving these games directly translates to games like Pokemon where the agent must navigate around a map. Surprisingly though all evaluated frontier VLMs struggle on this task" [X Link](https://x.com/a1zhang/status/1943845637633847383) 2025-07-12T01:31Z 12.4K followers, [--] engagements "The practice game involves dragging a mouse in a pre-specified pattern. A lot of RTS games like Age of Empires / Civ require this ability. Unlike clicking which requires moving to a specified location the trajectory of moving the mouse matters in this game" [X Link](https://x.com/a1zhang/status/1943845640666329226) 2025-07-12T01:31Z 12.4K followers, [---] engagements "Bro actually denied OpenAI an AlphaGo moment LOL @FakePsyho is him. Huge congratsππ" [X Link](https://x.com/a1zhang/status/1945640329304313862) 2025-07-17T00:22Z 12.4K followers, 13.9K engagements "LM reasoning benchmark idea: have it beat a Hardcore Nuzlocke run of Pokmon Run & Bun or a Kaizo ROM hack Give it access to search online use damage calculators etc. People spend literally hundreds of hours meticulously planning battles managing their available mons etc" [X Link](https://x.com/a1zhang/status/1948252706503557201) 2025-07-24T05:23Z 12.4K followers, [----] engagements "The focus wouldnt be on navigating the world (weve proven this can be done with Gemini Plays Pokemon) rather this is a mix of the recent Pokemon Showdown agents + a much longer horizon reasoning task of planning out runs accounting for bad luck etc" [X Link](https://x.com/a1zhang/status/1948252713084735965) 2025-07-24T05:23Z 12.4K followers, [---] engagements "@KLieret Congrats on the launch" [X Link](https://x.com/a1zhang/status/1948627483513000250) 2025-07-25T06:12Z 12.4K followers, [---] engagements "if anyone can help no idea why I dont know this but is there free TTS software where you throw in a huge website or PDF (long doc say like the CUDA programming guide long) and it spits out reasonable audio that isnt just monotone reading" [X Link](https://x.com/a1zhang/status/1951686540620410984) 2025-08-02T16:48Z 12.5K followers, [----] engagements "in addition to all the amazing content for the next [--] weeks on GPU MODE there's also this amazing 5-week course starting soon on all the juicy secrets to 100B+ scale model training there's a shit ton of content in here and a ton of amazing lecturers + free compute" [X Link](https://x.com/a1zhang/status/1958162504296591715) 2025-08-20T13:41Z 12.6K followers, [----] engagements "Excited to announce the SECOND @GPU_MODE x @AMD $100K kernel competition: β‘DISTRIBUTED KERNELS You now get free access to a full **8xMI300 node** to optimize all2all gemm + reducescatter and allreduce + gemm kernels -- all relevant to frontier LMs Go compete now π§΅" [X Link](https://x.com/a1zhang/status/1960773095599562987) 2025-08-27T18:35Z 12.9K followers, [----] engagements "hi if youre interested in using or writing mega kernels for AI (one big GPU kernel for an entire model) you should tune in to todays @GPU_MODE livestream today in [--] hours we have the authors of MPK talking about their awesome new compiler for mega kernels see you there :)" [X Link](https://x.com/a1zhang/status/1966899549517095176) 2025-09-13T16:19Z 13K followers, 17.1K engagements "Claude [---] Sonnet cleans up all the UI for our hacky typescript leaderboard in minutes π btw the @GPU_MODE kernel competition is live NOW we have tons of available gpus (free of charge) hosted on @modal_labs -- see @charles_irl's triton code taking π₯ here (you could be next)" [X Link](https://x.com/a1zhang/status/1894179014153019843) 2025-02-25T00:14Z 11.2K followers, [----] engagements "More cracked submissions to the @AMD x @GPU_MODE leaderboard β18k+ submissions since the beginning πFP8 GEMM: A battle btwn Seb and Snektron for π₯ with a 25% faster kernel since [--] weeks ago π€― Single-device MoE: multiple ppl are now 100-600x faster than PyTorch ref A few days after @AnushElangovan's tweet about a crazy-fast 183.429s kernel for FP8 GEMM on MI300X an EVEN FASTER submission has emergedβ‘ We're already at 5000+ submissions in just the first week of the $100k (cash π΅ btw) AMD MI300X kernel-writing competition -- join now https://t.co/qCaQzgIuw3 A few days after @AnushElangovan's" [X Link](https://x.com/a1zhang/status/1920917804313616749) 2025-05-09T19:04Z 23.1K followers, [----] engagements "I'll be at the AMD Advancing AI conference on June [--] where we'll be announcing the winners of the $100K @AMD x @GPU_MODE competition ALSO the amazing @marksaroufim and @m_sirovatka are presenting on writing fast kernels so pull up π¨DM if you wanna meet up :)" [X Link](https://x.com/anyuser/status/1931562074989023647) 2025-06-08T04:01Z 23.1K followers, [----] engagements "kind of a surreal moment being on stage with Lisa Su as she announces & thanks us for the competition we built the past year building w/ @m_sirovatka @marksaroufim Ben & Erik (all in our free time :p) on @GPU_MODE has been genuinely incredible cant thank you guys enough β€" [X Link](https://x.com/a1zhang/status/1933558754831999318) 2025-06-13T16:15Z 23.1K followers, 21.7K engagements "Announcing a new @GPU_MODE kernel writing competition: our first featuring both NVIDIA and AMD hardware The first problem will be the Triangle Multiplication operator essential to the AlphaFold 𧬠models It's a particularly tricky problem with no good public implementation" [X Link](https://x.com/a1zhang/status/1937626194385522886) 2025-06-24T21:37Z 23.1K followers, 26.7K engagements "sadly wont be at ICML but have [--] papers that you should check out KernelBench which @simonguozirui will be presenting at the main conference _ + the @GPU_MODE leaderboards OSS infra at the CODEML workshop (7/19) that @m_sirovatka will be giving an oral for Lots of πΏ" [X Link](https://x.com/anyuser/status/1944392542146945249) 2025-07-13T13:44Z 23.1K followers, [----] engagements "announcing the @GPU_MODE x @scaleml summer speaker series happening next week a 5-day series where top researchers will teach about the algorithmic and systems-level advances that underpin gpt-oss all content will be live-streamed & recorded for FREE on GPU MODE's YouTube" [X Link](https://x.com/a1zhang/status/1957514870368399818) 2025-08-18T18:48Z 23.1K followers, 29.6K engagements "We are live This will be a super long session with two amazing speakers so feel free to stop by and ask any questions you may have :) π: https://www.youtube.com/watchv=LMk8nqIFXLo We are ending strong with GPU Programming π [--] talks today back to back First @exists_forall for intro to CUDA and then @simran_s_arora for Thunder Kittens π Today at: 1:00pm EST / 11:00am PT - https://t.co/2S7r5oxzI1 https://t.co/hcQRVFeYQz https://www.youtube.com/watchv=LMk8nqIFXLo We are ending strong with GPU Programming π [--] talks today back to back First @exists_forall for intro to CUDA and then" [X Link](https://x.com/a1zhang/status/1961476137852277003) 2025-08-29T17:08Z 16.3K followers, [----] engagements "it's insane to me how little attention the llm.q repo has it's a fully C/C++/CUDA implementation of multi-gpu (zero + fsdp) quantized LLM training with support for selective AC it's genuinely the coolest OSS thing I've seen this year (what's crazier is [--] person wrote it)" [X Link](https://x.com/a1zhang/status/1973160442873913488) 2025-09-30T22:58Z 16.5K followers, 41.8K engagements "the AF3 trimul kernel @GPU_MODE competition has ended - big congrats to the winners we wrote a blogpost on the GPU MODE website detailing 1) why this particular problem is so nasty to optimize 2) how we intended for it to go 3) what participants wrote https://www.gpumode.com/v2/news https://www.gpumode.com/v2/news" [X Link](https://x.com/a1zhang/status/1975628150496493708) 2025-10-07T18:23Z 23.1K followers, [----] engagements "Blogpost: There are several examples of interesting behavior that emerge from RLMs that can be found in the blogpost. We wrote a visualizer to make these examples clearer and highlight different strategies that these models can take. OOLONG: BCP: Id like to thank my wonderful advisor @lateinteraction the person Ive spammed on Slack while working on this @noahziems and the rest of my labmates @jacobli99 @dianetc_ for their support and discussion in this project https://arxiv.org/abs/2508.06600 https://openreview.net/forumid=lrDr6dmXOX https://alexzhang13.github.io/blog/2025/rlm/" [X Link](https://x.com/a1zhang/status/1978469131407290859) 2025-10-15T14:32Z 16.3K followers, 23.6K engagements "So you can think of RLMs as a generalization of this idea and several others. In our post we wanted to show that code execution was the right instantiation of this more general idea. Most prior works don't really consider the context / prompt our the root LM in the way that we've been framing it (which makes sense they're very agentic in nature). In terms of why the REPL environment is different than the setup you described in some sense we don't want to manually engineer tools for the RLMs. We want full flexibility and using code is the most flexible form of this (e.g. using regex queries to" [X Link](https://x.com/a1zhang/status/1978509860946772061) 2025-10-15T17:14Z 16.3K followers, [----] engagements ""Why is this not just an agent with access to your file system e.g. something like SWE-Agent" Perhaps because we implemented it with a REPL environment there seems to be some link to coding but an RLM is entirely task-agnostic. Think of it as an extension of the bitter lesson -- our design of how we handle context (e.g. how we design agents) should entirely be up to an LM not a human. When you think about an agent with file system or terminal access it is generally given tools to look around some codebase execute code write tests etc. An RLM like an LM call is a function from text -- text." [X Link](https://x.com/a1zhang/status/1978536295828762892) 2025-10-15T18:59Z 16.3K followers, 31K engagements "Lots of folks have been asking for a gist or simple notebook to try out RLMs. While we work on some more exciting experiments here's a self-contained minimal version I quickly put together for people to build on top of. Happy hacking :) https://github.com/alexzhang13/rlm https://github.com/alexzhang13/rlm" [X Link](https://x.com/a1zhang/status/1978948676287340753) 2025-10-16T22:18Z 16.4K followers, 31.4K engagements "The last thing I'll clarify before I log off for a bit to go study for my midterms Q1: Why has such a simple idea like RLMs not been formalized / used in this way Honestly I think a lot of the replies are the answer to this question. There is this hard stuck notion of "agents" in our field -- the idea that an LM call is the smallest unit / function of execution we have available and everything on top has to be some kind of human-designed scaffold to get it to work. We happened to stumble upon the idea of handling huge contexts when we designed agents for SWE but because we were so insistent" [X Link](https://x.com/a1zhang/status/1979055081178673164) 2025-10-17T05:21Z 16.4K followers, 82.1K engagements "context switching away from RLMs for today but not entirely from the blog format feeling insane FOMO that I can't be at the @GPU_MODE IRL event or PTC this year but in other slightly related news it's been [--] year since KernelBench @simonguozirui and I are releasing a long post tmr (when the IRL event is happening) on our perspective on what's happened since with all the ideas & conclusions we've come to about GPU codegen unpacking all the good (and reward hacked) results that have emerged and the shift from performance to correctness hopefully I at least see lots of pictures of the event from" [X Link](https://x.com/a1zhang/status/1981379905918046440) 2025-10-23T15:19Z 16.5K followers, [----] engagements "Thanks for having me on I also answered a bunch about what Ive been thinking and some more expanded thoughts on RLMs vs. X in Claude Code CodeAct Tool calling etc. Happy to answer anything else too One of our best talks yet. Thanks @a1zhang for the amazing presentation + Q&A on Recursive Language Models If you're interested in how we can get agents to handle near-infinite contexts this one is a must. Watch the recording here https://t.co/v3N2mHoeSU One of our best talks yet. Thanks @a1zhang for the amazing presentation + Q&A on Recursive Language Models If you're interested in how we can get" [X Link](https://x.com/a1zhang/status/1984406507824599368) 2025-10-31T23:45Z 21.9K followers, [----] engagements "The wait is over Were so excited to announce the @GPU_MODE x @NVIDIA kernel optimization competition for NVFP4 kernels on Blackwell B200s We will be awarding NVIDIA DGX Sparks & RTX 50XX series GPUs for individual rankings on each problem as well as a Dell Pro Max with NVIDIA GB300 (an absolutely insane prize btw) for the grand prize winner of the entire competition Winners will also be invited to GTC [----] for free. The competition will feature [--] different single-device kernels over a [--] month period; we highly encourage competitors to try out and use CuTe DSL / CUTLASS [---] on these problems" [X Link](https://x.com/a1zhang/status/1985434030473437213) 2025-11-03T19:48Z 16.5K followers, 16.6K engagements "lots of fun announcements / news from @GPU_MODE this week i sadly wasn't able to go but they've written up a wonderful blogpost on the IRL hackathon in SF a couple weeks back + all the cool winner projects. go have a read on the website" [X Link](https://x.com/a1zhang/status/1986202810384355690) 2025-11-05T22:43Z 16.5K followers, [----] engagements "very exciting news congrats to @hardmaru and the rest of the super talented folks at Sakana on the raise (if u have the option to intern there you should i learned sm there pre-phd) Announcing our Series B π https://t.co/6BpYSq5uc4 https://t.co/QvVbNpGPei Announcing our Series B π https://t.co/6BpYSq5uc4 https://t.co/QvVbNpGPei" [X Link](https://x.com/a1zhang/status/1990209934915781108) 2025-11-17T00:06Z 16.4K followers, 39.9K engagements "huge step towards solving low-bit (NVFP4 / FP4) training. FP8 training has been a thing for a while but anything lower has traditionally been unstable not affiliated at all but I've seen Jack in the weeds for months trying out different strategies and writing CUTLASS kernels to get this to work **fast** too Training LLMs with NVFP4 is hard because FP4 has so few values that I can fit them all in this post: [--] [---] [--] [---] [--] [--] [--] [--]. But what if I told you that reducing this range even further could actually unlock better training + quantization performance Introducing Four https://t.co/T39wKIFF4O" [X Link](https://x.com/a1zhang/status/1995915904786256271) 2025-12-02T18:00Z 16.3K followers, 16.9K engagements "Super cool work I'm very excited to see more efforts getting sandboxing / inference for RLMs to work We're also cooking something for this soon :) Couldn't get the recursive language model idea out of my head after seeing @JoshPurtell post about @a1zhang and @lateinteraction's research.so here's Aleph https://t.co/6L132vZv9Q API-less (Claude Desktop Cursor Windsurf friendly) MCP server Couldn't get the recursive language model idea out of my head after seeing @JoshPurtell post about @a1zhang and @lateinteraction's research.so here's Aleph https://t.co/6L132vZv9Q API-less (Claude Desktop" [X Link](https://x.com/a1zhang/status/2000667455627084025) 2025-12-15T20:41Z 16.4K followers, [----] engagements "There's a lot of important things But just to name the obvious ones: The goal in the end is 1) a pluggable standard either library or engine to slot in any LM API or local LM and run it as an RLM (also manage the sandbox handle the LM calls so they're async etc.) and 2) RLMs lend themselves well as a new axis of scaling reasoning. We want to standardize our design of (1) so the input/output pairs that an LM sees when used as an RLM are predictable so we can apply a lot of the recent training techniques like Quiet-STaR to train models that behave efficiently as RLMs" [X Link](https://x.com/a1zhang/status/2000670997557452999) 2025-12-15T20:55Z 16.4K followers, [---] engagements "We experiment on several different tasks of varying levels of complexity with one closed and one open LLM. We apply well-known task agnostic baselines including context compaction (summarization) CodeAct with a retriever and of course the base LLM itself. Across all tasks and a closed and open frontier model RLMs outperform other methods at a cheaper or comparable cost" [X Link](https://x.com/a1zhang/status/2007198922000019867) 2026-01-02T21:14Z 22.8K followers, 14.5K engagements "We hypothesize that model performance degrades on long context tasks at different rates depending on the complexity of the task. We scale GPT-5 and RLM(GPT-5) performance at input context lengths from 8K to 1M tokens on [--] tasks of increasing difficulty and show that the base model performance degrades pretty dramatically while RLMs maintain strong performance and can handle contexts well beyond the window of the base model. https://twitter.com/i/web/status/2007198924868608360 https://twitter.com/i/web/status/2007198924868608360" [X Link](https://x.com/a1zhang/status/2007198924868608360) 2026-01-02T21:14Z 22.8K followers, 13.7K engagements "We are excited by all the current interest and future work on RLMs including the recent blogpost by @PrimeIntellect and @omouamoua. This work would not be possible without support from the @LaudeInstitute and help from labmates @NoahZiems @jacobli99 and @nlp_mit Paper Link: https://arxiv.org/pdf/2512.24601 https://arxiv.org/pdf/2512.24601" [X Link](https://x.com/a1zhang/status/2007198927800418314) 2026-01-02T21:14Z 19.9K followers, [----] engagements "on the naming choice of RLMs its fair to say that RLMs arent a new architecture / new NN so the naming is confusing. it was a point of discussion with my advisor @lateinteraction when we were exploring this idea a few months back. I think though that what we think of as a model can be reframed beyond just a NN ultimately a big goal and motivation for the naming is to challenge the idea that a Transformer / NN is the lowest level abstraction of a language model. LMs are great & have gotten us very far but were getting to the regime where theyre very expensive to improve. increasing physical" [X Link](https://x.com/a1zhang/status/2007234677430456766) 2026-01-02T23:36Z 19.8K followers, 16.5K engagements "Sorry that line is meant to read as that long = 100M characters. It cant be changed because thats what was used for the experiments but by all means for any future experiments this ambiguity can be removed. The input context variable can be just a string it can be pre-chunked and it can also be an arbitrary object (e.g. a list of strings images etc.). The idea is that for RLMs you just dump inputs and it should be able to handle them appropriately. https://twitter.com/i/web/status/2007488143847542792 https://twitter.com/i/web/status/2007488143847542792" [X Link](https://x.com/a1zhang/status/2007488143847542792) 2026-01-03T16:24Z 19.7K followers, [----] engagements "RLMs natively handle prompts well beyond the context window of the base model. There are additional benefits w.r.t. context rot and raw performance (which is where future work will see the biggest gains) but out the box the immediate benefit is just handling huge contexts almost for free. https://twitter.com/i/web/status/2007489143530561719 https://twitter.com/i/web/status/2007489143530561719" [X Link](https://x.com/a1zhang/status/2007489143530561719) 2026-01-03T16:28Z 21.1K followers, [---] engagements "I was considering waiting a while to polish this first but decided it'd be better to just release an initial version to get better community feedback and squash bugs This is the official RLM repo with native support for cloud-based and local REPLs. https://github.com/alexzhang13/rlm https://github.com/alexzhang13/rlm" [X Link](https://x.com/a1zhang/status/2007566581409144852) 2026-01-03T21:35Z 23K followers, 119.6K engagements "@lakshyaag feel free to make any PRs if you figure out something cool I also should add some more robust workflows for public PRs. but will do this later for now will accept / review jank PRs" [X Link](https://x.com/a1zhang/status/2007569293710372951) 2026-01-03T21:46Z 19.6K followers, [----] engagements "@mitch_troy Yeah we could do this (and also explicitly implement this) for people to play with. Let me see if I can whip something up quickly as a separate type of sandbox" [X Link](https://x.com/a1zhang/status/2007578066730844558) 2026-01-03T22:21Z 19.6K followers, [----] engagements "Im also going to be unofficially answering questions people have on alphaXiv over the course of the next few weeks so please drop and questions or comments on the paper you have directly there :) "Recursive Language Models" A potentially big direction for LLMs in [----] from MIT researchers In their approach a prompt isnt run directly instead its stored as a variable in an external Python REPL and the language model writes code to inspect/slice/decompose that long https://t.co/AFhRb8sd3z "Recursive Language Models" A potentially big direction for LLMs in [----] from MIT researchers In their" [X Link](https://x.com/a1zhang/status/2007586495855776201) 2026-01-03T22:54Z 19.8K followers, 35.3K engagements "@lateinteraction genuinely so much more coming from this lab π [----] is not ready for MIT OASYS" [X Link](https://x.com/a1zhang/status/2007679894596268509) 2026-01-04T05:06Z 20.9K followers, [----] engagements "@Devarsh786 This is likely one of the biggest problems to tackle with multi-LM call systems. Like most systems problems I suspect a lot of these kinds of issues can be solved with asynchrony / clever pipelining of LM calls but its WIP and something I want to fix in the OSS infefence code" [X Link](https://x.com/a1zhang/status/2007802271052996755) 2026-01-04T13:12Z 19.7K followers, [---] engagements "I also want to point out that we did add one of these "things that failed" sections to the Appendix inspired by the YOLOv3 paper More of a future Q but for writing papers would ppl prefer I make it long and detailed or just short and straight to the point Much like the switch in [----] from language models to reasoning models we think [----] will be all about the switch to Recursive Language Models (RLMs). It turns out that models can be far more powerful if you allow them to treat *their own prompts* as an object in an external https://t.co/6jCyZiLeQl Much like the switch in [----] from language" [X Link](https://x.com/a1zhang/status/2007867323714261470) 2026-01-04T17:30Z 19.9K followers, 12.6K engagements "Completely fair point def did not mean to make it clickbait haha IMO from my view "Recursive LM" sounds like an LM that is calling an LM rather than an in-architecture component (e.g. in AlphaFold3 they do what you're likely thinking of as an architecture change and call it "recycling"). I think overall LLM + scaffold = new model likely isn't going to catch on it's just in this specific instance the word "recursion" implies something else. https://twitter.com/i/web/status/2007953477318988066 https://twitter.com/i/web/status/2007953477318988066" [X Link](https://x.com/a1zhang/status/2007953477318988066) 2026-01-04T23:13Z 19.7K followers, [---] engagements "For those interested in making OSS contributions to the RLM repo I've added a bunch of random thoughts and TODOs of what to add in a *messy* Markdown file on the GH repo. Feel free to tackle any of them or any other things you think are meaningful. I'll be pretty active here or on the repo. Once I finish some other related work I might open up a Discord channel or something for people who want to make longer standing contributions to the repo / discuss the direction of where to take it. Cheers https://github.com/alexzhang13/rlm/blob/main/CONTRIBUTING.md" [X Link](https://x.com/a1zhang/status/2007968184565928331) 2026-01-05T00:11Z 19.9K followers, 16.6K engagements "@_serinuntius Merged :) Very embarrassing on my part to not have caught that hopefully most people who forked fixed it on their own" [X Link](https://x.com/a1zhang/status/2008720331557794219) 2026-01-07T02:00Z 19.8K followers, [--] engagements "@alex__mackenzie Do you have an open repo for this UI I quite like it haha" [X Link](https://x.com/a1zhang/status/2008989857750860095) 2026-01-07T19:51Z 19.8K followers, [----] engagements "I'm still unsure about what exactly the UI for an RLM trajectory should look like. I want it to be extremely easy for someone to just look at (ideally with no clicking) and just see what the model is doing. The hard part is for visualizing sub-calls (especially if those sub-calls are RLMs and not LMs) but lmk if you have thoughts on this https://twitter.com/i/web/status/2008991712304136275 https://twitter.com/i/web/status/2008991712304136275" [X Link](https://x.com/a1zhang/status/2008991712304136275) 2026-01-07T19:58Z 22K followers, [---] engagements "@SwishMoe @lateinteraction Super cool This is also an open issue on the RLM repo to add multimodal support if youre interested :)" [X Link](https://x.com/a1zhang/status/2010513525303586854) 2026-01-12T00:45Z 21.4K followers, [----] engagements "I suppose I work at DeepMind now sorry @lateinteraction DeepMind just did the unthinkable. They built an AI that doesn't need RAG and it has perfect memory of everything it's ever read. It's called Recursive Language Models and it might mark the death of traditional context windows forever. Here's how it works (and why it matters https://t.co/mWc5EpKe59 DeepMind just did the unthinkable. They built an AI that doesn't need RAG and it has perfect memory of everything it's ever read. It's called Recursive Language Models and it might mark the death of traditional context windows forever. Here's" [X Link](https://x.com/a1zhang/status/2010756361940852908) 2026-01-12T16:50Z 23K followers, 120.1K engagements "im sorry guys for messing up your feeds New paper dropped by Anthropic: "Fractal Language Models" It DESTROYS the context window narrative. The LLM doesn't just respond it splits into self similar copies No tokens but models arguing compressing until the prompt is not read but self reconstructed /satire @a1zhang https://t.co/qSIzdvynw4 New paper dropped by Anthropic: "Fractal Language Models" It DESTROYS the context window narrative. The LLM doesn't just respond it splits into self similar copies No tokens but models arguing compressing until the prompt is not read but self reconstructed" [X Link](https://x.com/a1zhang/status/2012974062281105562) 2026-01-18T19:43Z 23.1K followers, 72.8K engagements "@yoavgo sadly incentives" [X Link](https://x.com/a1zhang/status/2013045785085042884) 2026-01-19T00:28Z 22.8K followers, [----] engagements "Be sure to check this out RLMs + DSPy :p The dspy.RLM module is now released π Install DSPy 3.1.2 to try it. Usage is plug-and-play with your existing Signatures. A little example of it helping @lateinteraction and I figure out some scattered backlogs: https://t.co/Avgx04sNJP The dspy.RLM module is now released π Install DSPy 3.1.2 to try it. Usage is plug-and-play with your existing Signatures. A little example of it helping @lateinteraction and I figure out some scattered backlogs: https://t.co/Avgx04sNJP" [X Link](https://x.com/a1zhang/status/2013379266545615130) 2026-01-19T22:33Z 23.2K followers, 22.6K engagements "yep; but what about conditional parallel tool calling or doing tool / sub calls only if the entries you're looking at contain some special property we could continue adding specialized functionality to serve each of these purposes or just let the model write the code to do this" [X Link](https://x.com/a1zhang/status/2014456533103608063) 2026-01-22T21:53Z 23K followers, [---] engagements "does anyone have any favorite blogs / papers that goes into detail about successfully FT / RLing a model on a domain-specific task like specifically just a small task nothing super fancy that requires an exorbitant amount of data" [X Link](https://x.com/a1zhang/status/2009743506311381341) 2026-01-09T21:46Z 23.3K followers, 16.6K engagements "guys ok weve really lost the plot here LOL looks like Ive left DeepMind and returned to MIT but now Ive changed the name to Recursive Meta Cognition R.I.P. basic prompting. MIT just dropped a technique that makes ChatGPT reason like a team of experts instead of one overconfident intern. Its called Recursive Meta-Cognition and it outperforms standard prompts by 110%. Heres the prompt (and why this changes everything) π https://t.co/caDTB52TS9 R.I.P. basic prompting. MIT just dropped a technique that makes ChatGPT reason like a team of experts instead of one overconfident intern. Its called" [X Link](https://x.com/a1zhang/status/2011972152388407640) 2026-01-16T01:21Z 29.2K followers, 112.4K engagements "Fundamentally what really is the difference between an RLM and S=context folding Codex Claude Code Terminus agents etc. This is the last and most important RLM post I'll make for a while to finally answer all the "this is trivially obvious" from HackerNews Reddit X etc. I know there's a lot of noise rn but this is the one thread I'd rly ask you not to skip For a while I didn't have a super clear answer to this. and no it's NOT that: [--]. CC sub-agents are user-defined while the LM defines the sub-agent in RLMs. this is a minor difference that I suspect Anthropic will phase out at some point 2." [X Link](https://x.com/a1zhang/status/2014337263287804260) 2026-01-22T14:00Z 23.5K followers, 49.4K engagements "This will be my second time coming on alphaXiv for RLMs (first time being after the blogpost) so like last time I'll try to answer everything / have good unfiltered conversation What if LLMs could reason over arbitrarily long inputs Join us tomorrow for our AI4Science talk with @a1zhang on Recursive Language Models (RLMs) a new inference time strategy that scales far beyond traditional context windows Register in the link below π https://t.co/gpI56SrDus What if LLMs could reason over arbitrarily long inputs Join us tomorrow for our AI4Science talk with @a1zhang on Recursive Language Models" [X Link](https://x.com/a1zhang/status/2014493912430207079) 2026-01-23T00:22Z 23.3K followers, 17.7K engagements "I've been super excited to get Daytona sandboxes support integrated into the RLM repository (which now works in the official repo). Check out this awesome guide on RLMs with *depth1* with code examples from @daytonaio https://www.daytona.io/docs/en/recursive-language-models/ We just dropped our guide to Recursive Language Models. Where every agent and sub-agent gets its own @daytonaio sandbox - at UNLIMITED recursion depth π€― https://t.co/vFGNS9gzkc https://www.daytona.io/docs/en/recursive-language-models/ We just dropped our guide to Recursive Language Models. Where every agent and sub-agent" [X Link](https://x.com/a1zhang/status/2015820458709471640) 2026-01-26T16:13Z 23.4K followers, 13.7K engagements "We have lots of plans in [----] for GPU MODE please read if you're interested in joining and/or helping us out :) GPU MODE 2026: were post-training Kernel LLMs in public and are building all the infra we need to make GPU programming more accessible to all. We're doing this in close collaboration with some of my favorite communities @PrimeIntellect @modal and @LambdaAPI [----] recap: 26K GPU MODE 2026: were post-training Kernel LLMs in public and are building all the infra we need to make GPU programming more accessible to all. We're doing this in close collaboration with some of my favorite" [X Link](https://x.com/a1zhang/status/2015830101842128909) 2026-01-26T16:52Z 23.4K followers, [----] engagements "Also most people won't read this but Mark has a much nicer and expanded version of his tweet linked (also can go to GPU MODE website and look in news tab) https://t.co/2jC070VX9U https://t.co/2jC070VX9U" [X Link](https://x.com/a1zhang/status/2015878196734964146) 2026-01-26T20:03Z 23.4K followers, [----] engagements "@raw_works Oh cool I can play around with this on some evals of my own Did you put up a PR for this" [X Link](https://x.com/a1zhang/status/2016567148567572741) 2026-01-28T17:40Z 23.5K followers, [----] engagements "PSA I've left MIT for Harvard now π¨ Your AI is lying to you with complete confidence. Harvard & MIT just proved ChatGPT hallucinates 110% less when you force it to argue with itself. The technique is called "Recursive Meta-Cognition" and it's embarrassingly simple. Here's how to make AI actually think: https://t.co/oXKqvcEpbw π¨ Your AI is lying to you with complete confidence. Harvard & MIT just proved ChatGPT hallucinates 110% less when you force it to argue with itself. The technique is called "Recursive Meta-Cognition" and it's embarrassingly simple. Here's how to make AI actually think:" [X Link](https://x.com/a1zhang/status/2016568049407554011) 2026-01-28T17:44Z 23.3K followers, 34.3K engagements "@irl_danB @lateinteraction Ah yes the visualizer can be found in the codebase. I am in the process of figuring out the best way to upload these trajectories / uniformly format them but I have a sneak peak up: https://alexzhang13.github.io/rlm-examples/ https://alexzhang13.github.io/rlm-examples/" [X Link](https://x.com/a1zhang/status/2016965697008468004) 2026-01-29T20:04Z 23.3K followers, [---] engagements Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
@a1zhang Alex L ZhangAlex L Zhang posts on X about future, ai, environment, inference the most. They currently have [------] followers and [---] posts still getting attention that total [-----] engagements in the last [--] hours.
Social category influence technology brands 15% stocks 6% social networks 4% finance 3% gaming 3% cryptocurrencies 1%
Social topic influence future 7%, ai 6%, environment #541, inference 5%, if you 5%, llm 5%, we are 4%, context window 4%, in the 3%, this is 3%
Top accounts mentioned or mentioned by @gpumode @lateinteraction @raw_works @shihwesley @simonguozirui @msirovatka @philipp__k @amd @swishmoe @tensorfi @socialtranxiety @rawworks @marksaroufim @noahziems @jacobli99 @primeintellect @laudeinstitute @sashimikun_void @dotdotjames @irl_danb
Top assets mentioned Alphabet Inc Class A (GOOGL) Dell Technologies, Inc. (DELL) Oasys (OAS)
Top posts by engagements in the last [--] hours
"another related direction Ill be paying attention to this year :) Memory is probably the biggest challenge for building practical AI agents. Thrilled to share our work exploring a shift from manually defining memory for each domain enabling agents to design better memory mechanisms for themselves. Meta-learning memory designs unlocks Memory is probably the biggest challenge for building practical AI agents. Thrilled to share our work exploring a shift from manually defining memory for each domain enabling agents to design better memory mechanisms for themselves. Meta-learning memory designs"
X Link 2026-02-11T20:34Z 29.3K followers, 28K engagements
"The prompt being symbolic is extremely important and I think theres a bit of confusion on what that means. We are claiming that for the REPL / environment that the RLM interacts with its inputs (i.e. prompt and we call it this way to emphasize the idea that an RLM is by construction a language model) needs to have some kind of symbolic handle in this environment to be used. Putting the book in a file is an equivalent statement. The file is the symbolic handle to the RLM prompt which the RLM can access through its REPL. This point is made even more clear in using a bash file system as the"
X Link 2026-02-12T22:35Z 29.3K followers, [----] engagements
"@socialtranxiety yeah it's called recursive meta cognition by some folks at deepmind"
X Link 2026-02-07T22:38Z 29.3K followers, [----] engagements
"This is true if youre clear on what is an instruction and what is raw data. In the blog we explain that we separate out the query (put directly in context) and the raw data. In my RLM implementation on GitHub this is also specified in this way (i.e. there is an option to pass in instructions directly to the LM). This is all a convenience in cases where you know exactly what the instructions are. This is not a safe assumption to make in many cases though. The reason we dont distinguish these two in the definition of an RLM is that in many cases it is not easy to distinguish nor is it"
X Link 2026-02-13T13:43Z 29.3K followers, [---] engagements
"πππ @a1zhang apologies from my content team @a1zhang apologies from my content team"
X Link 2026-01-16T18:13Z 29.3K followers, 13K engagements
"Specifically we were lucky that LongBench-Pro a separate source of long-context problems separate from our eval tasks was released recently. We are excited about future results that train larger models as RLMs but the fact that this small model picks up these recursive strategies from a tiny amount of data is pretty encouraging Read more in appendix A of the paper: https://arxiv.org/pdf/2512.24601 https://arxiv.org/pdf/2512.24601"
X Link 2026-01-29T17:16Z 29K followers, [----] engagements
"The model was trained trajectories using a fixed system prompt and following the structure of our RLM repo. We recommend using vLLM with our inference code to use it out of the box. Open model: https://huggingface.co/mit-oasys/rlm-qwen3-8b-v0.1 https://github.com/alexzhang13/rlm https://huggingface.co/mit-oasys/rlm-qwen3-8b-v0.1 https://github.com/alexzhang13/rlm"
X Link 2026-01-29T17:16Z 29.1K followers, [----] engagements
"Second we expanded the writeup with an extra section (and corresponding results). Building on an earlier discussion we solidify the three defining properties of an RLM: [--]. A symbolic handle to the prompt [--]. Access to a persistent Turing-complete environment that contains this handle [--]. The ability to perform symbolic recursion within this environment We also add new results for a CodeAct baseline that has access to sub-calls. This contrasts RLMs to existing agents with sub-calls. (Spoiler: the latter works much less effectively as you scale up the context length.)"
X Link 2026-01-29T17:16Z 29.2K followers, [----] engagements
"We are excited by all the support and future work to be done on RLMs Thanks again to Laude Institute Prime Intellect and Modal for their support of this research. Please let us know how RLMs do in your own domains and where they can improve :) https://arxiv.org/abs/2512.24601 https://arxiv.org/abs/2512.24601"
X Link 2026-01-29T17:16Z 29.2K followers, [----] engagements
"I came across this work that implicitly implements an RLM in a DSL that executes both code and natural language instructions. Super cool https://elliecheng.com/blog/2026/01/20/enabling-rlm-with-shared-program-state/ https://elliecheng.com/blog/2026/01/20/enabling-rlm-with-shared-program-state/"
X Link 2026-01-30T17:10Z 29.1K followers, 10.3K engagements
"Some of the benchmarks in our paper provide simple patterns where this isn't the case (e.g. OOLONG where you have to chunk and loop over sub-calls) and Claude Code wouldn't natively do this without specific instruction. There is also the more obvious friction of trying to apply Claude Code to a generic task and it being 1) overkill and 2) an awkward interface for doing so But yep I also think current models work as RLMs but the real long term value is in performance down the road following this strategy :)"
X Link 2026-02-08T05:33Z 28.9K followers, [----] engagements
"hmm Recursive Language Models (RLMs) let agents manage 10M+ tokens by delegating tasks recursively. This Google Cloud Community Article explains why ADK was the perfect choice for re-implementing the original RLM codebase in a more enterprise-ready format https://t.co/p3MsNtLVJL https://t.co/CBMj1xbxD3 Recursive Language Models (RLMs) let agents manage 10M+ tokens by delegating tasks recursively. This Google Cloud Community Article explains why ADK was the perfect choice for re-implementing the original RLM codebase in a more enterprise-ready format https://t.co/p3MsNtLVJL"
X Link 2026-02-08T20:19Z 29.3K followers, 113K engagements
"If you're dealing with a 100M context window there wasn't anything to cache to begin with. You'd want a natural way for an LM to handle this which an RLM provides. As for general token costs RLMs can allow handling a system without the individual LM calls looking at the entire context. Suppose there exists an LM that can ingest 100M context windows -- it is forced to look at all of it even if it doesn't need to. https://twitter.com/i/web/status/2020627077624742021 https://twitter.com/i/web/status/2020627077624742021"
X Link 2026-02-08T22:33Z 29K followers, [---] engagements
"@lateinteraction @thkostolansky I hand drew mine in high school LOL"
X Link 2026-02-12T22:27Z 29.3K followers, [----] engagements
"Fair questions dont worry I would point you to a bunch of prior tweets or videos Ive talked on but theyre scattered at this point. [--]. RLM = subagents perhaps it is an argument on how exactly a subagent calling system should look (Im not claiming optimality or anything its more based on intuition). [--]. CC currently doesnt do exactly what the RLM describes. I think this is a misconception and I would hope (its a beta feature afaik) it moves closer to this. CC explicitly calls sub-agents as a tool (eg. Opus [---] will directly output JSON(call sub agent)) which differs from writing programs in"
X Link 2026-02-14T01:14Z 29.3K followers, [----] engagements
"Providing some responses I want to preface this all by saying the comparison of CC to RLM is a little fuzzy and not the correct framing to me because a lot of CC is a highly post-trained task-specific scaffold that shares a lot of similarities to the defn of an RLM. In fact there are small tweaks that can be made to CC for it to fit the definition of an RLM. To illustrate this point there are a few existing plugins for integrating RLMs into CC / OC. [--]. Agreed except there are some notable limitations of CC-style sub-agents. It relies on the root model being entirely correct about its order /"
X Link 2026-02-14T03:47Z 29.3K followers, [---] engagements
"What if scaling the context windows of frontier LLMs is much easier than it sounds Were excited to share our work on Recursive Language Models (RLMs). A new inference strategy where LLMs can decompose and recursively interact with input prompts of seemingly unbounded length as a REPL environment. On the OOLONG benchmark RLMs with GPT-5-mini outperforms GPT-5 by over 110% gains (more than double) on 132k-token sequences and is cheaper to query on average. On the BrowseComp-Plus benchmark RLMs with GPT-5 can take in 10M+ tokens as their prompt and answer highly compositional queries without"
X Link 2025-10-15T14:32Z 29.3K followers, 951.1K engagements
"Much like the switch in [----] from language models to reasoning models we think [----] will be all about the switch to Recursive Language Models (RLMs). It turns out that models can be far more powerful if you allow them to treat their own prompts as an object in an external environment which they understand and manipulate by writing code that invokes LLMs Our full paper on RLMs is now availablewith much more expansive experiments compared to our initial blogpost from October [----] https://arxiv.org/pdf/2512.24601 https://arxiv.org/pdf/2512.24601"
X Link 2026-01-02T21:14Z 29.3K followers, 2M engagements
"RLMs are our bitter-lesson-pilled approach to inference-time scaling and they can scale the context size of LLMs by orders of magnitude From the outside an RLM exposes the same interface as a language model. It accepts a string prompt and produces a string response. But internally RLMs do not feed the prompt directly to the Transformer. Instead they set up the LLM in a REPL environment where the prompt is placed into a variable and then allow the LLM to write code to peek into break up and recursively invoke itself over snippets of the prompt."
X Link 2026-01-02T21:14Z 29.3K followers, 61.9K engagements
"@DavidFSWD @lateinteraction awesome well also be releasing some code for people to play with soon"
X Link 2026-01-03T05:33Z 29.3K followers, 13.5K engagements
"We just updated the RLM paper with some new stuff. First we just released RLM-Qwen3-8B the first natively recursive language model (at tiny scale). We post-trained Qwen3-8B using only [----] RLM trajectories from unrelated domains to our evaluation benchmarks. RLM-Qwen3-8B works well across several tasks and delivers a pretty large boost over using an RLM scaffold with the underlying Qwen3-8B model off-the-shelf and even larger gains over directly using Qwen3-8B directly for long-context problems. https://twitter.com/i/web/status/2016923294461476873"
X Link 2026-01-29T17:16Z 29.3K followers, 67.6K engagements
"This is bar for bar one of the core pieces of intuition behind how we came up with RLMs why they currently work and why they are so promising for this year Lots of exciting stuff will be released soon :) Why do coding agents work so well and what would it take to replicate their success in other domains One important and under-appreciated reason is that agentic coding is a type of neurosymbolic AI. The main weakness of LLMs is that they are statistical machines and struggle at Why do coding agents work so well and what would it take to replicate their success in other domains One important"
X Link 2026-02-02T20:12Z 29.3K followers, 30.2K engagements
"while procrastinating on research I decided it's finally time to add RLMs to pypi pip install rlms"
X Link 2026-02-07T22:27Z 29.3K followers, 79.5K engagements
"Maybe I can provide some intuition but lmk if its unclear I am trying to refine how I explain this anyways To start I think the RLM idea is super simple but elegant (I'm biased obviously). The paper argues that future language models 1) do not need to think about context window limits; 2) will have reasoning chains that mix code (symbolic) and neural LMs (fuzzy). RLMs are what we think minimally such a system should look like. Explicitly it is an LM REPL + prompt where the REPL contains the prompt and sub-agents as a function inside the REPL. This last part is quite important because it"
X Link 2026-02-08T05:13Z 29.3K followers, 59.2K engagements
"RT @raw_works: out of curiosity after reading this i started benchmarking rlm and dspy.rlm on longmemeval tl;dr - i think i might have a"
X Link 2026-02-11T00:30Z 29.3K followers, [--] engagements
"π Whos up to build an RLM agent for ARC-AGI-3 Bounty available https://github .com/arcprize/ARC-AGI-3-Agents Whos up to build an RLM agent for ARC-AGI-3 Bounty available https://github .com/arcprize/ARC-AGI-3-Agents"
X Link 2026-02-12T22:27Z 29.3K followers, [----] engagements
"Funnily enough I tried to dabble with ARC AGI before and with very little success Super cool to well designed RLMs achieving SOTA :) We set a new ARC-AGI-2 SotA: 85.28% using an Agentica agent (350 lines) that writes and runs code. https://t.co/tohFfBZb2P We set a new ARC-AGI-2 SotA: 85.28% using an Agentica agent (350 lines) that writes and runs code. https://t.co/tohFfBZb2P"
X Link 2026-02-12T22:43Z 29.3K followers, 17.7K engagements
"Weve extended the ML Valentine's π dataset and found it a new home: https://t.co/VzSbaH2sl2 β€π€ My favorite this year @a1zhangs RLM π Because I think about it over and over and over again. https://t.co/XrWLPrhkC0 Weve extended the ML Valentine's π dataset and found it a new home: https://t.co/VzSbaH2sl2 β€π€ My favorite this year @a1zhangs RLM π Because I think about it over and over and over again. https://t.co/XrWLPrhkC0"
X Link 2026-02-14T20:32Z 29.3K followers, [----] engagements
"We @AuricSource solved 8/10 problems from the #1stProof benchmark (Abouzaid et al. arXiv:2602.05192) all with Lean [--] formal verification. Q4 & Q6: substantial partial QED with precise remaining gaps. The twist AI agents did the heavy lifting reasoning proving and https://t.co/clorfPuy76 We @AuricSource solved 8/10 problems from the #1stProof benchmark (Abouzaid et al. arXiv:2602.05192) all with Lean [--] formal verification. Q4 & Q6: substantial partial QED with precise remaining gaps. The twist AI agents did the heavy lifting reasoning proving and https://t.co/clorfPuy76"
X Link 2026-02-16T22:34Z 29.3K followers, [----] engagements
"Also for cuEquivariance kernels we explicitly ban them because they are closed source and this competition in particular focuses on inference which has a lot of room to be faster. We only use them as a benchmark reference. We dont use them as the correctness reference because they have precision errors w.rt. a pure FP32 PyTorch implementation"
X Link 2025-06-25T03:17Z 12.3K followers, [---] engagements
"Ive been looking forward to this one for a long time LOL @simonguozirui I'm giving a talk at GPU mode tomorrow. Feel free to join the livestream: https://t.co/2UWmxdjNEc I'm giving a talk at GPU mode tomorrow. Feel free to join the livestream: https://t.co/2UWmxdjNEc"
X Link 2025-06-28T18:20Z 12.3K followers, [---] engagements
"Very much a noob question but for benchmarking CUDA code speed we generally have to clear caches so multiple repeated runs are fair. If I were to benchmark CPU code speed (e.g. on AlgoTune) does a similar principle apply And how easy is it to do this in say Python"
X Link 2025-07-11T09:02Z 12.4K followers, [----] engagements
"As a reminder the first game is just clicking on a green button. Many games in the VideoGameBench list require clicking on buttons characters etc. The mapping from text actions -- spatial coordinates is non-trivial and all evaluated models but Claude [---] struggle"
X Link 2025-07-12T01:31Z 12.4K followers, [--] engagements
"The second game requires moving a square through a grid-world maze using the arrow-keys. Solving these games directly translates to games like Pokemon where the agent must navigate around a map. Surprisingly though all evaluated frontier VLMs struggle on this task"
X Link 2025-07-12T01:31Z 12.4K followers, [--] engagements
"The practice game involves dragging a mouse in a pre-specified pattern. A lot of RTS games like Age of Empires / Civ require this ability. Unlike clicking which requires moving to a specified location the trajectory of moving the mouse matters in this game"
X Link 2025-07-12T01:31Z 12.4K followers, [---] engagements
"Bro actually denied OpenAI an AlphaGo moment LOL @FakePsyho is him. Huge congratsππ"
X Link 2025-07-17T00:22Z 12.4K followers, 13.9K engagements
"LM reasoning benchmark idea: have it beat a Hardcore Nuzlocke run of Pokmon Run & Bun or a Kaizo ROM hack Give it access to search online use damage calculators etc. People spend literally hundreds of hours meticulously planning battles managing their available mons etc"
X Link 2025-07-24T05:23Z 12.4K followers, [----] engagements
"The focus wouldnt be on navigating the world (weve proven this can be done with Gemini Plays Pokemon) rather this is a mix of the recent Pokemon Showdown agents + a much longer horizon reasoning task of planning out runs accounting for bad luck etc"
X Link 2025-07-24T05:23Z 12.4K followers, [---] engagements
"@KLieret Congrats on the launch"
X Link 2025-07-25T06:12Z 12.4K followers, [---] engagements
"if anyone can help no idea why I dont know this but is there free TTS software where you throw in a huge website or PDF (long doc say like the CUDA programming guide long) and it spits out reasonable audio that isnt just monotone reading"
X Link 2025-08-02T16:48Z 12.5K followers, [----] engagements
"in addition to all the amazing content for the next [--] weeks on GPU MODE there's also this amazing 5-week course starting soon on all the juicy secrets to 100B+ scale model training there's a shit ton of content in here and a ton of amazing lecturers + free compute"
X Link 2025-08-20T13:41Z 12.6K followers, [----] engagements
"Excited to announce the SECOND @GPU_MODE x @AMD $100K kernel competition: β‘DISTRIBUTED KERNELS You now get free access to a full 8xMI300 node to optimize all2all gemm + reducescatter and allreduce + gemm kernels -- all relevant to frontier LMs Go compete now π§΅"
X Link 2025-08-27T18:35Z 12.9K followers, [----] engagements
"hi if youre interested in using or writing mega kernels for AI (one big GPU kernel for an entire model) you should tune in to todays @GPU_MODE livestream today in [--] hours we have the authors of MPK talking about their awesome new compiler for mega kernels see you there :)"
X Link 2025-09-13T16:19Z 13K followers, 17.1K engagements
"Claude [---] Sonnet cleans up all the UI for our hacky typescript leaderboard in minutes π btw the @GPU_MODE kernel competition is live NOW we have tons of available gpus (free of charge) hosted on @modal_labs -- see @charles_irl's triton code taking π₯ here (you could be next)"
X Link 2025-02-25T00:14Z 11.2K followers, [----] engagements
"More cracked submissions to the @AMD x @GPU_MODE leaderboard β18k+ submissions since the beginning πFP8 GEMM: A battle btwn Seb and Snektron for π₯ with a 25% faster kernel since [--] weeks ago π€― Single-device MoE: multiple ppl are now 100-600x faster than PyTorch ref A few days after @AnushElangovan's tweet about a crazy-fast 183.429s kernel for FP8 GEMM on MI300X an EVEN FASTER submission has emergedβ‘ We're already at 5000+ submissions in just the first week of the $100k (cash π΅ btw) AMD MI300X kernel-writing competition -- join now https://t.co/qCaQzgIuw3 A few days after @AnushElangovan's"
X Link 2025-05-09T19:04Z 23.1K followers, [----] engagements
"I'll be at the AMD Advancing AI conference on June [--] where we'll be announcing the winners of the $100K @AMD x @GPU_MODE competition ALSO the amazing @marksaroufim and @m_sirovatka are presenting on writing fast kernels so pull up π¨DM if you wanna meet up :)"
X Link 2025-06-08T04:01Z 23.1K followers, [----] engagements
"kind of a surreal moment being on stage with Lisa Su as she announces & thanks us for the competition we built the past year building w/ @m_sirovatka @marksaroufim Ben & Erik (all in our free time :p) on @GPU_MODE has been genuinely incredible cant thank you guys enough β€"
X Link 2025-06-13T16:15Z 23.1K followers, 21.7K engagements
"Announcing a new @GPU_MODE kernel writing competition: our first featuring both NVIDIA and AMD hardware The first problem will be the Triangle Multiplication operator essential to the AlphaFold 𧬠models It's a particularly tricky problem with no good public implementation"
X Link 2025-06-24T21:37Z 23.1K followers, 26.7K engagements
"sadly wont be at ICML but have [--] papers that you should check out KernelBench which @simonguozirui will be presenting at the main conference _ + the @GPU_MODE leaderboards OSS infra at the CODEML workshop (7/19) that @m_sirovatka will be giving an oral for Lots of πΏ"
X Link 2025-07-13T13:44Z 23.1K followers, [----] engagements
"announcing the @GPU_MODE x @scaleml summer speaker series happening next week a 5-day series where top researchers will teach about the algorithmic and systems-level advances that underpin gpt-oss all content will be live-streamed & recorded for FREE on GPU MODE's YouTube"
X Link 2025-08-18T18:48Z 23.1K followers, 29.6K engagements
"We are live This will be a super long session with two amazing speakers so feel free to stop by and ask any questions you may have :) π: https://www.youtube.com/watchv=LMk8nqIFXLo We are ending strong with GPU Programming π [--] talks today back to back First @exists_forall for intro to CUDA and then @simran_s_arora for Thunder Kittens π Today at: 1:00pm EST / 11:00am PT - https://t.co/2S7r5oxzI1 https://t.co/hcQRVFeYQz https://www.youtube.com/watchv=LMk8nqIFXLo We are ending strong with GPU Programming π [--] talks today back to back First @exists_forall for intro to CUDA and then"
X Link 2025-08-29T17:08Z 16.3K followers, [----] engagements
"it's insane to me how little attention the llm.q repo has it's a fully C/C++/CUDA implementation of multi-gpu (zero + fsdp) quantized LLM training with support for selective AC it's genuinely the coolest OSS thing I've seen this year (what's crazier is [--] person wrote it)"
X Link 2025-09-30T22:58Z 16.5K followers, 41.8K engagements
"the AF3 trimul kernel @GPU_MODE competition has ended - big congrats to the winners we wrote a blogpost on the GPU MODE website detailing 1) why this particular problem is so nasty to optimize 2) how we intended for it to go 3) what participants wrote https://www.gpumode.com/v2/news https://www.gpumode.com/v2/news"
X Link 2025-10-07T18:23Z 23.1K followers, [----] engagements
"Blogpost: There are several examples of interesting behavior that emerge from RLMs that can be found in the blogpost. We wrote a visualizer to make these examples clearer and highlight different strategies that these models can take. OOLONG: BCP: Id like to thank my wonderful advisor @lateinteraction the person Ive spammed on Slack while working on this @noahziems and the rest of my labmates @jacobli99 @dianetc_ for their support and discussion in this project https://arxiv.org/abs/2508.06600 https://openreview.net/forumid=lrDr6dmXOX https://alexzhang13.github.io/blog/2025/rlm/"
X Link 2025-10-15T14:32Z 16.3K followers, 23.6K engagements
"So you can think of RLMs as a generalization of this idea and several others. In our post we wanted to show that code execution was the right instantiation of this more general idea. Most prior works don't really consider the context / prompt our the root LM in the way that we've been framing it (which makes sense they're very agentic in nature). In terms of why the REPL environment is different than the setup you described in some sense we don't want to manually engineer tools for the RLMs. We want full flexibility and using code is the most flexible form of this (e.g. using regex queries to"
X Link 2025-10-15T17:14Z 16.3K followers, [----] engagements
""Why is this not just an agent with access to your file system e.g. something like SWE-Agent" Perhaps because we implemented it with a REPL environment there seems to be some link to coding but an RLM is entirely task-agnostic. Think of it as an extension of the bitter lesson -- our design of how we handle context (e.g. how we design agents) should entirely be up to an LM not a human. When you think about an agent with file system or terminal access it is generally given tools to look around some codebase execute code write tests etc. An RLM like an LM call is a function from text -- text."
X Link 2025-10-15T18:59Z 16.3K followers, 31K engagements
"Lots of folks have been asking for a gist or simple notebook to try out RLMs. While we work on some more exciting experiments here's a self-contained minimal version I quickly put together for people to build on top of. Happy hacking :) https://github.com/alexzhang13/rlm https://github.com/alexzhang13/rlm"
X Link 2025-10-16T22:18Z 16.4K followers, 31.4K engagements
"The last thing I'll clarify before I log off for a bit to go study for my midterms Q1: Why has such a simple idea like RLMs not been formalized / used in this way Honestly I think a lot of the replies are the answer to this question. There is this hard stuck notion of "agents" in our field -- the idea that an LM call is the smallest unit / function of execution we have available and everything on top has to be some kind of human-designed scaffold to get it to work. We happened to stumble upon the idea of handling huge contexts when we designed agents for SWE but because we were so insistent"
X Link 2025-10-17T05:21Z 16.4K followers, 82.1K engagements
"context switching away from RLMs for today but not entirely from the blog format feeling insane FOMO that I can't be at the @GPU_MODE IRL event or PTC this year but in other slightly related news it's been [--] year since KernelBench @simonguozirui and I are releasing a long post tmr (when the IRL event is happening) on our perspective on what's happened since with all the ideas & conclusions we've come to about GPU codegen unpacking all the good (and reward hacked) results that have emerged and the shift from performance to correctness hopefully I at least see lots of pictures of the event from"
X Link 2025-10-23T15:19Z 16.5K followers, [----] engagements
"Thanks for having me on I also answered a bunch about what Ive been thinking and some more expanded thoughts on RLMs vs. X in Claude Code CodeAct Tool calling etc. Happy to answer anything else too One of our best talks yet. Thanks @a1zhang for the amazing presentation + Q&A on Recursive Language Models If you're interested in how we can get agents to handle near-infinite contexts this one is a must. Watch the recording here https://t.co/v3N2mHoeSU One of our best talks yet. Thanks @a1zhang for the amazing presentation + Q&A on Recursive Language Models If you're interested in how we can get"
X Link 2025-10-31T23:45Z 21.9K followers, [----] engagements
"The wait is over Were so excited to announce the @GPU_MODE x @NVIDIA kernel optimization competition for NVFP4 kernels on Blackwell B200s We will be awarding NVIDIA DGX Sparks & RTX 50XX series GPUs for individual rankings on each problem as well as a Dell Pro Max with NVIDIA GB300 (an absolutely insane prize btw) for the grand prize winner of the entire competition Winners will also be invited to GTC [----] for free. The competition will feature [--] different single-device kernels over a [--] month period; we highly encourage competitors to try out and use CuTe DSL / CUTLASS [---] on these problems"
X Link 2025-11-03T19:48Z 16.5K followers, 16.6K engagements
"lots of fun announcements / news from @GPU_MODE this week i sadly wasn't able to go but they've written up a wonderful blogpost on the IRL hackathon in SF a couple weeks back + all the cool winner projects. go have a read on the website"
X Link 2025-11-05T22:43Z 16.5K followers, [----] engagements
"very exciting news congrats to @hardmaru and the rest of the super talented folks at Sakana on the raise (if u have the option to intern there you should i learned sm there pre-phd) Announcing our Series B π https://t.co/6BpYSq5uc4 https://t.co/QvVbNpGPei Announcing our Series B π https://t.co/6BpYSq5uc4 https://t.co/QvVbNpGPei"
X Link 2025-11-17T00:06Z 16.4K followers, 39.9K engagements
"huge step towards solving low-bit (NVFP4 / FP4) training. FP8 training has been a thing for a while but anything lower has traditionally been unstable not affiliated at all but I've seen Jack in the weeds for months trying out different strategies and writing CUTLASS kernels to get this to work fast too Training LLMs with NVFP4 is hard because FP4 has so few values that I can fit them all in this post: [--] [---] [--] [---] [--] [--] [--] [--]. But what if I told you that reducing this range even further could actually unlock better training + quantization performance Introducing Four https://t.co/T39wKIFF4O"
X Link 2025-12-02T18:00Z 16.3K followers, 16.9K engagements
"Super cool work I'm very excited to see more efforts getting sandboxing / inference for RLMs to work We're also cooking something for this soon :) Couldn't get the recursive language model idea out of my head after seeing @JoshPurtell post about @a1zhang and @lateinteraction's research.so here's Aleph https://t.co/6L132vZv9Q API-less (Claude Desktop Cursor Windsurf friendly) MCP server Couldn't get the recursive language model idea out of my head after seeing @JoshPurtell post about @a1zhang and @lateinteraction's research.so here's Aleph https://t.co/6L132vZv9Q API-less (Claude Desktop"
X Link 2025-12-15T20:41Z 16.4K followers, [----] engagements
"There's a lot of important things But just to name the obvious ones: The goal in the end is 1) a pluggable standard either library or engine to slot in any LM API or local LM and run it as an RLM (also manage the sandbox handle the LM calls so they're async etc.) and 2) RLMs lend themselves well as a new axis of scaling reasoning. We want to standardize our design of (1) so the input/output pairs that an LM sees when used as an RLM are predictable so we can apply a lot of the recent training techniques like Quiet-STaR to train models that behave efficiently as RLMs"
X Link 2025-12-15T20:55Z 16.4K followers, [---] engagements
"We experiment on several different tasks of varying levels of complexity with one closed and one open LLM. We apply well-known task agnostic baselines including context compaction (summarization) CodeAct with a retriever and of course the base LLM itself. Across all tasks and a closed and open frontier model RLMs outperform other methods at a cheaper or comparable cost"
X Link 2026-01-02T21:14Z 22.8K followers, 14.5K engagements
"We hypothesize that model performance degrades on long context tasks at different rates depending on the complexity of the task. We scale GPT-5 and RLM(GPT-5) performance at input context lengths from 8K to 1M tokens on [--] tasks of increasing difficulty and show that the base model performance degrades pretty dramatically while RLMs maintain strong performance and can handle contexts well beyond the window of the base model. https://twitter.com/i/web/status/2007198924868608360 https://twitter.com/i/web/status/2007198924868608360"
X Link 2026-01-02T21:14Z 22.8K followers, 13.7K engagements
"We are excited by all the current interest and future work on RLMs including the recent blogpost by @PrimeIntellect and @omouamoua. This work would not be possible without support from the @LaudeInstitute and help from labmates @NoahZiems @jacobli99 and @nlp_mit Paper Link: https://arxiv.org/pdf/2512.24601 https://arxiv.org/pdf/2512.24601"
X Link 2026-01-02T21:14Z 19.9K followers, [----] engagements
"on the naming choice of RLMs its fair to say that RLMs arent a new architecture / new NN so the naming is confusing. it was a point of discussion with my advisor @lateinteraction when we were exploring this idea a few months back. I think though that what we think of as a model can be reframed beyond just a NN ultimately a big goal and motivation for the naming is to challenge the idea that a Transformer / NN is the lowest level abstraction of a language model. LMs are great & have gotten us very far but were getting to the regime where theyre very expensive to improve. increasing physical"
X Link 2026-01-02T23:36Z 19.8K followers, 16.5K engagements
"Sorry that line is meant to read as that long = 100M characters. It cant be changed because thats what was used for the experiments but by all means for any future experiments this ambiguity can be removed. The input context variable can be just a string it can be pre-chunked and it can also be an arbitrary object (e.g. a list of strings images etc.). The idea is that for RLMs you just dump inputs and it should be able to handle them appropriately. https://twitter.com/i/web/status/2007488143847542792 https://twitter.com/i/web/status/2007488143847542792"
X Link 2026-01-03T16:24Z 19.7K followers, [----] engagements
"RLMs natively handle prompts well beyond the context window of the base model. There are additional benefits w.r.t. context rot and raw performance (which is where future work will see the biggest gains) but out the box the immediate benefit is just handling huge contexts almost for free. https://twitter.com/i/web/status/2007489143530561719 https://twitter.com/i/web/status/2007489143530561719"
X Link 2026-01-03T16:28Z 21.1K followers, [---] engagements
"I was considering waiting a while to polish this first but decided it'd be better to just release an initial version to get better community feedback and squash bugs This is the official RLM repo with native support for cloud-based and local REPLs. https://github.com/alexzhang13/rlm https://github.com/alexzhang13/rlm"
X Link 2026-01-03T21:35Z 23K followers, 119.6K engagements
"@lakshyaag feel free to make any PRs if you figure out something cool I also should add some more robust workflows for public PRs. but will do this later for now will accept / review jank PRs"
X Link 2026-01-03T21:46Z 19.6K followers, [----] engagements
"@mitch_troy Yeah we could do this (and also explicitly implement this) for people to play with. Let me see if I can whip something up quickly as a separate type of sandbox"
X Link 2026-01-03T22:21Z 19.6K followers, [----] engagements
"Im also going to be unofficially answering questions people have on alphaXiv over the course of the next few weeks so please drop and questions or comments on the paper you have directly there :) "Recursive Language Models" A potentially big direction for LLMs in [----] from MIT researchers In their approach a prompt isnt run directly instead its stored as a variable in an external Python REPL and the language model writes code to inspect/slice/decompose that long https://t.co/AFhRb8sd3z "Recursive Language Models" A potentially big direction for LLMs in [----] from MIT researchers In their"
X Link 2026-01-03T22:54Z 19.8K followers, 35.3K engagements
"@lateinteraction genuinely so much more coming from this lab π [----] is not ready for MIT OASYS"
X Link 2026-01-04T05:06Z 20.9K followers, [----] engagements
"@Devarsh786 This is likely one of the biggest problems to tackle with multi-LM call systems. Like most systems problems I suspect a lot of these kinds of issues can be solved with asynchrony / clever pipelining of LM calls but its WIP and something I want to fix in the OSS infefence code"
X Link 2026-01-04T13:12Z 19.7K followers, [---] engagements
"I also want to point out that we did add one of these "things that failed" sections to the Appendix inspired by the YOLOv3 paper More of a future Q but for writing papers would ppl prefer I make it long and detailed or just short and straight to the point Much like the switch in [----] from language models to reasoning models we think [----] will be all about the switch to Recursive Language Models (RLMs). It turns out that models can be far more powerful if you allow them to treat their own prompts as an object in an external https://t.co/6jCyZiLeQl Much like the switch in [----] from language"
X Link 2026-01-04T17:30Z 19.9K followers, 12.6K engagements
"Completely fair point def did not mean to make it clickbait haha IMO from my view "Recursive LM" sounds like an LM that is calling an LM rather than an in-architecture component (e.g. in AlphaFold3 they do what you're likely thinking of as an architecture change and call it "recycling"). I think overall LLM + scaffold = new model likely isn't going to catch on it's just in this specific instance the word "recursion" implies something else. https://twitter.com/i/web/status/2007953477318988066 https://twitter.com/i/web/status/2007953477318988066"
X Link 2026-01-04T23:13Z 19.7K followers, [---] engagements
"For those interested in making OSS contributions to the RLM repo I've added a bunch of random thoughts and TODOs of what to add in a messy Markdown file on the GH repo. Feel free to tackle any of them or any other things you think are meaningful. I'll be pretty active here or on the repo. Once I finish some other related work I might open up a Discord channel or something for people who want to make longer standing contributions to the repo / discuss the direction of where to take it. Cheers https://github.com/alexzhang13/rlm/blob/main/CONTRIBUTING.md"
X Link 2026-01-05T00:11Z 19.9K followers, 16.6K engagements
"@_serinuntius Merged :) Very embarrassing on my part to not have caught that hopefully most people who forked fixed it on their own"
X Link 2026-01-07T02:00Z 19.8K followers, [--] engagements
"@alex__mackenzie Do you have an open repo for this UI I quite like it haha"
X Link 2026-01-07T19:51Z 19.8K followers, [----] engagements
"I'm still unsure about what exactly the UI for an RLM trajectory should look like. I want it to be extremely easy for someone to just look at (ideally with no clicking) and just see what the model is doing. The hard part is for visualizing sub-calls (especially if those sub-calls are RLMs and not LMs) but lmk if you have thoughts on this https://twitter.com/i/web/status/2008991712304136275 https://twitter.com/i/web/status/2008991712304136275"
X Link 2026-01-07T19:58Z 22K followers, [---] engagements
"@SwishMoe @lateinteraction Super cool This is also an open issue on the RLM repo to add multimodal support if youre interested :)"
X Link 2026-01-12T00:45Z 21.4K followers, [----] engagements
"I suppose I work at DeepMind now sorry @lateinteraction DeepMind just did the unthinkable. They built an AI that doesn't need RAG and it has perfect memory of everything it's ever read. It's called Recursive Language Models and it might mark the death of traditional context windows forever. Here's how it works (and why it matters https://t.co/mWc5EpKe59 DeepMind just did the unthinkable. They built an AI that doesn't need RAG and it has perfect memory of everything it's ever read. It's called Recursive Language Models and it might mark the death of traditional context windows forever. Here's"
X Link 2026-01-12T16:50Z 23K followers, 120.1K engagements
"im sorry guys for messing up your feeds New paper dropped by Anthropic: "Fractal Language Models" It DESTROYS the context window narrative. The LLM doesn't just respond it splits into self similar copies No tokens but models arguing compressing until the prompt is not read but self reconstructed /satire @a1zhang https://t.co/qSIzdvynw4 New paper dropped by Anthropic: "Fractal Language Models" It DESTROYS the context window narrative. The LLM doesn't just respond it splits into self similar copies No tokens but models arguing compressing until the prompt is not read but self reconstructed"
X Link 2026-01-18T19:43Z 23.1K followers, 72.8K engagements
"@yoavgo sadly incentives"
X Link 2026-01-19T00:28Z 22.8K followers, [----] engagements
"Be sure to check this out RLMs + DSPy :p The dspy.RLM module is now released π Install DSPy 3.1.2 to try it. Usage is plug-and-play with your existing Signatures. A little example of it helping @lateinteraction and I figure out some scattered backlogs: https://t.co/Avgx04sNJP The dspy.RLM module is now released π Install DSPy 3.1.2 to try it. Usage is plug-and-play with your existing Signatures. A little example of it helping @lateinteraction and I figure out some scattered backlogs: https://t.co/Avgx04sNJP"
X Link 2026-01-19T22:33Z 23.2K followers, 22.6K engagements
"yep; but what about conditional parallel tool calling or doing tool / sub calls only if the entries you're looking at contain some special property we could continue adding specialized functionality to serve each of these purposes or just let the model write the code to do this"
X Link 2026-01-22T21:53Z 23K followers, [---] engagements
"does anyone have any favorite blogs / papers that goes into detail about successfully FT / RLing a model on a domain-specific task like specifically just a small task nothing super fancy that requires an exorbitant amount of data"
X Link 2026-01-09T21:46Z 23.3K followers, 16.6K engagements
"guys ok weve really lost the plot here LOL looks like Ive left DeepMind and returned to MIT but now Ive changed the name to Recursive Meta Cognition R.I.P. basic prompting. MIT just dropped a technique that makes ChatGPT reason like a team of experts instead of one overconfident intern. Its called Recursive Meta-Cognition and it outperforms standard prompts by 110%. Heres the prompt (and why this changes everything) π https://t.co/caDTB52TS9 R.I.P. basic prompting. MIT just dropped a technique that makes ChatGPT reason like a team of experts instead of one overconfident intern. Its called"
X Link 2026-01-16T01:21Z 29.2K followers, 112.4K engagements
"Fundamentally what really is the difference between an RLM and S=context folding Codex Claude Code Terminus agents etc. This is the last and most important RLM post I'll make for a while to finally answer all the "this is trivially obvious" from HackerNews Reddit X etc. I know there's a lot of noise rn but this is the one thread I'd rly ask you not to skip For a while I didn't have a super clear answer to this. and no it's NOT that: [--]. CC sub-agents are user-defined while the LM defines the sub-agent in RLMs. this is a minor difference that I suspect Anthropic will phase out at some point 2."
X Link 2026-01-22T14:00Z 23.5K followers, 49.4K engagements
"This will be my second time coming on alphaXiv for RLMs (first time being after the blogpost) so like last time I'll try to answer everything / have good unfiltered conversation What if LLMs could reason over arbitrarily long inputs Join us tomorrow for our AI4Science talk with @a1zhang on Recursive Language Models (RLMs) a new inference time strategy that scales far beyond traditional context windows Register in the link below π https://t.co/gpI56SrDus What if LLMs could reason over arbitrarily long inputs Join us tomorrow for our AI4Science talk with @a1zhang on Recursive Language Models"
X Link 2026-01-23T00:22Z 23.3K followers, 17.7K engagements
"I've been super excited to get Daytona sandboxes support integrated into the RLM repository (which now works in the official repo). Check out this awesome guide on RLMs with depth1 with code examples from @daytonaio https://www.daytona.io/docs/en/recursive-language-models/ We just dropped our guide to Recursive Language Models. Where every agent and sub-agent gets its own @daytonaio sandbox - at UNLIMITED recursion depth π€― https://t.co/vFGNS9gzkc https://www.daytona.io/docs/en/recursive-language-models/ We just dropped our guide to Recursive Language Models. Where every agent and sub-agent"
X Link 2026-01-26T16:13Z 23.4K followers, 13.7K engagements
"We have lots of plans in [----] for GPU MODE please read if you're interested in joining and/or helping us out :) GPU MODE 2026: were post-training Kernel LLMs in public and are building all the infra we need to make GPU programming more accessible to all. We're doing this in close collaboration with some of my favorite communities @PrimeIntellect @modal and @LambdaAPI [----] recap: 26K GPU MODE 2026: were post-training Kernel LLMs in public and are building all the infra we need to make GPU programming more accessible to all. We're doing this in close collaboration with some of my favorite"
X Link 2026-01-26T16:52Z 23.4K followers, [----] engagements
"Also most people won't read this but Mark has a much nicer and expanded version of his tweet linked (also can go to GPU MODE website and look in news tab) https://t.co/2jC070VX9U https://t.co/2jC070VX9U"
X Link 2026-01-26T20:03Z 23.4K followers, [----] engagements
"@raw_works Oh cool I can play around with this on some evals of my own Did you put up a PR for this"
X Link 2026-01-28T17:40Z 23.5K followers, [----] engagements
"PSA I've left MIT for Harvard now π¨ Your AI is lying to you with complete confidence. Harvard & MIT just proved ChatGPT hallucinates 110% less when you force it to argue with itself. The technique is called "Recursive Meta-Cognition" and it's embarrassingly simple. Here's how to make AI actually think: https://t.co/oXKqvcEpbw π¨ Your AI is lying to you with complete confidence. Harvard & MIT just proved ChatGPT hallucinates 110% less when you force it to argue with itself. The technique is called "Recursive Meta-Cognition" and it's embarrassingly simple. Here's how to make AI actually think:"
X Link 2026-01-28T17:44Z 23.3K followers, 34.3K engagements
"@irl_danB @lateinteraction Ah yes the visualizer can be found in the codebase. I am in the process of figuring out the best way to upload these trajectories / uniformly format them but I have a sneak peak up: https://alexzhang13.github.io/rlm-examples/ https://alexzhang13.github.io/rlm-examples/"
X Link 2026-01-29T20:04Z 23.3K followers, [---] engagements
Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing
/creator/twitter::a1zhang