[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

# ![@karpathy Avatar](https://lunarcrush.com/gi/w:26/cr:twitter::33836629.png) @karpathy Andrej Karpathy

Andrej Karpathy posts on X about all the, open ai, 90s, llm the most. They currently have XXXXXXXXX followers and XXX posts still getting attention that total XXXXXXXXX engagements in the last XX hours.

### Engagements: XXXXXXXXX [#](/creator/twitter::33836629/interactions)
![Engagements Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::33836629/c:line/m:interactions.svg)

- X Week XXXXXXXXXX +67%
- X Month XXXXXXXXXX +271%
- X Months XXXXXXXXXX -XX%
- X Year XXXXXXXXXXX +72%

### Mentions: XX [#](/creator/twitter::33836629/posts_active)
![Mentions Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::33836629/c:line/m:posts_active.svg)

- X Week XX +52%
- X Month XXX +40%
- X Months XXX -XXXX%
- X Year XXX +30%

### Followers: XXXXXXXXX [#](/creator/twitter::33836629/followers)
![Followers Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::33836629/c:line/m:followers.svg)

- X Week XXXXXXXXX +1.80%
- X Month XXXXXXXXX +3.10%
- X Months XXXXXXXXX +14%
- X Year XXXXXXXXX +32%

### CreatorRank: XXXXX [#](/creator/twitter::33836629/influencer_rank)
![CreatorRank Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::33836629/c:line/m:influencer_rank.svg)

### Social Influence [#](/creator/twitter::33836629/influence)
---

**Social category influence**
[technology brands](/list/technology-brands)  XXXX% [social networks](/list/social-networks)  XXXX% [finance](/list/finance)  XXXX% [gaming](/list/gaming)  XXXX% [celebrities](/list/celebrities)  XXXX%

**Social topic influence**
[all the](/topic/all-the) 4.95%, [open ai](/topic/open-ai) #150, [90s](/topic/90s) #1325, [llm](/topic/llm) #19, [rl](/topic/rl) #43, [dvd](/topic/dvd) #389, [over the](/topic/over-the) 1.98%, [mentions](/topic/mentions) 0.99%, [echo](/topic/echo) 0.99%, [batch](/topic/batch) XXXX%

**Top accounts mentioned or mentioned by**
[@grok](/creator/undefined) [@basedproffoak](/creator/undefined) [@ronald_vanloon](/creator/undefined) [@elonmusk](/creator/undefined) [@gregcoppola5d](/creator/undefined) [@clementdelangue](/creator/undefined) [@amitmoryossef](/creator/undefined) [@dwarkesh_sp](/creator/undefined) [@shomikghosh21](/creator/undefined) [@yacinemtb](/creator/undefined) [@kimmonismus](/creator/undefined) [@jeremy_ai_](/creator/undefined) [@kalyan_kpl](/creator/undefined) [@rasbt](/creator/undefined) [@swyx](/creator/undefined) [@staysaasy](/creator/undefined) [@eigenron](/creator/undefined) [@theturingpost](/creator/undefined) [@elder_plinius](/creator/undefined) [@muserhymes](/creator/undefined)

**Top assets mentioned**
[PSYOP (PSYOP)](/topic/$psyop)
### Top Social Posts [#](/creator/twitter::33836629/posts)
---
Top posts by engagements in the last XX hours

"@shaneguML The Great Filter is kinda cute"  
[X Link](https://x.com/karpathy/status/1945196908420485125) [@karpathy](/creator/x/karpathy) 2025-07-15T19:00Z 1.4M followers, 332.5K engagements


"I don't know what labs are doing to these poor LLMs during RL but they are mortally terrified of exceptions in any infinitesimally likely case. Exceptions are a normal part of life and healthy dev process. Sign my LLM welfare petition for improved rewards in cases of exceptions"  
[X Link](https://x.com/karpathy/status/1976077806443569355) [@karpathy](/creator/x/karpathy) 2025-10-09T00:10Z 1.4M followers, 699.4K engagements


"An attempt to explain (current) ChatGPT versions. I still run into many many people who don't know that: - o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3 you're ngmi. - 4o is different from o4. Yes I know lol. 4o is a good "daily driver" for many easy-medium questions. o4 is only available as mini for now and is not as good as o3 and I'm not super sure why it's out right now. Example basic "router" in my own personal use: - Any simple query (e.g. "what foods are high in"  
[X Link](https://x.com/karpathy/status/1929597620969951434) [@karpathy](/creator/x/karpathy) 2025-06-02T17:54Z 1.4M followers, 1.4M engagements


"Hah judging by mentions overnight people seem to find the ghost analogy provocative. I swear I don't wake up just trying to come with new memes but to elaborate briefly why I thought it was a fun comparison: 1) It captures the idea that LLMs are purely digital artifacts that don't interact with the physical world (unlike animals which are very embodied). 2) Ghosts are a kind of "echo" of the living in this case a statistical distillation of humanity. 3) There is an air of mystery over both ghosts and LLMs as in we don't fully understand what they are or how they work. 4) The process of"  
[X Link](https://x.com/karpathy/status/1973756330449236009) [@karpathy](/creator/x/karpathy) 2025-10-02T14:25Z 1.4M followers, 242.3K engagements


"Its been a decade but yes I believe I hallucinated the term in my 2015 post on unreasonable effectiveness of RNNs. I later became aware that Geoff Hinton used confabulate which is often (but I think not always) a better analogue in human psychology. Its a bit too specific meaning a factual fabrication. I think hallucinate works better as a bit more general (not necessarily factual) filling in of data or patterns"  
[X Link](https://x.com/karpathy/status/1980033964757946398) [@karpathy](/creator/x/karpathy) 2025-10-19T22:10Z 1.4M followers, 31.2K engagements


"LLM OS. Bear with me I'm still cooking. Specs: - LLM: OpenAI GPT-4 Turbo XXX core (batch size) processor @ 20Hz (tok/s) - RAM: 128Ktok - Filesystem: Ada002"  
[X Link](https://x.com/karpathy/status/1723140519554105733) [@karpathy](/creator/x/karpathy) 2023-11-11T00:48Z 1.4M followers, 2.4M engagements


"This is what the ideal grocery store looks like. Minimally processed (NOVA Group 1) food only (no "edible food-like substances") organic local fresh. Food should not be more complex than this yet I don't believe this exists"  
[X Link](https://x.com/karpathy/status/1942612984481870068) [@karpathy](/creator/x/karpathy) 2025-07-08T15:53Z 1.4M followers, 599.3K engagements


"TV in the 90s: you turn it on you watch. TV 2025: - turn on wait for it to load - popup: TV wants to update 1.5GB. No. - scroll sideways find prime video app or etc - popup: now app wants to update 500MB. No - App launching. App loading - select account screen - 🫠"  
[X Link](https://x.com/karpathy/status/1978653908663726585) [@karpathy](/creator/x/karpathy) 2025-10-16T02:47Z 1.4M followers, 1.7M engagements


"nanochat now has a primordial identity and can talk a bit about itself and its capabilities (e.g. it knows it's nanochat d32 that cost $XXX that it was built by me that it can't speak languages other than English too well and why etc.). This kind of customization is all done through synthetic data generation and I uploaded a new example script to demonstrate. It's a bit subtle but by default LLMs have no inherent personality or any understanding of their own capabilities because they are not animal-like entities. They don't know what they are or what they can or can't do or know or don't"  
[X Link](https://x.com/karpathy/status/1980665134415802554) [@karpathy](/creator/x/karpathy) 2025-10-21T15:59Z 1.4M followers, 356.8K engagements


"The hottest new programming language is English"  
[X Link](https://x.com/karpathy/status/1617979122625712128) [@karpathy](/creator/x/karpathy) 2023-01-24T20:14Z 1.4M followers, 8.4M engagements


"⚡ Excited to share that I am starting an AI+Education company called Eureka Labs. The announcement: --- We are Eureka Labs and we are building a new kind of school that is AI native. How can we approach an ideal experience for learning something new For example in the case of physics one could imagine working through very high quality course materials together with Feynman who is there to guide you every step of the way. Unfortunately subject matter experts who are deeply passionate great at teaching infinitely patient and fluent in all of the world's languages are also very scarce and cannot"  
[X Link](https://x.com/karpathy/status/1813263734707790301) [@karpathy](/creator/x/karpathy) 2024-07-16T17:25Z 1.4M followers, 2.5M engagements


"Projects like OpenAIs Operator are to the digital world as Humanoid robots are to the physical world. One general setting (monitor keyboard and mouse or human body) that can in principle gradually perform arbitrarily general tasks via an I/O interface originally designed for humans. In both cases it leads to a gradually mixed autonomy world where humans become high-level supervisors of low-level automation. A bit like a driver monitoring the Autopilot. This will happen faster in digital world than in physical world because flipping bits is somewhere around 1000X less expensive than moving"  
[X Link](https://x.com/karpathy/status/1882544526033924438) [@karpathy](/creator/x/karpathy) 2025-01-23T21:42Z 1.4M followers, 427.8K engagements


"New 2h11m YouTube video: How I Use LLMs This video continues my general audience series. The last one focused on how LLMs are trained so I wanted to follow up with a more practical guide of the entire LLM ecosystem including lots of examples of use in my own life. Chapters give a sense of content: 00:00:00 Intro into the growing LLM ecosystem 00:02:54 ChatGPT interaction under the hood 00:13:12 Basic LLM interactions examples 00:18:03 Be aware of the model you're using pricing tiers 00:22:54 Thinking models and when to use them 00:31:00 Tool use: internet search 00:42:04 Tool use: deep"  
[X Link](https://x.com/karpathy/status/1895242932095209667) [@karpathy](/creator/x/karpathy) 2025-02-27T22:41Z 1.4M followers, 986.4K engagements


"@willccbb Theoretical physicists are the intellectual embryonic stem cell Ive now seen them become everything"  
[X Link](https://x.com/karpathy/status/1929699637063307286) [@karpathy](/creator/x/karpathy) 2025-06-03T00:40Z 1.4M followers, 143.9K engagements


"My sleep scores during recent travel were in the 90s. Now back in SF I am consistently back down to 70s 80s. I am increasingly convinced that this is due to traffic noise from a nearby road/intersection where I live - every 10min a car truck bus or motorcycle with a very loud engine passes by (some are 10X louder than others). In the later less deep stages of sleep it is much easier to wake and then much harder to go back to sleep. More generally I think noise pollution (esp early hours) come at a huge societal cost that is not correctly accounted for. E.g. I wouldn't be too surprised if a"  
[X Link](https://x.com/karpathy/status/1931426322536132767) [@karpathy](/creator/x/karpathy) 2025-06-07T19:01Z 1.4M followers, 1.5M engagements


"How to build a thriving open source community by writing code like bacteria do 🦠. Bacterial code (genomes) are: - small (each line of code costs energy) - modular (organized into groups of swappable operons) - self-contained (easily "copy paste-able" via horizontal gene transfer) If chunks of code are small modular self-contained and trivial to copy-and-paste the community can thrive via horizontal gene transfer. For any function (gene) or class (operon) that you write: can you imagine someone going "yoink" without knowing the rest of your code or having to import anything new to gain a"  
[X Link](https://x.com/karpathy/status/1941616674094170287) [@karpathy](/creator/x/karpathy) 2025-07-05T21:54Z 1.4M followers, 621.2K engagements


"Scaling up RL is all the rage right now I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly) let me slightly increase (/decrease) the probability of every action I took for the future". You get a lot more leverage from verifier functions than explicit supervision this is great. But first it looks suspicious asymptotically - once the tasks grow to be minutes/hours of interaction long you're really going to do all that work just"  
[X Link](https://x.com/karpathy/status/1944435412489171119) [@karpathy](/creator/x/karpathy) 2025-07-13T16:35Z 1.4M followers, 1.1M engagements


"I'm noticing that due to (I think) a lot of benchmarkmaxxing on long horizon tasks LLMs are becoming a little too agentic by default a little beyond my average use case. For example in coding the models now tend to reason for a fairly long time they have an inclination to start listing and grepping files all across the entire repo they do repeated web searchers they over-analyze and over-think little rare edge cases even in code that is knowingly incomplete and under active development and often come back minutes later even for simple queries. This might make sense for long-running tasks but"  
[X Link](https://x.com/karpathy/status/1954224651443544436) [@karpathy](/creator/x/karpathy) 2025-08-09T16:53Z 1.4M followers, 1M engagements


"Continuing the journey of optimal LLM-assisted coding experience. In particular I find that instead of narrowing in on a perfect one thing my usage is increasingly diversifying across a few workflows that I "stitch up" the pros/cons of: Personally the bread & butter (75%) of my LLM assistance continues to be just (Cursor) tab complete. This is because I find that writing concrete chunks of code/comments myself and in the right part of the code is a high bandwidth way of communicating "task specification" to the LLM i.e. it's primarily about task specification bits - it takes too many bits and"  
[X Link](https://x.com/karpathy/status/1959703967694545296) [@karpathy](/creator/x/karpathy) 2025-08-24T19:46Z 1.4M followers, 682.6K engagements


"Transforming human knowledge sensors and actuators from human-first and human-legible to LLM-first and LLM-legible is a beautiful space with so much potential and so much can be done. One example I'm obsessed with recently - for every textbook pdf/epub there is a perfect "LLMification" of it intended not for human but for an LLM (though it is a non-trivial transformation that would need human in the loop involvement). - All of the exposition is extracted into a markdown document including all latex styling (bold/italic) tables lists etc. All of the figures are extracted as images. - All"  
[X Link](https://x.com/karpathy/status/1961128638725923119) [@karpathy](/creator/x/karpathy) 2025-08-28T18:07Z 1.4M followers, 709.8K engagements


""AI isn't replacing radiologists" good article Expectation: rapid progress in image recognition AI will delete radiology jobs (e.g. as famously predicted by Geoff Hinton now almost a decade ago). Reality: radiology is doing great and is growing. There are a lot of imo naive predictions out there on the imminent impact of AI on the job market. E.g. a year ago I was asked by someone who should know better if I think there will be any software engineers still today. (Spoiler: I think we're going to make it). This is happening too broadly. The post goes into detail on why it's not that simple"  
[X Link](https://x.com/karpathy/status/1971220449515516391) [@karpathy](/creator/x/karpathy) 2025-09-25T14:29Z 1.4M followers, 2.3M engagements


"POV: Your LLM agent is dividing a by b"  
[X Link](https://x.com/karpathy/status/1976082963382272334) [@karpathy](/creator/x/karpathy) 2025-10-09T00:31Z 1.4M followers, 377.9K engagements


"Excited to release new repo: nanochat (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining nanochat is a minimal from scratch full-stack training/inference pipeline of a simple ChatGPT clone in a single dependency-minimal codebase. You boot up a cloud GPU box run a single script and in as little as X hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs 8000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb evaluate CORE score"  
[X Link](https://x.com/karpathy/status/1977755427569111362) [@karpathy](/creator/x/karpathy) 2025-10-13T15:16Z 1.4M followers, 5.6M engagements


"nanochat d32 i.e. the depth XX version that I specced for $1000 up from $XXX has finished training after XX hours and looks good. All the metrics go up quite a bit across pretraining SFT and RL. CORE score of XXXX is now well above GPT-2 at XXXX. GSM8K went X% - XX% etc. So that's encouraging. The model is pretty fun to talk to but judging from some early interactions I think people have a little bit too much expectation for these micro models. There is a reason that frontier LLM labs raise billions to train their models. nanochat models cost $XXX - $1000 to train from scratch. The $100"  
[X Link](https://x.com/karpathy/status/1978615547945521655) [@karpathy](/creator/x/karpathy) 2025-10-16T00:14Z 1.4M followers, 247.3K engagements


"DVD player is superior technology"  
[X Link](https://x.com/karpathy/status/1978656449904496861) [@karpathy](/creator/x/karpathy) 2025-10-16T02:57Z 1.4M followers, 119.8K engagements


"@tim_zaman @dwarkesh_sp Hacker hoodie Rare Cloth Armor: X Intellect +10 Stamina +10 Active: put hoodie over head. Enters state of intense focus. +20% APM. +10% intellect. Ignores hunger. If disturbed take XXX sanity damage"  
[X Link](https://x.com/karpathy/status/1979009002114588740) [@karpathy](/creator/x/karpathy) 2025-10-17T02:18Z 1.4M followers, 11.2K engagements


"I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots) and yes data collection etc. but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language person) is whether pixels are better inputs to LLMs than text. Whether text tokens are wasteful and just terrible at the input. Maybe it makes more sense that all inputs to LLMs should only ever be images. Even if you happen to have pure text input maybe you'd prefer to render it and then feed that in: - more information"  
[X Link](https://x.com/karpathy/status/1980397031542989305) [@karpathy](/creator/x/karpathy) 2025-10-20T22:13Z 1.4M followers, 2.8M engagements


"Some background is that early OpenAI (2015) we all took a poll on our AGI timelines. I voted XX years then (very long compared to others) so now X decade later it's interesting that it still feels on trend and I'm sticking with it. I completely understand that it feels arbitrary and yes it's just vibes. I could refuse to answer (and often do/try) but it's not that interesting. And I admire people who try to make the predictions be "not vibes" but I haven't seen something concrete that is sufficiently convincing"  
[X Link](https://x.com/karpathy/status/1980669343479509025) [@karpathy](/creator/x/karpathy) 2025-10-21T16:15Z 1.4M followers, 145.4K engagements


"@thawani_avijit Haha. I am afraid people interpreted my delete tokenizer as use bytes directly without BPE the issue is you *still* need bytes encoding arbitrariness even for that Pixels is the only way. Just like humans. It is written. If GPT-10 uses utf8 at the input I will eat a shoe"  
[X Link](https://x.com/karpathy/status/1980764296016720094) [@karpathy](/creator/x/karpathy) 2025-10-21T22:33Z 1.4M followers, 73.2K engagements


"TIL will look into The thing that makes this a bit complicated right now is the start latency. What bloats up the setup time right now is the dataset and its tokenization which is all done in Python right now. Installing huggingface datasets downloading FineWeb 10B and tokenizing it is currently X hr. I think I have to look into precomputing all of this and just saving the final .bin files (20GB) of tokens somewhere (S3 or so). You could imagine fetching data shards asynchronously while the training started. This would completely eliminate any Python dependency. The next slightly annoying"  
[X Link](https://x.com/karpathy/status/1795501945832247790) [@karpathy](/creator/x/karpathy) 2024-05-28T17:06Z 1.4M followers, 20.6K engagements


"@brickroad7 Think Casper Childhood favorite"  
[X Link](https://x.com/karpathy/status/1973455432359485704) [@karpathy](/creator/x/karpathy) 2025-10-01T18:30Z 1.4M followers, 25.9K engagements


"@rasbt yep 💯 Using closed model API should feel quite unsettling when there is no recourse"  
[X Link](https://x.com/karpathy/status/1973474834949759107) [@karpathy](/creator/x/karpathy) 2025-10-01T19:47Z 1.4M followers, 21.4K engagements


"@alexisxrivas Dream kitchen without the contemporary minimalism psyop. But also a big fan of functional and affordable"  
[X Link](https://x.com/karpathy/status/1973762212956406266) [@karpathy](/creator/x/karpathy) 2025-10-02T14:49Z 1.4M followers, 62.8K engagements


"@jdchawla29 Yeah one more person pointed out already. I'd say it falls under "agent" because an LLM is still giving you a large chunk of code that you're slotting in"  
[X Link](https://x.com/karpathy/status/1973895042684330243) [@karpathy](/creator/x/karpathy) 2025-10-02T23:37Z 1.4M followers, 18.5K engagements


"@swyx @staysaasy I dont love it. Youre either engineering or youre vibing. They are opposites of some spectrum"  
[X Link](https://x.com/karpathy/status/1975740493855400153) [@karpathy](/creator/x/karpathy) 2025-10-08T01:50Z 1.4M followers, 57K engagements


"@doodlestein OPEN THE POD BAY DOORS HAL The number of AI pioneers anticipating this as the state of the art AI of 2025 must surely have been exactly zero"  
[X Link](https://x.com/karpathy/status/1977445189167063052) [@karpathy](/creator/x/karpathy) 2025-10-12T18:44Z 1.4M followers, 47.2K engagements


"@eigenrobot World of Warcraft Classic grinding mobs simple questing is mine. Repetitive skill rotation with just enough variety to keep fun/engaging but easy. A lot of *wrong* answers in the replies here games that nowhere near mindless enough eg Factorio"  
[X Link](https://x.com/karpathy/status/1974635345359847619) [@karpathy](/creator/x/karpathy) 2025-10-05T00:38Z 1.4M followers, 66.9K engagements


"# on shortification of "learning" There are a lot of videos on YouTube/TikTok etc. that give the appearance of education but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are learning (but actually they are just having fun). The people creating this content also enjoy it because fun has a much larger audience fame and revenue. But as far as learning goes this is a trap. This content is an epsilon away from watching the Bachelorette. It's like snacking on those "Garden Veggie Straws" which feel"  
[X Link](https://x.com/karpathy/status/1756380066580455557) [@karpathy](/creator/x/karpathy) 2024-02-10T18:10Z 1.4M followers, 2.2M engagements


"The future expands the variance of human condition a lot more than it drags its mean. This is an empirical observation with interesting extrapolations. The past is well-approximated as a population of farmers living similar lives w.r.t. upbringing knowledge activities ideals aspirations etc. The future trends to include all of: - the transhumanists who "ascend" with neuralinks etc. and the Amish living 19th century life. - those who "worship" ideals of religion technology knowledge wealth fitness community nature art . - those exploring externally into the stars those exploring internally"  
[X Link](https://x.com/karpathy/status/1846448411362709980) [@karpathy](/creator/x/karpathy) 2024-10-16T07:09Z 1.4M followers, 427.7K engagements


"I was given early access to Grok X earlier today making me I think one of the first few who could run a quick vibe check. Thinking ✅ First Grok X clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan question: "Create a board game webpage showing a hex grid just like in the game Settlers of Catan. Each hex grid is numbered from 1.N where N is the total number of hex tiles. Make it generic so one can change the number of "rings" using a slider. For example in Catan the radius is X hexes. Single html page please." Few models"  
[X Link](https://x.com/karpathy/status/1891720635363254772) [@karpathy](/creator/x/karpathy) 2025-02-18T05:25Z 1.4M followers, 3.7M engagements


"I wrote a quick new post on "Digital Hygiene". Basically there are some no-brainer decisions you can make in your life to dramatically improve the privacy and security of your computing and this post goes over some of them. Blog post link in the reply but copy pasting below too. Every now and then I get reminded about the vast fraud apparatus of the internet re-invigorating my pursuit of basic digital hygiene around privacy/security of day to day computing. The sketchiness starts with major tech companies who are incentivized to build comprehensive profiles of you to monetize it directly for"  
[X Link](https://x.com/karpathy/status/1902046003567718810) [@karpathy](/creator/x/karpathy) 2025-03-18T17:14Z 1.4M followers, 4M engagements


""Finding the Best Sleep Tracker" Results of an experiment where I wore X sleep trackers every night for X months. TLDR Whoop = Oura 8Sleep Apple Watch + AutoSleep. Link simply right here instead of in a reply because ()/"  
[X Link](https://x.com/karpathy/status/1906386327190257963) [@karpathy](/creator/x/karpathy) 2025-03-30T16:41Z 1.4M followers, 1.6M engagements


"Noticing myself adopting a certain rhythm in AI-assisted coding (i.e. code I actually and professionally care about contrast to vibe code). X. Stuff everything relevant into context (this can take a while in big projects. If the project is small enough just stuff everything e.g. files-to-prompt . -e ts -e tsx -e css -e md --cxml --ignore node_modules -o prompt.xml) X. Describe the next single concrete incremental change we're trying to implement. Don't ask for code ask for a few high-level approaches pros/cons. There's almost always a few ways to do thing and the LLM's judgement is not always"  
[X Link](https://x.com/karpathy/status/1915581920022585597) [@karpathy](/creator/x/karpathy) 2025-04-25T01:41Z 1.4M followers, 1.2M engagements


"We're missing (at least one) major paradigm for LLM learning. Not sure what to call it possibly it has a name - system prompt learning Pretraining is for knowledge. Finetuning (SL/RL) is for habitual behavior. Both of these involve a change in parameters but a lot of human learning feels more like a change in system prompt. You encounter a problem figure something out then "remember" something in fairly explicit terms for the next time. E.g. "It seems when I encounter this and that kind of a problem I should try this and that kind of an approach/solution". It feels more like taking notes for"  
[X Link](https://x.com/karpathy/status/1921368644069765486) [@karpathy](/creator/x/karpathy) 2025-05-11T00:55Z 1.4M followers, 1.5M engagements


"Nice - my AI startup school talk is now up Chapters: 0:00 Imo fair to say that software is changing quite fundamentally again. LLMs are a new kind of computer and you program them *in English*. Hence I think they are well deserving of a major version upgrade in terms of software. 6:06 LLMs have properties of utilities of fabs and of operating systems = New LLM OS fabbed by labs and distributed like utilities (for now). Many historical analogies apply - imo we are computing circa 1960s. 14:39 LLM psychology: LLMs = "people spirits" stochastic simulations of people where the simulator is an"  
[X Link](https://x.com/karpathy/status/1935518272667217925) [@karpathy](/creator/x/karpathy) 2025-06-19T02:01Z 1.4M followers, 1.3M engagements


"More gists less gits"  
[X Link](https://x.com/karpathy/status/1941618002841174234) [@karpathy](/creator/x/karpathy) 2025-07-05T21:59Z 1.4M followers, 137.2K engagements


""Using a better model for analysis" 🤨 I didn't realize I was using haiku all this time no idea when claude code snuck this one in rofl"  
[X Link](https://x.com/karpathy/status/1946325810618700033) [@karpathy](/creator/x/karpathy) 2025-07-18T21:46Z 1.4M followers, 374.6K engagements


"How amazing it would be if we could extract and reframe all the practice problems from all the textbooks ever written into environments"  
[X Link](https://x.com/karpathy/status/1960805995313291488) [@karpathy](/creator/x/karpathy) 2025-08-27T20:45Z 1.4M followers, 151.3K engagements


"Anytime someone takes a picture/video that I happen to be in the background of I like to wave at the AGI that sees me XX years from now"  
[X Link](https://x.com/karpathy/status/1970113433795174792) [@karpathy](/creator/x/karpathy) 2025-09-22T13:10Z 1.4M followers, 365.3K engagements


"Very early on in the project I did a small run with/without QK norm and found that it helped. Same for the embedding weight sharing. I'll retry I'm not tied to any details of the model and they weren't chosen any more carefully than a single run I spent most of the time just arranging the harness. Now it's a good time to experiment again"  
[X Link](https://x.com/karpathy/status/1977772071351640114) [@karpathy](/creator/x/karpathy) 2025-10-13T16:23Z 1.4M followers, 6395 engagements


"@simonw Good idea I'm running/tuning these tiers now. I'll bunch up some of the low-hanging fruit here over the next few days and make it available (I spent most of my energy going into this v0.1 on the overall harness)"  
[X Link](https://x.com/karpathy/status/1977786824337863151) [@karpathy](/creator/x/karpathy) 2025-10-13T17:21Z 1.4M followers, 16.3K engagements


"@younghope11 yeah i was thinking about it good idea"  
[X Link](https://x.com/karpathy/status/1977847480755859556) [@karpathy](/creator/x/karpathy) 2025-10-13T21:22Z 1.4M followers, 3005 engagements


"I count and report the lines in uv.lock which is my attempt at a simple 80:20 proxy for this as it includes all the packages recursively. Open to suggestions I'd love to measure things like "cognitive complexity" (there was a great blog post on it a few months back that I can't find). Or things like how "bacterial" the code is per my tweet a few months ago too. tiny already knows that I'm not a huge fan of excessive line count maxxing .min.py style specifically"  
[X Link](https://x.com/karpathy/status/1978113809551065381) [@karpathy](/creator/x/karpathy) 2025-10-14T15:00Z 1.4M followers, 19.2K engagements


"Website: GitHub: : @EurekaLabsAI"  
[X Link](https://x.com/karpathy/status/1813263739619319859) [@karpathy](/creator/x/karpathy) 2024-07-16T17:25Z 1.4M followers, 196.7K engagements


"LLM model size competition is intensifying backwards My bet is that we'll see models that "think" very well and reliably that are very very small. There is most likely a setting even of GPT-2 parameters for which most people will consider GPT-2 "smart". The reason current models are so large is because we're still being very wasteful during training - we're asking them to memorize the internet and remarkably they do and can e.g. recite SHA hashes of common numbers or recall really esoteric facts. (Actually LLMs are really good at memorization qualitatively a lot better than humans sometimes"  
[X Link](https://x.com/karpathy/status/1814038096218083497) [@karpathy](/creator/x/karpathy) 2024-07-18T20:42Z 1.4M followers, 1.4M engagements


"@Sams_Antics Yep I read Project Hail Mary and remember liking it a lot too it just didn't have the same staying power for some reason I don't fully understand so it didn't make this list"  
[X Link](https://x.com/karpathy/status/1865927217756393533) [@karpathy](/creator/x/karpathy) 2024-12-09T01:11Z 1.4M followers, 46.7K engagements


"There's a new kind of coding I call "vibe coding" where you fully give in to the vibes embrace exponentials and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment usually that fixes it. The code grows beyond my"  
[X Link](https://x.com/karpathy/status/1886192184808149383) [@karpathy](/creator/x/karpathy) 2025-02-02T23:17Z 1.4M followers, 5.2M engagements


"I attended a vibe coding hackathon recently and used the chance to build a web app (with auth payments deploy etc.). I tinker but I am not a web dev by background so besides the app I was very interested in what it's like to vibe code a full web app today. As such I wrote none of the code directly (Cursor+Claude/o3 did) and I don't really know how the app works in the conventional sense that I'm used to as an engineer. The app is called MenuGen and it is live on Basically I'm often confused about what all the things on a restaurant menu are - e.g. Pt Tagine Cavatappi or Sweetbread (hint it's."  
[X Link](https://x.com/karpathy/status/1917961248031080455) [@karpathy](/creator/x/karpathy) 2025-05-01T15:16Z 1.4M followers, 780.2K engagements


"+1 for "context engineering" over "prompt engineering". People associate prompts with short task descriptions you'd give an LLM in your day-to-day use. When in every industrial-strength LLM app context engineering is the delicate art and science of filling the context window with just the right information for the next step. Science because doing this right involves task descriptions and explanations few shot examples RAG related (possibly multimodal) data tools state and history compacting. Too little or of the wrong form and the LLM doesn't have the right context for optimal performance."  
[X Link](https://x.com/karpathy/status/1937902205765607626) [@karpathy](/creator/x/karpathy) 2025-06-25T15:54Z 1.4M followers, 2.4M engagements


"May your regularizer be strong lest you RLHF to slop"  
[X Link](https://x.com/karpathy/status/1937941695943065640) [@karpathy](/creator/x/karpathy) 2025-06-25T18:31Z 1.4M followers, 230.9K engagements


"In era of pretraining what mattered was internet text. You'd primarily want a large diverse high quality collection of internet documents to learn from. In era of supervised finetuning it was conversations. Contract workers are hired to create answers for questions a bit like what you'd see on Stack Overflow / Quora or etc. but geared towards LLM use cases. Neither of the two above are going away (imo) but in this era of reinforcement learning it is now environments. Unlike the above they give the LLM an opportunity to actually interact - take actions see outcomes etc. This means you can hope"  
[X Link](https://x.com/karpathy/status/1960803117689397543) [@karpathy](/creator/x/karpathy) 2025-08-27T20:34Z 1.4M followers, 933.6K engagements


"I think congrats again to OpenAI for cooking with GPT-5 Pro. This is the third time I've struggled on something complex/gnarly for an hour on and off with CC then X Pro goes off for XX minutes and comes back with code that works out of the box. I had CC read the X Pro version and it wrote up X paragraphs admiring it (very wholesome). If you're not giving it your hardest problems you're probably missing out"  
[X Link](https://x.com/karpathy/status/1964020416139448359) [@karpathy](/creator/x/karpathy) 2025-09-05T17:38Z 1.4M followers, 2.6M engagements


"reminded of this paragraph from gsm8k paper 2021 :)"  
[X Link](https://x.com/karpathy/status/1966896849929073106) [@karpathy](/creator/x/karpathy) 2025-09-13T16:08Z 1.4M followers, 372.6K engagements


"from this era"  
[X Link](https://x.com/karpathy/status/1966897698612932783) [@karpathy](/creator/x/karpathy) 2025-09-13T16:12Z 1.4M followers, 260.8K engagements


"Finally had a chance to listen through this pod with Sutton which was interesting and amusing. As background Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea is sufficiently "bitter lesson pilled" (meaning arranged so that it benefits from added computation for free) as a proxy for whether it's going to work or worth even pursuing. The underlying assumption being that LLMs are of course highly "bitter lesson pilled" indeed just look at LLM scaling laws where if you put compute"  
[X Link](https://x.com/karpathy/status/1973435013875314729) [@karpathy](/creator/x/karpathy) 2025-10-01T17:09Z 1.4M followers, 1.9M engagements


"Something I am experimenting with. I copy pasted: 1) the full podcast transcript 2) the bitter lesson blog post 3) my full post above To ChatGPT. The interesting part is you can fork the conversation context to ask any questions and take it in whatever direction with chat:"  
[X Link](https://x.com/karpathy/status/1973443912388977021) [@karpathy](/creator/x/karpathy) 2025-10-01T17:44Z 1.4M followers, 142K engagements


"Tinker is cool. If you're a researcher/developer tinker dramatically simplifies LLM post-training. You retain XX% of algorithmic creative control (usually related to data loss function the algorithm) while tinker handles the hard parts that you usually want to touch much less often (infra forward/backward of the LLM itself distributed training) meaning you can do these at well below XX% of typical complexity involved. Compared to the more common and existing paradigm of "upload your data we'll post-train your LLM" this is imo a more clever place to "slice up" the complexity of post-training"  
[X Link](https://x.com/karpathy/status/1973468610917179630) [@karpathy](/creator/x/karpathy) 2025-10-01T19:22Z 1.4M followers, 694.4K engagements


"Btw people should check out @alexisxrivas / @coverbuild which is really cool and I've followed for a while. I think I'm just developing an allergy to some disease in culture that I'm working through. Separately from functional scalable affordable. we have to build CATHEDRALS again. Applying miracles of modern technology. To take people's breath away. To shake Olympus in jealousy"  
[X Link](https://x.com/karpathy/status/1973776784429842912) [@karpathy](/creator/x/karpathy) 2025-10-02T15:47Z 1.4M followers, 28K engagements


"For your professional programming do you use mostly:"  
[X Link](https://x.com/karpathy/status/1973892769359056997) [@karpathy](/creator/x/karpathy) 2025-10-02T23:28Z 1.4M followers, 354.2K engagements


"Every company needs a DM POC - someone high up who you can just DM the most obvious things and who shortcuts the PM hierarchy"  
[X Link](https://x.com/karpathy/status/1974482521862865154) [@karpathy](/creator/x/karpathy) 2025-10-04T14:31Z 1.4M followers, 553.6K engagements


"GitHub repo: A lot more detailed and technical walkthrough: Example conversation with the $XXX 4-hour nanochat in the WebUI. It's. entertaining :) Larger models (e.g. a 12-hour depth XX or a 24-hour depth 30) quickly get more coherent"  
[X Link](https://x.com/karpathy/status/1977755430093980034) [@karpathy](/creator/x/karpathy) 2025-10-13T15:16Z 1.4M followers, 240.9K engagements


"And an example of some of the summary metrics produced by the $XXX speedrun in the report card to start. The current code base is a bit over 8000 lines but I tried to keep them clean and well-commented. Now comes the fun part - of tuning and hillclimbing"  
[X Link](https://x.com/karpathy/status/1977755433172443626) [@karpathy](/creator/x/karpathy) 2025-10-13T15:16Z 1.4M followers, 169.7K engagements


"@zenitsu_aprntc Good question it's basically entirely hand-written (with tab autocomplete). I tried to use claude/codex agents a few times but they just didn't work well enough at all and net unhelpful possibly the repo is too far off the data distribution"  
[X Link](https://x.com/karpathy/status/1977758204139331904) [@karpathy](/creator/x/karpathy) 2025-10-13T15:27Z 1.4M followers, 476.3K engagements


"Good question ty I think this is not a good repo for that. You should think of micro models maybe more as very young children (kindergarten etc.) they just don't have the raw intelligence of their larger cousins. If you finetune/train it on your own data you'll probably get some amusing parroting that feels like your writing in style but it will be slop. To achieve what you're looking for you'd want something more like: - take your raw data - add extensive synthetic data generation rewrites on top (tricky not obvious researchy) - finetune a state of the art open LLM on it (e.g. tinker) -"  
[X Link](https://x.com/karpathy/status/1977760627730051214) [@karpathy](/creator/x/karpathy) 2025-10-13T15:37Z 1.4M followers, 52.5K engagements


"Basically Llama-like a bit simpler some influences from modded-nanoGPT. Tried to find a solid baseline for this scale: - dense transformer - rotary embeddings (and no positional embeddings) - QK norm - untied weights for embedding and unembedding - norm after token embedding - relu2 activation in MLP - no learnable params in rmsnorm - no biases in linear layers - Multi-Query Attention (MQA) - logit softcap Optimizer is Muon+AdamW heavily influenced from modded-nanoGPT. I have a TODO to try to tune Adam LRs well (e.g. per module) to remove Muon I haven't tried hard enough yet"  
[X Link](https://x.com/karpathy/status/1977763273786507691) [@karpathy](/creator/x/karpathy) 2025-10-13T15:48Z 1.4M followers, 85.1K engagements


"@singh_ilepton LLM101n is currently experiencing a lot of scope creep 😅"  
[X Link](https://x.com/karpathy/status/1977763719812710501) [@karpathy](/creator/x/karpathy) 2025-10-13T15:49Z 1.4M followers, 30.8K engagements


"@ClementDelangue @huggingface Ty huggingface work/infra/datasets are critical to projects like nanochat - to be accurate the source code of nanochat (e.g. at the $XXX tier) is 8KB of Python and 30GB of fineweb/smoltalk"  
[X Link](https://x.com/karpathy/status/1977773782304690399) [@karpathy](/creator/x/karpathy) 2025-10-13T16:29Z 1.4M followers, 43.7K engagements


"@Tim_Dettmers Thank you Notably I didn't yet include model quantization for inference. I have questions :)"  
[X Link](https://x.com/karpathy/status/1977774395742622163) [@karpathy](/creator/x/karpathy) 2025-10-13T16:32Z 1.4M followers, 42K engagements


"@vikramsingh0110 Ty MINIX is very inspiring and was exactly on my mind as well (as I've made the LLM - OS analogy often). Goals"  
[X Link](https://x.com/karpathy/status/1978503893500698935) [@karpathy](/creator/x/karpathy) 2025-10-15T16:51Z 1.4M followers, 28.2K engagements


"There is a movement I found on Instagram where people delivery choose to live in 90s refusing all technology after 2000. Like an intermediate form of the Amish"  
[X Link](https://x.com/karpathy/status/1978654744475578568) [@karpathy](/creator/x/karpathy) 2025-10-16T02:50Z 1.4M followers, 278.1K engagements


"Deliberately*"  
[X Link](https://x.com/karpathy/status/1978654822036607245) [@karpathy](/creator/x/karpathy) 2025-10-16T02:50Z 1.4M followers, 130.8K engagements


"@kumbhani_smit I think when I said TV personally I just meant movies/tvshows which I realize wasnt super clear. Things you used cassette or DVD for as the 90s baseline. I basically never watched TV as in news or cable"  
[X Link](https://x.com/karpathy/status/1978835284952260906) [@karpathy](/creator/x/karpathy) 2025-10-16T14:47Z 1.4M followers, 38.5K engagements


"Thank you I'm quite happy with the core_eval.py rewrite. I wanted to evaluate my base model with the DCLM "core score" as described in their paper but what felt like it should surely be a simple thing of XXX lines of code actually required me to pip install and depend on a huge amount of infrastructure and code from mosaic and so on. I'm happy to have ended up with XXX lines of imo quite nice clean "bacterial" rewrite there"  
[X Link](https://x.com/karpathy/status/1979587323705839780) [@karpathy](/creator/x/karpathy) 2025-10-18T16:36Z 1.4M followers, 97.9K engagements


"My pleasure to come on Dwarkesh last week I thought the questions and conversation were really good. I re-watched the pod just now too. First of all yes I know and I'm sorry that I speak so fast :). It's to my detriment because sometimes my speaking thread out-executes my thinking thread so I think I botched a few explanations due to that and sometimes I was also nervous that I'm going too much on a tangent or too deep into something relatively spurious. Anyway a few notes/pointers: AGI timelines. My comments on AGI timelines looks to be the most trending part of the early response. This is"  
[X Link](https://x.com/karpathy/status/1979644538185752935) [@karpathy](/creator/x/karpathy) 2025-10-18T20:23Z 1.4M followers, 3.7M engagements


"@elonmusk Id much rather use and collaborate with Grok X than compete against it. Though quite similar to chess and in the limit (speaking of physics) my value add probably trends to zero"  
[X Link](https://x.com/karpathy/status/1979680641940861032) [@karpathy](/creator/x/karpathy) 2025-10-18T22:47Z 1.4M followers, 204K engagements


"I very much hope you continue working on RL I think it's a misunderstanding that I am suggesting we need some kind of a replacement for RL. That's not accurate and I tried to clear it but did so poorly - they layer. Layer X was base model autocomplete. Layer X was instruct finetuning (SFT) creating assistants in style (InstructGPT paper). Layer X is reinforcement learning (RL) allowing us to essentially optimize over the sampling loop too and driving away undesirable behaviors like hallucinations stuck repetition loops and eliciting "move 37"-like behaviors that would be really hard to SFT"  
[X Link](https://x.com/karpathy/status/1979932716423680137) [@karpathy](/creator/x/karpathy) 2025-10-19T15:28Z 1.4M followers, 260.5K engagements


"Nice short post illustrating how simple text (discrete) diffusion can be. Diffusion (i.e. parallel iterated denoising top) is the pervasive generative paradigm in image/video but autoregression (i.e. go left to right bottom) is the dominant paradigm in text. For audio I've seen a bit of both. A lot of diffusion papers look a bit dense but if you strip the mathematical formalism you end up with simple baseline algorithms e.g. something a lot closer to flow matching in continuous or something like this in discrete. It's your vanilla transformer but with bi-directional attention where you"  
[X Link](https://x.com/karpathy/status/1980347971935068380) [@karpathy](/creator/x/karpathy) 2025-10-20T18:58Z 1.4M followers, 622K engagements


"Original OpenAI definition. AGI = an automated system capable of doing any economically valuable work that a human can do. (Like humans they might need some additional training for a job that's fine). And I grant the commonly given "digital concession" putting away physical work or aspects. So it's a remote worker you can hire instead of a person to take on most digital tasks/jobs and you interact with them in basically all the same ways - email DMs zoom calls whatever"  
[X Link](https://x.com/karpathy/status/1980672126131794063) [@karpathy](/creator/x/karpathy) 2025-10-21T16:26Z 1.4M followers, 26.9K engagements


"Yes so there's one more dimension here which is more about diffusion. The original definition is that such a system *exists* not that it is fully deployed across society. The diffusion of such a system will still take even more time in my mind (e.g. basic technological diffusion stuff as seen with computing/internet compute constraints sensors/actuators over the physical world constraints societal legal). I don't want to offer up more vibes timelines right now. But then the full deployment of AGI into all parts of the economy and the acceleration caused by that I would then call ASI - a kind"  
[X Link](https://x.com/karpathy/status/1980692376046956961) [@karpathy](/creator/x/karpathy) 2025-10-21T17:47Z 1.4M followers, 7811 engagements


"@swyx @YorkieBuilds @Grad62304977 @tejalpatwardhan AGI definition is not about GDPVal puzzles scores its about when youre fired because GPT-n is better cheaper takes no equity and works weekends. I like GDPVal but I dislike the fixation on puzzle solving as some kind of alleged endpoint"  
[X Link](https://x.com/karpathy/status/1980726944661688679) [@karpathy](/creator/x/karpathy) 2025-10-21T20:04Z 1.4M followers, 16.9K engagements


"@nearcyan Bed is stuck in inclined position because of AWS outage I really thought this must have been a joke"  
[X Link](https://x.com/karpathy/status/1980743204665454762) [@karpathy](/creator/x/karpathy) 2025-10-21T21:09Z 1.4M followers, 90.1K engagements


"@akyurekekin Fair but its still actively learning to make those errors (just so it can recover from them later) which imo is still a bit weird. In an ideal world you wouldnt possibly process supervision is one way to get there even within RL framework. Def agree on recovery learning"  
[X Link](https://x.com/karpathy/status/1980762027674206367) [@karpathy](/creator/x/karpathy) 2025-10-21T22:24Z 1.4M followers, 29.5K engagements


"@LucasAtkins7 This code is extremely dangerous. Here I improved it"  
[X Link](https://x.com/karpathy/status/1981009115523789169) [@karpathy](/creator/x/karpathy) 2025-10-22T14:45Z 1.4M followers, 1.1M engagements

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@karpathy Andrej Karpathy

Andrej Karpathy posts on X about all the, open ai, 90s, llm the most. They currently have XXXXXXXXX followers and XXX posts still getting attention that total XXXXXXXXX engagements in the last XX hours.

Engagements: XXXXXXXXX #

X Week XXXXXXXXXX +67%
X Month XXXXXXXXXX +271%
X Months XXXXXXXXXX -XX%
X Year XXXXXXXXXXX +72%

Mentions: XX #

X Week XX +52%
X Month XXX +40%
X Months XXX -XXXX%
X Year XXX +30%

Followers: XXXXXXXXX #

X Week XXXXXXXXX +1.80%
X Month XXXXXXXXX +3.10%
X Months XXXXXXXXX +14%
X Year XXXXXXXXX +32%

CreatorRank: XXXXX #

Social Influence #

Social category influence technology brands XXXX% social networks XXXX% finance XXXX% gaming XXXX% celebrities XXXX%

Social topic influence all the 4.95%, open ai #150, 90s #1325, llm #19, rl #43, dvd #389, over the 1.98%, mentions 0.99%, echo 0.99%, batch XXXX%

Top accounts mentioned or mentioned by @grok @basedproffoak @ronald_vanloon @elonmusk @gregcoppola5d @clementdelangue @amitmoryossef @dwarkesh_sp @shomikghosh21 @yacinemtb @kimmonismus @jeremy_ai_ @kalyan_kpl @rasbt @swyx @staysaasy @eigenron @theturingpost @elder_plinius @muserhymes

Top assets mentioned PSYOP (PSYOP)

Top Social Posts #

Top posts by engagements in the last XX hours

"@shaneguML The Great Filter is kinda cute"
X Link @karpathy 2025-07-15T19:00Z 1.4M followers, 332.5K engagements

"I don't know what labs are doing to these poor LLMs during RL but they are mortally terrified of exceptions in any infinitesimally likely case. Exceptions are a normal part of life and healthy dev process. Sign my LLM welfare petition for improved rewards in cases of exceptions"
X Link @karpathy 2025-10-09T00:10Z 1.4M followers, 699.4K engagements

"An attempt to explain (current) ChatGPT versions. I still run into many many people who don't know that: - o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3 you're ngmi. - 4o is different from o4. Yes I know lol. 4o is a good "daily driver" for many easy-medium questions. o4 is only available as mini for now and is not as good as o3 and I'm not super sure why it's out right now. Example basic "router" in my own personal use: - Any simple query (e.g. "what foods are high in"
X Link @karpathy 2025-06-02T17:54Z 1.4M followers, 1.4M engagements

"Hah judging by mentions overnight people seem to find the ghost analogy provocative. I swear I don't wake up just trying to come with new memes but to elaborate briefly why I thought it was a fun comparison: 1) It captures the idea that LLMs are purely digital artifacts that don't interact with the physical world (unlike animals which are very embodied). 2) Ghosts are a kind of "echo" of the living in this case a statistical distillation of humanity. 3) There is an air of mystery over both ghosts and LLMs as in we don't fully understand what they are or how they work. 4) The process of"
X Link @karpathy 2025-10-02T14:25Z 1.4M followers, 242.3K engagements

"Its been a decade but yes I believe I hallucinated the term in my 2015 post on unreasonable effectiveness of RNNs. I later became aware that Geoff Hinton used confabulate which is often (but I think not always) a better analogue in human psychology. Its a bit too specific meaning a factual fabrication. I think hallucinate works better as a bit more general (not necessarily factual) filling in of data or patterns"
X Link @karpathy 2025-10-19T22:10Z 1.4M followers, 31.2K engagements

"LLM OS. Bear with me I'm still cooking. Specs: - LLM: OpenAI GPT-4 Turbo XXX core (batch size) processor @ 20Hz (tok/s) - RAM: 128Ktok - Filesystem: Ada002"
X Link @karpathy 2023-11-11T00:48Z 1.4M followers, 2.4M engagements

"This is what the ideal grocery store looks like. Minimally processed (NOVA Group 1) food only (no "edible food-like substances") organic local fresh. Food should not be more complex than this yet I don't believe this exists"
X Link @karpathy 2025-07-08T15:53Z 1.4M followers, 599.3K engagements

"TV in the 90s: you turn it on you watch. TV 2025: - turn on wait for it to load - popup: TV wants to update 1.5GB. No. - scroll sideways find prime video app or etc - popup: now app wants to update 500MB. No - App launching. App loading - select account screen - 🫠"
X Link @karpathy 2025-10-16T02:47Z 1.4M followers, 1.7M engagements

"nanochat now has a primordial identity and can talk a bit about itself and its capabilities (e.g. it knows it's nanochat d32 that cost $XXX that it was built by me that it can't speak languages other than English too well and why etc.). This kind of customization is all done through synthetic data generation and I uploaded a new example script to demonstrate. It's a bit subtle but by default LLMs have no inherent personality or any understanding of their own capabilities because they are not animal-like entities. They don't know what they are or what they can or can't do or know or don't"
X Link @karpathy 2025-10-21T15:59Z 1.4M followers, 356.8K engagements

"The hottest new programming language is English"
X Link @karpathy 2023-01-24T20:14Z 1.4M followers, 8.4M engagements

"⚡ Excited to share that I am starting an AI+Education company called Eureka Labs. The announcement: --- We are Eureka Labs and we are building a new kind of school that is AI native. How can we approach an ideal experience for learning something new For example in the case of physics one could imagine working through very high quality course materials together with Feynman who is there to guide you every step of the way. Unfortunately subject matter experts who are deeply passionate great at teaching infinitely patient and fluent in all of the world's languages are also very scarce and cannot"
X Link @karpathy 2024-07-16T17:25Z 1.4M followers, 2.5M engagements

"Projects like OpenAIs Operator are to the digital world as Humanoid robots are to the physical world. One general setting (monitor keyboard and mouse or human body) that can in principle gradually perform arbitrarily general tasks via an I/O interface originally designed for humans. In both cases it leads to a gradually mixed autonomy world where humans become high-level supervisors of low-level automation. A bit like a driver monitoring the Autopilot. This will happen faster in digital world than in physical world because flipping bits is somewhere around 1000X less expensive than moving"
X Link @karpathy 2025-01-23T21:42Z 1.4M followers, 427.8K engagements

"New 2h11m YouTube video: How I Use LLMs This video continues my general audience series. The last one focused on how LLMs are trained so I wanted to follow up with a more practical guide of the entire LLM ecosystem including lots of examples of use in my own life. Chapters give a sense of content: 00:00:00 Intro into the growing LLM ecosystem 00:02:54 ChatGPT interaction under the hood 00:13:12 Basic LLM interactions examples 00:18:03 Be aware of the model you're using pricing tiers 00:22:54 Thinking models and when to use them 00:31:00 Tool use: internet search 00:42:04 Tool use: deep"
X Link @karpathy 2025-02-27T22:41Z 1.4M followers, 986.4K engagements

"@willccbb Theoretical physicists are the intellectual embryonic stem cell Ive now seen them become everything"
X Link @karpathy 2025-06-03T00:40Z 1.4M followers, 143.9K engagements

"My sleep scores during recent travel were in the 90s. Now back in SF I am consistently back down to 70s 80s. I am increasingly convinced that this is due to traffic noise from a nearby road/intersection where I live - every 10min a car truck bus or motorcycle with a very loud engine passes by (some are 10X louder than others). In the later less deep stages of sleep it is much easier to wake and then much harder to go back to sleep. More generally I think noise pollution (esp early hours) come at a huge societal cost that is not correctly accounted for. E.g. I wouldn't be too surprised if a"
X Link @karpathy 2025-06-07T19:01Z 1.4M followers, 1.5M engagements

"How to build a thriving open source community by writing code like bacteria do 🦠. Bacterial code (genomes) are: - small (each line of code costs energy) - modular (organized into groups of swappable operons) - self-contained (easily "copy paste-able" via horizontal gene transfer) If chunks of code are small modular self-contained and trivial to copy-and-paste the community can thrive via horizontal gene transfer. For any function (gene) or class (operon) that you write: can you imagine someone going "yoink" without knowing the rest of your code or having to import anything new to gain a"
X Link @karpathy 2025-07-05T21:54Z 1.4M followers, 621.2K engagements

"Scaling up RL is all the rage right now I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly) let me slightly increase (/decrease) the probability of every action I took for the future". You get a lot more leverage from verifier functions than explicit supervision this is great. But first it looks suspicious asymptotically - once the tasks grow to be minutes/hours of interaction long you're really going to do all that work just"
X Link @karpathy 2025-07-13T16:35Z 1.4M followers, 1.1M engagements

"I'm noticing that due to (I think) a lot of benchmarkmaxxing on long horizon tasks LLMs are becoming a little too agentic by default a little beyond my average use case. For example in coding the models now tend to reason for a fairly long time they have an inclination to start listing and grepping files all across the entire repo they do repeated web searchers they over-analyze and over-think little rare edge cases even in code that is knowingly incomplete and under active development and often come back minutes later even for simple queries. This might make sense for long-running tasks but"
X Link @karpathy 2025-08-09T16:53Z 1.4M followers, 1M engagements

"Continuing the journey of optimal LLM-assisted coding experience. In particular I find that instead of narrowing in on a perfect one thing my usage is increasingly diversifying across a few workflows that I "stitch up" the pros/cons of: Personally the bread & butter (75%) of my LLM assistance continues to be just (Cursor) tab complete. This is because I find that writing concrete chunks of code/comments myself and in the right part of the code is a high bandwidth way of communicating "task specification" to the LLM i.e. it's primarily about task specification bits - it takes too many bits and"
X Link @karpathy 2025-08-24T19:46Z 1.4M followers, 682.6K engagements

"Transforming human knowledge sensors and actuators from human-first and human-legible to LLM-first and LLM-legible is a beautiful space with so much potential and so much can be done. One example I'm obsessed with recently - for every textbook pdf/epub there is a perfect "LLMification" of it intended not for human but for an LLM (though it is a non-trivial transformation that would need human in the loop involvement). - All of the exposition is extracted into a markdown document including all latex styling (bold/italic) tables lists etc. All of the figures are extracted as images. - All"
X Link @karpathy 2025-08-28T18:07Z 1.4M followers, 709.8K engagements

""AI isn't replacing radiologists" good article Expectation: rapid progress in image recognition AI will delete radiology jobs (e.g. as famously predicted by Geoff Hinton now almost a decade ago). Reality: radiology is doing great and is growing. There are a lot of imo naive predictions out there on the imminent impact of AI on the job market. E.g. a year ago I was asked by someone who should know better if I think there will be any software engineers still today. (Spoiler: I think we're going to make it). This is happening too broadly. The post goes into detail on why it's not that simple"
X Link @karpathy 2025-09-25T14:29Z 1.4M followers, 2.3M engagements

"POV: Your LLM agent is dividing a by b"
X Link @karpathy 2025-10-09T00:31Z 1.4M followers, 377.9K engagements

"Excited to release new repo: nanochat (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining nanochat is a minimal from scratch full-stack training/inference pipeline of a simple ChatGPT clone in a single dependency-minimal codebase. You boot up a cloud GPU box run a single script and in as little as X hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs 8000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb evaluate CORE score"
X Link @karpathy 2025-10-13T15:16Z 1.4M followers, 5.6M engagements

"nanochat d32 i.e. the depth XX version that I specced for $1000 up from $XXX has finished training after XX hours and looks good. All the metrics go up quite a bit across pretraining SFT and RL. CORE score of XXXX is now well above GPT-2 at XXXX. GSM8K went X% - XX% etc. So that's encouraging. The model is pretty fun to talk to but judging from some early interactions I think people have a little bit too much expectation for these micro models. There is a reason that frontier LLM labs raise billions to train their models. nanochat models cost $XXX - $1000 to train from scratch. The $100"
X Link @karpathy 2025-10-16T00:14Z 1.4M followers, 247.3K engagements

"DVD player is superior technology"
X Link @karpathy 2025-10-16T02:57Z 1.4M followers, 119.8K engagements

"@tim_zaman @dwarkesh_sp Hacker hoodie Rare Cloth Armor: X Intellect +10 Stamina +10 Active: put hoodie over head. Enters state of intense focus. +20% APM. +10% intellect. Ignores hunger. If disturbed take XXX sanity damage"
X Link @karpathy 2025-10-17T02:18Z 1.4M followers, 11.2K engagements

"I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots) and yes data collection etc. but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language person) is whether pixels are better inputs to LLMs than text. Whether text tokens are wasteful and just terrible at the input. Maybe it makes more sense that all inputs to LLMs should only ever be images. Even if you happen to have pure text input maybe you'd prefer to render it and then feed that in: - more information"
X Link @karpathy 2025-10-20T22:13Z 1.4M followers, 2.8M engagements

"Some background is that early OpenAI (2015) we all took a poll on our AGI timelines. I voted XX years then (very long compared to others) so now X decade later it's interesting that it still feels on trend and I'm sticking with it. I completely understand that it feels arbitrary and yes it's just vibes. I could refuse to answer (and often do/try) but it's not that interesting. And I admire people who try to make the predictions be "not vibes" but I haven't seen something concrete that is sufficiently convincing"
X Link @karpathy 2025-10-21T16:15Z 1.4M followers, 145.4K engagements

"@thawani_avijit Haha. I am afraid people interpreted my delete tokenizer as use bytes directly without BPE the issue is you still need bytes encoding arbitrariness even for that Pixels is the only way. Just like humans. It is written. If GPT-10 uses utf8 at the input I will eat a shoe"
X Link @karpathy 2025-10-21T22:33Z 1.4M followers, 73.2K engagements

"TIL will look into The thing that makes this a bit complicated right now is the start latency. What bloats up the setup time right now is the dataset and its tokenization which is all done in Python right now. Installing huggingface datasets downloading FineWeb 10B and tokenizing it is currently X hr. I think I have to look into precomputing all of this and just saving the final .bin files (20GB) of tokens somewhere (S3 or so). You could imagine fetching data shards asynchronously while the training started. This would completely eliminate any Python dependency. The next slightly annoying"
X Link @karpathy 2024-05-28T17:06Z 1.4M followers, 20.6K engagements

"@brickroad7 Think Casper Childhood favorite"
X Link @karpathy 2025-10-01T18:30Z 1.4M followers, 25.9K engagements

"@rasbt yep 💯 Using closed model API should feel quite unsettling when there is no recourse"
X Link @karpathy 2025-10-01T19:47Z 1.4M followers, 21.4K engagements

"@alexisxrivas Dream kitchen without the contemporary minimalism psyop. But also a big fan of functional and affordable"
X Link @karpathy 2025-10-02T14:49Z 1.4M followers, 62.8K engagements

"@jdchawla29 Yeah one more person pointed out already. I'd say it falls under "agent" because an LLM is still giving you a large chunk of code that you're slotting in"
X Link @karpathy 2025-10-02T23:37Z 1.4M followers, 18.5K engagements

"@swyx @staysaasy I dont love it. Youre either engineering or youre vibing. They are opposites of some spectrum"
X Link @karpathy 2025-10-08T01:50Z 1.4M followers, 57K engagements

"@doodlestein OPEN THE POD BAY DOORS HAL The number of AI pioneers anticipating this as the state of the art AI of 2025 must surely have been exactly zero"
X Link @karpathy 2025-10-12T18:44Z 1.4M followers, 47.2K engagements

"@eigenrobot World of Warcraft Classic grinding mobs simple questing is mine. Repetitive skill rotation with just enough variety to keep fun/engaging but easy. A lot of wrong answers in the replies here games that nowhere near mindless enough eg Factorio"
X Link @karpathy 2025-10-05T00:38Z 1.4M followers, 66.9K engagements

"# on shortification of "learning" There are a lot of videos on YouTube/TikTok etc. that give the appearance of education but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are learning (but actually they are just having fun). The people creating this content also enjoy it because fun has a much larger audience fame and revenue. But as far as learning goes this is a trap. This content is an epsilon away from watching the Bachelorette. It's like snacking on those "Garden Veggie Straws" which feel"
X Link @karpathy 2024-02-10T18:10Z 1.4M followers, 2.2M engagements

"The future expands the variance of human condition a lot more than it drags its mean. This is an empirical observation with interesting extrapolations. The past is well-approximated as a population of farmers living similar lives w.r.t. upbringing knowledge activities ideals aspirations etc. The future trends to include all of: - the transhumanists who "ascend" with neuralinks etc. and the Amish living 19th century life. - those who "worship" ideals of religion technology knowledge wealth fitness community nature art . - those exploring externally into the stars those exploring internally"
X Link @karpathy 2024-10-16T07:09Z 1.4M followers, 427.7K engagements

"I was given early access to Grok X earlier today making me I think one of the first few who could run a quick vibe check. Thinking ✅ First Grok X clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan question: "Create a board game webpage showing a hex grid just like in the game Settlers of Catan. Each hex grid is numbered from 1.N where N is the total number of hex tiles. Make it generic so one can change the number of "rings" using a slider. For example in Catan the radius is X hexes. Single html page please." Few models"
X Link @karpathy 2025-02-18T05:25Z 1.4M followers, 3.7M engagements

"I wrote a quick new post on "Digital Hygiene". Basically there are some no-brainer decisions you can make in your life to dramatically improve the privacy and security of your computing and this post goes over some of them. Blog post link in the reply but copy pasting below too. Every now and then I get reminded about the vast fraud apparatus of the internet re-invigorating my pursuit of basic digital hygiene around privacy/security of day to day computing. The sketchiness starts with major tech companies who are incentivized to build comprehensive profiles of you to monetize it directly for"
X Link @karpathy 2025-03-18T17:14Z 1.4M followers, 4M engagements

""Finding the Best Sleep Tracker" Results of an experiment where I wore X sleep trackers every night for X months. TLDR Whoop = Oura 8Sleep Apple Watch + AutoSleep. Link simply right here instead of in a reply because ()/"
X Link @karpathy 2025-03-30T16:41Z 1.4M followers, 1.6M engagements

"Noticing myself adopting a certain rhythm in AI-assisted coding (i.e. code I actually and professionally care about contrast to vibe code). X. Stuff everything relevant into context (this can take a while in big projects. If the project is small enough just stuff everything e.g. files-to-prompt . -e ts -e tsx -e css -e md --cxml --ignore node_modules -o prompt.xml) X. Describe the next single concrete incremental change we're trying to implement. Don't ask for code ask for a few high-level approaches pros/cons. There's almost always a few ways to do thing and the LLM's judgement is not always"
X Link @karpathy 2025-04-25T01:41Z 1.4M followers, 1.2M engagements

"We're missing (at least one) major paradigm for LLM learning. Not sure what to call it possibly it has a name - system prompt learning Pretraining is for knowledge. Finetuning (SL/RL) is for habitual behavior. Both of these involve a change in parameters but a lot of human learning feels more like a change in system prompt. You encounter a problem figure something out then "remember" something in fairly explicit terms for the next time. E.g. "It seems when I encounter this and that kind of a problem I should try this and that kind of an approach/solution". It feels more like taking notes for"
X Link @karpathy 2025-05-11T00:55Z 1.4M followers, 1.5M engagements

"Nice - my AI startup school talk is now up Chapters: 0:00 Imo fair to say that software is changing quite fundamentally again. LLMs are a new kind of computer and you program them in English. Hence I think they are well deserving of a major version upgrade in terms of software. 6:06 LLMs have properties of utilities of fabs and of operating systems = New LLM OS fabbed by labs and distributed like utilities (for now). Many historical analogies apply - imo we are computing circa 1960s. 14:39 LLM psychology: LLMs = "people spirits" stochastic simulations of people where the simulator is an"
X Link @karpathy 2025-06-19T02:01Z 1.4M followers, 1.3M engagements

"More gists less gits"
X Link @karpathy 2025-07-05T21:59Z 1.4M followers, 137.2K engagements

""Using a better model for analysis" 🤨 I didn't realize I was using haiku all this time no idea when claude code snuck this one in rofl"
X Link @karpathy 2025-07-18T21:46Z 1.4M followers, 374.6K engagements

"How amazing it would be if we could extract and reframe all the practice problems from all the textbooks ever written into environments"
X Link @karpathy 2025-08-27T20:45Z 1.4M followers, 151.3K engagements

"Anytime someone takes a picture/video that I happen to be in the background of I like to wave at the AGI that sees me XX years from now"
X Link @karpathy 2025-09-22T13:10Z 1.4M followers, 365.3K engagements

"Very early on in the project I did a small run with/without QK norm and found that it helped. Same for the embedding weight sharing. I'll retry I'm not tied to any details of the model and they weren't chosen any more carefully than a single run I spent most of the time just arranging the harness. Now it's a good time to experiment again"
X Link @karpathy 2025-10-13T16:23Z 1.4M followers, 6395 engagements

"@simonw Good idea I'm running/tuning these tiers now. I'll bunch up some of the low-hanging fruit here over the next few days and make it available (I spent most of my energy going into this v0.1 on the overall harness)"
X Link @karpathy 2025-10-13T17:21Z 1.4M followers, 16.3K engagements

"@younghope11 yeah i was thinking about it good idea"
X Link @karpathy 2025-10-13T21:22Z 1.4M followers, 3005 engagements

"I count and report the lines in uv.lock which is my attempt at a simple 80:20 proxy for this as it includes all the packages recursively. Open to suggestions I'd love to measure things like "cognitive complexity" (there was a great blog post on it a few months back that I can't find). Or things like how "bacterial" the code is per my tweet a few months ago too. tiny already knows that I'm not a huge fan of excessive line count maxxing .min.py style specifically"
X Link @karpathy 2025-10-14T15:00Z 1.4M followers, 19.2K engagements

"Website: GitHub: : @EurekaLabsAI"
X Link @karpathy 2024-07-16T17:25Z 1.4M followers, 196.7K engagements

"LLM model size competition is intensifying backwards My bet is that we'll see models that "think" very well and reliably that are very very small. There is most likely a setting even of GPT-2 parameters for which most people will consider GPT-2 "smart". The reason current models are so large is because we're still being very wasteful during training - we're asking them to memorize the internet and remarkably they do and can e.g. recite SHA hashes of common numbers or recall really esoteric facts. (Actually LLMs are really good at memorization qualitatively a lot better than humans sometimes"
X Link @karpathy 2024-07-18T20:42Z 1.4M followers, 1.4M engagements

"@Sams_Antics Yep I read Project Hail Mary and remember liking it a lot too it just didn't have the same staying power for some reason I don't fully understand so it didn't make this list"
X Link @karpathy 2024-12-09T01:11Z 1.4M followers, 46.7K engagements

"There's a new kind of coding I call "vibe coding" where you fully give in to the vibes embrace exponentials and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment usually that fixes it. The code grows beyond my"
X Link @karpathy 2025-02-02T23:17Z 1.4M followers, 5.2M engagements

"I attended a vibe coding hackathon recently and used the chance to build a web app (with auth payments deploy etc.). I tinker but I am not a web dev by background so besides the app I was very interested in what it's like to vibe code a full web app today. As such I wrote none of the code directly (Cursor+Claude/o3 did) and I don't really know how the app works in the conventional sense that I'm used to as an engineer. The app is called MenuGen and it is live on Basically I'm often confused about what all the things on a restaurant menu are - e.g. Pt Tagine Cavatappi or Sweetbread (hint it's."
X Link @karpathy 2025-05-01T15:16Z 1.4M followers, 780.2K engagements

"+1 for "context engineering" over "prompt engineering". People associate prompts with short task descriptions you'd give an LLM in your day-to-day use. When in every industrial-strength LLM app context engineering is the delicate art and science of filling the context window with just the right information for the next step. Science because doing this right involves task descriptions and explanations few shot examples RAG related (possibly multimodal) data tools state and history compacting. Too little or of the wrong form and the LLM doesn't have the right context for optimal performance."
X Link @karpathy 2025-06-25T15:54Z 1.4M followers, 2.4M engagements

"May your regularizer be strong lest you RLHF to slop"
X Link @karpathy 2025-06-25T18:31Z 1.4M followers, 230.9K engagements

"In era of pretraining what mattered was internet text. You'd primarily want a large diverse high quality collection of internet documents to learn from. In era of supervised finetuning it was conversations. Contract workers are hired to create answers for questions a bit like what you'd see on Stack Overflow / Quora or etc. but geared towards LLM use cases. Neither of the two above are going away (imo) but in this era of reinforcement learning it is now environments. Unlike the above they give the LLM an opportunity to actually interact - take actions see outcomes etc. This means you can hope"
X Link @karpathy 2025-08-27T20:34Z 1.4M followers, 933.6K engagements

"I think congrats again to OpenAI for cooking with GPT-5 Pro. This is the third time I've struggled on something complex/gnarly for an hour on and off with CC then X Pro goes off for XX minutes and comes back with code that works out of the box. I had CC read the X Pro version and it wrote up X paragraphs admiring it (very wholesome). If you're not giving it your hardest problems you're probably missing out"
X Link @karpathy 2025-09-05T17:38Z 1.4M followers, 2.6M engagements

"reminded of this paragraph from gsm8k paper 2021 :)"
X Link @karpathy 2025-09-13T16:08Z 1.4M followers, 372.6K engagements

"from this era"
X Link @karpathy 2025-09-13T16:12Z 1.4M followers, 260.8K engagements

"Finally had a chance to listen through this pod with Sutton which was interesting and amusing. As background Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea is sufficiently "bitter lesson pilled" (meaning arranged so that it benefits from added computation for free) as a proxy for whether it's going to work or worth even pursuing. The underlying assumption being that LLMs are of course highly "bitter lesson pilled" indeed just look at LLM scaling laws where if you put compute"
X Link @karpathy 2025-10-01T17:09Z 1.4M followers, 1.9M engagements

"Something I am experimenting with. I copy pasted: 1) the full podcast transcript 2) the bitter lesson blog post 3) my full post above To ChatGPT. The interesting part is you can fork the conversation context to ask any questions and take it in whatever direction with chat:"
X Link @karpathy 2025-10-01T17:44Z 1.4M followers, 142K engagements

"Tinker is cool. If you're a researcher/developer tinker dramatically simplifies LLM post-training. You retain XX% of algorithmic creative control (usually related to data loss function the algorithm) while tinker handles the hard parts that you usually want to touch much less often (infra forward/backward of the LLM itself distributed training) meaning you can do these at well below XX% of typical complexity involved. Compared to the more common and existing paradigm of "upload your data we'll post-train your LLM" this is imo a more clever place to "slice up" the complexity of post-training"
X Link @karpathy 2025-10-01T19:22Z 1.4M followers, 694.4K engagements

"Btw people should check out @alexisxrivas / @coverbuild which is really cool and I've followed for a while. I think I'm just developing an allergy to some disease in culture that I'm working through. Separately from functional scalable affordable. we have to build CATHEDRALS again. Applying miracles of modern technology. To take people's breath away. To shake Olympus in jealousy"
X Link @karpathy 2025-10-02T15:47Z 1.4M followers, 28K engagements

"For your professional programming do you use mostly:"
X Link @karpathy 2025-10-02T23:28Z 1.4M followers, 354.2K engagements

"Every company needs a DM POC - someone high up who you can just DM the most obvious things and who shortcuts the PM hierarchy"
X Link @karpathy 2025-10-04T14:31Z 1.4M followers, 553.6K engagements

"GitHub repo: A lot more detailed and technical walkthrough: Example conversation with the $XXX 4-hour nanochat in the WebUI. It's. entertaining :) Larger models (e.g. a 12-hour depth XX or a 24-hour depth 30) quickly get more coherent"
X Link @karpathy 2025-10-13T15:16Z 1.4M followers, 240.9K engagements

"And an example of some of the summary metrics produced by the $XXX speedrun in the report card to start. The current code base is a bit over 8000 lines but I tried to keep them clean and well-commented. Now comes the fun part - of tuning and hillclimbing"
X Link @karpathy 2025-10-13T15:16Z 1.4M followers, 169.7K engagements

"@zenitsu_aprntc Good question it's basically entirely hand-written (with tab autocomplete). I tried to use claude/codex agents a few times but they just didn't work well enough at all and net unhelpful possibly the repo is too far off the data distribution"
X Link @karpathy 2025-10-13T15:27Z 1.4M followers, 476.3K engagements

"Good question ty I think this is not a good repo for that. You should think of micro models maybe more as very young children (kindergarten etc.) they just don't have the raw intelligence of their larger cousins. If you finetune/train it on your own data you'll probably get some amusing parroting that feels like your writing in style but it will be slop. To achieve what you're looking for you'd want something more like: - take your raw data - add extensive synthetic data generation rewrites on top (tricky not obvious researchy) - finetune a state of the art open LLM on it (e.g. tinker) -"
X Link @karpathy 2025-10-13T15:37Z 1.4M followers, 52.5K engagements

"Basically Llama-like a bit simpler some influences from modded-nanoGPT. Tried to find a solid baseline for this scale: - dense transformer - rotary embeddings (and no positional embeddings) - QK norm - untied weights for embedding and unembedding - norm after token embedding - relu2 activation in MLP - no learnable params in rmsnorm - no biases in linear layers - Multi-Query Attention (MQA) - logit softcap Optimizer is Muon+AdamW heavily influenced from modded-nanoGPT. I have a TODO to try to tune Adam LRs well (e.g. per module) to remove Muon I haven't tried hard enough yet"
X Link @karpathy 2025-10-13T15:48Z 1.4M followers, 85.1K engagements

"@singh_ilepton LLM101n is currently experiencing a lot of scope creep 😅"
X Link @karpathy 2025-10-13T15:49Z 1.4M followers, 30.8K engagements

"@ClementDelangue @huggingface Ty huggingface work/infra/datasets are critical to projects like nanochat - to be accurate the source code of nanochat (e.g. at the $XXX tier) is 8KB of Python and 30GB of fineweb/smoltalk"
X Link @karpathy 2025-10-13T16:29Z 1.4M followers, 43.7K engagements

"@Tim_Dettmers Thank you Notably I didn't yet include model quantization for inference. I have questions :)"
X Link @karpathy 2025-10-13T16:32Z 1.4M followers, 42K engagements

"@vikramsingh0110 Ty MINIX is very inspiring and was exactly on my mind as well (as I've made the LLM - OS analogy often). Goals"
X Link @karpathy 2025-10-15T16:51Z 1.4M followers, 28.2K engagements

"There is a movement I found on Instagram where people delivery choose to live in 90s refusing all technology after 2000. Like an intermediate form of the Amish"
X Link @karpathy 2025-10-16T02:50Z 1.4M followers, 278.1K engagements

"Deliberately*"
X Link @karpathy 2025-10-16T02:50Z 1.4M followers, 130.8K engagements

"@kumbhani_smit I think when I said TV personally I just meant movies/tvshows which I realize wasnt super clear. Things you used cassette or DVD for as the 90s baseline. I basically never watched TV as in news or cable"
X Link @karpathy 2025-10-16T14:47Z 1.4M followers, 38.5K engagements

"Thank you I'm quite happy with the core_eval.py rewrite. I wanted to evaluate my base model with the DCLM "core score" as described in their paper but what felt like it should surely be a simple thing of XXX lines of code actually required me to pip install and depend on a huge amount of infrastructure and code from mosaic and so on. I'm happy to have ended up with XXX lines of imo quite nice clean "bacterial" rewrite there"
X Link @karpathy 2025-10-18T16:36Z 1.4M followers, 97.9K engagements

"My pleasure to come on Dwarkesh last week I thought the questions and conversation were really good. I re-watched the pod just now too. First of all yes I know and I'm sorry that I speak so fast :). It's to my detriment because sometimes my speaking thread out-executes my thinking thread so I think I botched a few explanations due to that and sometimes I was also nervous that I'm going too much on a tangent or too deep into something relatively spurious. Anyway a few notes/pointers: AGI timelines. My comments on AGI timelines looks to be the most trending part of the early response. This is"
X Link @karpathy 2025-10-18T20:23Z 1.4M followers, 3.7M engagements

"@elonmusk Id much rather use and collaborate with Grok X than compete against it. Though quite similar to chess and in the limit (speaking of physics) my value add probably trends to zero"
X Link @karpathy 2025-10-18T22:47Z 1.4M followers, 204K engagements

"I very much hope you continue working on RL I think it's a misunderstanding that I am suggesting we need some kind of a replacement for RL. That's not accurate and I tried to clear it but did so poorly - they layer. Layer X was base model autocomplete. Layer X was instruct finetuning (SFT) creating assistants in style (InstructGPT paper). Layer X is reinforcement learning (RL) allowing us to essentially optimize over the sampling loop too and driving away undesirable behaviors like hallucinations stuck repetition loops and eliciting "move 37"-like behaviors that would be really hard to SFT"
X Link @karpathy 2025-10-19T15:28Z 1.4M followers, 260.5K engagements

"Nice short post illustrating how simple text (discrete) diffusion can be. Diffusion (i.e. parallel iterated denoising top) is the pervasive generative paradigm in image/video but autoregression (i.e. go left to right bottom) is the dominant paradigm in text. For audio I've seen a bit of both. A lot of diffusion papers look a bit dense but if you strip the mathematical formalism you end up with simple baseline algorithms e.g. something a lot closer to flow matching in continuous or something like this in discrete. It's your vanilla transformer but with bi-directional attention where you"
X Link @karpathy 2025-10-20T18:58Z 1.4M followers, 622K engagements

"Original OpenAI definition. AGI = an automated system capable of doing any economically valuable work that a human can do. (Like humans they might need some additional training for a job that's fine). And I grant the commonly given "digital concession" putting away physical work or aspects. So it's a remote worker you can hire instead of a person to take on most digital tasks/jobs and you interact with them in basically all the same ways - email DMs zoom calls whatever"
X Link @karpathy 2025-10-21T16:26Z 1.4M followers, 26.9K engagements

"Yes so there's one more dimension here which is more about diffusion. The original definition is that such a system exists not that it is fully deployed across society. The diffusion of such a system will still take even more time in my mind (e.g. basic technological diffusion stuff as seen with computing/internet compute constraints sensors/actuators over the physical world constraints societal legal). I don't want to offer up more vibes timelines right now. But then the full deployment of AGI into all parts of the economy and the acceleration caused by that I would then call ASI - a kind"
X Link @karpathy 2025-10-21T17:47Z 1.4M followers, 7811 engagements

"@swyx @YorkieBuilds @Grad62304977 @tejalpatwardhan AGI definition is not about GDPVal puzzles scores its about when youre fired because GPT-n is better cheaper takes no equity and works weekends. I like GDPVal but I dislike the fixation on puzzle solving as some kind of alleged endpoint"
X Link @karpathy 2025-10-21T20:04Z 1.4M followers, 16.9K engagements

"@nearcyan Bed is stuck in inclined position because of AWS outage I really thought this must have been a joke"
X Link @karpathy 2025-10-21T21:09Z 1.4M followers, 90.1K engagements

"@akyurekekin Fair but its still actively learning to make those errors (just so it can recover from them later) which imo is still a bit weird. In an ideal world you wouldnt possibly process supervision is one way to get there even within RL framework. Def agree on recovery learning"
X Link @karpathy 2025-10-21T22:24Z 1.4M followers, 29.5K engagements

"@LucasAtkins7 This code is extremely dangerous. Here I improved it"
X Link @karpathy 2025-10-22T14:45Z 1.4M followers, 1.1M engagements