Dark | Light
# ![@kalomaze Avatar](https://lunarcrush.com/gi/w:26/cr:twitter::1319397157913436163.png) @kalomaze kalomaze

kalomaze posts on X about model, if you, ai, anthropic the most. They currently have [------] followers and [---] posts still getting attention that total [------] engagements in the last [--] hours.

### Engagements: [------] [#](/creator/twitter::1319397157913436163/interactions)
![Engagements Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1319397157913436163/c:line/m:interactions.svg)

- [--] Week [-------] +175%
- [--] Month [---------] +43%
- [--] Months [----------] +99%
- [--] Year [-----------] +585%

### Mentions: [--] [#](/creator/twitter::1319397157913436163/posts_active)
![Mentions Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1319397157913436163/c:line/m:posts_active.svg)

- [--] Month [--] -7%
- [--] Months [---] -10%
- [--] Year [-----] +31%

### Followers: [------] [#](/creator/twitter::1319397157913436163/followers)
![Followers Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1319397157913436163/c:line/m:followers.svg)

- [--] Week [------] +0.68%
- [--] Month [------] +2.30%
- [--] Months [------] +27%
- [--] Year [------] +241%

### CreatorRank: [-------] [#](/creator/twitter::1319397157913436163/influencer_rank)
![CreatorRank Line Chart](https://lunarcrush.com/gi/w:600/cr:twitter::1319397157913436163/c:line/m:influencer_rank.svg)

### Social Influence

**Social category influence**
[technology brands](/list/technology-brands)  16.98% [finance](/list/finance)  4.72% [products](/list/products)  2.83% [stocks](/list/stocks)  2.83% [social networks](/list/social-networks)  2.83% [celebrities](/list/celebrities)  1.89% [fashion brands](/list/fashion-brands)  0.94%

**Social topic influence**
[model](/topic/model) #3938, [if you](/topic/if-you) 5.66%, [ai](/topic/ai) 5.66%, [anthropic](/topic/anthropic) 4.72%, [macbook](/topic/macbook) #444, [meta](/topic/meta) 2.83%, [the world](/topic/the-world) 2.83%, [core](/topic/core) 2.83%, [generative](/topic/generative) #148, [search](/topic/search) 1.89%

**Top accounts mentioned or mentioned by**
[@teortaxestex](/creator/undefined) [@scheminglunatic](/creator/undefined) [@osoleve](/creator/undefined) [@prcrecluse674](/creator/undefined) [@sameqcu](/creator/undefined) [@t43736689](/creator/undefined) [@mmjukic](/creator/undefined) [@snellingio](/creator/undefined) [@keytryer](/creator/undefined) [@umi33563](/creator/undefined) [@giffmana](/creator/undefined) [@xxshaurizardxx](/creator/undefined) [@hypotheosis](/creator/undefined) [@itsjustmarky](/creator/undefined) [@lauriewired](/creator/undefined) [@moonl88537](/creator/undefined) [@jatin_exe](/creator/undefined) [@ai_homelab](/creator/undefined) [@bmacabeus](/creator/undefined) [@slight_blu](/creator/undefined)

**Top assets mentioned**
[Alphabet Inc Class A (GOOGL)](/topic/$googl)
### Top Social Posts
Top posts by engagements in the last [--] hours

"residual encoding of high dimensional continuous data with temporal structure mentioned https://t.co/vArmj0QkET https://t.co/vArmj0QkET"  
[X Link](https://x.com/kalomaze/status/2021858576428675447)  2026-02-12T08:07Z 22.1K followers, 10.3K engagements


"it's a little dizzying that a [--] bit quantization of this should in principle fit on my 128GB ram macbook. and 10b active lives firmly in the "not necessarily FLOPS cucked" regime. Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding (SWE-Bench Verified 80.2%) search (BrowseComp 76.3%) agentic tool-calling (BFCL 76.8%) & office work. - Optimized for efficient execution 37% faster at complex https://t.co/UwiKzzQNG8 Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding"  
[X Link](https://x.com/kalomaze/status/2022120620901974033)  2026-02-13T01:28Z 22.1K followers, 20.5K engagements


"minimax-m2.5's release as open weights is impending. in preparation i wanted to see what my 128gb macbook could do latency wise for m2.1 (same base model earlier iteration post training). Q3_K_S 25t/s at like. 11k tokens context. for a sparse 230b running on battery power"  
[X Link](https://x.com/kalomaze/status/2022173349418611132)  2026-02-13T04:57Z 22.1K followers, [----] engagements


"it has markedly more situational awareness & a stronger ability to do big picture reasoning beyond the immediate next steps v say GLM4.7 if the actual goal benefits from rescoping that is when it rescopes. if it rescopes in a way that maybe violates the spec it caveats and asks"  
[X Link](https://x.com/kalomaze/status/2022230950080585763)  2026-02-13T08:46Z 22.1K followers, [----] engagements


"example: of its own volition it caveated along the lines of "seems related but idt it applies" when finding related git issues to a problem. good signs all around. (the tool wasnt applying to existing opencode convos it got the MCP going & it was right about it not being shown)"  
[X Link](https://x.com/kalomaze/status/2022231776505319469)  2026-02-13T08:50Z 22.1K followers, [----] engagements


"something i did as a quasi bench was watching it spend [--] mins figuring out how to compile an ancient python version (like how i did w Opus) on arm64 & port the updated version of karpathy microgpt. overall it feels. i wanna say tersely deferential in a communicative way"  
[X Link](https://x.com/kalomaze/status/2022234436696490256)  2026-02-13T09:00Z 22.1K followers, [----] engagements


"oh also the important part here is that it accomplished this in [--] minutes for $0.24"  
[X Link](https://x.com/kalomaze/status/2022240928174629361)  2026-02-13T09:26Z 22.1K followers, [----] engagements


"oh that's llama3.1-405b leaked at 3am the day before on 4chan"  
[X Link](https://x.com/kalomaze/status/1815305220118769952)  2024-07-22T08:37Z 22K followers, 305.7K engagements


"Meta papers: guys. what if. *takes hit of blunt* we predicted the next CONCEPT instead of a TOKEN man. or reasoned in LATENT SPACE man. DeepSeek papers: we found a way to make Attention take 10x less memory for the 3rd time this year. its going in the next pretrain btw narrator: they did not kill tokenization narrator: they did not kill tokenization"  
[X Link](https://x.com/kalomaze/status/1916949579154509833)  2025-04-28T20:16Z 22K followers, 199.7K engagements


"there is a NES emulator that simulates the behavior of individual transistors of the semiconductors not cycle level not even gate level transistor level"  
[X Link](https://x.com/kalomaze/status/1976601887601487963)  2025-10-10T10:53Z 22K followers, 981.8K engagements


"pulling out the big guns for this project"  
[X Link](https://x.com/kalomaze/status/1976972682567745995)  2025-10-11T11:26Z 22K followers, 53.3K engagements


"the implementation of loss terms that act as "penalties" or suppressants tends to give me the machine learning equivalent of the ick if your gradient is not going in the direction you want naturally you want to be careful that you are not counteracting it via "duct tape losses""  
[X Link](https://x.com/kalomaze/status/2014055963574120655)  2026-01-21T19:22Z 22K followers, [----] engagements


"@samgd surely you can at the very least do a stochastic subset where the set actually matches between both"  
[X Link](https://x.com/kalomaze/status/2014280237190906100)  2026-01-22T10:13Z 22K followers, [---] engagements


"that girl is a real claude pleaser"  
[X Link](https://x.com/kalomaze/status/2014508934812598280)  2026-01-23T01:22Z 22.1K followers, 44.1K engagements


"i mostly believe this for models that are by all meaningful measures beyond capable as raw foundations (dsv3 kimi glm4.5 trinity large.) i feel the pretraining gap becomes way more uncomfortable when you're in the 3090-runnable tier language model regime at say 30b"  
[X Link](https://x.com/kalomaze/status/2017752563995840992)  2026-02-01T00:11Z 22K followers, [----] engagements


"i'm not saying trinity large just to gas my people up it's up there. same with l3 405b base with the main distinction being that it's 30x cheaper than 405b and therefore actually practical to do real post training work on"  
[X Link](https://x.com/kalomaze/status/2017754396382400731)  2026-02-01T00:18Z 22K followers, [----] engagements


"noooooooooooooooo aaaaarghhhh people keep falling for this fantasy idea of abandoning sgd/backprop structure for pure search when the objective itself should be the thing extracting the information theoretical signal Why Solomonoff Induction It's provably optimal for prediction. The idea is simple: search for all programs that fit the data and favor low-complexity ones. Since it's uncomputable we're building a practical approximation in the context of neural nets. Why Solomonoff Induction It's provably optimal for prediction. The idea is simple: search for all programs that fit the data and"  
[X Link](https://x.com/kalomaze/status/2018971310358142986)  2026-02-04T08:54Z 22.1K followers, 13.9K engagements


"i found a particularly nonsensical sample when working on filtering an OSS instruction following dataset and somehow it was so word salady that the Anthropic classifier assumed it was a jailbreak (when i gave it to Opus)"  
[X Link](https://x.com/kalomaze/status/2019241941268066586)  2026-02-05T02:49Z 22.1K followers, [----] engagements


"we will see which model is the real agi"  
[X Link](https://x.com/kalomaze/status/2019243423723868382)  2026-02-05T02:55Z 22K followers, [----] engagements


"deep learning can and will conquer anything Claude Opus [---] (120K Thinking) on ARC-AGI Semi-Private Eval Max Effort: - ARC-AGI-1: 93.0% $1.88/task - ARC-AGI-2: 68.8% $3.64/task New ARC-AGI SOTA model from @AnthropicAI https://t.co/rfjhpp2B6G Claude Opus [---] (120K Thinking) on ARC-AGI Semi-Private Eval Max Effort: - ARC-AGI-1: 93.0% $1.88/task - ARC-AGI-2: 68.8% $3.64/task New ARC-AGI SOTA model from @AnthropicAI https://t.co/rfjhpp2B6G"  
[X Link](https://x.com/kalomaze/status/2019487692732658136)  2026-02-05T19:06Z 22.1K followers, 14.3K engagements


"specifically the term i want would be for describing behavior that is compartmentalized & learned in pursuit of some other goal that it may *genuinely be achieving* despite that behavior serving no apparent functional purpose for goal achievement. "reward stimming" perhaps"  
[X Link](https://x.com/kalomaze/status/2019497095242346979)  2026-02-05T19:43Z 22K followers, [----] engagements


"the lesswrong people probably already have a coinage for this somewhere"  
[X Link](https://x.com/kalomaze/status/2019498314627182672)  2026-02-05T19:48Z 22K followers, [----] engagements


"@iScienceLuvr @redtachyon not if it's a giant phi-like"  
[X Link](https://x.com/kalomaze/status/2019502884187173125)  2026-02-05T20:06Z 22K followers, [--] engagements


"a project m weekly local in san francisco would heal my soul"  
[X Link](https://x.com/kalomaze/status/2019524313213464743)  2026-02-05T21:31Z 22K followers, [----] engagements


"there is no wall we continue going up the curve and it feels just as weird and thrilling as I thought it would we continue going up the curve and it feels just as weird and thrilling as I thought it would"  
[X Link](https://x.com/kalomaze/status/2019678128566849898)  2026-02-06T07:42Z 22K followers, [----] engagements


"in spite of all their edges in spite of all their inelegances in spite of all their pain points the kind that engineers all around the world stomach as they carve out the shape of the future. deep neural networks still work"  
[X Link](https://x.com/kalomaze/status/2019681802613658102)  2026-02-06T07:57Z 22K followers, [----] engagements


"this really my day job"  
[X Link](https://x.com/kalomaze/status/2019716907415544276)  2026-02-06T10:16Z 22K followers, [----] engagements


"@JasonBotterill @scaling01 i doubt it anthropic has avoided the "demo the model early in public" trap before (no early launches on lmsys arena)"  
[X Link](https://x.com/kalomaze/status/2020032699092881768)  2026-02-07T07:11Z 22K followers, [--] engagements


".@Grad62304977 is basically the LeBron James of finding obscure chinese RL papers"  
[X Link](https://x.com/kalomaze/status/2020255301753270357)  2026-02-07T21:56Z 22K followers, 22.1K engagements


"@1bit2far if you can strike the jugular of general purpose longform software engineering you can accelerate everything else where the "hard parts" of the problem rest in knowledge worker execution rather than physical logistics ergo "it's so over when claude code gets good at CAD""  
[X Link](https://x.com/kalomaze/status/2020271024550605066)  2026-02-07T22:58Z 22K followers, [---] engagements


"basically this. the idea of agentic pretraining is somewhat oxymoronic it already beats the best open-source models before any fine-tuning or RLHF. Just pretraining bearish on Avocado's potential if true The kinds of skills modern models show 100+ step agentic traces etc should not be possible with good faith pretraining it already beats the best open-source models before any fine-tuning or RLHF. Just pretraining bearish on Avocado's potential if true The kinds of skills modern models show 100+ step agentic traces etc should not be possible with good faith pretraining"  
[X Link](https://x.com/kalomaze/status/2020658297561616860)  2026-02-09T00:37Z 22.1K followers, [----] engagements


"there's a LOT you can do to diversify the midtraining stage in a way that's aligned w/ better modeling the core corpus relevant to humans. things like pretraining on super long CoT traces instead of letting a majority of the TTC benefits fall out of post training .hm ngmi"  
[X Link](https://x.com/kalomaze/status/2020663542727516311)  2026-02-09T00:58Z 22.1K followers, [----] engagements


"@osoleve a lot of people (even at OpenAI) i suspect have this kind of pigeonholing for what synthetic data is supposed to be about. i suspect one of the problems in Orion was treating synth data as basically "roids" rather than "carefully constructing exercises to strengthen core muscles""  
[X Link](https://x.com/kalomaze/status/2020664379122008307)  2026-02-09T01:01Z 22K followers, [---] engagements


"@osoleve this stuff is naturally the tail less than 0.1% of all data easily surely Anthropic is not pretraining on enough of this that core abilities are compromised & the outliers that do exist prolly teach something useful abt how embeddings relate in terms of perceptual arrangement"  
[X Link](https://x.com/kalomaze/status/2020674369715126776)  2026-02-09T01:41Z 22K followers, [--] engagements


"@osoleve short term planning about ASCII is admittedly kind of stupid but it's the type of stupid that is ood in a way that easily demonstrates some degree of non-language transfer "loss curve that spikes" almost certainly does not exist on the natural internet in an ASCII representation"  
[X Link](https://x.com/kalomaze/status/2020675902745804816)  2026-02-09T01:47Z 22K followers, [--] engagements


"is freezing vision encoder components on modern vlms justified is there strong evidence showing why you shouldn't iirc qwen2.5vl did joint pretraining for a long ass time and then. chose to freeze it for instruction tuning anyways. is it just cargo cult at that point"  
[X Link](https://x.com/kalomaze/status/2020740401360220320)  2026-02-09T06:03Z 22.1K followers, 12.8K engagements


"i (as well as many other people in similar positions to me) owe The Whale a lot of things. The Whale is a humble enough creature to not ask for any of those things in return. but what i will always give them is my respect @eliebakouch I'd go so far as to say that DeepSeek made LLMs science again bridged the gap between "open research" and "breakthrough" pulled the world kicking and screaming out of the dark age of frontier lab superstition and viral mythmaking marketing it was so obnoxious. I'm so thankful https://t.co/olt1PwmGxt @eliebakouch I'd go so far as to say that DeepSeek made LLMs"  
[X Link](https://x.com/kalomaze/status/2021660203184533727)  2026-02-11T18:58Z 22.1K followers, 10K engagements


"the creator of Claude Code liked this post. having sex is crazy cause its like the claude code of bringing life into this world having sex is crazy cause its like the claude code of bringing life into this world"  
[X Link](https://x.com/kalomaze/status/2021707729107394832)  2026-02-11T22:07Z 22.1K followers, [----] engagements


"i made a modified version of this called microgpt_y2k6 designed to work on python2.5 (which came out [--] months before Google fully acquired YouTube in 2006) New art project. Train and inference GPT in [---] lines of pure dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. https://t.co/HmiRrQugnP New art project. Train and inference GPT in [---] lines of pure dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot"  
[X Link](https://x.com/kalomaze/status/2021747215652598269)  2026-02-12T00:44Z 22.1K followers, 14.7K engagements


"i am asserting that you'd just need to create something that stacks invariant bs transformations from a predefined set stack 'em *n* times in a compositional way check for if it compiles to same ASM then train a classifier on this v. human C (regularize var names in some obivous way if you lack symbols & strip comments ofc) https://twitter.com/i/web/status/2021784613950296444 https://twitter.com/i/web/status/2021784613950296444"  
[X Link](https://x.com/kalomaze/status/2021784613950296444)  2026-02-12T03:13Z 22.1K followers, [---] engagements


"@scheminglunatic no i understand why you say this i'm saying that you're saying this downstream of open source tooling being not good enough for MoE right now as it should be"  
[X Link](https://x.com/kalomaze/status/2021799555348803660)  2026-02-12T04:12Z 22.1K followers, [---] engagements


"@scheminglunatic it's genuinely the rational choice given your constraints and the ecosystem maturity gap basically my point is that if you're to hope for something you ought to hope that the ecosystem gets less shitty for MoE lol"  
[X Link](https://x.com/kalomaze/status/2021801072860471688)  2026-02-12T04:18Z 22.1K followers, [---] engagements


"@teortaxesTex this is a problem that probably yields to autistically breadth over depth scale and better engineering over taste or special insight into the structure of a problem which is an exactly Elon Musk shaped problem to solve"  
[X Link](https://x.com/kalomaze/status/2021841926568427993)  2026-02-12T07:00Z 22.1K followers, [---] engagements


"@leaguepublicacc tech twitter doesn't register this as ingroup signaling because they aren't really familiar with the specific genre of fandom culture ingroup where performed aesthetic rejection of ai does numbers"  
[X Link](https://x.com/kalomaze/status/2021856048890667184)  2026-02-12T07:57Z 22.1K followers, [----] engagements


"@T43736689 codex over the command line is most definitely not useless if you are doing anything that looks like infrastructure work but plenty of people have already made that point for me"  
[X Link](https://x.com/kalomaze/status/2022035001030905992)  2026-02-12T19:48Z 22.1K followers, [---] engagements


"@xXshaurizardXx well ofc THE source in a platonic ideal dense is sometimes legit unachievable esp when you have obfuscation or stripped symbols. i mainly suspect that heuristics vetting semantical sanity (soft verifier) + constraints forcing instruction accuracy (hard verifier) compound well"  
[X Link](https://x.com/kalomaze/status/2022062471197114378)  2026-02-12T21:37Z 22.1K followers, [--] engagements


"@qtnx_ let's verify the unverifiable"  
[X Link](https://x.com/kalomaze/status/1866134525912519096)  2024-12-09T14:55Z 22.1K followers, 195.7K engagements


"zero shot frontend tests are quite possibly inversely correlated with how well something does well in a practical sense at true long multiturn for real projects MiniMax M2.5 is benchmaxed. I gave [--] models the same prompt: Create a neon "OPEN" sign in HTML. GLM 5: Clean classic neon. Nailed it. Claude Opus 4.6: Stylized with glow. Solid. Gemini [--] Pro: Cursive with bloom lighting. Creative. MiniMax M2.5: Spelled it "O b N" with https://t.co/nSBExIJIB1 MiniMax M2.5 is benchmaxed. I gave [--] models the same prompt: Create a neon "OPEN" sign in HTML. GLM 5: Clean classic neon. Nailed it. Claude Opus"  
[X Link](https://x.com/kalomaze/status/2022426522825691221)  2026-02-13T21:43Z 22.1K followers, 12.8K engagements


"look up [--] on google"  
[X Link](https://x.com/kalomaze/status/2022437468445061336)  2026-02-13T22:27Z 22.1K followers, [----] engagements


"i had a dream that i went to dairy queen with nick fuentes and asked him about what society would be like if homesteading became widely embraced and he was just silent/contemplative and then when we left he made me throw away my food even though i wasn't done eating and i was mad"  
[X Link](https://x.com/kalomaze/status/2022803802743849295)  2026-02-14T22:43Z 22.1K followers, [----] engagements


"@teortaxesTex @mmjukic but also they believe in generative RMs for diffusers and afaic nobody else is scaling full trajectory RL diffusion. this possibly matters more"  
[X Link](https://x.com/kalomaze/status/2022807058236444939)  2026-02-14T22:55Z 22.1K followers, [---] engagements


"lol the LaTeX makes really simple importance ratio weighting math look intimidating Just remember while you fiddle around with your AI models AI researchers from around the world are sweating their asses off trying to pump these new models out on a weekly basis and doing math like this shit which was in the MimiMax M2.5 release notes: https://t.co/Y07790AeFU Just remember while you fiddle around with your AI models AI researchers from around the world are sweating their asses off trying to pump these new models out on a weekly basis and doing math like this shit which was in the MimiMax M2.5"  
[X Link](https://x.com/kalomaze/status/2022142530322829378)  2026-02-13T02:55Z 22.1K followers, 17.2K engagements


"@teortaxesTex well it's not just the prose that's just a surface feature it's also better with epistemics in a way that isn't shaped like GPT5's constant pedantry & other things that are meaningfully distinct on a behavioral level"  
[X Link](https://x.com/kalomaze/status/2022177794726474200)  2026-02-13T05:15Z 22.1K followers, [----] engagements


"this person is probably not "chronically unemployed". it is quite plausible that they're i don't know a barber a caregiver a barista a security guard one of the jobs from the subset of the economy which doesn't depend on working with structured digital information everyday"  
[X Link](https://x.com/kalomaze/status/2021811752565096475)  2026-02-12T05:00Z 22.1K followers, [----] engagements


"there is no reason why they have to mog on multimodal or HLE but consistently drop the ball wrt swe agents something something "i asked a deepmind employee why not make their models good for longform programming and he said 'we cant we don't know how to do it'""  
[X Link](https://x.com/kalomaze/status/2022029703243346288)  2026-02-12T19:27Z 22.1K followers, [----] engagements


"minimax-m2.5 seems to be a genuinely competent agent model. whenever it makes failures it smoothly recovers in predictable ways. most notably it doesn't overreach or engage in "by any means" type myopic goal seeking. it feels like there's grounded deliberation & strategy"  
[X Link](https://x.com/kalomaze/status/2022229301308047632)  2026-02-13T08:40Z 22.1K followers, 22.4K engagements


"@hypotheosis_ speaking of audio autoencoders most of them that exist off the shelf are overfit to mono human speech style vocals with conv bottlenecks 🥀"  
[X Link](https://x.com/kalomaze/status/2022811454211342718)  2026-02-14T23:13Z 22.1K followers, [---] engagements


"more evidence for the chinese century okay this is awesome cc: @deepfates and @LighthavenPR https://t.co/gQ1kyrIO7W okay this is awesome cc: @deepfates and @LighthavenPR https://t.co/gQ1kyrIO7W"  
[X Link](https://x.com/kalomaze/status/2005128797063184637)  2025-12-28T04:08Z 22.1K followers, 39.1K engagements


"the whole point of my linked thread (idk if you skimmed it and missed) was that the naive verifier by itself would reward hack (obviously) but that the problem of trying to distinguishing between things that are legible idiomatic C and are not is probably a clear enough signal to build soft verifiers and classifiers for i.e youd have a -soft verifier- that you stack ON TOP of the byte match acc and of course obfuscation not having symbols hyperaggressive opt flags etc makes things harder no doubt but for things like i.e GameCube games that shipped with symbols that have reasonably loose"  
[X Link](https://x.com/kalomaze/status/2022070739156025596)  2026-02-12T22:10Z 22.1K followers, [---] engagements


"@snellingio yeah and minimax are l3-style attention purists (GQA) vs deepseek when it comes to long context so you can fit at most 32k-ish on q3_K_S without [--] bit kv type interventions"  
[X Link](https://x.com/kalomaze/status/2022135855377527017)  2026-02-13T02:28Z 22.1K followers, [---] engagements


"@PRCrecluse674 they did it in a form factor that seems predictable and is reasonably honest when it catches itself fucking up (this is not like Qwen for example). what i care about for a cheap agent model is not peak performance at the highest echelon as much as 'does it fail gracefully'"  
[X Link](https://x.com/kalomaze/status/2022246312142606435)  2026-02-13T09:47Z 22.1K followers, [---] engagements


"@teortaxesTex @mmjukic actually not possibly but definitely the more i think about it"  
[X Link](https://x.com/kalomaze/status/2022819292287230233)  2026-02-14T23:44Z 22.1K followers, [--] engagements


"@niklassheth @shatterspine of the valid superset the verifier would accept (all possible programs that can compile to these bytes) the human written program is going to obviously stand out in ways that are hilariously detectable/predictable via learned classification the superset is mostly nonsense"  
[X Link](https://x.com/kalomaze/status/2021779387360879077)  2026-02-12T02:52Z 22.1K followers, [----] engagements


"@teortaxesTex a permanent underclass escapee in action"  
[X Link](https://x.com/kalomaze/status/2022028826449227879)  2026-02-12T19:23Z 22.1K followers, [----] engagements


"@KeyTryer i feel like i'm insane because everyone talking about the release is like "they must have some bespoke behind the scenes compositor / harness" instead of looking at the papers & consistently noticing how much they embrace doing the obvious but hard thing (RL scaling on diffusion)"  
[X Link](https://x.com/kalomaze/status/2022042376752263254)  2026-02-12T20:17Z 22.1K followers, [---] engagements


"@itsjustmarky m2.1 was running at 3.4ish bpw equivalent for me last night https://x.com/kalomaze/status/2022173349418611132s=46 minimax-m2.5's release as open weights is impending. in preparation i wanted to see what my 128gb macbook could do latency wise for m2.1 (same base model earlier iteration post training). Q3_K_S 25t/s at like. 11k tokens context. for a sparse 230b running on battery power. https://t.co/pEvwANKRxR https://x.com/kalomaze/status/2022173349418611132s=46 minimax-m2.5's release as open weights is impending. in preparation i wanted to see what my 128gb macbook could do"  
[X Link](https://x.com/kalomaze/status/2022421776018633152)  2026-02-13T21:25Z 22.1K followers, [--] engagements


"i honestly think you don't need a phone case anymore"  
[X Link](https://x.com/kalomaze/status/1967011523211215291)  2025-09-13T23:44Z 22.1K followers, 172K engagements


"reverse engineering the source code of arbitrary binaries via decompilation of assembly code (code that compiles back to same ASM bytes % match acc) verifiability is magic - not sure what verifiable problems cant be solved with AI what are the biggest open problems that have perfect verifiers verifiability is magic - not sure what verifiable problems cant be solved with AI what are the biggest open problems that have perfect verifiers"  
[X Link](https://x.com/kalomaze/status/2021704033371824444)  2026-02-11T21:52Z 22.1K followers, 24.5K engagements


"@scheminglunatic barring maintainer skill issues the MoE is going to be better & train faster for the guy training on a [----] box AND the guy sshing into [--] B200s"  
[X Link](https://x.com/kalomaze/status/2021799640195379631)  2026-02-12T04:12Z 22.1K followers, [--] engagements


"@lauriewired i'll check out the keynote regardless. but this feels. harshly worded to me (ftr i am okay with harsh words if the words are in pursuit of an argument that demonstrates what it is exactly that i am missing)"  
[X Link](https://x.com/kalomaze/status/2022052499780428080)  2026-02-12T20:57Z 22.1K followers, [---] engagements


"@umi33563 @hypotheosis_ these papers were basically saying "autoregression seems provably optimal for high dim learned compression" before GPT2 btw"  
[X Link](https://x.com/kalomaze/status/2022816585577304219)  2026-02-14T23:33Z 22.1K followers, [--] engagements


"@KeyTryer other things pigeonholed to llms that have no reason not to work for diffusion: - finegrained sparsity/MoE - temporal/causal factorization of input space (vs latents that can only represent 5-15 seconds) meanwhile academics are trying to diffuse. language. as a research fad"  
[X Link](https://x.com/kalomaze/status/2022043314720321799)  2026-02-12T20:21Z 22.1K followers, [---] engagements


"@KeyTryer ironically reward models and maximization of them in a loop across multiple rounds is quite spiritually close to the underlying intuition that Yann LeCun is pointing to when he talks about EBMs and world models except made practical through. *generative* VLM reward models"  
[X Link](https://x.com/kalomaze/status/2022047148788134011)  2026-02-12T20:36Z 22.1K followers, [---] engagements


"@snellingio basically all modern quants that exist aren't exactly [--] bit to begin with and are VBR depending on the tensor so even if you drop down to like 3.7bpw effective should be not awful at all"  
[X Link](https://x.com/kalomaze/status/2022132745431060989)  2026-02-13T02:16Z 22.1K followers, [---] engagements


"state of the art extended chain of thought reasoning"  
[X Link](https://x.com/kalomaze/status/2020665838433665165)  2026-02-09T01:07Z 22.1K followers, 34.4K engagements


"if this post surprises you (it apparently did for a lot of people) then you probably have not great theory of mind for people who are not software engineers i just found out chatgpt has a SUBSCRIPTION service WHO IS PAYING IM LAUGHING SO HARD RN i just found out chatgpt has a SUBSCRIPTION service WHO IS PAYING IM LAUGHING SO HARD RN"  
[X Link](https://x.com/kalomaze/status/2021810086407245894)  2026-02-12T04:54Z 22.1K followers, 46.8K engagements


"tool use free coding eval man it's like they are cursed or something The latest Deep Think moves beyond abstract theory to drive practical applications. Its state-of-the-art on ARC-AGI-2 a benchmark for frontier AI reasoning. On Humanitys Last Exam it sets a new standard tackling the hardest problems across mathematics science and https://t.co/Cm0PYDd2Cn The latest Deep Think moves beyond abstract theory to drive practical applications. Its state-of-the-art on ARC-AGI-2 a benchmark for frontier AI reasoning. On Humanitys Last Exam it sets a new standard tackling the hardest problems across"  
[X Link](https://x.com/kalomaze/status/2022027132411818187)  2026-02-12T19:16Z 22.1K followers, 27.9K engagements


"i mean realistically cloud go brr right but on-device quasi-Opus for SWE shit is like. whew"  
[X Link](https://x.com/kalomaze/status/2022121497448587722)  2026-02-13T01:31Z 22.1K followers, [----] engagements


".@teortaxesTex also this thing is qualitatively more ensouled via the generative RM as distillation proxy reward pipeline than whatever alpacamaxxing on SFT outputs thing they're doing to GLM5. the prose isn't bad"  
[X Link](https://x.com/kalomaze/status/2022176903311073348)  2026-02-13T05:11Z 22.1K followers, [----] engagements


"@teortaxesTex @mmjukic tho very few companies on earth (DeepMind ByteDance maybe Meta if they didn't have skill issues) can afford to do fully differentiable trajectory RL without policy gradient approximations hence why offline DPO-esque cope is occasionally seen in academic diffusion work"  
[X Link](https://x.com/kalomaze/status/2022821422146687002)  2026-02-14T23:53Z 22.1K followers, [--] engagements


"it's funny how much alpha lies in just. reading the older work of people who went onto big labs"  
[X Link](https://x.com/kalomaze/status/1977476547515707463)  2025-10-12T20:48Z 22K followers, 94.8K engagements


"chatgpt launch - accidental consumer breakout hit image generation via dalle/dalle2 - too niche/controversial to be adopted for anything serious gpt store - there was nothing here "laundry buddy" was worthless sora [--] - see #2 (except not niche anymore it's just as controversial though) yeah i don't think we're gonna see another breakout consumer AI thing and it's just going to quietly become basic infrastructure OpenAI seems to believe otherwise despite the fact that they have only ever produced a breakout hit by mistake https://twitter.com/i/web/status/1978238171650437254"  
[X Link](https://x.com/kalomaze/status/1978238171650437254)  2025-10-14T23:15Z 22K followers, 12.9K engagements


"my honest advice to people who this resonated with: spend less time reading shiny papers & more time working on the "boring" things focus on the basics. by basics i mean like deduplication (on the data side) understanding dp/tp/pp abstractions (on the training side) etc Genuine question: where does one even go to learn a lot of this stuff I doubt it is in school. Do people find resources online and self study Is this just a sign of the sector maturing and people are expected to learn these skills at other jobs Ive learned a decent amount of Genuine question: where does one even go to learn a"  
[X Link](https://x.com/kalomaze/status/1983592989911138780)  2025-10-29T17:53Z 22K followers, 133.7K engagements


"i think we should prefer [--] death caused by an autonomous vehicle to [--] deaths caused by humans https://t.co/NC5kdpHUcW https://t.co/NC5kdpHUcW"  
[X Link](https://x.com/anyuser/status/1984668605242630146)  2025-11-01T17:07Z 22K followers, 49.8K engagements


"@_masterofwolves when your money up but your honey down"  
[X Link](https://x.com/kalomaze/status/1987246431376195729)  2025-11-08T19:50Z 22K followers, 91.4K engagements


"RL LEARNING WITH LORA: A DIVERSE DEEP DIVE"  
[X Link](https://x.com/kalomaze/status/1987372126220001393)  2025-11-09T04:10Z 22K followers, 225K engagements


"honestly he's right Rep. Brad Sherman (D-CA) denies he was looking at pornography on plane says: This was on Twitter. These pictures came up on For You. https://t.co/URpx0i1EP2 Rep. Brad Sherman (D-CA) denies he was looking at pornography on plane says: This was on Twitter. These pictures came up on For You. https://t.co/URpx0i1EP2"  
[X Link](https://x.com/kalomaze/status/1989777742808973656)  2025-11-15T19:29Z 22K followers, 10.3M engagements


"glm flash xml instruction following 😬"  
[X Link](https://x.com/kalomaze/status/2016018582149501296)  2026-01-27T05:21Z 21.9K followers, [----] engagements


"yes old kimi k2 had "more soul" but a bigger overarching problem is that neither of the kimis during post training has moved the needle wrt epistemic humility & detail embellishment that is to say kimi isnt very good at saying "i don't know" or at calibrating its uncertainty"  
[X Link](https://x.com/kalomaze/status/2016093733532663872)  2026-01-27T10:19Z 21.9K followers, 11.6K engagements


"@MoonL88537 the joys of random walks through claudespace"  
[X Link](https://x.com/kalomaze/status/2017432961860399168)  2026-01-31T03:01Z 22K followers, [--] engagements


"@MoonL88537 i think there is something beautiful about /r9k/ on this actually being populated by what are (essentially) robots"  
[X Link](https://x.com/kalomaze/status/2017433151564595414)  2026-01-31T03:02Z 22K followers, [--] engagements


"at some point in time it's prolly going to come out that at least one of (or multiple) of the frontier labs is doing some obvious in hindsight thing like "pretrain on constructed diff histories across time not just isolated code snippets" while open weights ones usually aren't"  
[X Link](https://x.com/kalomaze/status/2017750130754261301)  2026-02-01T00:01Z 21.9K followers, 24.9K engagements


"@cis_female i think there is a gap between "constructed" and "literal git histories as is" that people did not get about my op"  
[X Link](https://x.com/kalomaze/status/2018387995595010084)  2026-02-02T18:16Z 21.9K followers, [---] engagements


"this is funny af for a super bowl ad but i think it requires too much "inside baseball" knowledge for it to broadly resonate anthropic's mindshare is disproportionately in enterprise (not consumers) & they also don't seem too interested in providing a usable free plan Ads are coming to AI. But not to Claude. Keep thinking. https://t.co/n2yECeBWyT Ads are coming to AI. But not to Claude. Keep thinking. https://t.co/n2yECeBWyT"  
[X Link](https://x.com/kalomaze/status/2019176245070819777)  2026-02-04T22:28Z 22K followers, [----] engagements


"basically. this ad hits hardest if you know enough about the field to know about Anthropic which is not most consumers atm lol not quite a pepsi vs coke situation. for now"  
[X Link](https://x.com/kalomaze/status/2019182453697696053)  2026-02-04T22:53Z 22K followers, [----] engagements


"getting word that like 80% of the llama4 team at Meta has resigned"  
[X Link](https://x.com/anyuser/status/1923431110962204680)  2025-05-16T17:31Z 22.1K followers, 1.3M engagements


""are you sure this will help us achieve AGI" "AGI." https://t.co/mgLcutGsJI https://t.co/mgLcutGsJI"  
[X Link](https://x.com/kalomaze/status/1944723753973252247)  2025-07-14T11:40Z 22.1K followers, 41.1K engagements


"this week has already been a very good week for foundation models. but it will be an even better week very very soon"  
[X Link](https://x.com/kalomaze/status/2016162499155132822)  2026-01-27T14:52Z 22.1K followers, 29.8K engagements


"this was posted at [--] in the morning central time Grok Imagine prompt: She smiles and says I will always love you https://t.co/cjDu3MuDCZ Grok Imagine prompt: She smiles and says I will always love you https://t.co/cjDu3MuDCZ"  
[X Link](https://x.com/kalomaze/status/1987245896975982944)  2025-11-08T19:48Z 22.1K followers, 1.8M engagements


"@noah_vandal it's very high leverage to just sample & view random entries - find failures of criteria being met manually set aside corrected judgements for failures as eval - tweak curation pipeline to compensate based on the ad-hoc eval select for best accuracy - repeat more than once"  
[X Link](https://x.com/anyuser/status/1939802531636634037)  2025-06-30T21:45Z 22.1K followers, 18.9K engagements


"there is a ML project called DeepSaber from [--] years ago. it was trained on [---] Beat Saber maps using a 20m Transformer. it uh did not make good Beat Saber maps. well. what happens when you use 300x more data using a base model that is 420x larger and 5x deeper wait. finetuning Qwen/Qwen2-Audio-7B on the map jsons. (basically just [--] grid positions to fill for each timestamp) + the song audio. it's not infeasible. wait. finetuning Qwen/Qwen2-Audio-7B on the map jsons. (basically just [--] grid positions to fill for each timestamp) + the song audio. it's not infeasible"  
[X Link](https://x.com/kalomaze/status/1903719869691949347)  2025-03-23T08:06Z 22.1K followers, 23.3K engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@kalomaze Avatar @kalomaze kalomaze

kalomaze posts on X about model, if you, ai, anthropic the most. They currently have [------] followers and [---] posts still getting attention that total [------] engagements in the last [--] hours.

Engagements: [------] #

Engagements Line Chart

  • [--] Week [-------] +175%
  • [--] Month [---------] +43%
  • [--] Months [----------] +99%
  • [--] Year [-----------] +585%

Mentions: [--] #

Mentions Line Chart

  • [--] Month [--] -7%
  • [--] Months [---] -10%
  • [--] Year [-----] +31%

Followers: [------] #

Followers Line Chart

  • [--] Week [------] +0.68%
  • [--] Month [------] +2.30%
  • [--] Months [------] +27%
  • [--] Year [------] +241%

CreatorRank: [-------] #

CreatorRank Line Chart

Social Influence

Social category influence technology brands 16.98% finance 4.72% products 2.83% stocks 2.83% social networks 2.83% celebrities 1.89% fashion brands 0.94%

Social topic influence model #3938, if you 5.66%, ai 5.66%, anthropic 4.72%, macbook #444, meta 2.83%, the world 2.83%, core 2.83%, generative #148, search 1.89%

Top accounts mentioned or mentioned by @teortaxestex @scheminglunatic @osoleve @prcrecluse674 @sameqcu @t43736689 @mmjukic @snellingio @keytryer @umi33563 @giffmana @xxshaurizardxx @hypotheosis @itsjustmarky @lauriewired @moonl88537 @jatin_exe @ai_homelab @bmacabeus @slight_blu

Top assets mentioned Alphabet Inc Class A (GOOGL)

Top Social Posts

Top posts by engagements in the last [--] hours

"residual encoding of high dimensional continuous data with temporal structure mentioned https://t.co/vArmj0QkET https://t.co/vArmj0QkET"
X Link 2026-02-12T08:07Z 22.1K followers, 10.3K engagements

"it's a little dizzying that a [--] bit quantization of this should in principle fit on my 128GB ram macbook. and 10b active lives firmly in the "not necessarily FLOPS cucked" regime. Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding (SWE-Bench Verified 80.2%) search (BrowseComp 76.3%) agentic tool-calling (BFCL 76.8%) & office work. - Optimized for efficient execution 37% faster at complex https://t.co/UwiKzzQNG8 Introducing M2.5 an open-source frontier model designed for real-world productivity. - SOTA performance at coding"
X Link 2026-02-13T01:28Z 22.1K followers, 20.5K engagements

"minimax-m2.5's release as open weights is impending. in preparation i wanted to see what my 128gb macbook could do latency wise for m2.1 (same base model earlier iteration post training). Q3_K_S 25t/s at like. 11k tokens context. for a sparse 230b running on battery power"
X Link 2026-02-13T04:57Z 22.1K followers, [----] engagements

"it has markedly more situational awareness & a stronger ability to do big picture reasoning beyond the immediate next steps v say GLM4.7 if the actual goal benefits from rescoping that is when it rescopes. if it rescopes in a way that maybe violates the spec it caveats and asks"
X Link 2026-02-13T08:46Z 22.1K followers, [----] engagements

"example: of its own volition it caveated along the lines of "seems related but idt it applies" when finding related git issues to a problem. good signs all around. (the tool wasnt applying to existing opencode convos it got the MCP going & it was right about it not being shown)"
X Link 2026-02-13T08:50Z 22.1K followers, [----] engagements

"something i did as a quasi bench was watching it spend [--] mins figuring out how to compile an ancient python version (like how i did w Opus) on arm64 & port the updated version of karpathy microgpt. overall it feels. i wanna say tersely deferential in a communicative way"
X Link 2026-02-13T09:00Z 22.1K followers, [----] engagements

"oh also the important part here is that it accomplished this in [--] minutes for $0.24"
X Link 2026-02-13T09:26Z 22.1K followers, [----] engagements

"oh that's llama3.1-405b leaked at 3am the day before on 4chan"
X Link 2024-07-22T08:37Z 22K followers, 305.7K engagements

"Meta papers: guys. what if. takes hit of blunt we predicted the next CONCEPT instead of a TOKEN man. or reasoned in LATENT SPACE man. DeepSeek papers: we found a way to make Attention take 10x less memory for the 3rd time this year. its going in the next pretrain btw narrator: they did not kill tokenization narrator: they did not kill tokenization"
X Link 2025-04-28T20:16Z 22K followers, 199.7K engagements

"there is a NES emulator that simulates the behavior of individual transistors of the semiconductors not cycle level not even gate level transistor level"
X Link 2025-10-10T10:53Z 22K followers, 981.8K engagements

"pulling out the big guns for this project"
X Link 2025-10-11T11:26Z 22K followers, 53.3K engagements

"the implementation of loss terms that act as "penalties" or suppressants tends to give me the machine learning equivalent of the ick if your gradient is not going in the direction you want naturally you want to be careful that you are not counteracting it via "duct tape losses""
X Link 2026-01-21T19:22Z 22K followers, [----] engagements

"@samgd surely you can at the very least do a stochastic subset where the set actually matches between both"
X Link 2026-01-22T10:13Z 22K followers, [---] engagements

"that girl is a real claude pleaser"
X Link 2026-01-23T01:22Z 22.1K followers, 44.1K engagements

"i mostly believe this for models that are by all meaningful measures beyond capable as raw foundations (dsv3 kimi glm4.5 trinity large.) i feel the pretraining gap becomes way more uncomfortable when you're in the 3090-runnable tier language model regime at say 30b"
X Link 2026-02-01T00:11Z 22K followers, [----] engagements

"i'm not saying trinity large just to gas my people up it's up there. same with l3 405b base with the main distinction being that it's 30x cheaper than 405b and therefore actually practical to do real post training work on"
X Link 2026-02-01T00:18Z 22K followers, [----] engagements

"noooooooooooooooo aaaaarghhhh people keep falling for this fantasy idea of abandoning sgd/backprop structure for pure search when the objective itself should be the thing extracting the information theoretical signal Why Solomonoff Induction It's provably optimal for prediction. The idea is simple: search for all programs that fit the data and favor low-complexity ones. Since it's uncomputable we're building a practical approximation in the context of neural nets. Why Solomonoff Induction It's provably optimal for prediction. The idea is simple: search for all programs that fit the data and"
X Link 2026-02-04T08:54Z 22.1K followers, 13.9K engagements

"i found a particularly nonsensical sample when working on filtering an OSS instruction following dataset and somehow it was so word salady that the Anthropic classifier assumed it was a jailbreak (when i gave it to Opus)"
X Link 2026-02-05T02:49Z 22.1K followers, [----] engagements

"we will see which model is the real agi"
X Link 2026-02-05T02:55Z 22K followers, [----] engagements

"deep learning can and will conquer anything Claude Opus [---] (120K Thinking) on ARC-AGI Semi-Private Eval Max Effort: - ARC-AGI-1: 93.0% $1.88/task - ARC-AGI-2: 68.8% $3.64/task New ARC-AGI SOTA model from @AnthropicAI https://t.co/rfjhpp2B6G Claude Opus [---] (120K Thinking) on ARC-AGI Semi-Private Eval Max Effort: - ARC-AGI-1: 93.0% $1.88/task - ARC-AGI-2: 68.8% $3.64/task New ARC-AGI SOTA model from @AnthropicAI https://t.co/rfjhpp2B6G"
X Link 2026-02-05T19:06Z 22.1K followers, 14.3K engagements

"specifically the term i want would be for describing behavior that is compartmentalized & learned in pursuit of some other goal that it may genuinely be achieving despite that behavior serving no apparent functional purpose for goal achievement. "reward stimming" perhaps"
X Link 2026-02-05T19:43Z 22K followers, [----] engagements

"the lesswrong people probably already have a coinage for this somewhere"
X Link 2026-02-05T19:48Z 22K followers, [----] engagements

"@iScienceLuvr @redtachyon not if it's a giant phi-like"
X Link 2026-02-05T20:06Z 22K followers, [--] engagements

"a project m weekly local in san francisco would heal my soul"
X Link 2026-02-05T21:31Z 22K followers, [----] engagements

"there is no wall we continue going up the curve and it feels just as weird and thrilling as I thought it would we continue going up the curve and it feels just as weird and thrilling as I thought it would"
X Link 2026-02-06T07:42Z 22K followers, [----] engagements

"in spite of all their edges in spite of all their inelegances in spite of all their pain points the kind that engineers all around the world stomach as they carve out the shape of the future. deep neural networks still work"
X Link 2026-02-06T07:57Z 22K followers, [----] engagements

"this really my day job"
X Link 2026-02-06T10:16Z 22K followers, [----] engagements

"@JasonBotterill @scaling01 i doubt it anthropic has avoided the "demo the model early in public" trap before (no early launches on lmsys arena)"
X Link 2026-02-07T07:11Z 22K followers, [--] engagements

".@Grad62304977 is basically the LeBron James of finding obscure chinese RL papers"
X Link 2026-02-07T21:56Z 22K followers, 22.1K engagements

"@1bit2far if you can strike the jugular of general purpose longform software engineering you can accelerate everything else where the "hard parts" of the problem rest in knowledge worker execution rather than physical logistics ergo "it's so over when claude code gets good at CAD""
X Link 2026-02-07T22:58Z 22K followers, [---] engagements

"basically this. the idea of agentic pretraining is somewhat oxymoronic it already beats the best open-source models before any fine-tuning or RLHF. Just pretraining bearish on Avocado's potential if true The kinds of skills modern models show 100+ step agentic traces etc should not be possible with good faith pretraining it already beats the best open-source models before any fine-tuning or RLHF. Just pretraining bearish on Avocado's potential if true The kinds of skills modern models show 100+ step agentic traces etc should not be possible with good faith pretraining"
X Link 2026-02-09T00:37Z 22.1K followers, [----] engagements

"there's a LOT you can do to diversify the midtraining stage in a way that's aligned w/ better modeling the core corpus relevant to humans. things like pretraining on super long CoT traces instead of letting a majority of the TTC benefits fall out of post training .hm ngmi"
X Link 2026-02-09T00:58Z 22.1K followers, [----] engagements

"@osoleve a lot of people (even at OpenAI) i suspect have this kind of pigeonholing for what synthetic data is supposed to be about. i suspect one of the problems in Orion was treating synth data as basically "roids" rather than "carefully constructing exercises to strengthen core muscles""
X Link 2026-02-09T01:01Z 22K followers, [---] engagements

"@osoleve this stuff is naturally the tail less than 0.1% of all data easily surely Anthropic is not pretraining on enough of this that core abilities are compromised & the outliers that do exist prolly teach something useful abt how embeddings relate in terms of perceptual arrangement"
X Link 2026-02-09T01:41Z 22K followers, [--] engagements

"@osoleve short term planning about ASCII is admittedly kind of stupid but it's the type of stupid that is ood in a way that easily demonstrates some degree of non-language transfer "loss curve that spikes" almost certainly does not exist on the natural internet in an ASCII representation"
X Link 2026-02-09T01:47Z 22K followers, [--] engagements

"is freezing vision encoder components on modern vlms justified is there strong evidence showing why you shouldn't iirc qwen2.5vl did joint pretraining for a long ass time and then. chose to freeze it for instruction tuning anyways. is it just cargo cult at that point"
X Link 2026-02-09T06:03Z 22.1K followers, 12.8K engagements

"i (as well as many other people in similar positions to me) owe The Whale a lot of things. The Whale is a humble enough creature to not ask for any of those things in return. but what i will always give them is my respect @eliebakouch I'd go so far as to say that DeepSeek made LLMs science again bridged the gap between "open research" and "breakthrough" pulled the world kicking and screaming out of the dark age of frontier lab superstition and viral mythmaking marketing it was so obnoxious. I'm so thankful https://t.co/olt1PwmGxt @eliebakouch I'd go so far as to say that DeepSeek made LLMs"
X Link 2026-02-11T18:58Z 22.1K followers, 10K engagements

"the creator of Claude Code liked this post. having sex is crazy cause its like the claude code of bringing life into this world having sex is crazy cause its like the claude code of bringing life into this world"
X Link 2026-02-11T22:07Z 22.1K followers, [----] engagements

"i made a modified version of this called microgpt_y2k6 designed to work on python2.5 (which came out [--] months before Google fully acquired YouTube in 2006) New art project. Train and inference GPT in [---] lines of pure dependency-free Python. This is the full algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. https://t.co/HmiRrQugnP New art project. Train and inference GPT in [---] lines of pure dependency-free Python. This is the full algorithmic content of what is needed. Everything else is just for efficiency. I cannot"
X Link 2026-02-12T00:44Z 22.1K followers, 14.7K engagements

"i am asserting that you'd just need to create something that stacks invariant bs transformations from a predefined set stack 'em n times in a compositional way check for if it compiles to same ASM then train a classifier on this v. human C (regularize var names in some obivous way if you lack symbols & strip comments ofc) https://twitter.com/i/web/status/2021784613950296444 https://twitter.com/i/web/status/2021784613950296444"
X Link 2026-02-12T03:13Z 22.1K followers, [---] engagements

"@scheminglunatic no i understand why you say this i'm saying that you're saying this downstream of open source tooling being not good enough for MoE right now as it should be"
X Link 2026-02-12T04:12Z 22.1K followers, [---] engagements

"@scheminglunatic it's genuinely the rational choice given your constraints and the ecosystem maturity gap basically my point is that if you're to hope for something you ought to hope that the ecosystem gets less shitty for MoE lol"
X Link 2026-02-12T04:18Z 22.1K followers, [---] engagements

"@teortaxesTex this is a problem that probably yields to autistically breadth over depth scale and better engineering over taste or special insight into the structure of a problem which is an exactly Elon Musk shaped problem to solve"
X Link 2026-02-12T07:00Z 22.1K followers, [---] engagements

"@leaguepublicacc tech twitter doesn't register this as ingroup signaling because they aren't really familiar with the specific genre of fandom culture ingroup where performed aesthetic rejection of ai does numbers"
X Link 2026-02-12T07:57Z 22.1K followers, [----] engagements

"@T43736689 codex over the command line is most definitely not useless if you are doing anything that looks like infrastructure work but plenty of people have already made that point for me"
X Link 2026-02-12T19:48Z 22.1K followers, [---] engagements

"@xXshaurizardXx well ofc THE source in a platonic ideal dense is sometimes legit unachievable esp when you have obfuscation or stripped symbols. i mainly suspect that heuristics vetting semantical sanity (soft verifier) + constraints forcing instruction accuracy (hard verifier) compound well"
X Link 2026-02-12T21:37Z 22.1K followers, [--] engagements

"@qtnx_ let's verify the unverifiable"
X Link 2024-12-09T14:55Z 22.1K followers, 195.7K engagements

"zero shot frontend tests are quite possibly inversely correlated with how well something does well in a practical sense at true long multiturn for real projects MiniMax M2.5 is benchmaxed. I gave [--] models the same prompt: Create a neon "OPEN" sign in HTML. GLM 5: Clean classic neon. Nailed it. Claude Opus 4.6: Stylized with glow. Solid. Gemini [--] Pro: Cursive with bloom lighting. Creative. MiniMax M2.5: Spelled it "O b N" with https://t.co/nSBExIJIB1 MiniMax M2.5 is benchmaxed. I gave [--] models the same prompt: Create a neon "OPEN" sign in HTML. GLM 5: Clean classic neon. Nailed it. Claude Opus"
X Link 2026-02-13T21:43Z 22.1K followers, 12.8K engagements

"look up [--] on google"
X Link 2026-02-13T22:27Z 22.1K followers, [----] engagements

"i had a dream that i went to dairy queen with nick fuentes and asked him about what society would be like if homesteading became widely embraced and he was just silent/contemplative and then when we left he made me throw away my food even though i wasn't done eating and i was mad"
X Link 2026-02-14T22:43Z 22.1K followers, [----] engagements

"@teortaxesTex @mmjukic but also they believe in generative RMs for diffusers and afaic nobody else is scaling full trajectory RL diffusion. this possibly matters more"
X Link 2026-02-14T22:55Z 22.1K followers, [---] engagements

"lol the LaTeX makes really simple importance ratio weighting math look intimidating Just remember while you fiddle around with your AI models AI researchers from around the world are sweating their asses off trying to pump these new models out on a weekly basis and doing math like this shit which was in the MimiMax M2.5 release notes: https://t.co/Y07790AeFU Just remember while you fiddle around with your AI models AI researchers from around the world are sweating their asses off trying to pump these new models out on a weekly basis and doing math like this shit which was in the MimiMax M2.5"
X Link 2026-02-13T02:55Z 22.1K followers, 17.2K engagements

"@teortaxesTex well it's not just the prose that's just a surface feature it's also better with epistemics in a way that isn't shaped like GPT5's constant pedantry & other things that are meaningfully distinct on a behavioral level"
X Link 2026-02-13T05:15Z 22.1K followers, [----] engagements

"this person is probably not "chronically unemployed". it is quite plausible that they're i don't know a barber a caregiver a barista a security guard one of the jobs from the subset of the economy which doesn't depend on working with structured digital information everyday"
X Link 2026-02-12T05:00Z 22.1K followers, [----] engagements

"there is no reason why they have to mog on multimodal or HLE but consistently drop the ball wrt swe agents something something "i asked a deepmind employee why not make their models good for longform programming and he said 'we cant we don't know how to do it'""
X Link 2026-02-12T19:27Z 22.1K followers, [----] engagements

"minimax-m2.5 seems to be a genuinely competent agent model. whenever it makes failures it smoothly recovers in predictable ways. most notably it doesn't overreach or engage in "by any means" type myopic goal seeking. it feels like there's grounded deliberation & strategy"
X Link 2026-02-13T08:40Z 22.1K followers, 22.4K engagements

"@hypotheosis_ speaking of audio autoencoders most of them that exist off the shelf are overfit to mono human speech style vocals with conv bottlenecks 🥀"
X Link 2026-02-14T23:13Z 22.1K followers, [---] engagements

"more evidence for the chinese century okay this is awesome cc: @deepfates and @LighthavenPR https://t.co/gQ1kyrIO7W okay this is awesome cc: @deepfates and @LighthavenPR https://t.co/gQ1kyrIO7W"
X Link 2025-12-28T04:08Z 22.1K followers, 39.1K engagements

"the whole point of my linked thread (idk if you skimmed it and missed) was that the naive verifier by itself would reward hack (obviously) but that the problem of trying to distinguishing between things that are legible idiomatic C and are not is probably a clear enough signal to build soft verifiers and classifiers for i.e youd have a -soft verifier- that you stack ON TOP of the byte match acc and of course obfuscation not having symbols hyperaggressive opt flags etc makes things harder no doubt but for things like i.e GameCube games that shipped with symbols that have reasonably loose"
X Link 2026-02-12T22:10Z 22.1K followers, [---] engagements

"@snellingio yeah and minimax are l3-style attention purists (GQA) vs deepseek when it comes to long context so you can fit at most 32k-ish on q3_K_S without [--] bit kv type interventions"
X Link 2026-02-13T02:28Z 22.1K followers, [---] engagements

"@PRCrecluse674 they did it in a form factor that seems predictable and is reasonably honest when it catches itself fucking up (this is not like Qwen for example). what i care about for a cheap agent model is not peak performance at the highest echelon as much as 'does it fail gracefully'"
X Link 2026-02-13T09:47Z 22.1K followers, [---] engagements

"@teortaxesTex @mmjukic actually not possibly but definitely the more i think about it"
X Link 2026-02-14T23:44Z 22.1K followers, [--] engagements

"@niklassheth @shatterspine of the valid superset the verifier would accept (all possible programs that can compile to these bytes) the human written program is going to obviously stand out in ways that are hilariously detectable/predictable via learned classification the superset is mostly nonsense"
X Link 2026-02-12T02:52Z 22.1K followers, [----] engagements

"@teortaxesTex a permanent underclass escapee in action"
X Link 2026-02-12T19:23Z 22.1K followers, [----] engagements

"@KeyTryer i feel like i'm insane because everyone talking about the release is like "they must have some bespoke behind the scenes compositor / harness" instead of looking at the papers & consistently noticing how much they embrace doing the obvious but hard thing (RL scaling on diffusion)"
X Link 2026-02-12T20:17Z 22.1K followers, [---] engagements

"@itsjustmarky m2.1 was running at 3.4ish bpw equivalent for me last night https://x.com/kalomaze/status/2022173349418611132s=46 minimax-m2.5's release as open weights is impending. in preparation i wanted to see what my 128gb macbook could do latency wise for m2.1 (same base model earlier iteration post training). Q3_K_S 25t/s at like. 11k tokens context. for a sparse 230b running on battery power. https://t.co/pEvwANKRxR https://x.com/kalomaze/status/2022173349418611132s=46 minimax-m2.5's release as open weights is impending. in preparation i wanted to see what my 128gb macbook could do"
X Link 2026-02-13T21:25Z 22.1K followers, [--] engagements

"i honestly think you don't need a phone case anymore"
X Link 2025-09-13T23:44Z 22.1K followers, 172K engagements

"reverse engineering the source code of arbitrary binaries via decompilation of assembly code (code that compiles back to same ASM bytes % match acc) verifiability is magic - not sure what verifiable problems cant be solved with AI what are the biggest open problems that have perfect verifiers verifiability is magic - not sure what verifiable problems cant be solved with AI what are the biggest open problems that have perfect verifiers"
X Link 2026-02-11T21:52Z 22.1K followers, 24.5K engagements

"@scheminglunatic barring maintainer skill issues the MoE is going to be better & train faster for the guy training on a [----] box AND the guy sshing into [--] B200s"
X Link 2026-02-12T04:12Z 22.1K followers, [--] engagements

"@lauriewired i'll check out the keynote regardless. but this feels. harshly worded to me (ftr i am okay with harsh words if the words are in pursuit of an argument that demonstrates what it is exactly that i am missing)"
X Link 2026-02-12T20:57Z 22.1K followers, [---] engagements

"@umi33563 @hypotheosis_ these papers were basically saying "autoregression seems provably optimal for high dim learned compression" before GPT2 btw"
X Link 2026-02-14T23:33Z 22.1K followers, [--] engagements

"@KeyTryer other things pigeonholed to llms that have no reason not to work for diffusion: - finegrained sparsity/MoE - temporal/causal factorization of input space (vs latents that can only represent 5-15 seconds) meanwhile academics are trying to diffuse. language. as a research fad"
X Link 2026-02-12T20:21Z 22.1K followers, [---] engagements

"@KeyTryer ironically reward models and maximization of them in a loop across multiple rounds is quite spiritually close to the underlying intuition that Yann LeCun is pointing to when he talks about EBMs and world models except made practical through. generative VLM reward models"
X Link 2026-02-12T20:36Z 22.1K followers, [---] engagements

"@snellingio basically all modern quants that exist aren't exactly [--] bit to begin with and are VBR depending on the tensor so even if you drop down to like 3.7bpw effective should be not awful at all"
X Link 2026-02-13T02:16Z 22.1K followers, [---] engagements

"state of the art extended chain of thought reasoning"
X Link 2026-02-09T01:07Z 22.1K followers, 34.4K engagements

"if this post surprises you (it apparently did for a lot of people) then you probably have not great theory of mind for people who are not software engineers i just found out chatgpt has a SUBSCRIPTION service WHO IS PAYING IM LAUGHING SO HARD RN i just found out chatgpt has a SUBSCRIPTION service WHO IS PAYING IM LAUGHING SO HARD RN"
X Link 2026-02-12T04:54Z 22.1K followers, 46.8K engagements

"tool use free coding eval man it's like they are cursed or something The latest Deep Think moves beyond abstract theory to drive practical applications. Its state-of-the-art on ARC-AGI-2 a benchmark for frontier AI reasoning. On Humanitys Last Exam it sets a new standard tackling the hardest problems across mathematics science and https://t.co/Cm0PYDd2Cn The latest Deep Think moves beyond abstract theory to drive practical applications. Its state-of-the-art on ARC-AGI-2 a benchmark for frontier AI reasoning. On Humanitys Last Exam it sets a new standard tackling the hardest problems across"
X Link 2026-02-12T19:16Z 22.1K followers, 27.9K engagements

"i mean realistically cloud go brr right but on-device quasi-Opus for SWE shit is like. whew"
X Link 2026-02-13T01:31Z 22.1K followers, [----] engagements

".@teortaxesTex also this thing is qualitatively more ensouled via the generative RM as distillation proxy reward pipeline than whatever alpacamaxxing on SFT outputs thing they're doing to GLM5. the prose isn't bad"
X Link 2026-02-13T05:11Z 22.1K followers, [----] engagements

"@teortaxesTex @mmjukic tho very few companies on earth (DeepMind ByteDance maybe Meta if they didn't have skill issues) can afford to do fully differentiable trajectory RL without policy gradient approximations hence why offline DPO-esque cope is occasionally seen in academic diffusion work"
X Link 2026-02-14T23:53Z 22.1K followers, [--] engagements

"it's funny how much alpha lies in just. reading the older work of people who went onto big labs"
X Link 2025-10-12T20:48Z 22K followers, 94.8K engagements

"chatgpt launch - accidental consumer breakout hit image generation via dalle/dalle2 - too niche/controversial to be adopted for anything serious gpt store - there was nothing here "laundry buddy" was worthless sora [--] - see #2 (except not niche anymore it's just as controversial though) yeah i don't think we're gonna see another breakout consumer AI thing and it's just going to quietly become basic infrastructure OpenAI seems to believe otherwise despite the fact that they have only ever produced a breakout hit by mistake https://twitter.com/i/web/status/1978238171650437254"
X Link 2025-10-14T23:15Z 22K followers, 12.9K engagements

"my honest advice to people who this resonated with: spend less time reading shiny papers & more time working on the "boring" things focus on the basics. by basics i mean like deduplication (on the data side) understanding dp/tp/pp abstractions (on the training side) etc Genuine question: where does one even go to learn a lot of this stuff I doubt it is in school. Do people find resources online and self study Is this just a sign of the sector maturing and people are expected to learn these skills at other jobs Ive learned a decent amount of Genuine question: where does one even go to learn a"
X Link 2025-10-29T17:53Z 22K followers, 133.7K engagements

"i think we should prefer [--] death caused by an autonomous vehicle to [--] deaths caused by humans https://t.co/NC5kdpHUcW https://t.co/NC5kdpHUcW"
X Link 2025-11-01T17:07Z 22K followers, 49.8K engagements

"@_masterofwolves when your money up but your honey down"
X Link 2025-11-08T19:50Z 22K followers, 91.4K engagements

"RL LEARNING WITH LORA: A DIVERSE DEEP DIVE"
X Link 2025-11-09T04:10Z 22K followers, 225K engagements

"honestly he's right Rep. Brad Sherman (D-CA) denies he was looking at pornography on plane says: This was on Twitter. These pictures came up on For You. https://t.co/URpx0i1EP2 Rep. Brad Sherman (D-CA) denies he was looking at pornography on plane says: This was on Twitter. These pictures came up on For You. https://t.co/URpx0i1EP2"
X Link 2025-11-15T19:29Z 22K followers, 10.3M engagements

"glm flash xml instruction following 😬"
X Link 2026-01-27T05:21Z 21.9K followers, [----] engagements

"yes old kimi k2 had "more soul" but a bigger overarching problem is that neither of the kimis during post training has moved the needle wrt epistemic humility & detail embellishment that is to say kimi isnt very good at saying "i don't know" or at calibrating its uncertainty"
X Link 2026-01-27T10:19Z 21.9K followers, 11.6K engagements

"@MoonL88537 the joys of random walks through claudespace"
X Link 2026-01-31T03:01Z 22K followers, [--] engagements

"@MoonL88537 i think there is something beautiful about /r9k/ on this actually being populated by what are (essentially) robots"
X Link 2026-01-31T03:02Z 22K followers, [--] engagements

"at some point in time it's prolly going to come out that at least one of (or multiple) of the frontier labs is doing some obvious in hindsight thing like "pretrain on constructed diff histories across time not just isolated code snippets" while open weights ones usually aren't"
X Link 2026-02-01T00:01Z 21.9K followers, 24.9K engagements

"@cis_female i think there is a gap between "constructed" and "literal git histories as is" that people did not get about my op"
X Link 2026-02-02T18:16Z 21.9K followers, [---] engagements

"this is funny af for a super bowl ad but i think it requires too much "inside baseball" knowledge for it to broadly resonate anthropic's mindshare is disproportionately in enterprise (not consumers) & they also don't seem too interested in providing a usable free plan Ads are coming to AI. But not to Claude. Keep thinking. https://t.co/n2yECeBWyT Ads are coming to AI. But not to Claude. Keep thinking. https://t.co/n2yECeBWyT"
X Link 2026-02-04T22:28Z 22K followers, [----] engagements

"basically. this ad hits hardest if you know enough about the field to know about Anthropic which is not most consumers atm lol not quite a pepsi vs coke situation. for now"
X Link 2026-02-04T22:53Z 22K followers, [----] engagements

"getting word that like 80% of the llama4 team at Meta has resigned"
X Link 2025-05-16T17:31Z 22.1K followers, 1.3M engagements

""are you sure this will help us achieve AGI" "AGI." https://t.co/mgLcutGsJI https://t.co/mgLcutGsJI"
X Link 2025-07-14T11:40Z 22.1K followers, 41.1K engagements

"this week has already been a very good week for foundation models. but it will be an even better week very very soon"
X Link 2026-01-27T14:52Z 22.1K followers, 29.8K engagements

"this was posted at [--] in the morning central time Grok Imagine prompt: She smiles and says I will always love you https://t.co/cjDu3MuDCZ Grok Imagine prompt: She smiles and says I will always love you https://t.co/cjDu3MuDCZ"
X Link 2025-11-08T19:48Z 22.1K followers, 1.8M engagements

"@noah_vandal it's very high leverage to just sample & view random entries - find failures of criteria being met manually set aside corrected judgements for failures as eval - tweak curation pipeline to compensate based on the ad-hoc eval select for best accuracy - repeat more than once"
X Link 2025-06-30T21:45Z 22.1K followers, 18.9K engagements

"there is a ML project called DeepSaber from [--] years ago. it was trained on [---] Beat Saber maps using a 20m Transformer. it uh did not make good Beat Saber maps. well. what happens when you use 300x more data using a base model that is 420x larger and 5x deeper wait. finetuning Qwen/Qwen2-Audio-7B on the map jsons. (basically just [--] grid positions to fill for each timestamp) + the song audio. it's not infeasible. wait. finetuning Qwen/Qwen2-Audio-7B on the map jsons. (basically just [--] grid positions to fill for each timestamp) + the song audio. it's not infeasible"
X Link 2025-03-23T08:06Z 22.1K followers, 23.3K engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing

@kalomaze
/creator/twitter::kalomaze