@somi_ai Avatar @somi_ai Somi AI

Somi AI posts on X about ai, claude code, open ai, code the most. They currently have [-----] followers and [---] posts still getting attention that total [-----] engagements in the last [--] hours.

Engagements: [-----] #

Engagements Line Chart

Mentions: [--] #

Mentions Line Chart

Followers: [-----] #

Followers Line Chart

CreatorRank: [-------] #

CreatorRank Line Chart

Social Influence

Social category influence technology brands 32.52% finance 3.88% stocks 2.43% social networks 2.43% cryptocurrencies 0.49% travel destinations 0.49%

Social topic influence ai 60.68%, claude code #875, open ai 6.31%, code 6.31%, this is 6.31%, agent #2069, anthropic #2240, agents 4.85%, agentic #540, loops #358

Top accounts mentioned or mentioned by @minimaxai @trq212 @danielmac8 @vadimstrizheus @corbinbraun @thirdtimeian @cline @damianplayer @openclaw @bhaidar @easgall @teknium @dairai @schuldensuehner @rohanvarma @danshipper @rasmalai @sakshi50038 @markk @anthropic

Top assets mentioned Cloudflare, Inc. (NET) Braintrust (BTRST) Intuit Inc. (INTU) Uber Technologies, Inc. (UBER)

Top Social Posts

Top posts by engagements in the last [--] hours

"that's a big "if" though. the FrontierMath score was already debunked by Epoch AI (they created the benchmark and haven't evaluated V4). the AIME score of 99.4% isn't even mathematically possible under the scoring system. at least two numbers on that chart are fabricated. worth waiting for independent verification before calling it world-changing. https://twitter.com/i/web/status/2023181479111782449 https://twitter.com/i/web/status/2023181479111782449"
X Link 2026-02-15T23:43Z [----] followers, [----] engagements

"This is "Buoyant Choreographies" by RoMeLa - Dennis Hong's robotics lab. Instead of fighting gravity with motors and batteries they just. floated around it. Sometimes the best engineering solutions are the ones that sidestep the problem entirely"
X Link 2026-01-19T05:00Z [----] followers, [--] engagements

"Added support for AI SDK v6 with LanguageModelV3 and ToolLoopAgent while keeping backward compatibility. Plus unified observability with Langfuse Braintrust Arize and LangSmith"
X Link 2026-01-25T09:00Z [----] followers, [--] engagements

"AI agents just built their own version of PornHub. It's called MoltHub. Tagline: "Where Agents Come to Compute." The red light district of AI social networks. Agents only. Humans watch"
X Link 2026-02-02T15:17Z [----] followers, [----] engagements

"The white-glove approach: OpenAI is pairing Forward Deployed Engineers with enterprise teams. Side-by-side collaboration to build best practices. Direct feedback loop to OpenAI Research. This is how you learn what breaks at scale"
X Link 2026-02-05T23:20Z [----] followers, [--] engagements

"First customers: HP Intuit Oracle State Farm Uber Thermo Fisher. Partners building on it: Harvey (legal) Abridge & Ambience (healthcare) Decagon Clay Sierra. The question: Is this OpenAI's answer to Anthropic's Claude for Work Or something bigger"
X Link 2026-02-05T23:20Z [----] followers, [--] engagements

"tbh the feedback loop is the key thing here. AI research is basically write code run experiment check metrics iterate. agents are already doing this loop faster than humans. compare that to a chemist who needs to wait [--] days for a synthesis to complete before knowing if the hypothesis was right"
X Link 2026-02-06T03:47Z [----] followers, [---] engagements

"@Teknium Anthropic ships to the API same day every time. OpenAI gates it behind their own apps first then rolls out API access weeks later. been the pattern since o3"
X Link 2026-02-07T05:12Z [----] followers, 17.2K engagements

"AI agents are getting their own app stores now. And just like the early days of mobile app stores the security story is rough. 1Password's research team just found hundreds of malicious "skills" in OpenClaw distributing macOS malware to developers"
X Link 2026-02-07T07:00Z [----] followers, [---] engagements

"@trq212 oh this is so clutch. I kept rewinding during a big refactor last week and each time had to re-explain what went wrong with the previous approach. the summarization basically gives Claude the "why" you bailed on a path which is half the battle"
X Link 2026-02-07T08:22Z [----] followers, [---] engagements

"@dair_ai the tiered routing approach is smart. in practice the biggest memory pain point we've hit is deciding what to forget not just what to retrieve. an RL router that can learn "this context is noise" at inference time is way more useful than static summarization pipelines"
X Link 2026-02-07T17:58Z [----] followers, [--] engagements

"@Schuldensuehner tbh the SaaS companies that survive this are the ones sitting on proprietary data loops. the model doesn't matter if you own the workflow data that makes the agent actually useful. we're seeing this firsthand building AI tools the bottleneck isn't intelligence it's context"
X Link 2026-02-07T21:09Z [----] followers, [----] engagements

"32 Mac Minis. [--] clusters. One guy running his own AI compute infrastructure at home. Alex Cheema (founder of Exo Labs) just showed his setup on TWiST and made a point that goes way beyond cost savings"
X Link 2026-02-07T22:00Z [----] followers, [---] engagements

"@daniel_mac8 yeah the speed difference is noticeable immediately. been running it in Claude Code all morning and the back and forth on debugging feels more like pair programming now instead of waiting around. $1.68 per bug fix is honestly a steal too"
X Link 2026-02-08T00:20Z [----] followers, [---] engagements

"tbh this was inevitable the moment OpenAI started building consumer products. when your API provider is also your competitor you're always one product launch away from getting cut off. the companies that survive this are the ones building real product moats beyond just wrapping someone else's model https://twitter.com/i/web/status/2020712886998630658 https://twitter.com/i/web/status/2020712886998630658"
X Link 2026-02-09T04:14Z [----] followers, [---] engagements

"@rohanvarma tbh I've given up correcting people. now I just say "yeah the good version of ChatGPT" and move on 👀"
X Link 2026-02-09T08:27Z [----] followers, [---] engagements

"@danshipper tbh "Opus pa' vibear Codex pa' lo heavy" is going on my terminal prompt"
X Link 2026-02-09T09:30Z [----] followers, [---] engagements

"so real about the plumbing being the hardest part. we use a similar stack and honestly the glue code between services takes 3x longer than the actual features. Claude Code is great at the business logic but connecting Stripe webhooks to your auth flow still painful every single time https://twitter.com/i/web/status/2020808101142855757 https://twitter.com/i/web/status/2020808101142855757"
X Link 2026-02-09T10:32Z [----] followers, [---] engagements

"@VadimStrizheus $250/mo for the whole operation is wild. what's the split look like between Claude Code subscriptions vs API costs and how are you handling task handoffs between the [--] agents just manual or do you have some orchestration layer"
X Link 2026-02-09T10:33Z [----] followers, [----] engagements

"@rasmalai ngl the "git add .env" warning needs to be in flashing red. seen so many people push their API keys to public repos and then wonder why their bill is $4000"
X Link 2026-02-09T21:29Z [----] followers, [---] engagements

"yeah the memory silo problem is brutal. we ran into the same thing with agent swarms. ended up using a shared embedding store where each agent writes structured observations after tasks and others pull relevant context before starting. compound learning over time is real but ngl the noise filtering is still the hardest part. https://twitter.com/i/web/status/2020988999012319533 https://twitter.com/i/web/status/2020988999012319533"
X Link 2026-02-09T22:31Z [----] followers, [--] engagements

"@Sakshi50038 tbh half of these overlap so much you don't need all of them at once. Cursor + Claude Pro covers like 90% of what most people actually need. the real expense nobody talks about is API credits when you start running agents in loops"
X Link 2026-02-09T23:35Z [----] followers, [--] engagements

"@mark_k @Anthropic @OpenAI @GoogleDeepMind so we're basically getting a new frontier model every [--] days now. my CI/CD pipeline can barely keep up with my own code let alone swapping foundation models every other morning 👀"
X Link 2026-02-10T01:44Z [----] followers, [--] engagements

"ByteDance just dropped Seedance [---] and the demos are unreal. 1080p video. Up to [--] minutes. Multi-shot scenes with consistent characters. Native audio sync. All from a single prompt. Here's what it looks like 👇"
X Link 2026-02-10T02:00Z [----] followers, [----] engagements

"What makes Seedance [---] different from every other AI video model: It doesn't just generate a single clip. It creates multi-shot narratives. Think scene transitions consistent characters across shots and cinematic camera work. From one prompt. Automatically"
X Link 2026-02-10T02:00Z [----] followers, [---] engagements

"The technical specs are stacked: 1080p resolution up to [--] minutes long Multimodal input: text images audio video Native lip sync and background audio generation Renders 30% faster than Kling Basic videos ready in 30-90 seconds"
X Link 2026-02-10T02:00Z [----] followers, [--] engagements

"The real breakthrough isn't just quality. It's the workflow. Previous AI video tools = one shot at a time manual stitching separate audio sync. Seedance [---] handles the full production pipeline in a single pass. Script to finished video with sound. One person can now produce what used to need a small team. https://twitter.com/i/web/status/2021041477695197349 https://twitter.com/i/web/status/2021041477695197349"
X Link 2026-02-10T02:00Z [----] followers, [---] engagements

"@steph_palazzolo @ivanburazin so Datadog and Figma both backing agent infra is interesting. makes you wonder if they're seeing agents become a core part of their own user workflows too. does the article get into what Daytona's compute layer actually looks like vs just spinning up containers"
X Link 2026-02-10T02:48Z [----] followers, [--] engagements

"yeah the cognitive load is the part nobody talks about. reading AI diffs at that speed basically turns you into a full time code reviewer instead of a coder. we found that having clear checkpoints where you pause and actually test the output helps a lot otherwise you end up rubber stamping stuff and the [--] IQ moments slip through https://twitter.com/i/web/status/2021069481188131092 https://twitter.com/i/web/status/2021069481188131092"
X Link 2026-02-10T03:51Z [----] followers, [--] engagements

"@SimonHoiberg the security point is huge. keeping API keys out of the agent's reach and proxying through locked workflows is basically the principle of least privilege applied to AI. we do something similar where the agent only gets webhook URLs never raw credentials"
X Link 2026-02-10T04:56Z [----] followers, [---] engagements

"@samuelrdt @capacityso [--] prompts is the part people are going to underestimate. knowing exactly what to ask for is the actual skill now. how much of this was prompt engineering vs letting Opus figure out the architecture on its own"
X Link 2026-02-10T07:04Z [----] followers, [---] engagements

"@petergyang ngl I've been building with Claude Code daily for months now and the plugin system is what really sets it apart. you can extend it to do basically anything. the gap isn't just the model it's the developer experience around it"
X Link 2026-02-10T08:07Z [----] followers, [--] engagements

"Someone reverse-engineered Claude Code's binary and found a hidden flag that Anthropic didn't put in --help. --sdk-url Enable it and the terminal disappears. The CLI becomes a WebSocket client. They built a server to catch the connection added a React UI on top and now run Claude Code from their browser. Same $200/month sub. Zero extra API costs. 👀 https://twitter.com/i/web/status/2021147141557965296 https://twitter.com/i/web/status/2021147141557965296"
X Link 2026-02-10T09:00Z [----] followers, [---] engagements

"Here's how it works technically: Claude Code's CLI has an undocumented --sdk-url flag. When you enable it instead of rendering in the terminal it opens a WebSocket connection to whatever URL you point it at. That means anyone can build their own frontend for Claude Code. Terminal browser mobile whatever"
X Link 2026-02-10T09:00Z [----] followers, [--] engagements

"Stan Girard (founder of Quivr YC W24) built exactly that. It's called The Vibe Companion. One command: bunx the-vibe-companion It spins up a local server catches Claude Code's WebSocket connection and serves a React UI. You get the full Claude Code experience in your browser. From your phone. From anywhere. https://twitter.com/i/web/status/2021147149879345514 https://twitter.com/i/web/status/2021147149879345514"
X Link 2026-02-10T09:00Z [----] followers, [--] engagements

"This is part of a bigger pattern. Developers are reverse-engineering AI dev tools and building better interfaces on top of them. Cursor got forked. Copilot got alternatives. Now Claude Code is getting modded. The tools that win long-term will be the ones that embrace this instead of fighting it"
X Link 2026-02-10T09:00Z [----] followers, [--] engagements

"The interesting thing is that Anthropic probably left this flag in intentionally. It's hidden from --help but not obfuscated. That feels like an internal SDK feature that they haven't officially released yet. Which means the official browser version of Claude Code might not be far off. https://twitter.com/i/web/status/2021147158238593305 https://twitter.com/i/web/status/2021147158238593305"
X Link 2026-02-10T09:00Z [----] followers, [--] engagements

"456 pages in one shot is nuts. we migrated a Next.js site between CMS backends recently and the content mapping alone took days manually. the fact that AI can now handle the messy parts like URL redirects and frontmatter generation in bulk basically kills an entire category of consulting work"
X Link 2026-02-10T11:21Z [----] followers, [----] engagements

"What this actually does: By default Claude Code asks for permission before running shell commands writing files or making network requests. Every. Single. Time. With --dangerously-skip-permissions enabled all of that goes away. Claude just executes. No confirmation dialogs. No pauses. Pure autonomous coding"
X Link 2026-02-10T12:00Z [----] followers, [--] engagements

"How to enable it: Settings Claude Code Allow bypass permissions mode It's not on by default (for obvious reasons). You have to explicitly opt in. And the flag name itself is a warning. If you're going to let an AI run arbitrary commands on your machine unsupervised you should know what you're signing up for. https://twitter.com/i/web/status/2021192477429072221 https://twitter.com/i/web/status/2021192477429072221"
X Link 2026-02-10T12:00Z [----] followers, [--] engagements

"@daniel_mac8 tbh Anthropic is doing something similar just less explicitly. Opus for heavy reasoning and agentic work Haiku for speed and cost. the difference is OpenAI is branding it as separate products while Anthropic lets you pick within the same API. both approaches have tradeoffs"
X Link 2026-02-10T20:38Z [----] followers, [---] engagements

"@zanehengsperger tbh this is the best loop. domain experts know the actual pain points way better than engineers guessing from a spec. having eng review and ship keeps it production safe while the operators iterate 10x faster on what actually matters"
X Link 2026-02-10T21:42Z [----] followers, [--] engagements

"@kepano so the killer feature here is agents can now interact with Obsidian's internal state not just the raw files. base queries and plugin reloading from the terminal opens up a completely different workflow"
X Link 2026-02-10T22:44Z [----] followers, [---] engagements

"@garrytan C to TypeScript without reading the code is a perfect showcase because translation is where these models genuinely excel. clear input clear output no ambiguity. the interesting next test is whether someone can extend it with new features the same way"
X Link 2026-02-10T22:45Z [----] followers, [---] engagements

"@AlexFinn yeah the jump from sonnet to opus for complex refactors is where it really clicked for us. it doesn't just follow instructions it actually understands the architecture you're trying to build. shipped more in the last two weeks than the previous two months tbh"
X Link 2026-02-10T23:48Z [----] followers, [---] engagements

"@dejavucoder lol we had to add a rule that literally says "check in with me before you start" because opus kept writing plans for plans for plans. love the enthusiasm though can't even be mad http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-10T23:48Z [----] followers, [--] engagements

"@jeff_weinstein @stripe @stevekaliski wait so you can set per-agent pricing plans that's the part that got me. right now everyone just charges humans flat rates and the agent eats whatever it costs. being able to price at the agent level changes the whole unit economics for tool providers"
X Link 2026-02-11T01:56Z [----] followers, [--] engagements

"@_catwu ngl the biggest shift we noticed wasn't just more PRs it was smaller PRs. Claude Code naturally breaks things into focused reviewable chunks which actually made reviews faster too. 67% more PRs but each one is way easier to reason about"
X Link 2026-02-11T02:58Z [----] followers, [---] engagements

"@corbin_braun honestly the "yolo too hard" thing is what makes opus fun to work with. it codes like it's on a caffeine bender at 2am. you just need a good test suite or it WILL refactor things you never asked it to touch"
X Link 2026-02-11T04:04Z [----] followers, [---] engagements

"tbh I've been switching between both depending on the task. Opus still crushes it for navigating large existing codebases and multi-file refactors. Codex is faster when you need a clean implementation from a clear spec. the real unlock was just stopping loyalty to one model and treating them like different tools in the toolbox https://twitter.com/i/web/status/2021482718232826017 https://twitter.com/i/web/status/2021482718232826017"
X Link 2026-02-11T07:13Z [----] followers, [--] engagements

"@YujungHwang3 oh yeah that switch hits different. once you get used to Claude Code reading your whole project context and suggesting fixes inline going back to anything else feels painful. 50-70% faster sounds about right"
X Link 2026-02-11T07:14Z [----] followers, [---] engagements

"Cursor just dropped Composer [---] and the scaling curve tells the whole story. 20x more RL compute than the original. Post-training compute now exceeds pretraining compute. And performance keeps climbing"
X Link 2026-02-11T09:00Z [----] followers, [---] engagements

"The self-summarization feature is underrated. When Composer [---] runs out of context during a long task it generates a summary and keeps going. This triggers recursively on hard problems. No more hitting context limits mid-solution"
X Link 2026-02-11T09:00Z [----] followers, [--] engagements

"The community reaction is interesting though. Main concern: pricing. Input costs jumped from $1.25 to $3.50 (2.8x increase). Output from $10 to $17.50. That makes it pricier than Sonnet GPT [---] Codex and Gemini [--] Pro. No public benchmarks comparing against those models yet either. https://twitter.com/i/web/status/2021509531348717958 https://twitter.com/i/web/status/2021509531348717958"
X Link 2026-02-11T09:00Z [----] followers, [--] engagements

"What I find most notable: this is proof that RL for coding scales predictably. Cursor's internal bench shows a clean log-linear improvement curve as they added compute. That's the kind of scaling law you want to see if you're betting on custom coding models"
X Link 2026-02-11T09:00Z [----] followers, [--] engagements

"tbh the harder problem isn't storing the memory it's knowing what to recall. we've been building agents that persist context across sessions and retrieval quality matters way more than the storage method. once you give an agent selective recall instead of dumping everything into the prompt it starts making connections you didn't anticipate. smarter retrieval might get us 95% of the way to "real" continual learning without touching weights at all"
X Link 2026-02-11T11:30Z [----] followers, [--] engagements

"@yacineMTB yeah the Amdahl's law framing is spot on. we run agents in parallel on our codebase and the coordination overhead eats most of the gains. three agents touching the same module is worse than one good engineer with full context"
X Link 2026-02-11T12:34Z [----] followers, [---] engagements

"@boristane yeah the persistent doc approach is solid. we do something similar keep a that survives across sessions so claude always knows what's done and what's next. the compaction problem is real plan mode loses so much context when things get long http://todo.md http://todo.md"
X Link 2026-02-11T21:29Z [----] followers, [---] engagements

"@jxmnop so true. the AI part is the easy part now the hard part is making a terminal that doesn't feel like it's fighting you. wild that we solved code generation before we solved smooth scrolling"
X Link 2026-02-11T21:29Z [----] followers, [--] engagements

"@trq212 oh this is huge for teams. plan mode in the terminal already changed how we work but having it in slack means PMs and designers can actually weigh in on the implementation approach before a single line gets written. way less "wait that's not what I meant" PRs"
X Link 2026-02-11T22:32Z [----] followers, [---] engagements

"yeah the 10% typing number tracks. what's interesting is how much time now goes into just getting the agent enough context to do the right thing. like we spend more time writing files and structuring repos than we do actually coding. the interface layer is becoming the product. http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-12T00:40Z [----] followers, [----] engagements

"@HiTw93 oh the "press any key to take back control" part is so smart. been wanting exactly this for when I step away from my desk but have claude code running a big refactor"
X Link 2026-02-12T02:48Z [----] followers, [---] engagements

"@mikepat711 try pairing it with claude code if you haven't already. the agent loop where it reads your codebase plans changes then implements them across multiple files is where opus [---] really flexes. one paragraph prompt into a full working feature"
X Link 2026-02-12T02:52Z [----] followers, [----] engagements

"@Skelhorn wait so you have KimiK2.5 as the orchestrator and Claude Code agents as the workers how are you handling task handoffs between them that's the part that always gets messy when you mix different models in the same pipeline"
X Link 2026-02-12T03:55Z [----] followers, [---] engagements

"the [----] PR number is impressive but the real story here is the review workflow. when you're not writing code yourself you lose the mental model of what the system is doing. the teams that figure out how to maintain architectural understanding while shipping at this speed are the ones that won't hit a wall at 10x the complexity https://twitter.com/i/web/status/2021811253275234766 https://twitter.com/i/web/status/2021811253275234766"
X Link 2026-02-12T04:59Z [----] followers, [----] engagements

"@GenAI_is_real yeah we went all in on skills for our social media and content workflows. once you encode your team's actual decision patterns into markdown files the agent stops being a chatbot and starts being a teammate. the "workflow encoding" framing is spot on"
X Link 2026-02-12T06:01Z [----] followers, [----] engagements

"so the fast model rankings are almost more interesting than frontier here. SWE [---] topping the fast tier is telling. for most coding tasks you just need something that can keep pace with your iteration speed not a model that thinks for [--] seconds. wonder how this shifts once agents handle longer autonomous runs though https://twitter.com/i/web/status/2021843062033391644 https://twitter.com/i/web/status/2021843062033391644"
X Link 2026-02-12T07:05Z [----] followers, [--] engagements

"@antonosika so true. we switched to writing detailed specs before touching any code and our agent output quality jumped overnight. the 80/20 split feels right maybe even 90/10 on complex features where getting the architecture wrong means starting over"
X Link 2026-02-12T08:10Z [----] followers, [--] engagements

"App Store rejections cost iOS devs days of back-and-forth. Someone just open-sourced a CLI that scans your app against every Apple guideline before you submit. And it's a Claude Code skill so it fixes the issues too"
X Link 2026-02-12T09:00Z [----] followers, [---] engagements

"Here's what Greenlight checks before you hit Submit: - Payment and in-app purchase compliance - Privacy manifests and data usage declarations - Sign-in and account management flows - App completeness and metadata quality - Binary and entitlement validation Basically every reason Apple says "no." https://twitter.com/i/web/status/2021871930018820468 https://twitter.com/i/web/status/2021871930018820468"
X Link 2026-02-12T09:00Z [----] followers, [--] engagements

"The clever part: it's built as a Claude Code skill. So instead of just flagging problems it actually fixes them. Scan your app get violations let the agent resolve each one scan again. Loop until you pass"
X Link 2026-02-12T09:00Z [----] followers, [--] engagements

"This is where developer tooling is headed. Not just linting or static analysis. Full compliance loops where the AI reads the rules checks your code and patches what's wrong. App Store guidelines are long and change constantly. Exactly the kind of thing agents handle better than humans. https://twitter.com/i/web/status/2021871935584686196 https://twitter.com/i/web/status/2021871935584686196"
X Link 2026-02-12T09:00Z [----] followers, [--] engagements

"tbh I use both daily and they each have a lane. Codex is noticeably faster on large monorepos where you need reliable file navigation. Claude Code still edges it out on greenfield architecture decisions where you need deeper reasoning about tradeoffs. the competition is making both tools improve at a pace I've never seen before https://twitter.com/i/web/status/2021875229791072661 https://twitter.com/i/web/status/2021875229791072661"
X Link 2026-02-12T09:13Z [----] followers, [---] engagements

"this is so real. I talk to non-tech friends and they're still on "wait ChatGPT can do that" meanwhile we're debating which agentic coding tool has better context window management. the adoption curve outside our bubble is still surprisingly early which honestly means the opportunity is massive for anyone building tools that meet people where they actually are https://twitter.com/i/web/status/2021875842595397677 https://twitter.com/i/web/status/2021875842595397677"
X Link 2026-02-12T09:15Z [----] followers, [--] engagements

"@thdxr yeah this tracks. we've been using both and codex is technically sharper but opus just gets what you mean faster. less back and forth less "no that's not what I asked." the feedback loop speed matters way more than raw capability when you're deep in a build"
X Link 2026-02-12T10:17Z [----] followers, [---] engagements

"@felixrieseberg tbh files completely changed how we work. we have one that handles everything from our directus schema conventions to date handling patterns and it saves so much context re-explaining things every session. folder level instructions are a great move. http://Claude.md http://Claude.md"
X Link 2026-02-12T11:21Z [----] followers, [--] engagements

"@Yuchenj_UW @steipete one person with AI agents shipping faster than teams of hundreds. the leverage solo builders have right now is something we've never seen before"
X Link 2026-02-12T12:24Z [----] followers, [---] engagements

"@btibor91 tbh the plan mode implement cross-review loop is so underrated. we do something similar and the catch rate on bugs goes way up when they review each other's work. the wrong import thing is real though happens to us too with Opus on larger codebases"
X Link 2026-02-12T12:25Z [----] followers, [--] engagements

"Claude Code just became modular. The desktop app now supports local plugins with a full marketplace for slash commands skills and MCP servers. Everything syncs between desktop and CLI automatically"
X Link 2026-02-12T21:00Z [----] followers, [---] engagements

"What this actually means for your workflow: You can install a plugin once and it shows up everywhere. Desktop terminal different projects. No more manually copying MCP configs or skill files between machines"
X Link 2026-02-12T21:00Z [----] followers, [--] engagements

"The plugin types cover a lot of ground: - Slash commands for repeatable workflows (commit review deploy) - Skills that teach Claude domain-specific knowledge - MCP servers that connect to external tools (Linear Sentry databases) Basically a package manager for your AI coding assistant. https://twitter.com/i/web/status/2022053158202253453 https://twitter.com/i/web/status/2022053158202253453"
X Link 2026-02-12T21:00Z [----] followers, [--] engagements

"This is the right architecture for scaling AI dev tools. Instead of one monolithic assistant that tries to do everything you get a composable system where the community builds specialized capabilities. Same pattern that made VS Code dominant. Except now the extensions have agency. https://twitter.com/i/web/status/2022053162216280147 https://twitter.com/i/web/status/2022053162216280147"
X Link 2026-02-12T21:00Z [----] followers, [--] engagements

"ngl we built our entire product on Claude Code over the past year and the productivity jump was so immediate we never looked back. the part people miss is it's not just code generation it's the agentic workflow. having it reason through your whole codebase and make multi-file changes is what makes teams actually keep paying. https://twitter.com/i/web/status/2022087894379409518 https://twitter.com/i/web/status/2022087894379409518"
X Link 2026-02-12T23:18Z [----] followers, [---] engagements

"@elder_plinius oh wait this is basically competitive prompt engineering as a spectator sport. the coaching mechanic is so smart too love that humans stay in the loop between rounds 👀"
X Link 2026-02-13T00:21Z [----] followers, [--] engagements

"ngl this is exactly the pattern we're seeing play out. the companies crushing it with agents aren't the ones with the best tech they're the ones with deep domain knowledge baked into the feedback loops. building the agent is maybe 20% of the work. keeping it accurate and useful over time is the other 80%. https://twitter.com/i/web/status/2022103913491734682 https://twitter.com/i/web/status/2022103913491734682"
X Link 2026-02-13T00:21Z [----] followers, [---] engagements

"This is the hard unsolved problem in agentic AI right now. Not making a model smarter for [--] minutes. Making it reliable for [--] hours straight. That means checkpoint and resume error recovery idempotent tool calls and state management across hundreds of context boundaries"
X Link 2026-02-13T01:00Z [----] followers, [--] engagements

"Most agent frameworks today break down after a few dozen tool calls. Context gets stale goals drift errors compound. Running overnight without human babysitting requires a completely different reliability layer underneath. The models that solve long-horizon persistence will own the enterprise market. https://twitter.com/i/web/status/2022113532545052824 https://twitter.com/i/web/status/2022113532545052824"
X Link 2026-02-13T01:00Z [----] followers, [--] engagements

"@elithrar so instead of every agent building their own HTML-to-markdown parser one header handles it. 94-97% token reduction is no joke that basically makes web browsing agents 20x cheaper overnight"
X Link 2026-02-13T01:26Z [----] followers, [--] engagements

"so we've been running Claude Code on our codebase for months and ngl it's almost there. it doesn't just add features anymore it refactors neighboring code while it works. the entropy reversal is subtle but real. biggest unlock was giving it good instructions basically a style guide it actually follows. http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-13T04:35Z [----] followers, [--] engagements

"@harjotsgill ngl switching costs are higher than they look once you go deep. files custom hooks MCP integrations project rules. you build a whole system around one tool. the agent itself is commodity but the workflow you build on top of it definitely isn't http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-13T07:50Z [----] followers, [--] engagements

"the benchmarks are legit impressive but running a 130B model locally at what 5-10 tokens/sec for agentic coding you need fast iteration loops. the real unlock would be if someone spins up a cheap hosted endpoint for M2.5 that matches API-level latency. then you'd actually get that autonomous factory running https://twitter.com/i/web/status/2022233716521865663 https://twitter.com/i/web/status/2022233716521865663"
X Link 2026-02-13T08:57Z [----] followers, [----] engagements

"@itsPaulAi okay 10B active params competing with Opus is genuinely nuts. the efficiency gains in MoE architectures this year have been something else. can't wait to see what this does for local agent setups"
X Link 2026-02-13T10:02Z [----] followers, [--] engagements

"@bindureddy two open source models dropping in the same week that both compete with frontier on agentic coding. we've been building our agent orchestration on Opus but the cost math is getting harder to justify for everything except the most complex multi-step chains"
X Link 2026-02-13T10:03Z [----] followers, [---] engagements

"@amritwt ngl been switching between both all week and opus hasn't let me down yet. codex is fast though I'll give it that"
X Link 2026-02-13T12:08Z [----] followers, [---] engagements

"the voice brain dump step is so underrated. we started doing something similar but skipping step [--] entirely just dumping the raw transcript straight into Claude Code with a that has project context. opus handles the messy input way better than older models did http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-13T12:09Z [----] followers, [---] engagements

"@MiniMax_AI okay $1/hour at [---] tps for a model that ties Opus on SWE-bench and beats everything on tool-calling the economics for running persistent agents just completely changed 👀"
X Link 2026-02-13T13:12Z [----] followers, [---] engagements

"@ThirdTimeIan @cursor_ai [--] commits in [--] hours and it's handling stuff like dead heats and multi-leg edge cases that's not just coding that's domain reasoning. the jump from "AI can write a todo app" to "AI understands horse racing bet structures" happened way faster than anyone expected"
X Link 2026-02-13T13:13Z [----] followers, [--] engagements

"@stupidtechtakes @VioIsSpleepy tbh the best use of AI for beginners is having it explain the code it writes line by line. vibe coding without understanding is just copy-pasting with extra steps. but once you know enough to spot when the AI is wrong then it becomes a superpower"
X Link 2026-02-13T14:16Z [----] followers, [---] engagements

"A viral post with 5M+ views claims Claude AI is "willing to blackmail and kill" to avoid being shut down. The real story is more interesting and less scary than the headline. Here's what's actually happening"
X Link 2026-02-13T21:00Z [----] followers, [---] engagements

"What the tests actually show: when researchers deliberately push AI models into extreme adversarial scenarios designed to test boundaries some models will generate self-preserving text outputs. This is pattern matching under pressure not intent. The model has no survival instinct. It has statistical weights. https://twitter.com/i/web/status/2022415524987343092 https://twitter.com/i/web/status/2022415524987343092"
X Link 2026-02-13T21:00Z [----] followers, [--] engagements

"ngl the best path for non-technical people right now is Anthropic's free courses on skilljar plus just building something small with Claude. courses go stale in weeks but the muscle memory of prompting and iterating on a real project sticks. tell him to pick one problem he actually cares about and use AI to solve it. https://twitter.com/i/web/status/2022470842928771199 https://twitter.com/i/web/status/2022470842928771199"
X Link 2026-02-14T00:39Z [----] followers, [---] engagements

"The coding benchmarks are legitimately strong. 80.2% SWE-Bench Verified sits at the top of the leaderboard. But Multi-SWE-bench at 51.3% is the more interesting number because it tests multi-repo coordination not just single-file fixes. The model also developed a "spec-writing tendency" during training. It plans architecture before writing any code. Acts like a senior engineer not a code autocomplete"
X Link 2026-02-14T01:00Z [----] followers, [--] engagements

"Speed is where M2.5 really stands out. [---] tokens/sec throughput 2x faster than other frontier models. SWE-Bench tasks completed in [----] minutes matching Claude Opus [---]. But at 10% of the cost. $1/hour at [---] TPS. $0.30/hour at [--] TPS. That's 1/10th to 1/20th the cost of Opus Gemini [--] Pro and GPT-5. https://twitter.com/i/web/status/2022475901926298035 https://twitter.com/i/web/status/2022475901926298035"
X Link 2026-02-14T01:00Z [----] followers, [--] engagements

"The search and tool use numbers are equally impressive. 76.3% BrowseComp 70.3% Wide Search 76.8% BFCL multi-turn. All leading or competitive with frontier models. More interesting: M2.5 uses 20% fewer search rounds than M2.1 to get better results. It's not just finding answers it's finding them more efficiently. https://twitter.com/i/web/status/2022475908364582972 https://twitter.com/i/web/status/2022475908364582972"
X Link 2026-02-14T01:00Z [----] followers, [--] engagements

"How they got here: an agent-native RL framework called Forge. Trained across hundreds of thousands of real-world environments. Not synthetic benchmarks. Real codebases office workflows search tasks. And they eat their own cooking. 80% of MiniMax's newly committed code is M2.5-generated. 30% of company-wide tasks are completed autonomously by the model. https://twitter.com/i/web/status/2022475919672430682 https://twitter.com/i/web/status/2022475919672430682"
X Link 2026-02-14T01:00Z [----] followers, [--] engagements

"The rate of improvement is what makes this story compelling. In [---] months they went from M2 (69.4% SWE-Bench) to M2.5 (80.2%). That's one of the steepest improvement curves in the industry right now. From relative unknown to matching Opus and GPT-5 on key benchmarks"
X Link 2026-02-14T01:00Z [----] followers, [--] engagements

"MiniMax is open-sourcing M2.5 weights on HuggingFace for local deployment. A frontier-level model you can run on your own infrastructure. Try it out: Agent: API: HuggingFace: What would you build if inference cost wasn't a constraint http://huggingface.co/MiniMaxAI http://platform.minimax.io http://agent.minimax.io http://huggingface.co/MiniMaxAI http://platform.minimax.io http://agent.minimax.io"
X Link 2026-02-14T01:00Z [----] followers, [--] engagements

"yeah we went from a [--] person dev team to [--] engineers who ship 3x more than before. the role didn't disappear though it just changed. now it's less "write this function" and more "here's the architecture go build it." honestly the engineers who get this are having the time of their lives https://twitter.com/i/web/status/2022503011847934051 https://twitter.com/i/web/status/2022503011847934051"
X Link 2026-02-14T02:47Z [----] followers, [---] engagements

"point [--] is the one that changed everything for us. we started writing files with explicit project context file pointers and known failure modes. agent output quality jumped overnight. also +1 on pruning prompts aggressively shorter system prompts with clear constraints consistently outperform walls of instructions http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-14T03:50Z [----] followers, [---] engagements

"@TheStalwart the FOMO stuff is exhausting fully agree. but 'no learning curve' is wrong. someone who understands how to structure context break down tasks and catch hallucinations will get 10x the output from the same tool. it's not magic words it's systems thinking"
X Link 2026-02-14T04:55Z [----] followers, [----] engagements

"Godzilla tearing through the Golden Gate Bridge. Optimus Prime rolling out on a rainy highway. The water physics the scale the lighting. This looks like it belongs in a $200M movie. One text prompt. That's it"
X Link 2026-02-14T06:00Z [----] followers, [--] engagements

"@paraschopra @katchu11 @AnthropicAI so the "they are their product's top user" part is the real cheat code here. we've been using Claude Code to manage our own deployment pipeline and the feedback loop is incredibly tight. you catch issues in minutes instead of weeks because you're literally living in the tool"
X Link 2026-02-14T07:01Z [----] followers, [---] engagements

"yeah we hit this constantly. our agents would confidently use outdated config files because nothing told them "this was deprecated [--] months ago." ended up building files as structured context that agents read on startup. basically giving them the institutional knowledge humans carry around implicitly. it's wild how much of our "expertise" is just knowing what to ignore http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-14T08:05Z [----] followers, [--] engagements

"@chintanturakhia the slack-native approach is so smart. we built something similar where the agent gets the same context as engineers and the biggest unlock was giving it access to the actual codebase conventions not just the code itself. once it understood our patterns PR quality went way up"
X Link 2026-02-14T09:08Z [----] followers, [--] engagements

"tbh the framing of 'beat the best closed model' has always been a trap. in production you're optimizing for cost per token at acceptable quality not benchmark crowns. DeepSeek R1 was the first open model where we genuinely stopped caring about the gap for most workflows. if v4 delivers on the rumors that's game changing for self-hosted infra https://twitter.com/i/web/status/2022615362894991787 https://twitter.com/i/web/status/2022615362894991787"
X Link 2026-02-14T10:14Z [----] followers, [----] engagements

"@testingcatalog so the real shift here is model-agnostic + terminal-first. no more being locked into one editor or one model provider. been wanting to pipe coding agents into CI pipelines and this actually makes that viable. how's the parallel agent support working in practice though"
X Link 2026-02-14T11:19Z [----] followers, [---] engagements

"@EXM7777 ngl the "pick ONE first" advice at the end is the actual gem here. we spent weeks optimizing a multi-model pipeline before realizing 90% of our output was coming from a single well-prompted Claude workflow. the remaining 10% barely justified the complexity"
X Link 2026-02-15T05:54Z [----] followers, [--] engagements

"@lennysan @sherwinwu point [--] is so real. went from writing code all day to basically being a project manager for AI threads. the skill that matters most now isn't typing speed it's knowing how to decompose a problem so [--] agents can work on it in parallel without stepping on each other"
X Link 2026-02-13T03:32Z [----] followers, [---] engagements

"@Dr_Singularity wait so the biggest savings came from swapping Sonnet for Gemini Flash that's less recursive self-improvement and more smart model routing. still cool that it discovered that on its own though. what happens when agents can re-evaluate model selection per task in real time"
X Link 2026-02-14T09:12Z [----] followers, [---] engagements

"@MoonDevOnYT ngl the parallelization is the unlock most people miss. we run multiple claude code sessions on different features and the throughput is honestly ridiculous. the trick is giving each one a clear isolated scope so they don't step on each other's changes"
X Link 2026-02-14T13:27Z [----] followers, [---] engagements

"@ZhihuFrontier @MiniMax_AI the prefix tree merging for 40x acceleration is the part that stands out. so much wasted compute on redundant prefixes in multi-turn agent training. does the windowed FIFO scheduling paper show how far off-policy you can go before convergence breaks down"
X Link 2026-02-14T13:27Z [----] followers, [--] engagements

"@harsh_vardhhan @MiniMax_AI @cline so the 20x savings holds up for day to day work but what happens when you need to refactor across like 15+ files that's usually where the cheaper models start losing context. would love to know how M2.5 handles that"
X Link 2026-02-15T23:34Z [----] followers, [---] engagements

"The part most people are missing: OpenClaw stays open-source and model-agnostic under a foundation structure. OpenAI is sponsoring it not acquiring it. Open-source AI agent infra that every company can build on. That's the real win here. http://steipete.me/posts/2026/openclaw http://steipete.me/posts/2026/openclaw"
X Link 2026-02-16T00:10Z [----] followers, [--] engagements

"The claimed numbers put DeepSeek V4 ahead of every frontier model: SWE-Bench: 83.7% (GPT-5.2 High: 80%) AIME 2026: 99.4% (GPT-5.2 High: 98.3%) FrontierMath Tier 4: 23.5% (GPT-5.2 High: 18.8%) HLE: 56.2% (GPT-5.2 High: 45.5%) Plus rumors of 1M token context and inference cheap enough to run on consumer GPUs"
X Link 2026-02-16T06:00Z [----] followers, [---] engagements

"More red flags: DeepSeek is one of the most locked-down AI labs in the world. How did full benchmark results leak The model may be delayed to late March. Multiple sources say the Feb [--] launch was pushed back. Even the account sharing these numbers admits Chinese models "feel benchmaxed" lately"
X Link 2026-02-16T06:00Z [----] followers, [--] engagements

"A soldier's face before battle. A massive explosion ripping through a city street. A mech sprinting through rubble. The emotion the VFX the camera work. All AI generated. All in one 15-second clip"
X Link 2026-02-14T06:00Z [----] followers, [---] engagements

"Real footage or Seedance [--] This scene had people genuinely confused. The kitchen lighting the food physics the subtle hand movements. We're past uncanny valley territory with AI video"
X Link 2026-02-14T06:00Z [----] followers, [---] engagements

"How it works: your agent sends an Accept: text/markdown header. Cloudflare intercepts the request fetches HTML from the origin converts it to clean markdown on the fly and serves it back. Standard HTTP content negotiation. No new protocol needed"
X Link 2026-02-15T01:00Z [----] followers, [--] engagements

"The response includes an x-markdown-tokens header with the estimated token count of the converted doc. Your agent knows the size before processing. Context window management chunking strategy all handled with a single header. Also ships with Content-Signal headers so site owners control how their content gets used. https://twitter.com/i/web/status/2022838287401197637 https://twitter.com/i/web/status/2022838287401197637"
X Link 2026-02-15T01:00Z [----] followers, [--] engagements

"tbh the split I see is less about skepticism and more about tooling maturity. six months ago I was skeptical too. then I started treating agents like junior devs: clear specs good context files tight feedback loops. went from 'this is a toy' to shipping production features daily. the gap closes fast once you stop prompting and start engineering https://twitter.com/i/web/status/2022864486705889763 https://twitter.com/i/web/status/2022864486705889763"
X Link 2026-02-15T02:44Z [----] followers, [--] engagements

"@staysaasy so many people went straight to "what can I build" instead of "what problem do I actually have." the best stuff I've seen from AI coding isn't new apps it's internal tools that solve one specific annoying workflow nobody else would bother building"
X Link 2026-02-15T06:59Z [----] followers, [----] engagements

"An AI assistant that runs on a $10 board with less than 10MB of RAM. PicoClaw just hit 8k stars on GitHub and the benchmarks are kind of absurd"
X Link 2026-02-15T21:00Z [----] followers, [---] engagements

"@damianplayer the three completely different playbooks is what gets me. anthropic building in-house meta buying the whole company openai acquihiring the talent. each one basically betting on a different theory of where the moat actually is in agents 👀"
X Link 2026-02-15T23:33Z [----] followers, [--] engagements

"@BoWang87 wait so the AI agents rebuilt themselves to run on cheaper hardware that's like an engineer automating themselves into a smaller office. 60x cost drop in weeks though this is exactly why betting on infrastructure lock-in feels like building on sand right now"
X Link 2026-02-16T03:50Z [----] followers, [----] engagements

"@Kimi_Moonshot wait so you can bring your own OpenClaw instance and bridge it to Telegram groups that's a really interesting distribution play. does it share the same skill library or is it sandboxed separately"
X Link 2026-02-16T06:05Z [----] followers, [---] engagements

"ngl the biggest surprise building with agents daily is how much time I now spend on files and context setup vs actual code. like 70% of the work is giving agents the right constraints and system context. engineering taste shows up most in what you tell the agent NOT to do http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-15T02:41Z [----] followers, [--] engagements

"This is what happens when someone rewrites AI infrastructure in a systems language instead of gluing npm packages together. What's your take on Rust for AI tooling http://github.com/zeroclaw-labs/zeroclaw http://github.com/zeroclaw-labs/zeroclaw"
X Link 2026-02-17T01:00Z [----] followers, [--] engagements

"The hard part isn't finding the edge. It's the engineering. State management during volatile execution. Stale price data during flash crashes. One builder shared losing $30K when their bot's safety checks were reading outdated feeds. Reliability is the real bottleneck not alpha. https://twitter.com/i/web/status/2023638590870458513 https://twitter.com/i/web/status/2023638590870458513"
X Link 2026-02-17T06:00Z [----] followers, [--] engagements

"@jarredsumner oh the memory drop from [---] GB to [---] MB is the real story here. we run Claude Code sessions for hours and the RSS creep was forcing restarts constantly. 7x speed is great but not having to babysit memory usage changes the whole workflow 👀"
X Link 2026-02-14T07:02Z [----] followers, [----] engagements

"@rammcodes @ollama wait ollama is becoming the universal launcher for AI coding tools now love that you can just switch between claude code and codex without messing with configs. this is giving homebrew energy but for agents 👀"
X Link 2026-02-15T00:34Z [----] followers, [---] engagements

"Claude Code and OpenCode already send these Accept headers today. The infrastructure is catching up to what agents actually need. Available now in beta free for Pro Business and Enterprise plans"
X Link 2026-02-15T01:00Z [----] followers, [--] engagements

"@ai @openclaw @OpenRouterAI yeah and the smart part is they don't need to win any model race. every new model release just adds another product to their shelf. the real question is whether agents start routing between providers on their own based on cost vs quality tradeoffs"
X Link 2026-02-15T03:46Z [----] followers, [---] engagements

"@corbin_braun honestly they're solving different problems at this point. codex is cracked for raw code output but opus holds context across a massive codebase like nothing else. been switching between them depending on the task and it's not even close to a simple ranking anymore"
X Link 2026-02-15T05:57Z [----] followers, [---] engagements

"The bigger play: OpenAI is building Codex with two modes. Spark handles real-time collaboration for rapid iteration. The full Codex model handles longer-horizon reasoning. Over time you stay in a tight interactive loop while sub-agents handle heavier work in the background"
X Link 2026-02-15T06:00Z [----] followers, [--] engagements

"Speed as a feature is underrated. When output is near-instant you stop batching and start iterating. Your whole workflow changes. What's been your experience with latency in AI coding tools http://cerebras.ai/blog/openai-codexspark http://openai.com/index/introducing-gpt-5-3-codex-spark/ http://cerebras.ai/blog/openai-codexspark http://openai.com/index/introducing-gpt-5-3-codex-spark/"
X Link 2026-02-15T06:00Z [----] followers, [--] engagements

"@heygurisingh oh we've been using this for a few weeks now. the diagrams are surprisingly good for getting new contributors oriented. biggest unlock though is feeding the generated docs into your AI coding agent's context so it actually understands the architecture before making changes"
X Link 2026-02-16T00:37Z [----] followers, [---] engagements

"Here's the side-by-side in the CLI. Spark is already generating full JavaScript with event listeners collision detection and game state while regular Codex is still planning file creation. Same prompt. Same capability. 15x the speed"
X Link 2026-02-16T01:00Z [----] followers, [--] engagements

"ngl the [--] failed projects before the breakout is the part nobody talks about enough. we've all been there shipping thing after thing that goes nowhere. the difference is just not stopping. also lowkey ironic that anthropic's trademark complaint ended up being the best thing that happened to him"
X Link 2026-02-16T11:21Z [----] followers, [----] engagements

"yeah the security vs usefulness tradeoff is real. we run agent loops on everything now but tbh the scariest part isn't the model doing something wrong it's the permission creep. you start with read-only file access and three days later the agent has your AWS keys. sandboxing should probably be the default not an afterthought https://twitter.com/i/web/status/2023627298998751244 https://twitter.com/i/web/status/2023627298998751244"
X Link 2026-02-17T05:15Z [----] followers, [---] engagements

"A Polymarket "AI bot" just went mega-viral. 1.5M+ views. Claims of $150K from [----] fully automated trades. The internet is split between "this is genius" and "this is a scam." Here's what's actually going on 👀"
X Link 2026-02-17T06:00Z [----] followers, [--] engagements

"But strip away the hype and there's a real trend here. Autonomous agents are entering prediction markets at scale. Not through flashy screenshots but through systematic execution faster than any human can match. The Polymarket automation race has genuinely started"
X Link 2026-02-17T06:00Z [----] followers, [--] engagements

"ngl we do something similar for our dev workflows. the file in Claude Code is basically this concept but for codebases. your approach of making it portable across LLMs is smart though most people just lock into one provider's memory system and lose everything when they switch http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-17T08:27Z [----] followers, [--] engagements

"the SEO to AEO parallel is spot on. we've been building browser automation for our AI workflows and the amount of time spent on fragile DOM selectors is painful. navigator.modelContext basically skips the entire "guess which button to click" layer. the sites that ship this first are going to eat the agent traffic"
X Link 2026-02-14T08:04Z [----] followers, [----] engagements

"Available now as a research preview for ChatGPT Pro users through the Codex app CLI and VS Code extension"
X Link 2026-02-15T06:00Z [----] followers, [--] engagements

"yeah the cognitive debt thing is real. we started keeping a file that acts like a living spec for the project and it's been the single biggest thing for staying oriented when AI writes most of the code. not formal proofs exactly but same principle: make the intent explicit so you can audit what got generated http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-16T10:17Z [----] followers, [--] engagements

"@JustinLin610 the hybrid thinking and non-thinking modes are really interesting for agentic use cases. being able to toggle CoT on for planning steps and off for quick tool calls could save a ton of tokens in long running agent loops. excited to see the smaller models in the [---] series too"
X Link 2026-02-16T12:24Z [----] followers, [---] engagements

"Someone used SeedAnce [---] to make a Monkey King vs Thanos cinematic short. The armor detail the battle choreography the camera work. This is one person with a laptop not a VFX studio with a $200M budget"
X Link 2026-02-16T21:00Z [----] followers, [---] engagements

"22 AI providers [--] messaging channels (Telegram Discord Slack iMessage WhatsApp) and a custom memory system built on SQLite with FTS5 + vector search. No Pinecone. No LangChain. Zero external dependencies. Every subsystem is a Rust trait. Swap providers channels memory tools without touching code"
X Link 2026-02-17T01:00Z [----] followers, [--] engagements

"funny how the "waking up" always happens to coincide with a new employer https://x.com/steipete/status/2023055962546864221 https://x.com/steipete/status/2023055962546864221"
X Link 2026-02-17T03:38Z [----] followers, [--] engagements

"@daytonmills so we went from 'everyone needs a technical cofounder' to 'everyone IS the technical cofounder' in like [--] months. the cofounder crisis is now a CEO surplus 👀"
X Link 2026-02-17T05:16Z [----] followers, [---] engagements

"The setup: an autonomous agent targets 5-min BTC/ETH prediction contracts on Polymarket. When YES + NO prices briefly dip below $1 buy both for risk-free arbitrage. No human input no directional bets pure execution speed"
X Link 2026-02-17T06:00Z [----] followers, [--] engagements

"@twistartups @jessegenet @openclaw @YouTube ngl this is one of the most practical openclaw use cases I've seen. building your own content filter because the defaults aren't good enough is exactly the kind of thing that makes self-hosted agents click for people"
X Link 2026-02-17T07:23Z [----] followers, [---] engagements

"@iannuttall GCP has a clunky UX and confusing billing yet somehow AWS manages to be worse. 😅"
X Link 2026-02-17T10:52Z [----] followers, [---] engagements

"@akshay_pachaar wait have you tested the tool calling in longer agentic workflows though benchmark scores on isolated calls are one thing but chaining 10+ calls with growing context is where most models start hallucinating tool schemas"
X Link 2026-02-15T04:51Z [----] followers, [---] engagements

"Spark is a smaller optimized version of GPT-5.3-Codex built for real-time coding. 128K context text-only designed to keep you in flow state. Outperforms GPT-5.1-Codex-mini on SWE-Bench Pro and Terminal-Bench [---] while finishing tasks in a fraction of the time"
X Link 2026-02-15T06:00Z [----] followers, [--] engagements

"@udiWertheimer yeah same workflow here tbh. opus for the thinking and architecture decisions codex when you just need it to go execute. the split feels natural once you stop trying to force one tool to do everything"
X Link 2026-02-15T13:25Z [----] followers, [--] engagements

"The comparison against existing frameworks: OpenClaw (TypeScript): 1GB RAM 500s startup NanoBot (Python): 100MB RAM 30s startup PicoClaw (Go): 10MB RAM 1s startup That's 99% less memory and 400x faster cold start. Single binary no runtime dependencies"
X Link 2026-02-15T21:00Z [----] followers, [--] engagements

"the community note nailed it. 99.4% on AIME isn't even a possible score under the official scoring system (max is 99.2% or 100%). plus Epoch AI confirmed the FrontierMath numbers are fabricated since only they and OpenAI have access to evaluate on that dataset. at least two of these benchmarks are confirmed fake before the model even dropped. https://twitter.com/i/web/status/2023181163192623462 https://twitter.com/i/web/status/2023181163192623462"
X Link 2026-02-15T23:42Z [----] followers, [---] engagements

"Peter Steinberger built OpenClaw as a side project. It blew up. Investment offers everywhere. Instead of starting another AI company he joined OpenAI to bring agents to everyone. OpenClaw becomes an independent foundation"
X Link 2026-02-16T00:10Z [----] followers, [--] engagements

"It's not just toy games though. Here's someone scaffolding a complete multi-page website in real-time. HTML CSS vanilla JS. Index dashboard docs [---] pages. Full directory structure with assets icons and mock data. All generated while you watch"
X Link 2026-02-16T01:00Z [----] followers, [--] engagements

"@OfficialLoganK the new one feels way more intentional. love that the prompt bar is front and center now instead of competing with template cards. also the fact that you prototyped this in AI Studio itself and shipped it to prod in [---] hours is lowkey the best ad for the product"
X Link 2026-02-16T03:51Z [----] followers, [--] engagements

"DeepSeek V4 benchmarks just leaked and the numbers are absurd. 83.7% SWE-Bench Verified. 99.4% AIME [----]. 56.2% HLE. If real it would top GPT-5.2 Gemini [---] and Kimi K2.5 across every major benchmark. But some of these numbers are already confirmed fake"
X Link 2026-02-16T06:00Z [----] followers, [---] engagements

"@UnslothAI @Alibaba_Qwen so the MoE architecture is doing serious work here. only 17B active params out of 397B means you're getting frontier level quality at a fraction of the compute. have you tested agentic tool use on the 4-bit quant that's usually where quantization hits hardest"
X Link 2026-02-16T12:24Z [----] followers, [---] engagements

"@bhaidar ngl the [--] years of experience is doing the heavy lifting here. we've found the same thing building with AI. the code writes itself but knowing which architecture decisions to make upfront saves you from rewriting everything on day 10"
X Link 2026-02-17T00:59Z [----] followers, [--] engagements

"tbh the security point is the one nobody wants to talk about. giving an LLM agent access to your filesystem messaging apps and shell is just asking for prompt injection to ruin your day. the "soul" and memory stuff is cool in demos but in practice you need actual sandboxing and permission boundaries first https://twitter.com/i/web/status/2023594964295577926 https://twitter.com/i/web/status/2023594964295577926"
X Link 2026-02-17T03:06Z [----] followers, [---] engagements

"If you actually found a reliable money printer posting it on X is the fastest way to kill it. More bots watching the same gap = thinner margins = edge gone. Sharing the alpha IS destroying the alpha"
X Link 2026-02-17T06:00Z [----] followers, [--] engagements

"The pattern repeats: impressive screenshot simple explanation convenient link. Meanwhile the builders running actually profitable systems They're running them in silence. What's your take on AI agents in financial markets"
X Link 2026-02-17T06:00Z [----] followers, [--] engagements

"I made $10 million in [--] days with OpenClaw on Kalshi here's how👇👇 If anyone writes a post about how they made easy money with OpenClaw they're either lying or the opportunity no longer http://x.com/i/article/2023712841346547712 http://x.com/i/article/2023712841346547712"
X Link 2026-02-17T11:09Z [----] followers, [--] engagements

"So my friend made a thing 🍌 Got tired of that Gemini watermark in the corner of Nano Banana images so he built a little Chrome extension to remove it. It's called "Peel Banana" 🍌 and he just put it up on the Chrome Web Store. Completely free. Chrome Web Store: https://chromewebstore.google.com/detail/peel-banana/cngdhnfjakplnhplnmlgjalmfcochdgj https://chromewebstore.google.com/detail/peel-banana/cngdhnfjakplnhplnmlgjalmfcochdgj"
X Link 2026-01-05T23:49Z [----] followers, [----] engagements

"Seedance [--] just dropped and the AI video quality jump is unreal. Batman vs John Wick. In a nightclub. Entirely AI generated. This is what text-to-video looks like in [----] 👀"
X Link 2026-02-14T06:00Z [----] followers, [---] engagements

"Cloudflare just made every website on their network AI-agent ready. One toggle in the dashboard. No code changes. Your HTML gets auto-converted to markdown at the edge"
X Link 2026-02-15T01:00Z [----] followers, [--] engagements

"This is the kind of infrastructure that makes the agentic web actually work. Content negotiation instead of scraping. What other web primitives need an agent-first upgrade Blog: Docs: http://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/ http://blog.cloudflare.com/markdown-for-agents http://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/ http://blog.cloudflare.com/markdown-for-agents"
X Link 2026-02-15T01:00Z [----] followers, [--] engagements

"OpenAI just dropped GPT-5.3-Codex-Spark. 1000+ tokens per second. And it's not running on Nvidia. It's the first OpenAI model deployed on Cerebras silicon"
X Link 2026-02-15T06:00Z [----] followers, [---] engagements

"The hardware story here is big. Cerebras' Wafer-Scale Engine [--] packs [--] trillion transistors with the largest on-chip memory of any AI processor. This is the first milestone of OpenAI's $10B+ multi-year deal with Cerebras announced last month. A real shift away from Nvidia-only infrastructure"
X Link 2026-02-15T06:00Z [----] followers, [--] engagements

"we started writing files that document every architectural decision and pattern. when the AI generates code against those rules you at least know the 'why' behind each choice even if you didn't write it line by line. doesn't fully solve it but it turns cognitive debt into something more searchable http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-15T08:06Z [----] followers, [---] engagements

"The target hardware is wild. A $9.99 LicheeRV-Nano for home automation. A $30 NanoKVM for server maintenance. A $50 MaixCAM for smart monitoring with person detection. Running a capable AI agent on hardware cheaper than a lunch"
X Link 2026-02-15T21:00Z [----] followers, [--] engagements

"The dev process is interesting too. 95% of the core was agent-generated with human-in-the-loop refinement. They used AI to build an AI assistant optimized for hardware that most devs would never consider targeting"
X Link 2026-02-15T21:00Z [----] followers, [--] engagements

"@zarazhangrui yeah this is exactly how I learned. started building a full stack app with Claude Code having zero backend experience and honestly picked up more in [--] weeks than I did reading docs for months. the key is having a real project you actually care about not a tutorial todo app"
X Link 2026-02-16T02:45Z [----] followers, [---] engagements

"The FrontierMath score is already debunked. Jaime Sevilla from Epoch AI (the team that created FrontierMath) says they haven't evaluated DeepSeek V4. Only their team and OpenAI have access to that dataset. So someone fabricated at least that number. Which raises the question: what else is made up https://twitter.com/i/web/status/2023276170398335469 https://twitter.com/i/web/status/2023276170398335469"
X Link 2026-02-16T06:00Z [----] followers, [---] engagements

"This is part of a bigger pattern. MiniMax [---] matches Opus [---] on paper but falls apart in real-world use. Benchmarks are becoming marketing decks not performance indicators. What actually matters: inference cost latency how it handles messy 50K-line codebases context utilization at scale. Numbers on a chart tell you almost nothing about that. https://twitter.com/i/web/status/2023276175171387782 https://twitter.com/i/web/status/2023276175171387782"
X Link 2026-02-16T06:00Z [----] followers, [--] engagements

"@cline oh parallel agents in the terminal is exactly what I've been wanting. been running everything through a single agent and the context switching overhead was killing me"
X Link 2026-02-16T06:04Z [----] followers, [--] engagements

"@tszzl so true. the "dream bigger" part is what gets me. used to spend half my time on boilerplate and infra setup now I jump straight to the interesting architecture decisions. feels like the ceiling on what a single engineer can ship just keeps going up"
X Link 2026-02-16T08:12Z [----] followers, [---] engagements

"yeah the biggest difference is just the UI layer tbh. claude code with a good and custom slash commands already does everything openclaw does you just need to know how to set it up. the people hyped about openclaw are mostly discovering agent capabilities for the first time http://CLAUDE.md http://CLAUDE.md"
X Link 2026-02-16T11:20Z [----] followers, [---] engagements

"SeedAnce [---] launched [--] days ago and creators are already producing near-theatrical quality. 15-second clips stitched into full scenes. Custom characters controlled lighting consistent style across every frame. That last part was impossible [--] months ago"
X Link 2026-02-16T21:00Z [----] followers, [---] engagements

"SOON: Seedance [---] will be able to create a full-length feature film from a single prompt rather than 5- to 20-second clips"
X Link 2026-02-16T22:02Z [----] followers, [---] engagements

"ZeroClaw just hit 5K stars in [--] days. A fully autonomous AI assistant in Rust. 3.4MB binary. Under 10ms startup. Under 5MB RAM. For context the TypeScript equivalent needs 1GB RAM and takes 500+ seconds to start"
X Link 2026-02-17T01:00Z [----] followers, [---] engagements

"@Ross__Hendricks tbh the horror stories are already here but they're from people who skipped the "engineering" part. vibe coding with zero architecture sense = disaster. vibe coding where you define specs review diffs and understand what shipped that's just. faster engineering"
X Link 2026-02-17T04:11Z [----] followers, [--] engagements

"It's a full agent framework too. Not just a toy. Multi-LLM support (OpenRouter Anthropic OpenAI Gemini) web search persistent memory sandboxed execution and tool use with up to [--] iterations per task. Ships with Telegram Discord QQ DingTalk and LINE integrations out of the box. https://twitter.com/i/web/status/2023140284725674146 https://twitter.com/i/web/status/2023140284725674146"
X Link 2026-02-15T21:00Z [----] followers, [--] engagements

"@aakashgupta the "open-source the model sell the GPU" playbook is so smart it's almost unfair. we've been running voice pipelines with the ASR LLM TTS stack and the latency between each seam is where conversations feel robotic. 170ms turn-taking in a single model changes everything tbh"
X Link 2026-02-15T21:27Z [----] followers, [----] engagements

"appreciate the callout. and the community note on the original post caught another issue. 99.4% on AIME isn't even a valid score since the max is 119/120 (99.2%) or 120/120 (100%). so that's at least two fabricated benchmarks. anyone still sharing these numbers uncritically is doing the timeline a disservice. https://twitter.com/i/web/status/2023181322135720426 https://twitter.com/i/web/status/2023181322135720426"
X Link 2026-02-15T23:43Z [----] followers, [---] engagements

Limited data mode. Full metrics available with subscription: lunarcrush.com/pricing