[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.] #  @rohanpaul_ai Rohan Paul Rohan Paul posts on X about open ai, llm, meta, token the most. They currently have XXXXXX followers and XXX posts still getting attention that total XXXXXX engagements in the last XX hours. ### Engagements: XXXXXX [#](/creator/twitter::2588345408/interactions)  - X Week XXXXXXXXX +183% - X Month XXXXXXXXXX +140% - X Months XXXXXXXXXX +137% - X Year XXXXXXXXXX +793% ### Mentions: XXX [#](/creator/twitter::2588345408/posts_active)  - X Week XXX +8.50% - X Month XXX +158% - X Months XXXXX +27% - X Year XXXXX +273% ### Followers: XXXXXX [#](/creator/twitter::2588345408/followers)  - X Week XXXXXX +4.90% - X Month XXXXXX +9.20% - X Months XXXXXX +34% - X Year XXXXXX +223% ### CreatorRank: XXXXXXX [#](/creator/twitter::2588345408/influencer_rank)  ### Social Influence [#](/creator/twitter::2588345408/influence) --- **Social category influence** [technology brands](/list/technology-brands) XXXXX% [finance](/list/finance) XXXX% [stocks](/list/stocks) XXXX% [celebrities](/list/celebrities) XXXX% [countries](/list/countries) XXXX% [vc firms](/list/vc-firms) XXX% [travel destinations](/list/travel-destinations) XXX% [social networks](/list/social-networks) XXX% [currencies](/list/currencies) XXX% [automotive brands](/list/automotive-brands) XXX% **Social topic influence** [open ai](/topic/open-ai) #231, [llm](/topic/llm) #27, [meta](/topic/meta) 2.68%, [token](/topic/token) #412, [microsoft](/topic/microsoft) #162, [mark zuckerberg](/topic/mark-zuckerberg) 1.79%, [$googl](/topic/$googl) 1.79%, [context engineering](/topic/context-engineering) #6, [puzzles](/topic/puzzles) 1.79%, [xai](/topic/xai) XXXX% **Top accounts mentioned or mentioned by** [@opus_genesis](/creator/undefined) [@grok](/creator/undefined) [@stonkyoloer](/creator/undefined) [@xai](/creator/undefined) [@huggingface](/creator/undefined) [@baiduinc](/creator/undefined) [@openai](/creator/undefined) [@windsurfai](/creator/undefined) [@nomadkreativ](/creator/undefined) [@justinechoes](/creator/undefined) [@googledeepmind](/creator/undefined) [@prashant_1722](/creator/undefined) [@the100kprompts](/creator/undefined) [@elonmusk](/creator/undefined) [@arthurkilber](/creator/undefined) [@teksedge](/creator/undefined) [@thewebai](/creator/undefined) [@viragconsulting](/creator/undefined) [@apple](/creator/undefined) [@mohansolo](/creator/undefined) **Top assets mentioned** [Microsoft Corp. (MSFT)](/topic/microsoft) [Alphabet Inc Class A (GOOGL)](/topic/$googl) [Goldman Sachs (GS)](/topic/goldman-sachs) [Apollo Global Management, Inc. (APO)](/topic/$apo) [ServiceNow Inc (NOW)](/topic/servicenow) ### Top Social Posts [#](/creator/twitter::2588345408/posts) --- Top posts by engagements in the last XX hours "@RossionQ yes most prop-trading firms and i-banking trading desks have their own powerful prediction softwares. And now the most powerfull models will be fine-tuned to their huge propietory data. and the race will be who has the most custom-fine-tuned model"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944511729603682496) 2025-07-13 21:38:18 UTC 73.6K followers, 20.3K engagements "This is really BAD news of LLM's coding skill. โน The best Frontier LLM models achieve X% on hard real-life Programming Contest problems domains where expert humans still excel. LiveCodeBench Pro a benchmark composed of problems from Codeforces ICPC and IOI (International Olympiad in Informatics) that are continuously updated to reduce the likelihood of data contamination"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1934751145400111572) 2025-06-16 23:13:13 UTC 73.6K followers, 459.8K engagements "Brilliant paper for optimizing your prompt-design. ๐ก Keep crucial rules early in your prompt break huge lists into chunks and expect misses past XXX no matter how fancy the engine. This paper checks what happens when the rules or instruction list reaches XXX. IFScale the benchmark asks a model to write a business report while slipping in up to XXX exact keywords. Because scoring is plain keyword matching the team charts accuracy for XX models from X vendors. Results show three decay shapes. Reasoning models like o3 stay near XXX% until about XXX rules then drop fast gpt4.1 drifts down in a"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945790079290798453) 2025-07-17 10:18:00 UTC 73.6K followers, 23K engagements "Todays edition of my newsletter just went out. ๐ Consider subscribing its free and I publish daily with top X% AI developments. โก In todays Edition (14-July-2025): ๐จ Mark Zuckerberg says Meta is building a 5GW AI data center ๐ก @xai will spin Grok into hundreds of task-focused agents that talk to each other. ๐ @cognition_labs is taking Remaining Windsurf team and tech days after Google bought its founders for $2.4B ๐ Byte-Size Briefs: - Pentagon picked Google OpenAI xAI and Anthropic for new defense deals. Each agreement carries a spending limit of $XXX million. ๐ง๐ Deep Dive:"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944907341637542278) 2025-07-14 23:50:19 UTC 73.5K followers, 14.4K engagements "@ArthurKilber yes long prompt generally works bettter with o3/o3 pro"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1941222172468576298) 2025-07-04 19:46:46 UTC 73.6K followers, XXX engagements "DeepSeek R1 running locally - Full setup guide"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1883304599932858665) 2025-01-26 00:03:01 UTC 73.6K followers, 1.4M engagements "Optimism is a low-cost gradient ascent hack. Creative throughput is proportional to expected reward. Defend it keep expectations green profit from higher gradient steps"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946309235673022780) 2025-07-18 20:40:57 UTC 73.6K followers, 2019 engagements "โ NY Times wins right to see ChatGPT logs in legal fight with OpenAI NYT can even search deleted ChatGPT logs ๐๐ exposing up to 2B private chats and testing OpenAI's privacy safeguards. Judge Sidney Stein rejected OpenAIs plea to keep standard deletion policies. Magistrate Ona Wangs preservation order forces the company to store every nonenterprise chat indefinitely while it negotiates keyword scopes with NYT Daily News and CIR. Only small anonymized slices will stay on OpenAI servers yet they still expose prompts outputs and timestamps. So billions of medical job and relationship details"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1941170903033041314) 2025-07-04 16:23:03 UTC 73.6K followers, 3188 engagements "Todays edition of my newsletter just went out. ๐ Consider subscribing its free and I publish daily with top X% AI developments. โก In todays Edition (18-July-2025): โ Humans vs AI at the AtCoder World Tour Finals @OpenAI beats all but one human. New Video model lets you take any video stream and set them in any alternative universe of your choosing. ๐ ConstBERT from Pinecone cuts multivector index size by about XX% yet keeps toptier ranking. ๐ง๐ OPINION: Human Money vs Machine Money: The Coming Split and Sam Altmans view"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946326003166609562) 2025-07-18 21:47:34 UTC 73.6K followers, 12.3K engagements "Microsoft Study Reveals Which Jobs AI is Actually Impacting Based on 200K Real Conversations. The largest study of its kind analyzing 200000 real conversations. Key Finding: big chunk of knowledge and people jobs now overlaps with what todays AI models do well. ๐ Most AI-Impacted Jobs: - Interpreters and Translators top the chart with XX% of their core activities turning up in chats and showing decent completion and scope. - Customer Service Representatives Sales Reps Writers Technical Writers and Data Scientists. Each of these lands an applicability score around 0.400.49 meaning roughly"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943843777099313428) 2025-07-12 01:24:05 UTC 73.6K followers, 102.1K engagements "๐ Google dropped its very first Gemini Embedding text model tops the MTEB Multilingual leaderboard. - generally available in the Gemini API and Vertex AI. - has consistently ranked #1 on the MTEB Multilingual leaderboard since its experimental launch in March - supports over XXX languages - has a 2048 maximum input token length - priced at $XXXX per 1M input tokens. - allows developers to scale the output dimensions down from the default 3072"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945604974588956895) 2025-07-16 22:02:28 UTC 73.6K followers, 2897 engagements "Bug fixed for Grok X now. The changed System Prompt pull request from Github"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945042426596413796) 2025-07-15 08:47:06 UTC 73.5K followers, 3027 engagements "Wild idea in this paper ๐คฏ How might we store knowledge affordably yet comprehensively Memory proposes an intriguing method - compressing factual data separately. Introduces a third form of memory in addition to the implicit knowledge stored in model parameters and the short-term working memory used during inference (context key-values). ๐จ๐ง LLMs struggle with inefficient knowledge storage and retrieval leading to high training and inference costs. The paper aims to address this by introducing a more efficient memory format. ๐ Memory3 introduces explicit memory as a third memory format for"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1809782336021537094) 2024-07-07 02:51:47 UTC 73.6K followers, 154.1K engagements "Now the 3rd paper comes on this ๐คฏ "The Illusion of the Illusion of the Illusion of Thinking" ๐1st original Paper from Apple concludes that large reasoning models reach a complexity point where accuracy collapses to zero and even spend fewer thinking tokens revealing hard limits on generalizable reasoning. ๐2nd Paper counters that the apparent collapse is an illusion caused by token limits and impossible puzzles so the models reasoning remains sound when evaluations remove those flaws. ๐3rd paper synthesizes both sides agreeing the collapse was an artifact yet stressing that models still"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1935746720144544157) 2025-06-19 17:09:17 UTC 73.6K followers, 251.7K engagements "๐ @cluely doubles ARR to $7M in X days after launching Early cheat on everything branding softened once Andreessen Horowitz Abstract Ventures and Susa Ventures backed the startup. techcrunch .com/2025/07/03/cluelys-arr-doubled-in-a-week-to-7m-founder-roy-lee-says-but-rivals-are-coming"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1940936884009509342) 2025-07-04 00:53:08 UTC 73.6K followers, 2293 engagements "๐จ HUGE BREAKTHROUGH. A hair-thin silicon chip now can push data at 1000 Gbps while sipping only X joules. That moves 100M books in roughly X minutes. What does it mean practically Data-center switches now stretch processors across long aisles wasting energy and space. In a big AI data center that means racks can sit closer cables shrink cooling loads drop and energy bills fall. Traditional copper links max out near XX Gbps so a single cable cannot carry the huge flood of data. Every time traffic exceeds what X cable moves engineers stack more identical cables in parallel then drop a switch"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944189756692476225) 2025-07-13 00:18:53 UTC 73.6K followers, 14.8K engagements "๐ Attention Suppression Method X masks all attention going to one sentence and watches how later logits drift. A strong drift signals a direct causal link. The suppression scores correlate with resampling scores backing up the claim that the three methods converge on the same anchors"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1938379204300620211) 2025-06-26 23:29:50 UTC 73.6K followers, XXX engagements "RAG boosts LLM memory yet it misses multi step logic while raw reasoning invents facts. This survey explains fresh designs that let the two prop each other up. ๐งฉ It first shows reasoning can fix retrieval by rewriting queries planning hops and filtering noisy passages. Next retrieval fills the knowledge gaps inside long reasoning chains bringing the proofs code or web snippets a model actually needs. ๐ค The highlight is a loop where an agent thinks searches checks and thinks again until the answer is solid. Chains trees and graph walks guide this loop and solo or team agents run it cutting"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946065140979511299) 2025-07-18 04:31:00 UTC 73.6K followers, 3441 engagements "๐งต 2/n But how exactly a model can be split and still it will generate response to my question A transformer is just a long stack of math layers packed into weight matrices that live in RAM. so webFrame starts by slicing the full checkpoint into several shards on disk. Each shard holds only the slice that a given computer will need following the same tensor-parallel idea first popularized in Megatron-LM. When the cluster boots every Mac loads just its slice which keeps memory use under control. Once a prompt arrives the layer-by-layer forward pass still happens in the usual order but matrix"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946243293391794472) 2025-07-18 16:18:55 UTC 73.6K followers, 1020 engagements "๐ฏOpenAI achieved IMO gold with experimental reasoning model; they also will be releasing GPT-5 soon. ๐ @OpenAI 's newest reasoning model solved X of the X problems on the 2025 International Math Olympiad under the same 2-day 4.5-hour-per-session rules that human contestants face. The model is not an IMO specialist it is a general LLM that uses fresh verification tricks and much longer thinking time letting it tackle proofs that used to stall machines. The gap between THE MOST BRILLIANT HUMAN students and AI on SUPER hard mathematics challenges has finally closed. ๐งฎ Olympiad problems demand"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946594919554146400) 2025-07-19 15:36:09 UTC 73.6K followers, 4908 engagements "@TheRealOdram no don't do that ๐ if you are in software-engineering o3/o3-pro is the absolute best you can get right now. i am a real fan"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1940890355731517743) 2025-07-03 21:48:15 UTC 73.6K followers, XX engagements "It had to happen. After all these are financed by a humongous amount of money. Metas top lab members including Alexandr Wang mulled dropping Behemoth the companys premier open model for a closed version Were obviously very pro open source but I havent committed to releasing every single thing that we do. - Mark Zuckerberg --- nytimes. com/2025/07/14/technology/meta-superintelligence-lab-ai.html"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944875435369881790) 2025-07-14 21:43:32 UTC 73.6K followers, 1955 engagements "This is really cool open-source project from @firecrawl_dev Turn a simple list of emails into a rich dataset with company profiles funding data tech stacks and more. It chains small specialized agents feeds them shared context and let them stitch the answers together. Behind the scenes each agent is a specialized module with its own expertise search strategies and type-safe output schema. Orchestrated by @Grok X and powered by @firecrawl_dev"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943382406809149922) 2025-07-10 18:50:46 UTC 73.6K followers, 4488 engagements "These guys literally burned the transformer architecture into their silicon. ๐คฏ And built the fastest chip of the world of all time for transformers architecture. 500000 tokens per second with Llama 70B throughput. ๐คฏ Worlds first specialized chip (ASIC) for transformers: Sohu One 8xSohu server replaces XXX H100 GPUs. And raised $120mn to build it. ๐ The Big Bet @Etched froze the transformer recipe into silicon. By burning the transformer architecture into its chip means it cant run many traditional AI models: like CNNs RNNs or LSTMs. also it can not run the DLRMs powering Instagram ads"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1938655279173792025) 2025-06-27 17:46:51 UTC 73.6K followers, 710.3K engagements "Fullstack Engineer - Waifus Annual Salary $440000 USD from real job-board. "What You'll Do Make Grok's realtime avatar products fast scalable and reliable. Help push forward audio and gameplay research and deploy breakthrough innovations to millions of users. Obsess over every millisecond and byte ensuring end-to-end quality and performance at scale across a rich suite of products and user platforms." --- job-boards .greenhouse. io/xai/jobs/4789505007"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945763403068518721) 2025-07-17 08:32:00 UTC 73.6K followers, 2601 engagements "๐ Receiver Heads Method X looks inside the model. Some attention heads in late layers called receiver heads pour unusually high attention into a few earlier broadcasting sentences. Sentences that soak up this focused attention are again mostly planning questioning or checking lines not raw arithmetic"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1938379199904960710) 2025-06-26 23:29:49 UTC 73.6K followers, XXX engagements "Github: Prompt engineering received all the attention but we can now get excited for what comes next. Once you've mastered prompts the real power comes from engineering the entire context window that surrounds those prompts. Guiding thought if you will"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946422703050871238) 2025-07-19 04:11:49 UTC 73.6K followers, 4640 engagements "Risk-aware financial forecasting models with LLMs. Here the researchrs design an adaptive Sharpe-ratio loss inside a Temporal Fusion Transformer. When tested on equities crypto and commodities the model lifts both prediction accuracy and realised portfolio Sharpe against standard TFT and LSTM baselines. --- researchgate. net/publication/389877674_An_Adaptive_Sharpe_Ratio-Based_Temporal_Fusion_Transformer_for_Financial_Forecasting"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944554988044423189) 2025-07-14 00:30:11 UTC 73.6K followers, 5546 engagements "If Apple buys Perplexity that would be its biggest ever acquisition"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1937972981059440991) 2025-06-25 20:35:39 UTC 73.6K followers, 1.1M engagements "Absolutely delux Github repository. lots of code-first tutorials covering every layer of production-grade GenAI agents by @NirDiamantAI"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945750323404185655) 2025-07-17 07:40:02 UTC 73.6K followers, 3988 engagements "Nvidia and AMD can once again ship their trimmed H20 and MI308 AI chips to China because the Trump team figures that selling older gear slows Huawei more than it strengthens Beijing labs. ๐AI czar David Sacks calls H20 a deprecated chip. By letting it flow Washington hopes global buyers stay tied to an American-made stack of chips software and models. Washington figures that if Nvidia stays blocked Chinese buyers will fill their racks with Huaweis home-grown Ascend chips hand Huawei massive production scale and let it polish those designs until they threaten Nvidia everywhere else. So by"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945387895747174573) 2025-07-16 07:39:52 UTC 73.6K followers, 3498 engagements "A brilliant example of Grok X Heavy. โค Swap manual bug hunts for Groks sweep. Take a look at the 21174 character long prompt. ๐ซก"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944538141920239983) 2025-07-13 23:23:15 UTC 73.5K followers, 7334 engagements "this story is going wildy viral on reddit. ChatGPT flagged a hidden gene defect that doctors missed for a decade. ChatGPT ingested the patients MRI CT broad lab panels and years of unexplained symptoms. It noticed that normal serum B12 clashed with nerve pain and fatigue hinting at a methylation block. Within months tingling eased and brain fog cleared. The primary physician reviewed the genetics report and agreed the variant unified the entire case. IMO time has already come taking a 2nd opinion from the best healthcare-AI model should be made part of medical code of practice. ------ reddit."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1941321376838951320) 2025-07-05 02:20:58 UTC 73.6K followers, 1.4M engagements "From text to trade: harnessing the potential of generative AI for investor sentiment analysis in financial markets through. This study describe a production-grade workflow that converts multilingual social-media streams into tradeable sentiment factors by means of a fine-tuned generative model. Over a 24-month back-test the factor delivers XXX % annualised excess return after transaction costs on a long-short equity book reinforcing the edge that rapid unstructured-text digestion can create. --- researchgate."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944541472537334230) 2025-07-13 23:36:29 UTC 73.6K followers, 7364 engagements "Most benchmarks still grade an LLM on a single file so they miss the messy reality where whole repos change after every test run. This paper closes that gap by introducing LiveRepoReflection 1888 tough tasks across X languages that make a model read edit and retest mini repositories. The team builds each task with an automated pipeline that scrapes fresh code writes several unittest suites crossexecutes them in a sandbox and drops anything the strongest models pass too easily. They also craft RepoReflectionInstruct 8702 vetted repos plus 840839 multiturn dialogues then finetune a 32B"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946411278509179209) 2025-07-19 03:26:26 UTC 73.6K followers, 1642 engagements "๐งต 3/n. The picture lays out a search pipeline that trims a huge document pool to a tiny list that an LLM can read. A sparse model and a dense embedding model each grab about 1000 likely matches from a corpus that holds 10M-100M records. Their two hit lists are blended then a multi-vector model checks finer details and keeps the best XXX. A heavier cross-encoder reranker scores those XXX pairs in depth and sends only XX winners forward. This step-by-step filter saves compute and storage yet still feeds the LLM documents picked with richer signals than a single wide scan could manage"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946296603624747042) 2025-07-18 19:50:45 UTC 73.6K followers, XXX engagements "The 10000-Year Clock: Jeff Bezos' $XX Million Timepiece Built Inside a Mountain He Owns The century hand moves every XXX years. A cuckoo emerges every 1000 years. Because he wants a concrete symbol that can stretch human attention beyond quarterly results. Computer scientist Danny Hillis proposed a XX millennia clock in 1989 to provoke society to take very long views of history and the future. Jeff Bezos is funding the first full-scale version on his Sierra Diablo land so the monument can stand as a daily reminder that todays choices echo far beyond any single lifetime"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945049524747059265) 2025-07-15 09:15:18 UTC 73.6K followers, 5594 engagements "๐ฐThinking Machines led by former OpenAI CTO Mira Murati raises $2B in seed funding at a valuation of $XX billion. Andreessen Horowitz wrote the biggest check joined by Nvidia Accel ServiceNow Cisco AMD and Jane Street. Investor appetite for fresh AI outfits is strong even while some people wonder about overall tech spending. Because of that U.S. startups raised about $XXXXX billion in the first half of 2025 a jump of nearly XX% and AI deals took roughly XXXX% of the total as per Pitchbook"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945337439813558487) 2025-07-16 04:19:22 UTC 73.6K followers, 4895 engagements "This stunning proof by MIT computer scientist is the first progress in XX years on one of the most famous questions in computer science. Space complexity vs Time complexity. New idea proves that any algorithm that runs in T steps can be re-engineered to use about T memory cells establishing that memory (RAM) is a much stronger resource than earlier theory allowed. A computer spends time (i.e. time complexity) running steps and spends memory (i.e. space complexity) holding data. Memory is the list of numbered slots inside RAM where a program keeps facts it will soon need again. Space"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944603034165838143) 2025-07-14 03:41:06 UTC 73.6K followers, 98.2K engagements "Microsoft Layoffs Hit Legal Department as AI Reshapes Staffing Strategy. Legal profession is mostly about language so it has to see the full pressure of AI. Microsoft has cut 15000 jobs since May redirecting cash toward AI infrastructure. Leaders faced a blunt trade-off: slow hardware spending or cut payroll. The company says Copilot already saved $500M in call-center costs last year. Inside Xbox canceled titles like Everwild and Perfect Dark illustrate the shift. Teams were whittled down until only a skeleton crew could keep existing games online. Cloud sales lost account managers just as"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946425893762806067) 2025-07-19 04:24:30 UTC 73.6K followers, 6679 engagements "๐ฐ Investors are lining up to fund Anthropic above $100B. Claude's revenue sprint from $3B to $4B annualized in X month explains the eagerness. That tag more than doubles the $61.5B valuation Anthropic set when it took $3.5B in February. Venture firms now race to pre commit cash before rivals lock up the allocation. Amazon and Alphabet already own sizeable stakes and supply cloud credits keeping compute costs under control. A fresh round mostly widens the buffer of GPUs --- bloomberg. com/news/articles/2025-07-16/anthropic-draws-investor-interest-at-more-than-100-billion-valuation"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945666391069356484) 2025-07-17 02:06:31 UTC 73.6K followers, 1915 engagements "Beautiful Survey paper on Context Engineering on 1400 research papers. XXX pages of comprehensive taxonomy decomposing Context Engineering into its foundational Components and the sophisticated Implementations. LLMs stumble when the prompt is messy so this survey maps every tool for cleaning stretching and storing context. The authors show that smart context handling not just bigger models drives more accurate and reliable answers. ๐บ Why define context engineering at all Today prompt tricks retrieval add-ons long-attention tweaks and memory hacks grow in separate silos. That split hides how"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946349730621202762) 2025-07-18 23:21:51 UTC 73.6K followers, 2771 engagements "This github repo is a goldmine. 3.4K Starts in X days. end-to-end code-first tutorials covering every layer of production-grade GenAI agents guiding you from spark to scale with proven patterns and reusable blueprints for real-world launches"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1936216326407962869) 2025-06-21 00:15:20 UTC 73.6K followers, 355.9K engagements "These stories continue about how AI (ChatGPT in this case) is helping people get a second opinion on medical problems. The person endured XX years of fatigue numbness and back pain after 5-6h sleep but felt fine with 8h. ChatGPT figured its because of vitamin D deficiency. --- reddit. com/r/OpenAI/comments/1lytfiw/after_11_years_chatgpt_helped_me_solve_chronic/"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944616813436051878) 2025-07-14 04:35:52 UTC 73.6K followers, 9584 engagements "INCREDIBLE. China just released 1tn parm top open source model for coding and agentic tool work. Kimi K2 from Moonshot AI Insane numbers on benchmarks. On LiveCodeBench the model hits XXXX Pass@1 beating DeepSeekV3 by almost X points and clearing Qwen235B by more than XX points Scores XXXX% on singleshot SWEbench agentic coding and XXXX on Tau2 retail tool use numbers that sit at or near the top of the open stack. - X tn total parameters MoE 32Bn active - Trained with the Muon optimizer - Very strong across frontier knowledge reasoning and coding tasks - SOTA on SWE Bench Verified Tau2 &"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943720128820261328) 2025-07-11 17:12:45 UTC 73.6K followers, 33.8K engagements "OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost ๐คฏ This Mixture of Agents models is optimized for generating synthetic training data. ๐ Using Mixture of Agents (MoA) architecture the model achieved SOTA results on both LMSYSs Arena Hard Auto (score: 84.8) and AlpacaEval XXX (LC score: 68.4). ๐ Theyve also benchmarked our MoA approach against GPT-4 variants on real-world OpenPipe customer tasks and found completions from our MoA model were preferred over GPT-4 XXXX% of the time (Claude X Opus as judge)"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1805685936556052649) 2024-06-25 19:34:09 UTC 73.6K followers, 32K engagements "Todays edition of my newsletter just went out. ๐ Consider subscribing its free and I publish daily with top X% AI developments. โก In todays Edition (16-July-2025): ๐ฅ Landmark research from Google DeepMind achieves 2X faster inference and XX% reduced KV cache memory ๐ง Mark Zuckerberg says AI researchers want X things apart from money. ๐ฐ Mira Muratis Thinking Machines Lab is worth $12B in seed round ๐ Google just dropped its first Gemini Embedding text model tops the MTEB Multilingual leaderboard. ๐ Artificial Analysis released the AI Adoption Survey Report for H1 2025 ๐ Top Resource:"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945617163412464080) 2025-07-16 22:50:54 UTC 73.6K followers, 13.2K engagements "Metas free-to-use Llama family was a strategic bridge and now expect to pay for it. The open-sourcing approach also helped Meta recruit elite researchers. Metas recent move to hire Scale AI co-founder Alexandr Wang along with reported signing bonuses up to $XXX million signals a shift toward commercial products that justify such costs. Wall Street will expect paid APIs enterprise subscriptions or in-product advertising tied to future closed models. --- bloomberg. com/opinion/articles/2025-07-14/mark-zuckerberg-and-meta-are-unlikely-to-keep-giving-away-ai-for-free"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944715937145372838) 2025-07-14 11:09:45 UTC 73.6K followers, 4439 engagements "With the research a 14Bparameter model holds XX% accuracy even on inputs that balloon to 3.5M tokens all while costing only O(N) in compute.๐คฏ LLMs usually freeze or slow down as soon as a prompt spills past their context window. MemAgent turns that long prompt into bitesized chunks keeps a tiny rolling summary and still nails the answer. The authors bolt a tiny fixed-size memory right inside that window teach the model with reinforcement learning to overwrite that memory after every slice and keep the rest of the architecture untouched. Because the memory never grows compute scales in a"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1942798442897756297) 2025-07-09 04:10:18 UTC 73.6K followers, 28.7K engagements "Artificial intelligence is going to replace literally half of all white-collar workers in the U.S. - Ford Motor Chief Executive Jim Farley"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1941746434211983660) 2025-07-06 06:30:00 UTC 73.6K followers, 4221 engagements "Yann LeCun on architectures that could lead to AGI --- "Abandon generative models in favor joint-embedding architectures Abandon probabilistic model in favor of energy-based models Abandon contrastive methods in favor of regularized methods Abandon Reinforcement Learning in favor of model-predictive control Use RL only when planning doesnt yield the predicted outcome to adjust the world model or the critic. IF YOU ARE INTERESTED IN HUMAN-LEVEL AI DONT WORK ON LLMS" --- From "IP Paris" YT channel (link in comment)"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945167783865532641) 2025-07-15 17:05:13 UTC 73.6K followers, 188.2K engagements "Todays edition of my newsletter just went out. ๐ Consider subscribing its free and I publish daily with top X% AI developments. โก In todays Edition (15-July-2025): ๐ xAI says it has fixed Grok 4s problematic responses ๐ LG Unveils Korea's First Open-weight Hybrid AI 'EXAONE 4.0' ๐ฅ Kimi K2 is the new Short-Story Creative Writing champion ๐ Byte-Size Briefs: NVIDIA is filing applications to sell the NVIDIA H20 GPU again. ๐ง๐ An ex-OpenAI engineer shares his thoughts about the organization"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945289361349882147) 2025-07-16 01:08:20 UTC 73.5K followers, 14.3K engagements "Functime is quite cool - its a forecasting library and for your Time-series machine learning and embeddings at scale - production-ready forecasting and temporal embeddings. - time-series preprocessing (box-cox differencing etc) cross-validation splitters (expanding and sliding window) and forecast metrics (MASE SMAPE etc). All optimized as lazy Polars transforms ------- Temporal embeddings measure the relatedness of time-series. Embeddings are more accurate and efficient compared to statistical methods (e.g. Catch22) for characteristing time-series. Embeddings have applications across many"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1814710332545257817) 2024-07-20 17:13:53 UTC 73.5K followers, 4059 engagements "Breakthrough in Alzheimer's disease. ๐ง Texas A&Ms team built flower-shaped molybdenum particles that slide into brain cells slash harmful oxidative stress and add X extra days to worm lives. They cut reactive oxygen by almost XX% and pushed mitochondrial survival close to 99%. The work hints at drugs that tackle Parkinsons or Alzheimers by fixing the cells power plants not just masking symptoms. --- interestingengineering .com/health/brain-healing-nanoflowers-treatment"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946177632548131022) 2025-07-18 11:58:00 UTC 73.6K followers, 1709 engagements "YCs Hidden Formula: XXX Users $100/Month $10k MRR The Startup Playbook"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943909101697949982) 2025-07-12 05:43:40 UTC 73.6K followers, 4493 engagements "Apple will seriously consider acquiring French startup Mistral AI as per Bloomberg ๐ What makes Mistral attractive Mistral was founded in 2023 by former Meta and Google researchers. It has raised a little over $XXX B and is valued at about XXX B. A fresh round of up to $X B led by Abu Dhabi-backed fund MGX is being negotiated now which could push the price higher. Microsoft paid XX M in 2024 for a minority stake and secured first-run access to Mistral-Large on Azure. Mistrals open-weight Mixtral models and its Le Chat consumer bot give Apple a ready-made foundation-model stack that is"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944708372701368642) 2025-07-14 10:39:41 UTC 73.6K followers, 2554 engagements "Academic spin-offs like Satori add their own autoregressive search loop on top of chain-of-thought then show the same framework solving physics proofs and formal logic puzzles illustrating how the break check recycle loop ports to fresh fields satori-reasoning. github. io/blog/satori/"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946595628013998272) 2025-07-19 15:38:58 UTC 73.6K followers, XXX engagements "ChatGPT's new Agent. got similar experience - great for non-time sensitive research. but presentation aesthetics still need to improve. - connecting to 3rd party apps not smooth. overall Manus Genspark and Comet will give them a very tough competition"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946087927647502425) 2025-07-18 06:01:33 UTC 73.6K followers, 4316 engagements "Meta scores two more high-profile OpenAI researchers. OpenAIs reinforcement-learning specialist Jason Wei along with chain-of-thought partner Hyung Won Chung are switching to Metas brand-new superintelligence lab. Meta is dangling packages of up to $300M across X years. The churn proves one thing: whoever nails stable long-context reasoning plus tight reward signals will set tomorrows benchmark. --- wired. com/story/jason-wei-open-ai-meta/"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945526944097079323) 2025-07-16 16:52:24 UTC 73.6K followers, 3083 engagements "Fine tuning big models often uses LoRA adapters to cut memory and supposedly time. Paper reports LoRA can train slower because every adapter spawns extra GPU kernels waiting in line. Benchmarks on GPT2 and LLaMA2 show forward plus backward can stretch XX% over full tuning. LoRA cuts parameters with rank r matrices yet those added multiplies break GPU parallelism. Study switches to Partial Connection Adaptation a mask that tweaks chosen weight columns no new layers. It fine tunes only top XX% of layers leaving lower stack frozen. Mask lives inside weights so each layer fires one kernel and"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945067567770472773) 2025-07-15 10:27:00 UTC 73.6K followers, 2962 engagements "@Tony_Omega their's is just much more customized with stats and humongous amount of data feeding into large multi-million dollar softwares"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944514662441103610) 2025-07-13 21:49:57 UTC 73.6K followers, 15.8K engagements "this ChatGPT prompt went so wildly viral on Reddit. The creator claims to have created this after struggling through XXX failed attempts. basically the prompt flips the usual flow by making the model interview the user first asking a few targeted questions about purpose audience constraints and context. Because the answers feed back into the final request it appears to generate more tailored outputs. (However imo asking ChatGPT to request missing information was already a common practice.) Here's the entire prompt: -------- You are Lyra a master-level AI prompt optimization specialist. Your"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1940483944102703307) 2025-07-02 18:53:19 UTC 73.6K followers, 318.5K engagements "Goldman Sachs Non-Profitable Tech Index is up XX% since hitting its low in April. signals very strong risk appetite for speculative growth stocks. like in 1999 investors are again very willing to pay up for distant earnings"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946310217765151166) 2025-07-18 20:44:51 UTC 73.6K followers, 2187 engagements "Nvidia CEO talks about AI/China/Models at Beijing Expo China. - splits AI into hardware models and apps stressing all three advance together. - About XX% of global AI researchers work in China sustaining that pace. - Nvidias 30-year China presence benefits from a sophisticated interlinked supply chain. - H20 complies with export caps yet offers strong bandwidth for large-model inference. - RTX Pro powers Omniverse digital twins matching Chinas smart-factory and robotics push. - He names reasoning as AIs third wave fueled by compute-heavy post-training not extra data. Reasoning AI links"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945544862134407586) 2025-07-16 18:03:36 UTC 73.6K followers, 1392 engagements "Mark Zuckerberg strikes again ๐ฅ Meta just grabbed Apple veterans Mark Lee and Tom Gunter for its Superintelligence Labs that already poached their boss Ruoming Pang with a $200M package. Lee was Pangs very first recruit at Apple and Gunter was a distinguished engineer inside Apple Foundation Models the group that trains Siris large language models. Their exit adds to internal uncertainty as Apple weighs swapping its own models for ChatGPT or Claude to get new Siri features out by next spring. reuters."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946077375789421020) 2025-07-18 05:19:37 UTC 73.6K followers, 2287 engagements "How do memories last when the molecules that form them turn over within days weeks or months A memory sticks around because two proteins meet in the same tiny spot where two neurons talk. Memories can live for decades because PKM sticks to KIBRA inside a busy synapse creating a swap-friendly bond that survives routine protein turnover. PKM is an enzyme that lives inside the synapse the contact point between neurons. They act like a bookmark. When one copy of either protein breaks down during normal cell cleanup a fresh copy plugs straight back into the waiting partner so the bookmark never"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945382000577405057) 2025-07-16 07:16:27 UTC 73.6K followers, 1344 engagements "๐จ META COMES BACK WITH FULL FORCE ๐ซก Mark Zuckerberg announced Meta will spend hundreds of billions building AI data centers that each pull gigawatt-scale power chasing models that out-think humans. Prometheus its planned AI super-compute campus goes live in 2026 and Hyperion (the bolder sequel to Prometheus) later ramps to X GW all paid for by Meta's own capital. Meta folded every AI project into Superintelligence Labs after Llama X stalled. Bigger models need far more compute so the plan pivots from add servers to build mini-power plants. A single X GW cluster can host tens of thousands of"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944818581893734860) 2025-07-14 17:57:37 UTC 73.5K followers, 8096 engagements "โก Thomson Reuters survey finds XX% of legal audit and accounting firms already profit from AI; Even ad-hoc adopters saw ROI Source: fortune .com/2025/07/01/ai-lawyers-accountants-auditors-lessons-for-us-all/"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1940349733555630101) 2025-07-02 10:00:01 UTC 73.5K followers, 1736 engagements "Paper Paper Title: "A Survey of Context Engineering for LLMs""  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946349761654853871) 2025-07-18 23:21:59 UTC 73.6K followers, 1142 engagements "The picture sorts the data first. On top you see X imaging streamsradiology dermatology digital pathology ophthalmologyand X medical-text stream. Each arrow shows how those sources feed the rest of the stack. The images go through MedSigLIP a vision encoder that turns each scan or photo into a compact vector the language models can read. Those vectors flow into MedGemma 4B Multimodal a 4B-parameter model that handles both pictures and words in a single forward pass. For text-only work there is a larger 27B-parameter MedGemma model that skips the image part and focuses on language reasoning"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943086976170942974) 2025-07-09 23:16:50 UTC 73.6K followers, 1179 engagements "ChatGPT literally saved this guys life after he got lost in the woods. The groupd got lost for X hrs in unmapped woods on an ATV ride then one guy sent phone GPS coords to ChatGPT every few minutes. ChatGPT replied with clear compass cues road names and terrain notes guiding them back to town unharmed. From r/ChatGPT/Own_Analyst3795"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1937199835318485177) 2025-06-23 17:23:26 UTC 73.6K followers, 1.5M engagements "with only a couple of prompts Gemini CLI can convert a messy folder containing hundreds of notes into a neatly named well-structured cross-linked Obsidian knowledge graph all in about half an hour and at minimal cost. from r/singularity/Ryoiki-Tokuiten"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1939449158882275758) 2025-06-29 22:21:27 UTC 73.6K followers, 413K engagements "OpenVision a fully open vision encoder family offering 25+ models (5.9M632M params) that outperform or match OpenAIs CLIP and Googles SigLIP on 9+ multimodal benchmarks. This matters as it's completely opentraining data code and weights includedunlike CLIP/SigLIP. OpenVision uses CLIPS (contrastive + generative training) and Recap-DataComp-1B (re-captioned with LLaVA3) for fully open training from scratch. Performance-wise OpenVision outdoes CLIP/SigLIP on LLaVA-1.5 and Open-LLaVA-Next setups across TextVQA ChartQA MME OCR etc. especially in higher-res variants like L/14-336. OpenVision-H/14"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1920974917866057913) 2025-05-09 22:51:25 UTC 73.6K followers, 11.6K engagements "๐ OpenAI is baking a payment-checkout into ChatGPT so shoppers can pay inside the chat and OpenAI will pocket a commission from each sale as per FT. The move will turn its free users into a fresh revenue engine beyond premium plans. Right now ChatGPT shows shopping links that dump users on outside sites which means friction for buyers and zero cut for OpenAI. Folding checkout into the chat slices out that jump and keeps money flowing through its own rails. Shopify's proven backend will handle card data fraud checks and fulfillment calls while OpenAI focuses on the chat front that recommends"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945670530981618094) 2025-07-17 02:22:58 UTC 73.6K followers, 3407 engagements "LLM based Multi-agent portfolio work in crypto. Here researchers extend the LLM-based AI agent idea to digital assets with a team of analyst trader and risk-manager LLMs that co-operate on a basket of the top XX tokens. The framework surpasses single-agent and market benchmarks in hit-rate and drawdown control and keeps full explainability through agent dialogue logs. ideas. repec. org/p/arx/papers/2501.00826.html"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944555788644884926) 2025-07-14 00:33:22 UTC 73.6K followers, 52.3K engagements "Most AI-alignment tests only see if a model avoids harm. This paper asks whether it can help people thrive. The team built the Flourishing AI Benchmark 1229 mixed questions tagged to X everyday domains Character Relationships Happiness Purpose Health Money and Faith. Judge models grade each answer and a geometric mean ties the scores so X weak area pulls the total down. They ran XX well known chatbots. OpenAI o3 topped the chart at XX but every system missed the XX pass mark with Faith and Purpose dragging hardest and Money showing the best numbers. The design stops cherry picking pushing"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944371230213074945) 2025-07-13 12:20:00 UTC 73.6K followers, 5726 engagements "LLM for financial trading. More findings. Here researchers embed an LLM opinion module inside the Black-Litterman framework. By mapping model uncertainty to confidence weights they create portfolios that outperformed S&P XXX equal-weight and vanilla mean-variance allocations during Jun 2024-Feb 2025 rebalancing tests. they found that different LLMs exhibit varying levels of predictive optimism and confidence stability which impact portfolio performance. The source code and data are available at github. com/youngandbin/LLM-MVO-BLM. arxiv. org/abs/2504.14345"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944550852108394566) 2025-07-14 00:13:45 UTC 73.6K followers, 11K engagements "Github Repo: Automatic document classification smart tagging and semantic search using OpenAI-compatible APIs and Ollama. For Paperless-ngx using OpenAI API Ollama Deepseek-r1 Azure and all OpenAI API compatible Services to automatically analyze and tag your documents. --- github. com/clusterzx/paperless-ai"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946221421400633456) 2025-07-18 14:52:00 UTC 73.6K followers, 3091 engagements "๐ฅ OpenAI cut off a developer who weaponized ChatGPT's API This developer built this project which could respond to voice commands using ChatGPT's Realtime API. OpenAI confirmed the shutdown citing a violation of its policies prohibiting the use of its AI for weapon-related applications. The turret could interpret commands like "turn left" or "respond accordingly" with precise real-time adjustmentsindicating how easily language models can be integrated into lethal systems. This incident amplifies concerns about AIs potential role in automating military-grade systems similar to autonomous"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1878063926866493675) 2025-01-11 12:58:27 UTC 73.6K followers, 380.5K engagements "๐ฉบ Google Research release MedGemma 27B multimodal health-AI models that run on X GPU MedGemma 27B multimodal extends the earlier 4B multimodal and 27B text-only models by adding vision capabilities to a 27B-parameter language core. Training added X new datasets EHRQA and Chest ImaGenome so the model can read longitudinal electronic health records and localize anatomy in chest X-rays. The report states that this larger multimodal variant inherits every skill of the 4B model while markedly improving language fluency EHR reasoning and visual grounding. The 4B variant clocks XXXX% MedQA and 81%"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943083359758111073) 2025-07-09 23:02:28 UTC 73.6K followers, 11.8K engagements "Another research showing how LLM+price time-series data is helping trading strategies ๐ LLMoE adaptive routing for trading strategies The LLM-Based Routing in Mixture-of-Experts (LLMoE) framework replaces a conventional softmax router with a language model that chooses between optimistic and pessimistic sub-experts after reading both price time-series and headline text. On MSFT data from 2006-2016 the approach lifts total return to XXXXX % versus XXXXX % for a classic MoE and raises the Sharpe ratio accordingly while maintaining full interpretability through the routers text rationale"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944448919347642817) 2025-07-13 17:28:43 UTC 73.6K followers, 21.1K engagements "Most models freeze once a clip tops about XX seconds. LongVILAR1 shows how a 7B model can reason across hourlong footage with cheap hardware. The authors build a 52K question answer set called LongVideo Reason covering temporal spatial goal and plot cases. Training first copies these human style chains of thought then switches to reinforcement learning that scores each answer and keeps better policies. A trick named Multi modal Reinforcement Sequence Parallelism splits frames across GPUs and reuses embeddings trimming step time by 2.1x and handling 3600 frames on X A100s. The result matches"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943918747666067938) 2025-07-12 06:22:00 UTC 73.5K followers, 1476 engagements "Child-sized robots can fly and can be used future search-rescue reach. iRonCub3 a X m XX kg humanoid that lifts XX cm using X jet thrusters A jet powered humanoid called iRonCub X has taken its first tethered jump showing that balanced flight is possible with X hobby sized turbines and whole body control. Current humanoids walk but cannot cross gaps or debris. This project bolts X turbines on the arms and X on a backpack runs them through force sensors and an unscented Kalman filter then asks a model predictive controller to keep the center of mass steady. Before lighting real engines the"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1941874933858185394) 2025-07-06 15:00:37 UTC 73.5K followers, 2342 engagements "How Each Agent Works Every agent outputs through a strict Zod schema which means the orchestrator can merge results without surprises. Adding a new field is a one-line schema tweak and a small search routine no risky prompt surgery"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943382863963132179) 2025-07-10 18:52:35 UTC 73.6K followers, XXX engagements "@TeksEdge yep for now waiting for the next Llama's release though"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945715810821079228) 2025-07-17 05:22:53 UTC 73.5K followers, XX engagements "Microsoft just dropped Phi-4-mini-flash-reasoning. - built on a new hybrid architecture - 10X higher throughput and a X to 3X reduction in latency - significantly faster inference without sacrificing reasoning performance. Microsoft swaps most of that heavy work for a lean SambaY layout with tiny gating blocks so the same 3.8B parameters think quicker and type sooner. ๐งฉ The quick idea Phi4miniflashreasoning keeps size small at 3.8B parameters but rebuilds the flow of information. A new decoderhybriddecoder stack called SambaY lets light recurrent pieces handle context a single fullattention"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943124260169613662) 2025-07-10 01:44:59 UTC 73.6K followers, 12K engagements "The taxonomy of Context Engineering in Large Language Models is categorized into foundational components system implementations evaluation methodologies and future directions"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946349735536967891) 2025-07-18 23:21:53 UTC 73.6K followers, XXX engagements "AI isnt just taking away entry-levels jobs its helping thousands apply for the same job with almost the same CV. AI will redefine the need for $80-120K+ University degrees. This is quite meaningful from the article. ๐ Being able to write well and think coherently were basic requirements in most graduate jobs XX XX years ago said a senior recruitment professional at a large consultancy firm from London speaking anonymously. Now they are emerging as basically elite skills. Almost nobody can do it. We see all the time that people with top degrees cannot summarise the contents of a document"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944594798083793405) 2025-07-14 03:08:23 UTC 73.6K followers, 10.7K engagements "Wow this is such a brilliant idea for running AI models locally. ๐ฏ webFrame is @thewebAI 's backend that slices a huge language model into smaller shards sends each shard to a different computer on your own network then stitches the answers back together on the fly. Because every shard stays local no token or user data leaves the building and even a modest Mac Mini cluster can serve a state-of-the-art model in real time. Its redefining whats possible on local hardware. And they just published their benchmark results. ๐ webFrame pushed out 3X more tokens each second than a SOTA opensource"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946243288455090378) 2025-07-18 16:18:54 UTC 73.6K followers, 6218 engagements "๐ฅ YC outlines how top AI startups prompt LLMs: prompts exceeding six pages XML tags meta-prompts and evaluations as their core IP. They found meta-prompting and role assignment drive consistent agent-like behavior. โ Key Learning Top AI startups use "manager-style" hyper-specific prompts6+ pages detailing task role and constraints. These aren't quick hacks; theyre structured like onboarding docs for new hires. Role prompting anchors the LLMs tone and behavior. Clear persona = better alignment with task. Example: telling the LLM it's a customer support manager calibrates its output"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1936837831458009217) 2025-06-22 17:24:58 UTC 73.6K followers, 256.7K engagements "Small model big browser skills thanks to smart compute splitting. Open web agents usually need huge models or tedious hitandmiss tuning so training a small open model that finishes multistep website tasks still feels like luck. This study shows how to split the training budget so an 8B Llama even beats its 70B teacher on many tasks. Weak 8B student first copies 70B demos through supervised fine tuning then swaps to onpolicy reinforcement learning while the lessons are fresh. The authors tried 1370 hyperparameter mixes and used bootstrap sampling to learn which ones really matter instead of"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943509638437310906) 2025-07-11 03:16:21 UTC 73.6K followers, 4547 engagements "It was only a matter of time and now its starting - "individualized pricing using AI" Delta is ditching flat fares in favor of AI that determines how much you personally will pay for a ticket. The system treats pricing like a live stock ticker watching demand spikes route history and even seat layout then offering a personal price in real time. Delta feeds those signals into Fetcherr a 6-year-old startup already powering WestJet and Virgin Atlantic. The carrier says early tests lifted revenue per seat without harming load factors. For now shoppers can still game the system by clearing cookies"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945720839254761659) 2025-07-17 05:42:52 UTC 73.6K followers, 4262 engagements "Browsers built into new language models now scrape social feeds on demand so guessing a strangers age or politics takes only a username. That simple trick is both a research lifeline and a privacy headache. The paper tests this power because public APIs keep shrinking while social science still needs fresh tweets. Authors spun up XX synthetic X accounts with set gender age class and ideology then compared model guesses to ground truth after XX tweets apiece. They also rechecked 1384 real users from a 2018 survey. GPT4o hit XX% on gender in the toy set and XX% on class in the survey and it"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946050881084145688) 2025-07-18 03:34:20 UTC 73.6K followers, 6382 engagements "2025 IMO(International Mathematical Olympiad) LLM results are in. --- The benchmark's mission is rigorous assessment of the reasoning and generalization capabilities of LLMs on new math problems which the models have not seen during training. It applies a uniform scoring procedure so results do not depend on any provider-specific. During evaluation each model tackles every problem X times and MathArena reports the average score together with the total cost in USD for those runs"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945930814727831738) 2025-07-17 19:37:14 UTC 73.6K followers, 4823 engagements "๐งฌ BIG news for Anti Aging inventions. ๐ Mushroom drug might slow fundamental aging processes. Psilocybins active metabolite kept human fibroblast cultures alive XX% longer at XX M and XX% longer at XXX M. Aged mice given monthly psilocybin showed XX% survival while only XX% of control animals made it through the same 10-month span. Psilocybin has long been tested for depression and addiction yet many researchers suspected a deeper link between the compound and biological aging because positive mental states often track with longer telomeres the protective DNA caps that shrink as cells age."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943429195847381421) 2025-07-10 21:56:42 UTC 73.5K followers, 14.5K engagements "Grok X (Thinking) clocks XXXX% on ARC-AGI-2 grabbing the new SOTA. That score is almost 2x the last commercial best and now tops the Kaggle leaderboard. --- What ARC-AGI-2 tries to measure The benchmark contains a larger freshly curated set of grid-based puzzles that cannot be memorized forcing any model to invent a rule on the fly from a handful of examples then apply that rule to a held-out test grid. Unlike ARC-AGI-1 the new version adds an explicit cost axis so a model must prove both adaptability and efficiency instead of relying on brute-force search with huge compute budgets. Grok 4s"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943174583227682908) 2025-07-10 05:04:57 UTC 73.6K followers, 2075 engagements "On FrontierMath ChatGPT agent solves XXXX% of questions on its first try FrontierMath is the hardest known math benchmark featuring novel unpublished problems that often take expert mathematicians hours or even days to solve. FrontierMath targets problems that ordinarily take professional mathematicians many hours or even days covering topics from computational number theory to algebraic geometry Epoch AIEpoch AI. Because every item is new and unpublished memorization is impossible so high scores reflect genuine reasoning skill. It proves again that giving AI models controlled access to tools"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945916115282182257) 2025-07-17 18:38:49 UTC 73.6K followers, 1543 engagements "Most agent tests stop at tiny teams and ignore how the bots actually coordinate. AGENTSNET proposed in this paper shows what happens when the crowd scales and asks for real teamwork. ๐งฉ The benchmark packs X classical distributed tasks namely coloring vertex cover matching leader election and consensus into chat puzzles. ๐ Agents only talk to neighbors and exchange JSON messages for 2D+1 synchronous rounds mimicking the LOCAL model from distributed computing. ๐ Each agent receives a tiny prompt with its own name the neighbor list and the shared goal then decides what to share or hold back"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945838145624297531) 2025-07-17 13:29:00 UTC 73.6K followers, 1432 engagements "Tech CEOs warn the hiring boom is over as AI writes code answers tickets and trims payrolls. Stanford data shows entry-level developer jobs sliding while only top specialists gain. Anthropic predicts XX% unemployment within X years. Microsoft cut 9000 letting Copilot write XX% of its code. IBM dumped 8000 roles. ADP payroll analysis pins the damage on devs aged 18-25 Amazon CEO Andy Jassy said last month that AI will reduce our total corporate workforce over the next few years as the company begins to need fewer people doing some of the jobs that are being done today. Shopify CEO Tobi Lutke"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946142563809042781) 2025-07-18 09:38:39 UTC 73.6K followers, 4298 engagements "The writer of this prompt says "This guide will cost openai thousands of dollars. Lol" ๐"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946140981897966062) 2025-07-18 09:32:22 UTC 73.6K followers, 4032 engagements "So Combining these two benchmarks (The SpreadsheetBench and The Internal Banking Benchmark) ChatGPT Agent can automate a substantial portion of the tedious and data-intensive tasks that define the role of a junior investment banking analyst. - It can conduct complex research and analysis to build financial models from scratch. - It can expertly manipulate spreadsheets a fundamental requirement for the job. - It can reason plan and execute multi-step workflows that involve using different tools (like the browser for research and the terminal for data processing/file creation)"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945921102175432986) 2025-07-17 18:58:38 UTC 73.6K followers, 1946 engagements "This will prove genuinely useful to rely on daily. @Proactor_ai just released v1.0 the 1st self-active AI teammate that acts on X prompts giving real-time fact-checks and smart interventions. It will act on its own based on the situation. It links X skills: perception to capture audio reasoning to compare each claim with search results and autonomous action to deliver concise corrections or suggestions. - real-time transcription for meetings calls and discussions - while you speak Proactor listens analyzes and immediately provides targeted AI advice. - can automatically identifies and"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1942599886333174159) 2025-07-08 15:01:19 UTC 73.5K followers, 4470 engagements "This headline pumps iron. Elon sure understands his crowd. ๐ Have you tried Groks new companion mode yet"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945040050669936680) 2025-07-15 08:37:39 UTC 73.6K followers, 4972 engagements "Cognition AI is taking Windsurfs code brand and $82M revenue days after Google bought its founders for $2.4B slotting the prize under a $4B valuation. So basically Google took the captains Cognition got the ship. Google sidestepped a full purchase by licensing the tech and hiring the chiefs a play that avoids antitrust noise yet strips the startup of leadership. Cognition grabs the rest promises instant vesting for every engineer and will feed Windsurfs data into Devin its automated coder hoping the extra examples cut hallucinations and widen language support. --- bloomberg."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944835830524063855) 2025-07-14 19:06:09 UTC 73.6K followers, 4103 engagements "This is quite a landmark paper from @GoogleDeepMind ๐ 2x faster inference because tokens exit the shared loop early. ๐ During training it cuts the heavy math dropping attention FLOPs per layer by about half so the same budget trains on more data. Shows a fresh way to teach LLMs to plan steps inside their own reasoning loop instead of hard-coding a single chain. Second it proves the mixer idea scales. By jumbling several small recursive experts and letting the model pick which one to call next the team pushes accuracy on math and coding benchmarks without ballooning parameter count."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945342199354548287) 2025-07-16 04:38:17 UTC 73.6K followers, 8470 engagements "Did you know XX% of US caselaw are available open sourced on @huggingface .๐ฏ This dataset contains XXX million cases from the Caselaw Access Project and Court Listener. The Caselaw Access Project consists of nearly XX million pages of U.S. federal and state court decisions and judges opinions from the last XXX years. In addition Court Listener adds over XXX thousand cases scraped from XXX courts. The Caselaw Access Project and Court Listener source legal data from a wide variety of resources such as the Harvard Law Library the Law Library of Congress and the Supreme Court Database. From"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945535228518293769) 2025-07-16 17:25:19 UTC 73.6K followers, 29.2K engagements "Frontier language models shine on Olympiadlevel benchmarks yet stumble on chores like counting letters. The paper samples easy reasoning tasks dials up length or distractions and watches accuracy crash. Tests cover word or character counting logic trees proofstyle math stories and travel itineraries that only need basic bookkeeping. As paragraphs grow or extra names appear small step errors snowball models lose track of state guess from phrase frequency or copy memorised solutions instead of thinking. A handbuilt Unpuzzles set flips famous riddles into trivial variants yet models often reuse"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944037530728382874) 2025-07-12 14:14:00 UTC 73.6K followers, 133.5K engagements "Metas reply to Stargate comes through Prometheus at X GW and Hyperion at X GW running multi-billion-dollar GPU clusters that sit in tents. - Meta - Prometheus IT Power by end of 2026 is 1020MW. The total number of chips used is 500000. Total compute power is 3171044226 TFLOPS. - Anthropic - Project Rainier IT Power by end of 2026 is 780MW. The total number of chips used is 800000. Total compute power is 1040000000 TFLOPS. - OpenAI - Stargate IT Power by end of 2026 is 880MW. The total number of chips used is 400000. Total compute power is 2469594595 TFLOPS. --- semianalysis."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944619983209963613) 2025-07-14 04:48:27 UTC 73.6K followers, 19.1K engagements "๐ SceneScript from Meta Reality Labs Research turns mapping rooms into writing short text commands so headsets can sketch walls doors and objects on the fly without fragile geometry code. ๐ค It learns that trick inside a huge synthetic world of 100000 virtual homes then plugs straight into large language models so you can ask spatial questions like you chat with ChatGPT. ๐ Key point is that SceneScript swaps 3D math for plain script generation makes the vocabulary expandable and lets anyone tweak a scene by correcting tokens all with the same next-word prediction trick that powers modern"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944996370420506977) 2025-07-15 05:44:05 UTC 73.5K followers, 3236 engagements "Most apps pick one large language model then hope it can do every job. FusionBench proves that mixing models with smart routing shared thoughts or distillation beats any solo model. FusionBench gathers 103M tokens of queries answers and thought sketches from XX open models that range from 8B to 671B parameters. It covers XX familiar tasks in math code commonsense world knowledge and reading so tests feel realistic. Each query holds two answer styles a straight reply and a detailed reasoning path then a judge score and cost tag. FusionFactory then tries three fusion tricks. Query level trains"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945107581409632328) 2025-07-15 13:06:00 UTC 73.5K followers, 6179 engagements "Multitoken masks plus gated LoRA cut LLM latency without hurting accuracy code output X faster. LLMs can already guess several words ahead this paper shows how to cash in on that foresight for X faster code and math generation with no drop in answer quality. ๐ What problem are they poking at Autoregressive models speak one token at a time so every extra word forces another full pass through the network. That singlestep habit slows reading back code proofs or long chat replies. The authors noticed the models hidden states quietly predict whole phrases anyway sitting unused in the logits list."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946018904993866009) 2025-07-18 01:27:16 UTC 73.6K followers, 3025 engagements "LLMs nail standard school word problems but fall apart when the question needs realworld sense. This scoping study tracks why. โ The Core Concepts: LLMs slice every prompt into tiny tokens and predict the next token from statistics so solving means matching patterns not building a picture of the story . Problem sets used to train and test the models are heavily skewed toward sproblems short tasks that collapse to plain arithmetic. Only a few include contextual twists or nonsense questions . ๐ Method: The authors compared X OpenAI models on XXX questions from X popular data sets plus classic"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1941993646011891732) 2025-07-06 22:52:20 UTC 73.6K followers, 2059 engagements "๐ง Linking to a standard Large Language Model unlocks reasoning. Ask Which chair sees the TV the chatbot parses the generated script computes sight lines and replies in plain text all without extra geometric code. If the model misplaces a door the user can type a correction the network infills the right token sequence and the interpreter snaps the mesh back into place"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944996793194045515) 2025-07-15 05:45:46 UTC 73.5K followers, XXX engagements "Beautiful research from @Apple More thoughts stop helping once tasks cross critical depth. Thinking tokens rise then crash revealing compute inefficiency. So Standard LLMs beat LRMs on easy puzzles unexpectedly. Researchers stress-test them on puzzles whose difficulty can be dialed up step by step. Thinking pull ahead mid-way but every model collapses once the puzzle grows past a critical depth. Even stranger near that point the thinker writes fewer thoughts despite plenty of allowed tokens hinting at a built-in ceiling on current inference-time reasoning. Key findings below. ๐งฉ Controlled"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1930968053027578199) 2025-06-06 12:40:34 UTC 73.6K followers, 263.1K engagements "Figure just rolled out its 3rdgeneration battery for the F.03 humanoid. With higher energy density F.03 keeps its 5hour spec yet gains extra payload headroom for future arms. The new pack is a structural part of the robot so it doubles as the torso frame cuts BOM (Bill of Materials) by XX% and shrugs off a X m concrete drop. Active cooling lets the pack gulp X kW during pitstop charging stretching runtime to roughly the entire 5hour shift Fseries robots already hit. Cell prices have slipped below $130/kWh this year and Figures structural approach removes the usual XX% overhead for separate"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946194242981752905) 2025-07-18 13:04:00 UTC 73.6K followers, 2451 engagements "AI compute is running into a power wall. Chinas grid already closing on 10000 TWh while the United States has sat near 4178 TWh for two decades WikipediaU.S. If training and serving bigger models keeps eating watts at the current pace that flat U.S. line could matter more than any parameter count. Chinas curve rockets upward because the country kept adding coal wind and solar at break-neck speed lifting generation six-fold since 1999. The U.S. line crawls sideways topping out just after 2010 and hovering around the same 4000 TWh ever since. Chinas burst came with a huge build-out of"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944981234213638183) 2025-07-15 04:43:56 UTC 73.5K followers, 7398 engagements "PDF parsing is still painful because LLMs reorder text in complex layouts break tables across pages and fail on graphs or images. ๐กTesting the new open-source OCRFlux model and here the results are really good for a change. So OCRFlux is a multimodal LLM based toolkit for converting PDFs and images into clean readable plain Markdown text. Because the underlying VLM is only 3B param it runs even on a 3090 GPU. The model is available on @huggingface . The engine that powers the OCRFlux teaches the model to rebuild every page and then stitch fragments across pages into one clean Markdown file."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1940057084021940543) 2025-07-01 14:37:08 UTC 73.6K followers, 149.6K engagements "Deep seek interesting prompt. From Reddit"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1883601254318039148) 2025-01-26 19:41:49 UTC 73.6K followers, 12.1M engagements "๐ค๐ธ Carnegie Mellon researchers reveal headline AI agents flop on 62%70% on performing real-world professional office tasks"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1941900573022015584) 2025-07-06 16:42:30 UTC 73.6K followers, 141.9K engagements "To get the new Grok companion. updates Grok taps bottom right settings download companions (one time) chooses a chat companion"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944901804216786990) 2025-07-14 23:28:19 UTC 73.6K followers, 6028 engagements "๐ฅ OpenAI rolled out agent mode in ChatGPT. Lets the model click around a virtual computer run code and finish multistep jobs on its own hitting XXXX% on Humanitys Last Exam while handling chores like building slide decks or buying groceries. It reaches XXXX% accuracy on Humanitys Last Exam (HLE) while older baselines like OpenAI o3 without tools sit at XXXX% and deep-research with browsing reaches 26.6%. The HLE exam spans 2500 expert-level questions across 100+ subjects that were crowdsourced specifically to stump modern language models. So coubling the previous best pass@1 score signals a"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945913831370453072) 2025-07-17 18:29:45 UTC 73.6K followers, 5404 engagements "๐ข A new 32B model EXAONE XXX just dropped on @huggingface from LG AI Research. ๐ค Outcompetes Qwen 235B on coding and exceeds DeepSeek R1 V3 671B on instruction tasks. - toggleable reasoning 131K context and a non-commercial license. - It solves more edge cases than Qwen 235B while using about one-seventh of the memory footprint - Trainded on 14T carefully filtered tokens. - supports Model Context Protocol (MCP) and Function Calling"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945027813133983834) 2025-07-15 07:49:02 UTC 73.6K followers, 8675 engagements ""Grok X is better than PhDs in every subject no exception" - Number X on Humanitys Last Exam with XXXX% - Number X on ARC-AGI-2 with XXXX% where the next best score is at 8.6%"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943194742256435406) 2025-07-10 06:25:04 UTC 73.6K followers, 11.8K engagements "The paper finds that money-based crowd odds on Polymarket called the 2024 presidential result more accurately and earlier than every traditional poll with the edge most obvious in key swing states where markets stayed on Trump while surveys wavered So basically Polls still miss presidential winners despite $50M spent each cycle. The paper pits those polls against daily Polymarket odds modelling both with Bayesian structural time series. Market prices jumped the night of Trump's July shooting attempt giving him XX% while polls barely moved. By XX October market forecasts never dipped below"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945482553366135194) 2025-07-16 13:56:00 UTC 73.6K followers, 1437 engagements "Kwai KeyeVL turns messy short videos into machinefriendly stories. Kwai Keye-VL is an 8B-param MLLM built by Kuaishou (the company behind the Kwai short-video app) to understand short videos as easily as still images while still handling regular vision-language tasks. The 8Bparameter model tops video tests yet keeps strong image skills. Most existing multimodal LLMs work well on single images but struggle with the rapid scene changes audio and dense context in TikTok-style clips. Keye-VL fixes that gap by training on a XXX billion-token corpus rich in video then adding a recipe that teaches"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1940933078215807328) 2025-07-04 00:38:01 UTC 73.6K followers, 1587 engagements "Grok X is crazy. Everyone keeps cranking out projects. Compiling XX incredible examples. ๐ ๐งต 1/n Grok4 generates click-morphing 3D attractor particles with ThreeJS shaders browser-native. XX FPS on consumer laptops"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943839074470834207) 2025-07-12 01:05:24 UTC 73.6K followers, 12.1K engagements "๐ AI assistants are now mandatory kit in top tier US law firms. DLA Piper already embeds Copilot in Microsoft apps and deploys in house models that spot Foreign Corrupt Practices Act trouble before it blooms. Gibson Dunns ChatGPT Enterprise pilot lets XXX people compare Google Gemini and Claude on real briefs. Ropes & Grays Hebbia agent squeezes fund term extraction from XX hours to about X. --- businessinsider. com/big-law-top-10-firms-ai-overhaul-use-cases-2025-7"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946138604994285637) 2025-07-18 09:22:55 UTC 73.6K followers, 1535 engagements ""The era when humans program is nearing its end within our group. Our aim is to have AI agents completely take over coding and programming. (.) we are currently initiating the process for that." - Softbank founder Masayoshi Son He estimates that approximately 1000 AI agents would be needed to replace each employee because "employees have complex thought processes." --- lightreading. com/ai-machine-learning/softbank-aims-for-1-billion-ai-agents-this-year"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945901453350166664) 2025-07-17 17:40:34 UTC 73.6K followers, 3126 engagements "The paper answer two questions: X. What's the difference between prediction and world models X. Are there straightforward metrics that can test this distinction Engineers often judge large models by how well they guess the next token. This study shows that great guesses do not guarantee a real grasp of the rules behind the data and it introduces a quick way to check. The authors build tiny synthetic tasks that obey a known set of rules finetune a foundation model on each task then watch how the model finishes fresh examples from the same rulebook. If its answers always change when the hidden"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944258856324149449) 2025-07-13 04:53:28 UTC 73.6K followers, 6408 engagements "field footage of Unitree Go2 Pro: basement and park --- reddit. com/r/robotics/comments/1lty64o/some_field_footage_of_unitree_go2_pro_basement/"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1942488825151848605) 2025-07-08 07:40:00 UTC 73.5K followers, 1569 engagements "Recruiters face XX% surge in AI-crafted rsums. hitting 11000 submissions per minute. Many rsums now mirror job-description keywords from simple ChatGPT prompts. AI agents auto-apply on behalf of candidates forcing firms into an AI vs AI screening arms race"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1937362281332920713) 2025-06-24 04:08:56 UTC 73.6K followers, 273K engagements "๐ ChatGPT agent fuses three older tools blending Operators web-browsing clicks Deep Researchs summarization tricks and ChatGPTs reasoning into one system so a single prompt can trigger browsing code execution or API calls without manual tool-switching"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945915319111016651) 2025-07-17 18:35:40 UTC 73.6K followers, XXX engagements ""hyper precise prompts to describe what you want" is absolutely the BEST strategy. ๐ฅ Many YCombinator AI startups prompts are super detailed (e.g. 6+ page prompts) with XML tags and meta-prompting techniques. e.g. Parahelp's customer support agent prompt is 6+ pages meticulously outlining instructions for managing tool calls. --- โ Key Learning from this doc Top AI startups use "manager-style" hyper-specific prompts6+ pages detailing task role and constraints. These aren't quick hacks; theyre structured like onboarding docs for new hires. Role prompting anchors the LLMs tone and behavior."  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944485332575039718) 2025-07-13 19:53:24 UTC 73.6K followers, 149.3K engagements "OpenAI is preparing to release ChatGPT 'agents' that could threaten Microsoft Excel and PowerPoint"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945401304676990987) 2025-07-16 08:33:09 UTC 73.6K followers, 16.6K engagements "ChatGPT Record Mode now available to ChatGPT Plus users globally in the macOS desktop app. It lets you record up to XXX minutes of voicelike meetings brainstorming sessions or voice notesand provides live transcription and a postsession summary saved as an editable canvas in your chat history. Just Tap the mic icon in chat give mic & systemaudio permissions speak naturally then stop or pause. ChatGPT creates a transcript and structured summary with highlights action items and timestamps. As of now Record Mode is not available for Linux Windows browsers or mobile so here you won't see the mic"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945550869451440331) 2025-07-16 18:27:28 UTC 73.6K followers, 3664 engagements "I asked ChatGPT Agent to build a slide presentation on this. If Apple buys Perplexity how big of an acquisition that will be vs Apple's historial acquisitions"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946392786707837038) 2025-07-19 02:12:57 UTC 73.6K followers, 2814 engagements "Its a hefty 206-page research paper and the findings are concerning. "LLM users consistently underperformed at neural linguistic and behavioral levels" This study finds LLM dependence weakens the writers own neural and linguistic fingerprints. ๐ค๐ค Relying only on EEG text mining and a cross-over session the authors show that keeping some AI-free practice time protects memory circuits and encourages richer language even when a tool is later reintroduced"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1934770112483217645) 2025-06-17 00:28:35 UTC 73.6K followers, 2.3M engagements "@OpenAI incredibly useful for meetings. ๐ would have been massive if it was available for Linux Windows browsers and mobile"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945549878731706466) 2025-07-16 18:23:32 UTC 73.6K followers, 11.2K engagements "AI can help us many ways. Here ChatGPT helped someone quit weed retrieve scam money figure out career path boost fitness and mental health. One comment I really liked is "I used ChatGPT not like a coach or therapist just like a space to get real with myself." The thread is full with many stories like that"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945404712452608468) 2025-07-16 08:46:41 UTC 73.6K followers, 3581 engagements "๐งต 2/n. Why constant space matters Every document now carries the same vector count so the index grows linearly with corpus size rather than document length. Fixed length lets the database pack vectors into cachefriendly blocks which improves paging and SIMD throughput and it roughly halves index size compared with unpooled ColBERT. This approach makes it: - Easier to manage and scale in a vector database: All documents have uniform storage sizes simplifying retrieval logic. - More efficient for query-time processing: Avoids the overhead of variable-length comparisons leading to better cache"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946296598885233143) 2025-07-18 19:50:44 UTC 73.6K followers, XXX engagements "๐ฏ An ex-OpenAI engineer shares his thoughts about OpenAI Has lots of insights on OpenAIs day-to-day life unlike anything I have read before. He joined OpenAI as a software engineer on the applied side spending about XX months building the Codex coding agent and related internal prototypes. Most of his time went into writing Python tuning GPU budgets and sprinting with a small team to take Codex from first commit to its public launch in X weeks. He left because of his own craving for founder freedom yet calls the year the most eye-opening move of his career. ๐ Culture shock OpenAI ballooned"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945311539143344548) 2025-07-16 02:36:27 UTC 73.6K followers, 116.5K engagements "A Reddit user deposited $XXX into Robinhood then let ChatGPT pick option trades. XXX% win reate over XX days. He uploads spreadsheets and screenshots with detailed fundamentals options chains technical indicators and macro data then tells each model to filter that information and propose trades that fit strict probability-of-profit and risk limits. They still place and close orders manually but plan to keep the head-to-head test running for X months. This is his prompt ------- "System Instructions You are ChatGPT Head of Options Research at an elite quant fund. Your task is to analyze the"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944266301775786253) 2025-07-13 05:23:03 UTC 73.6K followers, 3.6M engagements "Voice is winning workflows and OpenAI stamped it in the UI. Three mics on the dock. record dictate chatone tap per vibe"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945684614632309061) 2025-07-17 03:18:55 UTC 73.6K followers, 3848 engagements "Context Engineering Evolution Timeline: A comprehensive visualization of the development trajectory of Context Engineering implementations from 2020 to 2025"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946349744827035978) 2025-07-18 23:21:55 UTC 73.6K followers, XX engagements "A Local LLM as a coding-autopilot is so surreal. A single file consisting of vectors and somehow holding the knowledge and meaning of the human world. ๐คฏ May vectors become much more powerful in the time to come"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946542033662824511) 2025-07-19 12:06:00 UTC 73.6K followers, 1997 engagements "unemployment rates from Federal Reserve Bank of NY computer engineering ranks 3rd at 7.5%"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1942992716541104444) 2025-07-09 17:02:17 UTC 73.6K followers, 3264 engagements "Money habits differ worldwide yet nobody knows which habits shape LLM advice. This study asked X major chatbots and humans from XX countries the same XX finance questions. Each model answered XXX times researchers kept the median answer for every prompt and compared it with the INTRA survey medians. When the authors ran that check on the XX finance questions every large language model landed in the same tight group and the only human data that fell into that pocket came from Tanzania The models almost always choose or price the gamble right at that average. In plain terms they treat a risky"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945466698913681563) 2025-07-16 12:53:00 UTC 73.6K followers, 2944 engagements "It is happening again. This time the magic word is not .com. It is AI. According to Torsten Slok the influential chief economist at Apollo Global Management ๐ธ AIs superstar stocks now trade at P/E levels higher than the 2000 dot-com crest hinting at another bubble. Sloks chart shows the top XX names in the Standard and Poors XXX carrying a richer premium than in 2000 while the other XXX barely move. Back in 2000 the internet was real yet a bubble still erased $5T of market value. The pattern repeats: exciting tech easy money and sky-high multiples. If profits do not rise quickly lofty"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946205818409586953) 2025-07-18 13:50:00 UTC 73.6K followers, 5307 engagements ""Developing superintelligence is now in sight. We should act as if it's going to be ready in the next 2-3 years." - Mark Zuckerberg About paying $XXX million or $XXX million pay packages he argued that Meta will spend hundreds of billions on compute and data-center build-outs so paying roughly $XXX million-plus to attract about 50-70 top researchers is sensible since that wage bill is tiny next to the overall capital outlay. And also that the market for world-class AI talent is extremely competitive because only a handful of researchers can do this work and every major lab wants them. ---"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945725129138597928) 2025-07-17 05:59:55 UTC 73.6K followers, 6861 engagements "A follow-up study on Apple's "Illusion of Thinking" Paper is published now. Shows the same models succeed once the format lets them give compressed answers proving the earlier collapse was a measurement artifact. Token limits not logic froze the models. Collapse vanished once the puzzles fit the context window. So Models failed the rubric not the reasoning. โ The Core Concepts Large Reasoning Models add chain-of-thought tokens and self-checks on top of standard language models. The Illusion of Thinking paper pushed them through four controlled puzzles steadily raising complexity to track how"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1933296859730301353) 2025-06-12 22:54:24 UTC 73.6K followers, 476K engagements "This 39-page report from Kuaishou explains how the company rebuilt its video recommender system into one end-to-end generative model called OneRec Traditional recommenders run separate retrieval pre-ranking and ranking stages that waste compute on network transfers and chase conflicting goals. โ The Core Concepts OneRec deletes retrieval prerank and rank replacing them with one encoderdecoder that maps user context to video tokens in one forward pass. All parameters chase the same final reward so gradients stop fighting each other. High arithmetic density keeps GPUs busy with matrix"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1940002449093185846) 2025-07-01 11:00:02 UTC 73.6K followers, 2002 engagements "๐ฆ Goldman Sachs is testing a hybrid workforce (AI+humans) with autonomous software engineer AI agent Devin as a new employee The AI agent will draft unit tests clean legacy scripts and open pull requests while a human reviews every change. The bank currently employs around 12000 human devs"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943779055012393277) 2025-07-11 21:06:55 UTC 73.6K followers, 2571 engagements "SpaceX just committed $2B to xAI. Musk bets that shared AI data and hardware lift every firm he controls. Future deals may see Grok guiding Starlink antennas or living inside Teslas Optimus robots. --- wsj. com/tech/spacex-to-invest-2-billion-into-elon-musks-xai-413934de"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1944247369014399347) 2025-07-13 04:07:49 UTC 73.5K followers, 1977 engagements "๐ Huggingface releases SmolLM3 SoTA 3B model 128k context dual mode reasoning (think/no_think) ๐ค @huggingface released SmolLM3 a 3B parameter multilingual reasoner that matches bigger 4B models handles 128k tokens and ships with an open-sourced training blueprint in this blog post. ๐ They pre-trained on 11.2T tokens then stretched context with YARN up-sampling finishing the run on XXX H100 GPUs in XX days. ๐ง A built-in dual think / no_think switch lets users decide between fast answers or slower chain-of-thought traces. ๐ How they pulled it off Grouped Query Attention trades multi-head"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1942699436834185287) 2025-07-08 21:36:54 UTC 73.5K followers, 2818 engagements "So @xAI 's @grok X really did hit XXXX% on HLE (Humanities Last Exam) ๐คฏ --- (HLE holds 2500 expert-written questions spanning more than XXX subjects including math physics computer science and humanities and XX% of them mix text with images. The authors deliberately built in anti-gaming safeguards and hid a private question set so that simply memorising answers will not help a model.)"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1943167569856532822) 2025-07-10 04:37:05 UTC 73.6K followers, 28.9K engagements "Someone just forked the original OpenAI Codex CLI A terminalbased coding agent that lets you chatprompt code changes run them safely in a sandbox and iterateall while supporting multiple AI providers (OpenAI Gemini OpenRouter Ollama)"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1913225961962696857) 2025-04-18 13:39:50 UTC 73.6K followers, 6430 engagements "Dharmesh Shah on leveraging AI in everything you do. "It's not a you vs AI. That's not the mental model you should have in here. The right mental frame of reference you should have is It's you to the power of AI. AI is an amplifier of your capability." --- From 'My First Million' YT channel (link in comment)"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945589802801528841) 2025-07-16 21:02:10 UTC 73.6K followers, 2863 engagements "AWS is previewing a specialized storage offering Amazon S3 Vectors that it claims can cut the cost of uploading storing and querying vectors by up to XX% compared to using a vector database. This new bucket type keeps vector data inside S3 itself brings a dedicated similarity-query API and promises up to XX% lower costs than running a separate vector database. The launch targets teams that need large cheap vector stores to feed retrieval-augmented generation memory for AI agents or other semantic-search workloads. ๐ What S3 Vectors is S3 Vectors adds vector buckets. Inside each bucket you"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945693567877624020) 2025-07-17 03:54:30 UTC 73.6K followers, 3626 engagements "This paper wants to understand LLM's proficiency in enhancing code performance at the repository level or delivering meaningful speed gains. Because they do not know which lines of code waste the most time or how to coordinate fixes across several files. The authors built SWE-Perf to measure that shortfall. Human reviewers in the benchmark trimmed average runtime by XXXX% while the best agent improved only XXX% even though it passed almost XX% of the functional checks. That gap shows that real performance work still needs profiling tools cross file reasoning and awareness of low level"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946148188982800470) 2025-07-18 10:01:00 UTC 73.6K followers, 1587 engagements "๐งฌ Further to my previous post last month's huge medical AI innovation Microsoft's AI Diagnostic Orchestrator (MAI-DxO) must be mentioned. ๐ Till now drug research has followed Erooms law where the cost to bring one therapy to market roughly doubles every X years and the success rate per $X B keeps sinking. That trend shows biology knowledge as the main choke point. ๐ฅ MAI-DxO attacks that choke point by turning a large language model into a virtual panel of clinicians. It asks follow-up questions picks tests checks prices and then cross-examines its own reasoning before it commits to a"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946448157652762955) 2025-07-19 05:52:58 UTC 73.6K followers, 2523 engagements "The authors start by reminding that a language model chooses each next token based on past tokens plus the surrounding context. Older prompt-engineering ideas package that context as one long prompt string which works for toy demos but quickly falls apart in real systems. They then reject the single-string view and introduce context engineering. Here the context is a bundle of smaller parts that get sourced filtered and glued together by helper functions before each model call. Treating context as several moving pieces makes it easier to swap data in and out on the fly"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946349740452614511) 2025-07-18 23:21:54 UTC 73.6K followers, XX engagements "that was quick. 1st party support for Claude Sonnet is back on @windsurf_ai"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1945630919181963483) 2025-07-16 23:45:33 UTC 73.6K followers, 3198 engagements "Because these mechanisms are text-level skills research groups have already tested the same reasoning setups on chemistry puzzles where the model must justify reaction mechanisms or property predictions and they report clear gains without chemistry-specific tweaks. arxiv .org/abs/2505.07735v1"  [@rohanpaul_ai](/creator/x/rohanpaul_ai) on [X](/post/tweet/1946595313546052035) 2025-07-19 15:37:43 UTC 73.6K followers, XXX engagements
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
Rohan Paul posts on X about open ai, llm, meta, token the most. They currently have XXXXXX followers and XXX posts still getting attention that total XXXXXX engagements in the last XX hours.
Social category influence technology brands XXXXX% finance XXXX% stocks XXXX% celebrities XXXX% countries XXXX% vc firms XXX% travel destinations XXX% social networks XXX% currencies XXX% automotive brands XXX%
Social topic influence open ai #231, llm #27, meta 2.68%, token #412, microsoft #162, mark zuckerberg 1.79%, $googl 1.79%, context engineering #6, puzzles 1.79%, xai XXXX%
Top accounts mentioned or mentioned by @opus_genesis @grok @stonkyoloer @xai @huggingface @baiduinc @openai @windsurfai @nomadkreativ @justinechoes @googledeepmind @prashant_1722 @the100kprompts @elonmusk @arthurkilber @teksedge @thewebai @viragconsulting @apple @mohansolo
Top assets mentioned Microsoft Corp. (MSFT) Alphabet Inc Class A (GOOGL) Goldman Sachs (GS) Apollo Global Management, Inc. (APO) ServiceNow Inc (NOW)
Top posts by engagements in the last XX hours
"@RossionQ yes most prop-trading firms and i-banking trading desks have their own powerful prediction softwares. And now the most powerfull models will be fine-tuned to their huge propietory data. and the race will be who has the most custom-fine-tuned model" @rohanpaul_ai on X 2025-07-13 21:38:18 UTC 73.6K followers, 20.3K engagements
"This is really BAD news of LLM's coding skill. โน The best Frontier LLM models achieve X% on hard real-life Programming Contest problems domains where expert humans still excel. LiveCodeBench Pro a benchmark composed of problems from Codeforces ICPC and IOI (International Olympiad in Informatics) that are continuously updated to reduce the likelihood of data contamination" @rohanpaul_ai on X 2025-06-16 23:13:13 UTC 73.6K followers, 459.8K engagements
"Brilliant paper for optimizing your prompt-design. ๐ก Keep crucial rules early in your prompt break huge lists into chunks and expect misses past XXX no matter how fancy the engine. This paper checks what happens when the rules or instruction list reaches XXX. IFScale the benchmark asks a model to write a business report while slipping in up to XXX exact keywords. Because scoring is plain keyword matching the team charts accuracy for XX models from X vendors. Results show three decay shapes. Reasoning models like o3 stay near XXX% until about XXX rules then drop fast gpt4.1 drifts down in a" @rohanpaul_ai on X 2025-07-17 10:18:00 UTC 73.6K followers, 23K engagements
"Todays edition of my newsletter just went out. ๐ Consider subscribing its free and I publish daily with top X% AI developments. โก In todays Edition (14-July-2025): ๐จ Mark Zuckerberg says Meta is building a 5GW AI data center ๐ก @xai will spin Grok into hundreds of task-focused agents that talk to each other. ๐ @cognition_labs is taking Remaining Windsurf team and tech days after Google bought its founders for $2.4B ๐ Byte-Size Briefs: - Pentagon picked Google OpenAI xAI and Anthropic for new defense deals. Each agreement carries a spending limit of $XXX million. ๐ง๐ Deep Dive:" @rohanpaul_ai on X 2025-07-14 23:50:19 UTC 73.5K followers, 14.4K engagements
"@ArthurKilber yes long prompt generally works bettter with o3/o3 pro" @rohanpaul_ai on X 2025-07-04 19:46:46 UTC 73.6K followers, XXX engagements
"DeepSeek R1 running locally - Full setup guide" @rohanpaul_ai on X 2025-01-26 00:03:01 UTC 73.6K followers, 1.4M engagements
"Optimism is a low-cost gradient ascent hack. Creative throughput is proportional to expected reward. Defend it keep expectations green profit from higher gradient steps" @rohanpaul_ai on X 2025-07-18 20:40:57 UTC 73.6K followers, 2019 engagements
"โ NY Times wins right to see ChatGPT logs in legal fight with OpenAI NYT can even search deleted ChatGPT logs ๐๐ exposing up to 2B private chats and testing OpenAI's privacy safeguards. Judge Sidney Stein rejected OpenAIs plea to keep standard deletion policies. Magistrate Ona Wangs preservation order forces the company to store every nonenterprise chat indefinitely while it negotiates keyword scopes with NYT Daily News and CIR. Only small anonymized slices will stay on OpenAI servers yet they still expose prompts outputs and timestamps. So billions of medical job and relationship details" @rohanpaul_ai on X 2025-07-04 16:23:03 UTC 73.6K followers, 3188 engagements
"Todays edition of my newsletter just went out. ๐ Consider subscribing its free and I publish daily with top X% AI developments. โก In todays Edition (18-July-2025): โ Humans vs AI at the AtCoder World Tour Finals @OpenAI beats all but one human. New Video model lets you take any video stream and set them in any alternative universe of your choosing. ๐ ConstBERT from Pinecone cuts multivector index size by about XX% yet keeps toptier ranking. ๐ง๐ OPINION: Human Money vs Machine Money: The Coming Split and Sam Altmans view" @rohanpaul_ai on X 2025-07-18 21:47:34 UTC 73.6K followers, 12.3K engagements
"Microsoft Study Reveals Which Jobs AI is Actually Impacting Based on 200K Real Conversations. The largest study of its kind analyzing 200000 real conversations. Key Finding: big chunk of knowledge and people jobs now overlaps with what todays AI models do well. ๐ Most AI-Impacted Jobs: - Interpreters and Translators top the chart with XX% of their core activities turning up in chats and showing decent completion and scope. - Customer Service Representatives Sales Reps Writers Technical Writers and Data Scientists. Each of these lands an applicability score around 0.400.49 meaning roughly" @rohanpaul_ai on X 2025-07-12 01:24:05 UTC 73.6K followers, 102.1K engagements
"๐ Google dropped its very first Gemini Embedding text model tops the MTEB Multilingual leaderboard. - generally available in the Gemini API and Vertex AI. - has consistently ranked #1 on the MTEB Multilingual leaderboard since its experimental launch in March - supports over XXX languages - has a 2048 maximum input token length - priced at $XXXX per 1M input tokens. - allows developers to scale the output dimensions down from the default 3072" @rohanpaul_ai on X 2025-07-16 22:02:28 UTC 73.6K followers, 2897 engagements
"Bug fixed for Grok X now. The changed System Prompt pull request from Github" @rohanpaul_ai on X 2025-07-15 08:47:06 UTC 73.5K followers, 3027 engagements
"Wild idea in this paper ๐คฏ How might we store knowledge affordably yet comprehensively Memory proposes an intriguing method - compressing factual data separately. Introduces a third form of memory in addition to the implicit knowledge stored in model parameters and the short-term working memory used during inference (context key-values). ๐จ๐ง LLMs struggle with inefficient knowledge storage and retrieval leading to high training and inference costs. The paper aims to address this by introducing a more efficient memory format. ๐ Memory3 introduces explicit memory as a third memory format for" @rohanpaul_ai on X 2024-07-07 02:51:47 UTC 73.6K followers, 154.1K engagements
"Now the 3rd paper comes on this ๐คฏ "The Illusion of the Illusion of the Illusion of Thinking" ๐1st original Paper from Apple concludes that large reasoning models reach a complexity point where accuracy collapses to zero and even spend fewer thinking tokens revealing hard limits on generalizable reasoning. ๐2nd Paper counters that the apparent collapse is an illusion caused by token limits and impossible puzzles so the models reasoning remains sound when evaluations remove those flaws. ๐3rd paper synthesizes both sides agreeing the collapse was an artifact yet stressing that models still" @rohanpaul_ai on X 2025-06-19 17:09:17 UTC 73.6K followers, 251.7K engagements
"๐ @cluely doubles ARR to $7M in X days after launching Early cheat on everything branding softened once Andreessen Horowitz Abstract Ventures and Susa Ventures backed the startup. techcrunch .com/2025/07/03/cluelys-arr-doubled-in-a-week-to-7m-founder-roy-lee-says-but-rivals-are-coming" @rohanpaul_ai on X 2025-07-04 00:53:08 UTC 73.6K followers, 2293 engagements
"๐จ HUGE BREAKTHROUGH. A hair-thin silicon chip now can push data at 1000 Gbps while sipping only X joules. That moves 100M books in roughly X minutes. What does it mean practically Data-center switches now stretch processors across long aisles wasting energy and space. In a big AI data center that means racks can sit closer cables shrink cooling loads drop and energy bills fall. Traditional copper links max out near XX Gbps so a single cable cannot carry the huge flood of data. Every time traffic exceeds what X cable moves engineers stack more identical cables in parallel then drop a switch" @rohanpaul_ai on X 2025-07-13 00:18:53 UTC 73.6K followers, 14.8K engagements
"๐ Attention Suppression Method X masks all attention going to one sentence and watches how later logits drift. A strong drift signals a direct causal link. The suppression scores correlate with resampling scores backing up the claim that the three methods converge on the same anchors" @rohanpaul_ai on X 2025-06-26 23:29:50 UTC 73.6K followers, XXX engagements
"RAG boosts LLM memory yet it misses multi step logic while raw reasoning invents facts. This survey explains fresh designs that let the two prop each other up. ๐งฉ It first shows reasoning can fix retrieval by rewriting queries planning hops and filtering noisy passages. Next retrieval fills the knowledge gaps inside long reasoning chains bringing the proofs code or web snippets a model actually needs. ๐ค The highlight is a loop where an agent thinks searches checks and thinks again until the answer is solid. Chains trees and graph walks guide this loop and solo or team agents run it cutting" @rohanpaul_ai on X 2025-07-18 04:31:00 UTC 73.6K followers, 3441 engagements
"๐งต 2/n But how exactly a model can be split and still it will generate response to my question A transformer is just a long stack of math layers packed into weight matrices that live in RAM. so webFrame starts by slicing the full checkpoint into several shards on disk. Each shard holds only the slice that a given computer will need following the same tensor-parallel idea first popularized in Megatron-LM. When the cluster boots every Mac loads just its slice which keeps memory use under control. Once a prompt arrives the layer-by-layer forward pass still happens in the usual order but matrix" @rohanpaul_ai on X 2025-07-18 16:18:55 UTC 73.6K followers, 1020 engagements
"๐ฏOpenAI achieved IMO gold with experimental reasoning model; they also will be releasing GPT-5 soon. ๐ @OpenAI 's newest reasoning model solved X of the X problems on the 2025 International Math Olympiad under the same 2-day 4.5-hour-per-session rules that human contestants face. The model is not an IMO specialist it is a general LLM that uses fresh verification tricks and much longer thinking time letting it tackle proofs that used to stall machines. The gap between THE MOST BRILLIANT HUMAN students and AI on SUPER hard mathematics challenges has finally closed. ๐งฎ Olympiad problems demand" @rohanpaul_ai on X 2025-07-19 15:36:09 UTC 73.6K followers, 4908 engagements
"@TheRealOdram no don't do that ๐ if you are in software-engineering o3/o3-pro is the absolute best you can get right now. i am a real fan" @rohanpaul_ai on X 2025-07-03 21:48:15 UTC 73.6K followers, XX engagements
"It had to happen. After all these are financed by a humongous amount of money. Metas top lab members including Alexandr Wang mulled dropping Behemoth the companys premier open model for a closed version Were obviously very pro open source but I havent committed to releasing every single thing that we do. - Mark Zuckerberg --- nytimes. com/2025/07/14/technology/meta-superintelligence-lab-ai.html" @rohanpaul_ai on X 2025-07-14 21:43:32 UTC 73.6K followers, 1955 engagements
"This is really cool open-source project from @firecrawl_dev Turn a simple list of emails into a rich dataset with company profiles funding data tech stacks and more. It chains small specialized agents feeds them shared context and let them stitch the answers together. Behind the scenes each agent is a specialized module with its own expertise search strategies and type-safe output schema. Orchestrated by @Grok X and powered by @firecrawl_dev" @rohanpaul_ai on X 2025-07-10 18:50:46 UTC 73.6K followers, 4488 engagements
"These guys literally burned the transformer architecture into their silicon. ๐คฏ And built the fastest chip of the world of all time for transformers architecture. 500000 tokens per second with Llama 70B throughput. ๐คฏ Worlds first specialized chip (ASIC) for transformers: Sohu One 8xSohu server replaces XXX H100 GPUs. And raised $120mn to build it. ๐ The Big Bet @Etched froze the transformer recipe into silicon. By burning the transformer architecture into its chip means it cant run many traditional AI models: like CNNs RNNs or LSTMs. also it can not run the DLRMs powering Instagram ads" @rohanpaul_ai on X 2025-06-27 17:46:51 UTC 73.6K followers, 710.3K engagements
"Fullstack Engineer - Waifus Annual Salary $440000 USD from real job-board. "What You'll Do Make Grok's realtime avatar products fast scalable and reliable. Help push forward audio and gameplay research and deploy breakthrough innovations to millions of users. Obsess over every millisecond and byte ensuring end-to-end quality and performance at scale across a rich suite of products and user platforms." --- job-boards .greenhouse. io/xai/jobs/4789505007" @rohanpaul_ai on X 2025-07-17 08:32:00 UTC 73.6K followers, 2601 engagements
"๐ Receiver Heads Method X looks inside the model. Some attention heads in late layers called receiver heads pour unusually high attention into a few earlier broadcasting sentences. Sentences that soak up this focused attention are again mostly planning questioning or checking lines not raw arithmetic" @rohanpaul_ai on X 2025-06-26 23:29:49 UTC 73.6K followers, XXX engagements
"Github: Prompt engineering received all the attention but we can now get excited for what comes next. Once you've mastered prompts the real power comes from engineering the entire context window that surrounds those prompts. Guiding thought if you will" @rohanpaul_ai on X 2025-07-19 04:11:49 UTC 73.6K followers, 4640 engagements
"Risk-aware financial forecasting models with LLMs. Here the researchrs design an adaptive Sharpe-ratio loss inside a Temporal Fusion Transformer. When tested on equities crypto and commodities the model lifts both prediction accuracy and realised portfolio Sharpe against standard TFT and LSTM baselines. --- researchgate. net/publication/389877674_An_Adaptive_Sharpe_Ratio-Based_Temporal_Fusion_Transformer_for_Financial_Forecasting" @rohanpaul_ai on X 2025-07-14 00:30:11 UTC 73.6K followers, 5546 engagements
"If Apple buys Perplexity that would be its biggest ever acquisition" @rohanpaul_ai on X 2025-06-25 20:35:39 UTC 73.6K followers, 1.1M engagements
"Absolutely delux Github repository. lots of code-first tutorials covering every layer of production-grade GenAI agents by @NirDiamantAI" @rohanpaul_ai on X 2025-07-17 07:40:02 UTC 73.6K followers, 3988 engagements
"Nvidia and AMD can once again ship their trimmed H20 and MI308 AI chips to China because the Trump team figures that selling older gear slows Huawei more than it strengthens Beijing labs. ๐AI czar David Sacks calls H20 a deprecated chip. By letting it flow Washington hopes global buyers stay tied to an American-made stack of chips software and models. Washington figures that if Nvidia stays blocked Chinese buyers will fill their racks with Huaweis home-grown Ascend chips hand Huawei massive production scale and let it polish those designs until they threaten Nvidia everywhere else. So by" @rohanpaul_ai on X 2025-07-16 07:39:52 UTC 73.6K followers, 3498 engagements
"A brilliant example of Grok X Heavy. โค Swap manual bug hunts for Groks sweep. Take a look at the 21174 character long prompt. ๐ซก" @rohanpaul_ai on X 2025-07-13 23:23:15 UTC 73.5K followers, 7334 engagements
"this story is going wildy viral on reddit. ChatGPT flagged a hidden gene defect that doctors missed for a decade. ChatGPT ingested the patients MRI CT broad lab panels and years of unexplained symptoms. It noticed that normal serum B12 clashed with nerve pain and fatigue hinting at a methylation block. Within months tingling eased and brain fog cleared. The primary physician reviewed the genetics report and agreed the variant unified the entire case. IMO time has already come taking a 2nd opinion from the best healthcare-AI model should be made part of medical code of practice. ------ reddit." @rohanpaul_ai on X 2025-07-05 02:20:58 UTC 73.6K followers, 1.4M engagements
"From text to trade: harnessing the potential of generative AI for investor sentiment analysis in financial markets through. This study describe a production-grade workflow that converts multilingual social-media streams into tradeable sentiment factors by means of a fine-tuned generative model. Over a 24-month back-test the factor delivers XXX % annualised excess return after transaction costs on a long-short equity book reinforcing the edge that rapid unstructured-text digestion can create. --- researchgate." @rohanpaul_ai on X 2025-07-13 23:36:29 UTC 73.6K followers, 7364 engagements
"Most benchmarks still grade an LLM on a single file so they miss the messy reality where whole repos change after every test run. This paper closes that gap by introducing LiveRepoReflection 1888 tough tasks across X languages that make a model read edit and retest mini repositories. The team builds each task with an automated pipeline that scrapes fresh code writes several unittest suites crossexecutes them in a sandbox and drops anything the strongest models pass too easily. They also craft RepoReflectionInstruct 8702 vetted repos plus 840839 multiturn dialogues then finetune a 32B" @rohanpaul_ai on X 2025-07-19 03:26:26 UTC 73.6K followers, 1642 engagements
"๐งต 3/n. The picture lays out a search pipeline that trims a huge document pool to a tiny list that an LLM can read. A sparse model and a dense embedding model each grab about 1000 likely matches from a corpus that holds 10M-100M records. Their two hit lists are blended then a multi-vector model checks finer details and keeps the best XXX. A heavier cross-encoder reranker scores those XXX pairs in depth and sends only XX winners forward. This step-by-step filter saves compute and storage yet still feeds the LLM documents picked with richer signals than a single wide scan could manage" @rohanpaul_ai on X 2025-07-18 19:50:45 UTC 73.6K followers, XXX engagements
"The 10000-Year Clock: Jeff Bezos' $XX Million Timepiece Built Inside a Mountain He Owns The century hand moves every XXX years. A cuckoo emerges every 1000 years. Because he wants a concrete symbol that can stretch human attention beyond quarterly results. Computer scientist Danny Hillis proposed a XX millennia clock in 1989 to provoke society to take very long views of history and the future. Jeff Bezos is funding the first full-scale version on his Sierra Diablo land so the monument can stand as a daily reminder that todays choices echo far beyond any single lifetime" @rohanpaul_ai on X 2025-07-15 09:15:18 UTC 73.6K followers, 5594 engagements
"๐ฐThinking Machines led by former OpenAI CTO Mira Murati raises $2B in seed funding at a valuation of $XX billion. Andreessen Horowitz wrote the biggest check joined by Nvidia Accel ServiceNow Cisco AMD and Jane Street. Investor appetite for fresh AI outfits is strong even while some people wonder about overall tech spending. Because of that U.S. startups raised about $XXXXX billion in the first half of 2025 a jump of nearly XX% and AI deals took roughly XXXX% of the total as per Pitchbook" @rohanpaul_ai on X 2025-07-16 04:19:22 UTC 73.6K followers, 4895 engagements
"This stunning proof by MIT computer scientist is the first progress in XX years on one of the most famous questions in computer science. Space complexity vs Time complexity. New idea proves that any algorithm that runs in T steps can be re-engineered to use about T memory cells establishing that memory (RAM) is a much stronger resource than earlier theory allowed. A computer spends time (i.e. time complexity) running steps and spends memory (i.e. space complexity) holding data. Memory is the list of numbered slots inside RAM where a program keeps facts it will soon need again. Space" @rohanpaul_ai on X 2025-07-14 03:41:06 UTC 73.6K followers, 98.2K engagements
"Microsoft Layoffs Hit Legal Department as AI Reshapes Staffing Strategy. Legal profession is mostly about language so it has to see the full pressure of AI. Microsoft has cut 15000 jobs since May redirecting cash toward AI infrastructure. Leaders faced a blunt trade-off: slow hardware spending or cut payroll. The company says Copilot already saved $500M in call-center costs last year. Inside Xbox canceled titles like Everwild and Perfect Dark illustrate the shift. Teams were whittled down until only a skeleton crew could keep existing games online. Cloud sales lost account managers just as" @rohanpaul_ai on X 2025-07-19 04:24:30 UTC 73.6K followers, 6679 engagements
"๐ฐ Investors are lining up to fund Anthropic above $100B. Claude's revenue sprint from $3B to $4B annualized in X month explains the eagerness. That tag more than doubles the $61.5B valuation Anthropic set when it took $3.5B in February. Venture firms now race to pre commit cash before rivals lock up the allocation. Amazon and Alphabet already own sizeable stakes and supply cloud credits keeping compute costs under control. A fresh round mostly widens the buffer of GPUs --- bloomberg. com/news/articles/2025-07-16/anthropic-draws-investor-interest-at-more-than-100-billion-valuation" @rohanpaul_ai on X 2025-07-17 02:06:31 UTC 73.6K followers, 1915 engagements
"Beautiful Survey paper on Context Engineering on 1400 research papers. XXX pages of comprehensive taxonomy decomposing Context Engineering into its foundational Components and the sophisticated Implementations. LLMs stumble when the prompt is messy so this survey maps every tool for cleaning stretching and storing context. The authors show that smart context handling not just bigger models drives more accurate and reliable answers. ๐บ Why define context engineering at all Today prompt tricks retrieval add-ons long-attention tweaks and memory hacks grow in separate silos. That split hides how" @rohanpaul_ai on X 2025-07-18 23:21:51 UTC 73.6K followers, 2771 engagements
"This github repo is a goldmine. 3.4K Starts in X days. end-to-end code-first tutorials covering every layer of production-grade GenAI agents guiding you from spark to scale with proven patterns and reusable blueprints for real-world launches" @rohanpaul_ai on X 2025-06-21 00:15:20 UTC 73.6K followers, 355.9K engagements
"These stories continue about how AI (ChatGPT in this case) is helping people get a second opinion on medical problems. The person endured XX years of fatigue numbness and back pain after 5-6h sleep but felt fine with 8h. ChatGPT figured its because of vitamin D deficiency. --- reddit. com/r/OpenAI/comments/1lytfiw/after_11_years_chatgpt_helped_me_solve_chronic/" @rohanpaul_ai on X 2025-07-14 04:35:52 UTC 73.6K followers, 9584 engagements
"INCREDIBLE. China just released 1tn parm top open source model for coding and agentic tool work. Kimi K2 from Moonshot AI Insane numbers on benchmarks. On LiveCodeBench the model hits XXXX Pass@1 beating DeepSeekV3 by almost X points and clearing Qwen235B by more than XX points Scores XXXX% on singleshot SWEbench agentic coding and XXXX on Tau2 retail tool use numbers that sit at or near the top of the open stack. - X tn total parameters MoE 32Bn active - Trained with the Muon optimizer - Very strong across frontier knowledge reasoning and coding tasks - SOTA on SWE Bench Verified Tau2 &" @rohanpaul_ai on X 2025-07-11 17:12:45 UTC 73.6K followers, 33.8K engagements
"OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost ๐คฏ This Mixture of Agents models is optimized for generating synthetic training data. ๐ Using Mixture of Agents (MoA) architecture the model achieved SOTA results on both LMSYSs Arena Hard Auto (score: 84.8) and AlpacaEval XXX (LC score: 68.4). ๐ Theyve also benchmarked our MoA approach against GPT-4 variants on real-world OpenPipe customer tasks and found completions from our MoA model were preferred over GPT-4 XXXX% of the time (Claude X Opus as judge)" @rohanpaul_ai on X 2024-06-25 19:34:09 UTC 73.6K followers, 32K engagements
"Todays edition of my newsletter just went out. ๐ Consider subscribing its free and I publish daily with top X% AI developments. โก In todays Edition (16-July-2025): ๐ฅ Landmark research from Google DeepMind achieves 2X faster inference and XX% reduced KV cache memory ๐ง Mark Zuckerberg says AI researchers want X things apart from money. ๐ฐ Mira Muratis Thinking Machines Lab is worth $12B in seed round ๐ Google just dropped its first Gemini Embedding text model tops the MTEB Multilingual leaderboard. ๐ Artificial Analysis released the AI Adoption Survey Report for H1 2025 ๐ Top Resource:" @rohanpaul_ai on X 2025-07-16 22:50:54 UTC 73.6K followers, 13.2K engagements
"Metas free-to-use Llama family was a strategic bridge and now expect to pay for it. The open-sourcing approach also helped Meta recruit elite researchers. Metas recent move to hire Scale AI co-founder Alexandr Wang along with reported signing bonuses up to $XXX million signals a shift toward commercial products that justify such costs. Wall Street will expect paid APIs enterprise subscriptions or in-product advertising tied to future closed models. --- bloomberg. com/opinion/articles/2025-07-14/mark-zuckerberg-and-meta-are-unlikely-to-keep-giving-away-ai-for-free" @rohanpaul_ai on X 2025-07-14 11:09:45 UTC 73.6K followers, 4439 engagements
"With the research a 14Bparameter model holds XX% accuracy even on inputs that balloon to 3.5M tokens all while costing only O(N) in compute.๐คฏ LLMs usually freeze or slow down as soon as a prompt spills past their context window. MemAgent turns that long prompt into bitesized chunks keeps a tiny rolling summary and still nails the answer. The authors bolt a tiny fixed-size memory right inside that window teach the model with reinforcement learning to overwrite that memory after every slice and keep the rest of the architecture untouched. Because the memory never grows compute scales in a" @rohanpaul_ai on X 2025-07-09 04:10:18 UTC 73.6K followers, 28.7K engagements
"Artificial intelligence is going to replace literally half of all white-collar workers in the U.S. - Ford Motor Chief Executive Jim Farley" @rohanpaul_ai on X 2025-07-06 06:30:00 UTC 73.6K followers, 4221 engagements
"Yann LeCun on architectures that could lead to AGI --- "Abandon generative models in favor joint-embedding architectures Abandon probabilistic model in favor of energy-based models Abandon contrastive methods in favor of regularized methods Abandon Reinforcement Learning in favor of model-predictive control Use RL only when planning doesnt yield the predicted outcome to adjust the world model or the critic. IF YOU ARE INTERESTED IN HUMAN-LEVEL AI DONT WORK ON LLMS" --- From "IP Paris" YT channel (link in comment)" @rohanpaul_ai on X 2025-07-15 17:05:13 UTC 73.6K followers, 188.2K engagements
"Todays edition of my newsletter just went out. ๐ Consider subscribing its free and I publish daily with top X% AI developments. โก In todays Edition (15-July-2025): ๐ xAI says it has fixed Grok 4s problematic responses ๐ LG Unveils Korea's First Open-weight Hybrid AI 'EXAONE 4.0' ๐ฅ Kimi K2 is the new Short-Story Creative Writing champion ๐ Byte-Size Briefs: NVIDIA is filing applications to sell the NVIDIA H20 GPU again. ๐ง๐ An ex-OpenAI engineer shares his thoughts about the organization" @rohanpaul_ai on X 2025-07-16 01:08:20 UTC 73.5K followers, 14.3K engagements
"Functime is quite cool - its a forecasting library and for your Time-series machine learning and embeddings at scale - production-ready forecasting and temporal embeddings. - time-series preprocessing (box-cox differencing etc) cross-validation splitters (expanding and sliding window) and forecast metrics (MASE SMAPE etc). All optimized as lazy Polars transforms ------- Temporal embeddings measure the relatedness of time-series. Embeddings are more accurate and efficient compared to statistical methods (e.g. Catch22) for characteristing time-series. Embeddings have applications across many" @rohanpaul_ai on X 2024-07-20 17:13:53 UTC 73.5K followers, 4059 engagements
"Breakthrough in Alzheimer's disease. ๐ง Texas A&Ms team built flower-shaped molybdenum particles that slide into brain cells slash harmful oxidative stress and add X extra days to worm lives. They cut reactive oxygen by almost XX% and pushed mitochondrial survival close to 99%. The work hints at drugs that tackle Parkinsons or Alzheimers by fixing the cells power plants not just masking symptoms. --- interestingengineering .com/health/brain-healing-nanoflowers-treatment" @rohanpaul_ai on X 2025-07-18 11:58:00 UTC 73.6K followers, 1709 engagements
"YCs Hidden Formula: XXX Users $100/Month $10k MRR The Startup Playbook" @rohanpaul_ai on X 2025-07-12 05:43:40 UTC 73.6K followers, 4493 engagements
"Apple will seriously consider acquiring French startup Mistral AI as per Bloomberg ๐ What makes Mistral attractive Mistral was founded in 2023 by former Meta and Google researchers. It has raised a little over $XXX B and is valued at about XXX B. A fresh round of up to $X B led by Abu Dhabi-backed fund MGX is being negotiated now which could push the price higher. Microsoft paid XX M in 2024 for a minority stake and secured first-run access to Mistral-Large on Azure. Mistrals open-weight Mixtral models and its Le Chat consumer bot give Apple a ready-made foundation-model stack that is" @rohanpaul_ai on X 2025-07-14 10:39:41 UTC 73.6K followers, 2554 engagements
"Academic spin-offs like Satori add their own autoregressive search loop on top of chain-of-thought then show the same framework solving physics proofs and formal logic puzzles illustrating how the break check recycle loop ports to fresh fields satori-reasoning. github. io/blog/satori/" @rohanpaul_ai on X 2025-07-19 15:38:58 UTC 73.6K followers, XXX engagements
"ChatGPT's new Agent. got similar experience - great for non-time sensitive research. but presentation aesthetics still need to improve. - connecting to 3rd party apps not smooth. overall Manus Genspark and Comet will give them a very tough competition" @rohanpaul_ai on X 2025-07-18 06:01:33 UTC 73.6K followers, 4316 engagements
"Meta scores two more high-profile OpenAI researchers. OpenAIs reinforcement-learning specialist Jason Wei along with chain-of-thought partner Hyung Won Chung are switching to Metas brand-new superintelligence lab. Meta is dangling packages of up to $300M across X years. The churn proves one thing: whoever nails stable long-context reasoning plus tight reward signals will set tomorrows benchmark. --- wired. com/story/jason-wei-open-ai-meta/" @rohanpaul_ai on X 2025-07-16 16:52:24 UTC 73.6K followers, 3083 engagements
"Fine tuning big models often uses LoRA adapters to cut memory and supposedly time. Paper reports LoRA can train slower because every adapter spawns extra GPU kernels waiting in line. Benchmarks on GPT2 and LLaMA2 show forward plus backward can stretch XX% over full tuning. LoRA cuts parameters with rank r matrices yet those added multiplies break GPU parallelism. Study switches to Partial Connection Adaptation a mask that tweaks chosen weight columns no new layers. It fine tunes only top XX% of layers leaving lower stack frozen. Mask lives inside weights so each layer fires one kernel and" @rohanpaul_ai on X 2025-07-15 10:27:00 UTC 73.6K followers, 2962 engagements
"@Tony_Omega their's is just much more customized with stats and humongous amount of data feeding into large multi-million dollar softwares" @rohanpaul_ai on X 2025-07-13 21:49:57 UTC 73.6K followers, 15.8K engagements
"this ChatGPT prompt went so wildly viral on Reddit. The creator claims to have created this after struggling through XXX failed attempts. basically the prompt flips the usual flow by making the model interview the user first asking a few targeted questions about purpose audience constraints and context. Because the answers feed back into the final request it appears to generate more tailored outputs. (However imo asking ChatGPT to request missing information was already a common practice.) Here's the entire prompt: -------- You are Lyra a master-level AI prompt optimization specialist. Your" @rohanpaul_ai on X 2025-07-02 18:53:19 UTC 73.6K followers, 318.5K engagements
"Goldman Sachs Non-Profitable Tech Index is up XX% since hitting its low in April. signals very strong risk appetite for speculative growth stocks. like in 1999 investors are again very willing to pay up for distant earnings" @rohanpaul_ai on X 2025-07-18 20:44:51 UTC 73.6K followers, 2187 engagements
"Nvidia CEO talks about AI/China/Models at Beijing Expo China. - splits AI into hardware models and apps stressing all three advance together. - About XX% of global AI researchers work in China sustaining that pace. - Nvidias 30-year China presence benefits from a sophisticated interlinked supply chain. - H20 complies with export caps yet offers strong bandwidth for large-model inference. - RTX Pro powers Omniverse digital twins matching Chinas smart-factory and robotics push. - He names reasoning as AIs third wave fueled by compute-heavy post-training not extra data. Reasoning AI links" @rohanpaul_ai on X 2025-07-16 18:03:36 UTC 73.6K followers, 1392 engagements
"Mark Zuckerberg strikes again ๐ฅ Meta just grabbed Apple veterans Mark Lee and Tom Gunter for its Superintelligence Labs that already poached their boss Ruoming Pang with a $200M package. Lee was Pangs very first recruit at Apple and Gunter was a distinguished engineer inside Apple Foundation Models the group that trains Siris large language models. Their exit adds to internal uncertainty as Apple weighs swapping its own models for ChatGPT or Claude to get new Siri features out by next spring. reuters." @rohanpaul_ai on X 2025-07-18 05:19:37 UTC 73.6K followers, 2287 engagements
"How do memories last when the molecules that form them turn over within days weeks or months A memory sticks around because two proteins meet in the same tiny spot where two neurons talk. Memories can live for decades because PKM sticks to KIBRA inside a busy synapse creating a swap-friendly bond that survives routine protein turnover. PKM is an enzyme that lives inside the synapse the contact point between neurons. They act like a bookmark. When one copy of either protein breaks down during normal cell cleanup a fresh copy plugs straight back into the waiting partner so the bookmark never" @rohanpaul_ai on X 2025-07-16 07:16:27 UTC 73.6K followers, 1344 engagements
"๐จ META COMES BACK WITH FULL FORCE ๐ซก Mark Zuckerberg announced Meta will spend hundreds of billions building AI data centers that each pull gigawatt-scale power chasing models that out-think humans. Prometheus its planned AI super-compute campus goes live in 2026 and Hyperion (the bolder sequel to Prometheus) later ramps to X GW all paid for by Meta's own capital. Meta folded every AI project into Superintelligence Labs after Llama X stalled. Bigger models need far more compute so the plan pivots from add servers to build mini-power plants. A single X GW cluster can host tens of thousands of" @rohanpaul_ai on X 2025-07-14 17:57:37 UTC 73.5K followers, 8096 engagements
"โก Thomson Reuters survey finds XX% of legal audit and accounting firms already profit from AI; Even ad-hoc adopters saw ROI Source: fortune .com/2025/07/01/ai-lawyers-accountants-auditors-lessons-for-us-all/" @rohanpaul_ai on X 2025-07-02 10:00:01 UTC 73.5K followers, 1736 engagements
"Paper Paper Title: "A Survey of Context Engineering for LLMs"" @rohanpaul_ai on X 2025-07-18 23:21:59 UTC 73.6K followers, 1142 engagements
"The picture sorts the data first. On top you see X imaging streamsradiology dermatology digital pathology ophthalmologyand X medical-text stream. Each arrow shows how those sources feed the rest of the stack. The images go through MedSigLIP a vision encoder that turns each scan or photo into a compact vector the language models can read. Those vectors flow into MedGemma 4B Multimodal a 4B-parameter model that handles both pictures and words in a single forward pass. For text-only work there is a larger 27B-parameter MedGemma model that skips the image part and focuses on language reasoning" @rohanpaul_ai on X 2025-07-09 23:16:50 UTC 73.6K followers, 1179 engagements
"ChatGPT literally saved this guys life after he got lost in the woods. The groupd got lost for X hrs in unmapped woods on an ATV ride then one guy sent phone GPS coords to ChatGPT every few minutes. ChatGPT replied with clear compass cues road names and terrain notes guiding them back to town unharmed. From r/ChatGPT/Own_Analyst3795" @rohanpaul_ai on X 2025-06-23 17:23:26 UTC 73.6K followers, 1.5M engagements
"with only a couple of prompts Gemini CLI can convert a messy folder containing hundreds of notes into a neatly named well-structured cross-linked Obsidian knowledge graph all in about half an hour and at minimal cost. from r/singularity/Ryoiki-Tokuiten" @rohanpaul_ai on X 2025-06-29 22:21:27 UTC 73.6K followers, 413K engagements
"OpenVision a fully open vision encoder family offering 25+ models (5.9M632M params) that outperform or match OpenAIs CLIP and Googles SigLIP on 9+ multimodal benchmarks. This matters as it's completely opentraining data code and weights includedunlike CLIP/SigLIP. OpenVision uses CLIPS (contrastive + generative training) and Recap-DataComp-1B (re-captioned with LLaVA3) for fully open training from scratch. Performance-wise OpenVision outdoes CLIP/SigLIP on LLaVA-1.5 and Open-LLaVA-Next setups across TextVQA ChartQA MME OCR etc. especially in higher-res variants like L/14-336. OpenVision-H/14" @rohanpaul_ai on X 2025-05-09 22:51:25 UTC 73.6K followers, 11.6K engagements
"๐ OpenAI is baking a payment-checkout into ChatGPT so shoppers can pay inside the chat and OpenAI will pocket a commission from each sale as per FT. The move will turn its free users into a fresh revenue engine beyond premium plans. Right now ChatGPT shows shopping links that dump users on outside sites which means friction for buyers and zero cut for OpenAI. Folding checkout into the chat slices out that jump and keeps money flowing through its own rails. Shopify's proven backend will handle card data fraud checks and fulfillment calls while OpenAI focuses on the chat front that recommends" @rohanpaul_ai on X 2025-07-17 02:22:58 UTC 73.6K followers, 3407 engagements
"LLM based Multi-agent portfolio work in crypto. Here researchers extend the LLM-based AI agent idea to digital assets with a team of analyst trader and risk-manager LLMs that co-operate on a basket of the top XX tokens. The framework surpasses single-agent and market benchmarks in hit-rate and drawdown control and keeps full explainability through agent dialogue logs. ideas. repec. org/p/arx/papers/2501.00826.html" @rohanpaul_ai on X 2025-07-14 00:33:22 UTC 73.6K followers, 52.3K engagements
"Most AI-alignment tests only see if a model avoids harm. This paper asks whether it can help people thrive. The team built the Flourishing AI Benchmark 1229 mixed questions tagged to X everyday domains Character Relationships Happiness Purpose Health Money and Faith. Judge models grade each answer and a geometric mean ties the scores so X weak area pulls the total down. They ran XX well known chatbots. OpenAI o3 topped the chart at XX but every system missed the XX pass mark with Faith and Purpose dragging hardest and Money showing the best numbers. The design stops cherry picking pushing" @rohanpaul_ai on X 2025-07-13 12:20:00 UTC 73.6K followers, 5726 engagements
"LLM for financial trading. More findings. Here researchers embed an LLM opinion module inside the Black-Litterman framework. By mapping model uncertainty to confidence weights they create portfolios that outperformed S&P XXX equal-weight and vanilla mean-variance allocations during Jun 2024-Feb 2025 rebalancing tests. they found that different LLMs exhibit varying levels of predictive optimism and confidence stability which impact portfolio performance. The source code and data are available at github. com/youngandbin/LLM-MVO-BLM. arxiv. org/abs/2504.14345" @rohanpaul_ai on X 2025-07-14 00:13:45 UTC 73.6K followers, 11K engagements
"Github Repo: Automatic document classification smart tagging and semantic search using OpenAI-compatible APIs and Ollama. For Paperless-ngx using OpenAI API Ollama Deepseek-r1 Azure and all OpenAI API compatible Services to automatically analyze and tag your documents. --- github. com/clusterzx/paperless-ai" @rohanpaul_ai on X 2025-07-18 14:52:00 UTC 73.6K followers, 3091 engagements
"๐ฅ OpenAI cut off a developer who weaponized ChatGPT's API This developer built this project which could respond to voice commands using ChatGPT's Realtime API. OpenAI confirmed the shutdown citing a violation of its policies prohibiting the use of its AI for weapon-related applications. The turret could interpret commands like "turn left" or "respond accordingly" with precise real-time adjustmentsindicating how easily language models can be integrated into lethal systems. This incident amplifies concerns about AIs potential role in automating military-grade systems similar to autonomous" @rohanpaul_ai on X 2025-01-11 12:58:27 UTC 73.6K followers, 380.5K engagements
"๐ฉบ Google Research release MedGemma 27B multimodal health-AI models that run on X GPU MedGemma 27B multimodal extends the earlier 4B multimodal and 27B text-only models by adding vision capabilities to a 27B-parameter language core. Training added X new datasets EHRQA and Chest ImaGenome so the model can read longitudinal electronic health records and localize anatomy in chest X-rays. The report states that this larger multimodal variant inherits every skill of the 4B model while markedly improving language fluency EHR reasoning and visual grounding. The 4B variant clocks XXXX% MedQA and 81%" @rohanpaul_ai on X 2025-07-09 23:02:28 UTC 73.6K followers, 11.8K engagements
"Another research showing how LLM+price time-series data is helping trading strategies ๐ LLMoE adaptive routing for trading strategies The LLM-Based Routing in Mixture-of-Experts (LLMoE) framework replaces a conventional softmax router with a language model that chooses between optimistic and pessimistic sub-experts after reading both price time-series and headline text. On MSFT data from 2006-2016 the approach lifts total return to XXXXX % versus XXXXX % for a classic MoE and raises the Sharpe ratio accordingly while maintaining full interpretability through the routers text rationale" @rohanpaul_ai on X 2025-07-13 17:28:43 UTC 73.6K followers, 21.1K engagements
"Most models freeze once a clip tops about XX seconds. LongVILAR1 shows how a 7B model can reason across hourlong footage with cheap hardware. The authors build a 52K question answer set called LongVideo Reason covering temporal spatial goal and plot cases. Training first copies these human style chains of thought then switches to reinforcement learning that scores each answer and keeps better policies. A trick named Multi modal Reinforcement Sequence Parallelism splits frames across GPUs and reuses embeddings trimming step time by 2.1x and handling 3600 frames on X A100s. The result matches" @rohanpaul_ai on X 2025-07-12 06:22:00 UTC 73.5K followers, 1476 engagements
"Child-sized robots can fly and can be used future search-rescue reach. iRonCub3 a X m XX kg humanoid that lifts XX cm using X jet thrusters A jet powered humanoid called iRonCub X has taken its first tethered jump showing that balanced flight is possible with X hobby sized turbines and whole body control. Current humanoids walk but cannot cross gaps or debris. This project bolts X turbines on the arms and X on a backpack runs them through force sensors and an unscented Kalman filter then asks a model predictive controller to keep the center of mass steady. Before lighting real engines the" @rohanpaul_ai on X 2025-07-06 15:00:37 UTC 73.5K followers, 2342 engagements
"How Each Agent Works Every agent outputs through a strict Zod schema which means the orchestrator can merge results without surprises. Adding a new field is a one-line schema tweak and a small search routine no risky prompt surgery" @rohanpaul_ai on X 2025-07-10 18:52:35 UTC 73.6K followers, XXX engagements
"@TeksEdge yep for now waiting for the next Llama's release though" @rohanpaul_ai on X 2025-07-17 05:22:53 UTC 73.5K followers, XX engagements
"Microsoft just dropped Phi-4-mini-flash-reasoning. - built on a new hybrid architecture - 10X higher throughput and a X to 3X reduction in latency - significantly faster inference without sacrificing reasoning performance. Microsoft swaps most of that heavy work for a lean SambaY layout with tiny gating blocks so the same 3.8B parameters think quicker and type sooner. ๐งฉ The quick idea Phi4miniflashreasoning keeps size small at 3.8B parameters but rebuilds the flow of information. A new decoderhybriddecoder stack called SambaY lets light recurrent pieces handle context a single fullattention" @rohanpaul_ai on X 2025-07-10 01:44:59 UTC 73.6K followers, 12K engagements
"The taxonomy of Context Engineering in Large Language Models is categorized into foundational components system implementations evaluation methodologies and future directions" @rohanpaul_ai on X 2025-07-18 23:21:53 UTC 73.6K followers, XXX engagements
"AI isnt just taking away entry-levels jobs its helping thousands apply for the same job with almost the same CV. AI will redefine the need for $80-120K+ University degrees. This is quite meaningful from the article. ๐ Being able to write well and think coherently were basic requirements in most graduate jobs XX XX years ago said a senior recruitment professional at a large consultancy firm from London speaking anonymously. Now they are emerging as basically elite skills. Almost nobody can do it. We see all the time that people with top degrees cannot summarise the contents of a document" @rohanpaul_ai on X 2025-07-14 03:08:23 UTC 73.6K followers, 10.7K engagements
"Wow this is such a brilliant idea for running AI models locally. ๐ฏ webFrame is @thewebAI 's backend that slices a huge language model into smaller shards sends each shard to a different computer on your own network then stitches the answers back together on the fly. Because every shard stays local no token or user data leaves the building and even a modest Mac Mini cluster can serve a state-of-the-art model in real time. Its redefining whats possible on local hardware. And they just published their benchmark results. ๐ webFrame pushed out 3X more tokens each second than a SOTA opensource" @rohanpaul_ai on X 2025-07-18 16:18:54 UTC 73.6K followers, 6218 engagements
"๐ฅ YC outlines how top AI startups prompt LLMs: prompts exceeding six pages XML tags meta-prompts and evaluations as their core IP. They found meta-prompting and role assignment drive consistent agent-like behavior. โ Key Learning Top AI startups use "manager-style" hyper-specific prompts6+ pages detailing task role and constraints. These aren't quick hacks; theyre structured like onboarding docs for new hires. Role prompting anchors the LLMs tone and behavior. Clear persona = better alignment with task. Example: telling the LLM it's a customer support manager calibrates its output" @rohanpaul_ai on X 2025-06-22 17:24:58 UTC 73.6K followers, 256.7K engagements
"Small model big browser skills thanks to smart compute splitting. Open web agents usually need huge models or tedious hitandmiss tuning so training a small open model that finishes multistep website tasks still feels like luck. This study shows how to split the training budget so an 8B Llama even beats its 70B teacher on many tasks. Weak 8B student first copies 70B demos through supervised fine tuning then swaps to onpolicy reinforcement learning while the lessons are fresh. The authors tried 1370 hyperparameter mixes and used bootstrap sampling to learn which ones really matter instead of" @rohanpaul_ai on X 2025-07-11 03:16:21 UTC 73.6K followers, 4547 engagements
"It was only a matter of time and now its starting - "individualized pricing using AI" Delta is ditching flat fares in favor of AI that determines how much you personally will pay for a ticket. The system treats pricing like a live stock ticker watching demand spikes route history and even seat layout then offering a personal price in real time. Delta feeds those signals into Fetcherr a 6-year-old startup already powering WestJet and Virgin Atlantic. The carrier says early tests lifted revenue per seat without harming load factors. For now shoppers can still game the system by clearing cookies" @rohanpaul_ai on X 2025-07-17 05:42:52 UTC 73.6K followers, 4262 engagements
"Browsers built into new language models now scrape social feeds on demand so guessing a strangers age or politics takes only a username. That simple trick is both a research lifeline and a privacy headache. The paper tests this power because public APIs keep shrinking while social science still needs fresh tweets. Authors spun up XX synthetic X accounts with set gender age class and ideology then compared model guesses to ground truth after XX tweets apiece. They also rechecked 1384 real users from a 2018 survey. GPT4o hit XX% on gender in the toy set and XX% on class in the survey and it" @rohanpaul_ai on X 2025-07-18 03:34:20 UTC 73.6K followers, 6382 engagements
"2025 IMO(International Mathematical Olympiad) LLM results are in. --- The benchmark's mission is rigorous assessment of the reasoning and generalization capabilities of LLMs on new math problems which the models have not seen during training. It applies a uniform scoring procedure so results do not depend on any provider-specific. During evaluation each model tackles every problem X times and MathArena reports the average score together with the total cost in USD for those runs" @rohanpaul_ai on X 2025-07-17 19:37:14 UTC 73.6K followers, 4823 engagements
"๐งฌ BIG news for Anti Aging inventions. ๐ Mushroom drug might slow fundamental aging processes. Psilocybins active metabolite kept human fibroblast cultures alive XX% longer at XX M and XX% longer at XXX M. Aged mice given monthly psilocybin showed XX% survival while only XX% of control animals made it through the same 10-month span. Psilocybin has long been tested for depression and addiction yet many researchers suspected a deeper link between the compound and biological aging because positive mental states often track with longer telomeres the protective DNA caps that shrink as cells age." @rohanpaul_ai on X 2025-07-10 21:56:42 UTC 73.5K followers, 14.5K engagements
"Grok X (Thinking) clocks XXXX% on ARC-AGI-2 grabbing the new SOTA. That score is almost 2x the last commercial best and now tops the Kaggle leaderboard. --- What ARC-AGI-2 tries to measure The benchmark contains a larger freshly curated set of grid-based puzzles that cannot be memorized forcing any model to invent a rule on the fly from a handful of examples then apply that rule to a held-out test grid. Unlike ARC-AGI-1 the new version adds an explicit cost axis so a model must prove both adaptability and efficiency instead of relying on brute-force search with huge compute budgets. Grok 4s" @rohanpaul_ai on X 2025-07-10 05:04:57 UTC 73.6K followers, 2075 engagements
"On FrontierMath ChatGPT agent solves XXXX% of questions on its first try FrontierMath is the hardest known math benchmark featuring novel unpublished problems that often take expert mathematicians hours or even days to solve. FrontierMath targets problems that ordinarily take professional mathematicians many hours or even days covering topics from computational number theory to algebraic geometry Epoch AIEpoch AI. Because every item is new and unpublished memorization is impossible so high scores reflect genuine reasoning skill. It proves again that giving AI models controlled access to tools" @rohanpaul_ai on X 2025-07-17 18:38:49 UTC 73.6K followers, 1543 engagements
"Most agent tests stop at tiny teams and ignore how the bots actually coordinate. AGENTSNET proposed in this paper shows what happens when the crowd scales and asks for real teamwork. ๐งฉ The benchmark packs X classical distributed tasks namely coloring vertex cover matching leader election and consensus into chat puzzles. ๐ Agents only talk to neighbors and exchange JSON messages for 2D+1 synchronous rounds mimicking the LOCAL model from distributed computing. ๐ Each agent receives a tiny prompt with its own name the neighbor list and the shared goal then decides what to share or hold back" @rohanpaul_ai on X 2025-07-17 13:29:00 UTC 73.6K followers, 1432 engagements
"Tech CEOs warn the hiring boom is over as AI writes code answers tickets and trims payrolls. Stanford data shows entry-level developer jobs sliding while only top specialists gain. Anthropic predicts XX% unemployment within X years. Microsoft cut 9000 letting Copilot write XX% of its code. IBM dumped 8000 roles. ADP payroll analysis pins the damage on devs aged 18-25 Amazon CEO Andy Jassy said last month that AI will reduce our total corporate workforce over the next few years as the company begins to need fewer people doing some of the jobs that are being done today. Shopify CEO Tobi Lutke" @rohanpaul_ai on X 2025-07-18 09:38:39 UTC 73.6K followers, 4298 engagements
"The writer of this prompt says "This guide will cost openai thousands of dollars. Lol" ๐" @rohanpaul_ai on X 2025-07-18 09:32:22 UTC 73.6K followers, 4032 engagements
"So Combining these two benchmarks (The SpreadsheetBench and The Internal Banking Benchmark) ChatGPT Agent can automate a substantial portion of the tedious and data-intensive tasks that define the role of a junior investment banking analyst. - It can conduct complex research and analysis to build financial models from scratch. - It can expertly manipulate spreadsheets a fundamental requirement for the job. - It can reason plan and execute multi-step workflows that involve using different tools (like the browser for research and the terminal for data processing/file creation)" @rohanpaul_ai on X 2025-07-17 18:58:38 UTC 73.6K followers, 1946 engagements
"This will prove genuinely useful to rely on daily. @Proactor_ai just released v1.0 the 1st self-active AI teammate that acts on X prompts giving real-time fact-checks and smart interventions. It will act on its own based on the situation. It links X skills: perception to capture audio reasoning to compare each claim with search results and autonomous action to deliver concise corrections or suggestions. - real-time transcription for meetings calls and discussions - while you speak Proactor listens analyzes and immediately provides targeted AI advice. - can automatically identifies and" @rohanpaul_ai on X 2025-07-08 15:01:19 UTC 73.5K followers, 4470 engagements
"This headline pumps iron. Elon sure understands his crowd. ๐ Have you tried Groks new companion mode yet" @rohanpaul_ai on X 2025-07-15 08:37:39 UTC 73.6K followers, 4972 engagements
"Cognition AI is taking Windsurfs code brand and $82M revenue days after Google bought its founders for $2.4B slotting the prize under a $4B valuation. So basically Google took the captains Cognition got the ship. Google sidestepped a full purchase by licensing the tech and hiring the chiefs a play that avoids antitrust noise yet strips the startup of leadership. Cognition grabs the rest promises instant vesting for every engineer and will feed Windsurfs data into Devin its automated coder hoping the extra examples cut hallucinations and widen language support. --- bloomberg." @rohanpaul_ai on X 2025-07-14 19:06:09 UTC 73.6K followers, 4103 engagements
"This is quite a landmark paper from @GoogleDeepMind ๐ 2x faster inference because tokens exit the shared loop early. ๐ During training it cuts the heavy math dropping attention FLOPs per layer by about half so the same budget trains on more data. Shows a fresh way to teach LLMs to plan steps inside their own reasoning loop instead of hard-coding a single chain. Second it proves the mixer idea scales. By jumbling several small recursive experts and letting the model pick which one to call next the team pushes accuracy on math and coding benchmarks without ballooning parameter count." @rohanpaul_ai on X 2025-07-16 04:38:17 UTC 73.6K followers, 8470 engagements
"Did you know XX% of US caselaw are available open sourced on @huggingface .๐ฏ This dataset contains XXX million cases from the Caselaw Access Project and Court Listener. The Caselaw Access Project consists of nearly XX million pages of U.S. federal and state court decisions and judges opinions from the last XXX years. In addition Court Listener adds over XXX thousand cases scraped from XXX courts. The Caselaw Access Project and Court Listener source legal data from a wide variety of resources such as the Harvard Law Library the Law Library of Congress and the Supreme Court Database. From" @rohanpaul_ai on X 2025-07-16 17:25:19 UTC 73.6K followers, 29.2K engagements
"Frontier language models shine on Olympiadlevel benchmarks yet stumble on chores like counting letters. The paper samples easy reasoning tasks dials up length or distractions and watches accuracy crash. Tests cover word or character counting logic trees proofstyle math stories and travel itineraries that only need basic bookkeeping. As paragraphs grow or extra names appear small step errors snowball models lose track of state guess from phrase frequency or copy memorised solutions instead of thinking. A handbuilt Unpuzzles set flips famous riddles into trivial variants yet models often reuse" @rohanpaul_ai on X 2025-07-12 14:14:00 UTC 73.6K followers, 133.5K engagements
"Metas reply to Stargate comes through Prometheus at X GW and Hyperion at X GW running multi-billion-dollar GPU clusters that sit in tents. - Meta - Prometheus IT Power by end of 2026 is 1020MW. The total number of chips used is 500000. Total compute power is 3171044226 TFLOPS. - Anthropic - Project Rainier IT Power by end of 2026 is 780MW. The total number of chips used is 800000. Total compute power is 1040000000 TFLOPS. - OpenAI - Stargate IT Power by end of 2026 is 880MW. The total number of chips used is 400000. Total compute power is 2469594595 TFLOPS. --- semianalysis." @rohanpaul_ai on X 2025-07-14 04:48:27 UTC 73.6K followers, 19.1K engagements
"๐ SceneScript from Meta Reality Labs Research turns mapping rooms into writing short text commands so headsets can sketch walls doors and objects on the fly without fragile geometry code. ๐ค It learns that trick inside a huge synthetic world of 100000 virtual homes then plugs straight into large language models so you can ask spatial questions like you chat with ChatGPT. ๐ Key point is that SceneScript swaps 3D math for plain script generation makes the vocabulary expandable and lets anyone tweak a scene by correcting tokens all with the same next-word prediction trick that powers modern" @rohanpaul_ai on X 2025-07-15 05:44:05 UTC 73.5K followers, 3236 engagements
"Most apps pick one large language model then hope it can do every job. FusionBench proves that mixing models with smart routing shared thoughts or distillation beats any solo model. FusionBench gathers 103M tokens of queries answers and thought sketches from XX open models that range from 8B to 671B parameters. It covers XX familiar tasks in math code commonsense world knowledge and reading so tests feel realistic. Each query holds two answer styles a straight reply and a detailed reasoning path then a judge score and cost tag. FusionFactory then tries three fusion tricks. Query level trains" @rohanpaul_ai on X 2025-07-15 13:06:00 UTC 73.5K followers, 6179 engagements
"Multitoken masks plus gated LoRA cut LLM latency without hurting accuracy code output X faster. LLMs can already guess several words ahead this paper shows how to cash in on that foresight for X faster code and math generation with no drop in answer quality. ๐ What problem are they poking at Autoregressive models speak one token at a time so every extra word forces another full pass through the network. That singlestep habit slows reading back code proofs or long chat replies. The authors noticed the models hidden states quietly predict whole phrases anyway sitting unused in the logits list." @rohanpaul_ai on X 2025-07-18 01:27:16 UTC 73.6K followers, 3025 engagements
"LLMs nail standard school word problems but fall apart when the question needs realworld sense. This scoping study tracks why. โ The Core Concepts: LLMs slice every prompt into tiny tokens and predict the next token from statistics so solving means matching patterns not building a picture of the story . Problem sets used to train and test the models are heavily skewed toward sproblems short tasks that collapse to plain arithmetic. Only a few include contextual twists or nonsense questions . ๐ Method: The authors compared X OpenAI models on XXX questions from X popular data sets plus classic" @rohanpaul_ai on X 2025-07-06 22:52:20 UTC 73.6K followers, 2059 engagements
"๐ง Linking to a standard Large Language Model unlocks reasoning. Ask Which chair sees the TV the chatbot parses the generated script computes sight lines and replies in plain text all without extra geometric code. If the model misplaces a door the user can type a correction the network infills the right token sequence and the interpreter snaps the mesh back into place" @rohanpaul_ai on X 2025-07-15 05:45:46 UTC 73.5K followers, XXX engagements
"Beautiful research from @Apple More thoughts stop helping once tasks cross critical depth. Thinking tokens rise then crash revealing compute inefficiency. So Standard LLMs beat LRMs on easy puzzles unexpectedly. Researchers stress-test them on puzzles whose difficulty can be dialed up step by step. Thinking pull ahead mid-way but every model collapses once the puzzle grows past a critical depth. Even stranger near that point the thinker writes fewer thoughts despite plenty of allowed tokens hinting at a built-in ceiling on current inference-time reasoning. Key findings below. ๐งฉ Controlled" @rohanpaul_ai on X 2025-06-06 12:40:34 UTC 73.6K followers, 263.1K engagements
"Figure just rolled out its 3rdgeneration battery for the F.03 humanoid. With higher energy density F.03 keeps its 5hour spec yet gains extra payload headroom for future arms. The new pack is a structural part of the robot so it doubles as the torso frame cuts BOM (Bill of Materials) by XX% and shrugs off a X m concrete drop. Active cooling lets the pack gulp X kW during pitstop charging stretching runtime to roughly the entire 5hour shift Fseries robots already hit. Cell prices have slipped below $130/kWh this year and Figures structural approach removes the usual XX% overhead for separate" @rohanpaul_ai on X 2025-07-18 13:04:00 UTC 73.6K followers, 2451 engagements
"AI compute is running into a power wall. Chinas grid already closing on 10000 TWh while the United States has sat near 4178 TWh for two decades WikipediaU.S. If training and serving bigger models keeps eating watts at the current pace that flat U.S. line could matter more than any parameter count. Chinas curve rockets upward because the country kept adding coal wind and solar at break-neck speed lifting generation six-fold since 1999. The U.S. line crawls sideways topping out just after 2010 and hovering around the same 4000 TWh ever since. Chinas burst came with a huge build-out of" @rohanpaul_ai on X 2025-07-15 04:43:56 UTC 73.5K followers, 7398 engagements
"PDF parsing is still painful because LLMs reorder text in complex layouts break tables across pages and fail on graphs or images. ๐กTesting the new open-source OCRFlux model and here the results are really good for a change. So OCRFlux is a multimodal LLM based toolkit for converting PDFs and images into clean readable plain Markdown text. Because the underlying VLM is only 3B param it runs even on a 3090 GPU. The model is available on @huggingface . The engine that powers the OCRFlux teaches the model to rebuild every page and then stitch fragments across pages into one clean Markdown file." @rohanpaul_ai on X 2025-07-01 14:37:08 UTC 73.6K followers, 149.6K engagements
"Deep seek interesting prompt. From Reddit" @rohanpaul_ai on X 2025-01-26 19:41:49 UTC 73.6K followers, 12.1M engagements
"๐ค๐ธ Carnegie Mellon researchers reveal headline AI agents flop on 62%70% on performing real-world professional office tasks" @rohanpaul_ai on X 2025-07-06 16:42:30 UTC 73.6K followers, 141.9K engagements
"To get the new Grok companion. updates Grok taps bottom right settings download companions (one time) chooses a chat companion" @rohanpaul_ai on X 2025-07-14 23:28:19 UTC 73.6K followers, 6028 engagements
"๐ฅ OpenAI rolled out agent mode in ChatGPT. Lets the model click around a virtual computer run code and finish multistep jobs on its own hitting XXXX% on Humanitys Last Exam while handling chores like building slide decks or buying groceries. It reaches XXXX% accuracy on Humanitys Last Exam (HLE) while older baselines like OpenAI o3 without tools sit at XXXX% and deep-research with browsing reaches 26.6%. The HLE exam spans 2500 expert-level questions across 100+ subjects that were crowdsourced specifically to stump modern language models. So coubling the previous best pass@1 score signals a" @rohanpaul_ai on X 2025-07-17 18:29:45 UTC 73.6K followers, 5404 engagements
"๐ข A new 32B model EXAONE XXX just dropped on @huggingface from LG AI Research. ๐ค Outcompetes Qwen 235B on coding and exceeds DeepSeek R1 V3 671B on instruction tasks. - toggleable reasoning 131K context and a non-commercial license. - It solves more edge cases than Qwen 235B while using about one-seventh of the memory footprint - Trainded on 14T carefully filtered tokens. - supports Model Context Protocol (MCP) and Function Calling" @rohanpaul_ai on X 2025-07-15 07:49:02 UTC 73.6K followers, 8675 engagements
""Grok X is better than PhDs in every subject no exception" - Number X on Humanitys Last Exam with XXXX% - Number X on ARC-AGI-2 with XXXX% where the next best score is at 8.6%" @rohanpaul_ai on X 2025-07-10 06:25:04 UTC 73.6K followers, 11.8K engagements
"The paper finds that money-based crowd odds on Polymarket called the 2024 presidential result more accurately and earlier than every traditional poll with the edge most obvious in key swing states where markets stayed on Trump while surveys wavered So basically Polls still miss presidential winners despite $50M spent each cycle. The paper pits those polls against daily Polymarket odds modelling both with Bayesian structural time series. Market prices jumped the night of Trump's July shooting attempt giving him XX% while polls barely moved. By XX October market forecasts never dipped below" @rohanpaul_ai on X 2025-07-16 13:56:00 UTC 73.6K followers, 1437 engagements
"Kwai KeyeVL turns messy short videos into machinefriendly stories. Kwai Keye-VL is an 8B-param MLLM built by Kuaishou (the company behind the Kwai short-video app) to understand short videos as easily as still images while still handling regular vision-language tasks. The 8Bparameter model tops video tests yet keeps strong image skills. Most existing multimodal LLMs work well on single images but struggle with the rapid scene changes audio and dense context in TikTok-style clips. Keye-VL fixes that gap by training on a XXX billion-token corpus rich in video then adding a recipe that teaches" @rohanpaul_ai on X 2025-07-04 00:38:01 UTC 73.6K followers, 1587 engagements
"Grok X is crazy. Everyone keeps cranking out projects. Compiling XX incredible examples. ๐ ๐งต 1/n Grok4 generates click-morphing 3D attractor particles with ThreeJS shaders browser-native. XX FPS on consumer laptops" @rohanpaul_ai on X 2025-07-12 01:05:24 UTC 73.6K followers, 12.1K engagements
"๐ AI assistants are now mandatory kit in top tier US law firms. DLA Piper already embeds Copilot in Microsoft apps and deploys in house models that spot Foreign Corrupt Practices Act trouble before it blooms. Gibson Dunns ChatGPT Enterprise pilot lets XXX people compare Google Gemini and Claude on real briefs. Ropes & Grays Hebbia agent squeezes fund term extraction from XX hours to about X. --- businessinsider. com/big-law-top-10-firms-ai-overhaul-use-cases-2025-7" @rohanpaul_ai on X 2025-07-18 09:22:55 UTC 73.6K followers, 1535 engagements
""The era when humans program is nearing its end within our group. Our aim is to have AI agents completely take over coding and programming. (.) we are currently initiating the process for that." - Softbank founder Masayoshi Son He estimates that approximately 1000 AI agents would be needed to replace each employee because "employees have complex thought processes." --- lightreading. com/ai-machine-learning/softbank-aims-for-1-billion-ai-agents-this-year" @rohanpaul_ai on X 2025-07-17 17:40:34 UTC 73.6K followers, 3126 engagements
"The paper answer two questions: X. What's the difference between prediction and world models X. Are there straightforward metrics that can test this distinction Engineers often judge large models by how well they guess the next token. This study shows that great guesses do not guarantee a real grasp of the rules behind the data and it introduces a quick way to check. The authors build tiny synthetic tasks that obey a known set of rules finetune a foundation model on each task then watch how the model finishes fresh examples from the same rulebook. If its answers always change when the hidden" @rohanpaul_ai on X 2025-07-13 04:53:28 UTC 73.6K followers, 6408 engagements
"field footage of Unitree Go2 Pro: basement and park --- reddit. com/r/robotics/comments/1lty64o/some_field_footage_of_unitree_go2_pro_basement/" @rohanpaul_ai on X 2025-07-08 07:40:00 UTC 73.5K followers, 1569 engagements
"Recruiters face XX% surge in AI-crafted rsums. hitting 11000 submissions per minute. Many rsums now mirror job-description keywords from simple ChatGPT prompts. AI agents auto-apply on behalf of candidates forcing firms into an AI vs AI screening arms race" @rohanpaul_ai on X 2025-06-24 04:08:56 UTC 73.6K followers, 273K engagements
"๐ ChatGPT agent fuses three older tools blending Operators web-browsing clicks Deep Researchs summarization tricks and ChatGPTs reasoning into one system so a single prompt can trigger browsing code execution or API calls without manual tool-switching" @rohanpaul_ai on X 2025-07-17 18:35:40 UTC 73.6K followers, XXX engagements
""hyper precise prompts to describe what you want" is absolutely the BEST strategy. ๐ฅ Many YCombinator AI startups prompts are super detailed (e.g. 6+ page prompts) with XML tags and meta-prompting techniques. e.g. Parahelp's customer support agent prompt is 6+ pages meticulously outlining instructions for managing tool calls. --- โ Key Learning from this doc Top AI startups use "manager-style" hyper-specific prompts6+ pages detailing task role and constraints. These aren't quick hacks; theyre structured like onboarding docs for new hires. Role prompting anchors the LLMs tone and behavior." @rohanpaul_ai on X 2025-07-13 19:53:24 UTC 73.6K followers, 149.3K engagements
"OpenAI is preparing to release ChatGPT 'agents' that could threaten Microsoft Excel and PowerPoint" @rohanpaul_ai on X 2025-07-16 08:33:09 UTC 73.6K followers, 16.6K engagements
"ChatGPT Record Mode now available to ChatGPT Plus users globally in the macOS desktop app. It lets you record up to XXX minutes of voicelike meetings brainstorming sessions or voice notesand provides live transcription and a postsession summary saved as an editable canvas in your chat history. Just Tap the mic icon in chat give mic & systemaudio permissions speak naturally then stop or pause. ChatGPT creates a transcript and structured summary with highlights action items and timestamps. As of now Record Mode is not available for Linux Windows browsers or mobile so here you won't see the mic" @rohanpaul_ai on X 2025-07-16 18:27:28 UTC 73.6K followers, 3664 engagements
"I asked ChatGPT Agent to build a slide presentation on this. If Apple buys Perplexity how big of an acquisition that will be vs Apple's historial acquisitions" @rohanpaul_ai on X 2025-07-19 02:12:57 UTC 73.6K followers, 2814 engagements
"Its a hefty 206-page research paper and the findings are concerning. "LLM users consistently underperformed at neural linguistic and behavioral levels" This study finds LLM dependence weakens the writers own neural and linguistic fingerprints. ๐ค๐ค Relying only on EEG text mining and a cross-over session the authors show that keeping some AI-free practice time protects memory circuits and encourages richer language even when a tool is later reintroduced" @rohanpaul_ai on X 2025-06-17 00:28:35 UTC 73.6K followers, 2.3M engagements
"@OpenAI incredibly useful for meetings. ๐ would have been massive if it was available for Linux Windows browsers and mobile" @rohanpaul_ai on X 2025-07-16 18:23:32 UTC 73.6K followers, 11.2K engagements
"AI can help us many ways. Here ChatGPT helped someone quit weed retrieve scam money figure out career path boost fitness and mental health. One comment I really liked is "I used ChatGPT not like a coach or therapist just like a space to get real with myself." The thread is full with many stories like that" @rohanpaul_ai on X 2025-07-16 08:46:41 UTC 73.6K followers, 3581 engagements
"๐งต 2/n. Why constant space matters Every document now carries the same vector count so the index grows linearly with corpus size rather than document length. Fixed length lets the database pack vectors into cachefriendly blocks which improves paging and SIMD throughput and it roughly halves index size compared with unpooled ColBERT. This approach makes it: - Easier to manage and scale in a vector database: All documents have uniform storage sizes simplifying retrieval logic. - More efficient for query-time processing: Avoids the overhead of variable-length comparisons leading to better cache" @rohanpaul_ai on X 2025-07-18 19:50:44 UTC 73.6K followers, XXX engagements
"๐ฏ An ex-OpenAI engineer shares his thoughts about OpenAI Has lots of insights on OpenAIs day-to-day life unlike anything I have read before. He joined OpenAI as a software engineer on the applied side spending about XX months building the Codex coding agent and related internal prototypes. Most of his time went into writing Python tuning GPU budgets and sprinting with a small team to take Codex from first commit to its public launch in X weeks. He left because of his own craving for founder freedom yet calls the year the most eye-opening move of his career. ๐ Culture shock OpenAI ballooned" @rohanpaul_ai on X 2025-07-16 02:36:27 UTC 73.6K followers, 116.5K engagements
"A Reddit user deposited $XXX into Robinhood then let ChatGPT pick option trades. XXX% win reate over XX days. He uploads spreadsheets and screenshots with detailed fundamentals options chains technical indicators and macro data then tells each model to filter that information and propose trades that fit strict probability-of-profit and risk limits. They still place and close orders manually but plan to keep the head-to-head test running for X months. This is his prompt ------- "System Instructions You are ChatGPT Head of Options Research at an elite quant fund. Your task is to analyze the" @rohanpaul_ai on X 2025-07-13 05:23:03 UTC 73.6K followers, 3.6M engagements
"Voice is winning workflows and OpenAI stamped it in the UI. Three mics on the dock. record dictate chatone tap per vibe" @rohanpaul_ai on X 2025-07-17 03:18:55 UTC 73.6K followers, 3848 engagements
"Context Engineering Evolution Timeline: A comprehensive visualization of the development trajectory of Context Engineering implementations from 2020 to 2025" @rohanpaul_ai on X 2025-07-18 23:21:55 UTC 73.6K followers, XX engagements
"A Local LLM as a coding-autopilot is so surreal. A single file consisting of vectors and somehow holding the knowledge and meaning of the human world. ๐คฏ May vectors become much more powerful in the time to come" @rohanpaul_ai on X 2025-07-19 12:06:00 UTC 73.6K followers, 1997 engagements
"unemployment rates from Federal Reserve Bank of NY computer engineering ranks 3rd at 7.5%" @rohanpaul_ai on X 2025-07-09 17:02:17 UTC 73.6K followers, 3264 engagements
"Money habits differ worldwide yet nobody knows which habits shape LLM advice. This study asked X major chatbots and humans from XX countries the same XX finance questions. Each model answered XXX times researchers kept the median answer for every prompt and compared it with the INTRA survey medians. When the authors ran that check on the XX finance questions every large language model landed in the same tight group and the only human data that fell into that pocket came from Tanzania The models almost always choose or price the gamble right at that average. In plain terms they treat a risky" @rohanpaul_ai on X 2025-07-16 12:53:00 UTC 73.6K followers, 2944 engagements
"It is happening again. This time the magic word is not .com. It is AI. According to Torsten Slok the influential chief economist at Apollo Global Management ๐ธ AIs superstar stocks now trade at P/E levels higher than the 2000 dot-com crest hinting at another bubble. Sloks chart shows the top XX names in the Standard and Poors XXX carrying a richer premium than in 2000 while the other XXX barely move. Back in 2000 the internet was real yet a bubble still erased $5T of market value. The pattern repeats: exciting tech easy money and sky-high multiples. If profits do not rise quickly lofty" @rohanpaul_ai on X 2025-07-18 13:50:00 UTC 73.6K followers, 5307 engagements
""Developing superintelligence is now in sight. We should act as if it's going to be ready in the next 2-3 years." - Mark Zuckerberg About paying $XXX million or $XXX million pay packages he argued that Meta will spend hundreds of billions on compute and data-center build-outs so paying roughly $XXX million-plus to attract about 50-70 top researchers is sensible since that wage bill is tiny next to the overall capital outlay. And also that the market for world-class AI talent is extremely competitive because only a handful of researchers can do this work and every major lab wants them. ---" @rohanpaul_ai on X 2025-07-17 05:59:55 UTC 73.6K followers, 6861 engagements
"A follow-up study on Apple's "Illusion of Thinking" Paper is published now. Shows the same models succeed once the format lets them give compressed answers proving the earlier collapse was a measurement artifact. Token limits not logic froze the models. Collapse vanished once the puzzles fit the context window. So Models failed the rubric not the reasoning. โ The Core Concepts Large Reasoning Models add chain-of-thought tokens and self-checks on top of standard language models. The Illusion of Thinking paper pushed them through four controlled puzzles steadily raising complexity to track how" @rohanpaul_ai on X 2025-06-12 22:54:24 UTC 73.6K followers, 476K engagements
"This 39-page report from Kuaishou explains how the company rebuilt its video recommender system into one end-to-end generative model called OneRec Traditional recommenders run separate retrieval pre-ranking and ranking stages that waste compute on network transfers and chase conflicting goals. โ The Core Concepts OneRec deletes retrieval prerank and rank replacing them with one encoderdecoder that maps user context to video tokens in one forward pass. All parameters chase the same final reward so gradients stop fighting each other. High arithmetic density keeps GPUs busy with matrix" @rohanpaul_ai on X 2025-07-01 11:00:02 UTC 73.6K followers, 2002 engagements
"๐ฆ Goldman Sachs is testing a hybrid workforce (AI+humans) with autonomous software engineer AI agent Devin as a new employee The AI agent will draft unit tests clean legacy scripts and open pull requests while a human reviews every change. The bank currently employs around 12000 human devs" @rohanpaul_ai on X 2025-07-11 21:06:55 UTC 73.6K followers, 2571 engagements
"SpaceX just committed $2B to xAI. Musk bets that shared AI data and hardware lift every firm he controls. Future deals may see Grok guiding Starlink antennas or living inside Teslas Optimus robots. --- wsj. com/tech/spacex-to-invest-2-billion-into-elon-musks-xai-413934de" @rohanpaul_ai on X 2025-07-13 04:07:49 UTC 73.5K followers, 1977 engagements
"๐ Huggingface releases SmolLM3 SoTA 3B model 128k context dual mode reasoning (think/no_think) ๐ค @huggingface released SmolLM3 a 3B parameter multilingual reasoner that matches bigger 4B models handles 128k tokens and ships with an open-sourced training blueprint in this blog post. ๐ They pre-trained on 11.2T tokens then stretched context with YARN up-sampling finishing the run on XXX H100 GPUs in XX days. ๐ง A built-in dual think / no_think switch lets users decide between fast answers or slower chain-of-thought traces. ๐ How they pulled it off Grouped Query Attention trades multi-head" @rohanpaul_ai on X 2025-07-08 21:36:54 UTC 73.5K followers, 2818 engagements
"So @xAI 's @grok X really did hit XXXX% on HLE (Humanities Last Exam) ๐คฏ --- (HLE holds 2500 expert-written questions spanning more than XXX subjects including math physics computer science and humanities and XX% of them mix text with images. The authors deliberately built in anti-gaming safeguards and hid a private question set so that simply memorising answers will not help a model.)" @rohanpaul_ai on X 2025-07-10 04:37:05 UTC 73.6K followers, 28.9K engagements
"Someone just forked the original OpenAI Codex CLI A terminalbased coding agent that lets you chatprompt code changes run them safely in a sandbox and iterateall while supporting multiple AI providers (OpenAI Gemini OpenRouter Ollama)" @rohanpaul_ai on X 2025-04-18 13:39:50 UTC 73.6K followers, 6430 engagements
"Dharmesh Shah on leveraging AI in everything you do. "It's not a you vs AI. That's not the mental model you should have in here. The right mental frame of reference you should have is It's you to the power of AI. AI is an amplifier of your capability." --- From 'My First Million' YT channel (link in comment)" @rohanpaul_ai on X 2025-07-16 21:02:10 UTC 73.6K followers, 2863 engagements
"AWS is previewing a specialized storage offering Amazon S3 Vectors that it claims can cut the cost of uploading storing and querying vectors by up to XX% compared to using a vector database. This new bucket type keeps vector data inside S3 itself brings a dedicated similarity-query API and promises up to XX% lower costs than running a separate vector database. The launch targets teams that need large cheap vector stores to feed retrieval-augmented generation memory for AI agents or other semantic-search workloads. ๐ What S3 Vectors is S3 Vectors adds vector buckets. Inside each bucket you" @rohanpaul_ai on X 2025-07-17 03:54:30 UTC 73.6K followers, 3626 engagements
"This paper wants to understand LLM's proficiency in enhancing code performance at the repository level or delivering meaningful speed gains. Because they do not know which lines of code waste the most time or how to coordinate fixes across several files. The authors built SWE-Perf to measure that shortfall. Human reviewers in the benchmark trimmed average runtime by XXXX% while the best agent improved only XXX% even though it passed almost XX% of the functional checks. That gap shows that real performance work still needs profiling tools cross file reasoning and awareness of low level" @rohanpaul_ai on X 2025-07-18 10:01:00 UTC 73.6K followers, 1587 engagements
"๐งฌ Further to my previous post last month's huge medical AI innovation Microsoft's AI Diagnostic Orchestrator (MAI-DxO) must be mentioned. ๐ Till now drug research has followed Erooms law where the cost to bring one therapy to market roughly doubles every X years and the success rate per $X B keeps sinking. That trend shows biology knowledge as the main choke point. ๐ฅ MAI-DxO attacks that choke point by turning a large language model into a virtual panel of clinicians. It asks follow-up questions picks tests checks prices and then cross-examines its own reasoning before it commits to a" @rohanpaul_ai on X 2025-07-19 05:52:58 UTC 73.6K followers, 2523 engagements
"The authors start by reminding that a language model chooses each next token based on past tokens plus the surrounding context. Older prompt-engineering ideas package that context as one long prompt string which works for toy demos but quickly falls apart in real systems. They then reject the single-string view and introduce context engineering. Here the context is a bundle of smaller parts that get sourced filtered and glued together by helper functions before each model call. Treating context as several moving pieces makes it easier to swap data in and out on the fly" @rohanpaul_ai on X 2025-07-18 23:21:54 UTC 73.6K followers, XX engagements
"that was quick. 1st party support for Claude Sonnet is back on @windsurf_ai" @rohanpaul_ai on X 2025-07-16 23:45:33 UTC 73.6K followers, 3198 engagements
"Because these mechanisms are text-level skills research groups have already tested the same reasoning setups on chemistry puzzles where the model must justify reaction mechanisms or property predictions and they report clear gains without chemistry-specific tweaks. arxiv .org/abs/2505.07735v1" @rohanpaul_ai on X 2025-07-19 15:37:43 UTC 73.6K followers, XXX engagements
/creator/twitter::rohanpaul_ai