LunarCrush LLM | creator/twitter::1163801450968997889/posts

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

[@casper_hansen_](/creator/twitter/casper_hansen_)
"qwen is so loved in the research community for one simple thing: it just works. it's insane how much tinkering you have to do for some models to work. the iteration time compounds especially for 100B+ models. guaranteed annoyance"  
[X Link](https://x.com/casper_hansen_/status/1978162195281183023) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-14T18:13Z 10.5K followers, 2115 engagements


"For those that think this is hype: no it's not. I invented a little training/eval set last night. Model score is about XX% with my own optimized prompt. GEPA takes that to XX% with auto="light" optimization. I imagine this can go to XX% with heavy optimization. Insane potential in this to become an automated optimizer for your model in production. Why train if you can just auto-adapt the model as prompts come through"  
[X Link](https://x.com/casper_hansen_/status/1978169521794859437) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-14T18:42Z 10.5K followers, 23.8K engagements


"Thanks Claude knew I could count on you my yes man"  
[X Link](https://x.com/casper_hansen_/status/1977637930043928825) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-13T07:29Z 10.5K followers, 2022 engagements


"o3 competitor: GLM XXX by Zhipu AI - hybrid reasoning model (on by default) - trained on 15T tokens - 128k context 96k output tokens - $XXXX / 1M tokens - MoE: 355B A32B and 106B A12B Benchmark details: - tool calling: XXXX% success rate vs Sonnets XXXX% vs Kimi K2 XXXX% - coding: XXXX% win rate vs Sonnet XXXX% vs Kimi K2 XXXX% vs Qwen3 Coder"  
[X Link](https://x.com/casper_hansen_/status/1949819017134301231) [@casper_hansen_](/creator/x/casper_hansen_) 2025-07-28T13:07Z 10.5K followers, 51.5K engagements


"I found X new methods that all outperform GEPA; except none of them are open-source. I stopped believing such papers a long time ago. The unfortunate reality of today is that papers are nothing more than a few words. If your code is not public how can I verify that it works Papers: - Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models: - C-Evolve: Consensus-based Evolution for Prompt Groups: - Maestro: Joint Graph & Config Optimization for Reliable AI Agents:"  
[X Link](https://x.com/casper_hansen_/status/1979535725222584658) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-18T13:11Z 10.5K followers, 18K engagements


"largely agree with karpathy in some aspects but also think the models of today are already what we would consider AGI just XX years ago"  
[X Link](https://x.com/casper_hansen_/status/1979563620477661445) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-18T15:02Z 10.5K followers, 2267 engagements


"When you work in distribution of Claude Kimi and GLM: EVERYTHING feels great. You can vibe code. You won't solve novel problems but at least you produce code"  
[X Link](https://x.com/casper_hansen_/status/1980303495967244648) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-20T16:02Z 10.5K followers, 2910 engagements


"Calling it now: Every inference engine should have a built-in GEPA system that auto-evolves your prompt over time"  
[X Link](https://x.com/casper_hansen_/status/1978851428354379807) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-16T15:52Z 10.5K followers, 11.2K engagements


"@Grad62304977 You found this happens on Qwen3 models too"  
[X Link](https://x.com/casper_hansen_/status/1979929959717060872) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-19T15:17Z 10.5K followers, XXX engagements


"@ai_for_success it would have to be dario for the sole reason that he wants regulatory capture. agi for me but not for thee"  
[X Link](https://x.com/casper_hansen_/status/1979946543173226738) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-19T16:23Z 10.5K followers, 3118 engagements


"NEW DeepSeek OCR model that outperforms dots ocr while prefilling 3x less tokens"  
[X Link](https://x.com/casper_hansen_/status/1980166248878203093) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-20T06:56Z 10.5K followers, 37.1K engagements


"if your code review does not look like this are you even code reviewing"  
[X Link](https://x.com/casper_hansen_/status/1980290157849788459) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-20T15:09Z 10.5K followers, 1324 engagements


"@mgoin_ Excellent work from all contributors and reviewers"  
[X Link](https://x.com/casper_hansen_/status/1980511516169498872) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-21T05:48Z 10.5K followers, XX engagements


"LoRA finetuned experts in MoE now runs properly in vLLM"  
[X Link](https://x.com/casper_hansen_/status/1980525929026973904) [@casper_hansen_](/creator/x/casper_hansen_) 2025-10-21T06:45Z 10.5K followers, XXX engagements

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@casper_hansen_ "qwen is so loved in the research community for one simple thing: it just works. it's insane how much tinkering you have to do for some models to work. the iteration time compounds especially for 100B+ models. guaranteed annoyance"
X Link @casper_hansen_ 2025-10-14T18:13Z 10.5K followers, 2115 engagements

"For those that think this is hype: no it's not. I invented a little training/eval set last night. Model score is about XX% with my own optimized prompt. GEPA takes that to XX% with auto="light" optimization. I imagine this can go to XX% with heavy optimization. Insane potential in this to become an automated optimizer for your model in production. Why train if you can just auto-adapt the model as prompts come through"
X Link @casper_hansen_ 2025-10-14T18:42Z 10.5K followers, 23.8K engagements

"Thanks Claude knew I could count on you my yes man"
X Link @casper_hansen_ 2025-10-13T07:29Z 10.5K followers, 2022 engagements

"o3 competitor: GLM XXX by Zhipu AI - hybrid reasoning model (on by default) - trained on 15T tokens - 128k context 96k output tokens - $XXXX / 1M tokens - MoE: 355B A32B and 106B A12B Benchmark details: - tool calling: XXXX% success rate vs Sonnets XXXX% vs Kimi K2 XXXX% - coding: XXXX% win rate vs Sonnet XXXX% vs Kimi K2 XXXX% vs Qwen3 Coder"
X Link @casper_hansen_ 2025-07-28T13:07Z 10.5K followers, 51.5K engagements

"I found X new methods that all outperform GEPA; except none of them are open-source. I stopped believing such papers a long time ago. The unfortunate reality of today is that papers are nothing more than a few words. If your code is not public how can I verify that it works Papers: - Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models: - C-Evolve: Consensus-based Evolution for Prompt Groups: - Maestro: Joint Graph & Config Optimization for Reliable AI Agents:"
X Link @casper_hansen_ 2025-10-18T13:11Z 10.5K followers, 18K engagements

"largely agree with karpathy in some aspects but also think the models of today are already what we would consider AGI just XX years ago"
X Link @casper_hansen_ 2025-10-18T15:02Z 10.5K followers, 2267 engagements

"When you work in distribution of Claude Kimi and GLM: EVERYTHING feels great. You can vibe code. You won't solve novel problems but at least you produce code"
X Link @casper_hansen_ 2025-10-20T16:02Z 10.5K followers, 2910 engagements

"Calling it now: Every inference engine should have a built-in GEPA system that auto-evolves your prompt over time"
X Link @casper_hansen_ 2025-10-16T15:52Z 10.5K followers, 11.2K engagements

"@Grad62304977 You found this happens on Qwen3 models too"
X Link @casper_hansen_ 2025-10-19T15:17Z 10.5K followers, XXX engagements

"@ai_for_success it would have to be dario for the sole reason that he wants regulatory capture. agi for me but not for thee"
X Link @casper_hansen_ 2025-10-19T16:23Z 10.5K followers, 3118 engagements

"NEW DeepSeek OCR model that outperforms dots ocr while prefilling 3x less tokens"
X Link @casper_hansen_ 2025-10-20T06:56Z 10.5K followers, 37.1K engagements

"if your code review does not look like this are you even code reviewing"
X Link @casper_hansen_ 2025-10-20T15:09Z 10.5K followers, 1324 engagements

"@mgoin_ Excellent work from all contributors and reviewers"
X Link @casper_hansen_ 2025-10-21T05:48Z 10.5K followers, XX engagements

"LoRA finetuned experts in MoE now runs properly in vLLM"
X Link @casper_hansen_ 2025-10-21T06:45Z 10.5K followers, XXX engagements