[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
@casper_hansen_
"Recipe to post-train Qwen3 1.7B into a DeepResearch model What does it mean for something small to think deeply Meet Lucy a posttrained Qwen31.7B as a DeepResearch model based on @willccbb's verifiers. Primary Rule-based Rewards: - Answer correctness We check whether the final response literally contains the ground-truth answer. This substring match is cheap and avoids calling a larger LLM judge. - Visit/search ratio If the agent visits at least as many pages as it issues search queries it receives ((visit_search_ratio - 1) / 4) ** XXXX. If it searches more than it visits the score is -0.5." @casper_hansen_ on X 2025-07-22 15:07:00 UTC 8808 followers, 39.8K engagements
"@daniel_mac8 it's been said that chinese teams ship fast but which one ships the 1M context model" @casper_hansen_ on X 2025-07-22 17:39:51 UTC 8798 followers, 3094 engagements
"btw @minishlab would recommend adding an example like the one in my screenshot that shows how to use semhash with a Huggingface dataset" @casper_hansen_ on X 2025-07-24 07:51:00 UTC 8808 followers, XXX engagements
"@chatgpt21 You think not What would it be then - o3 alpha" @casper_hansen_ on X 2025-07-23 05:58:45 UTC 8806 followers, 1662 engagements
"@Presidentlin @altryne Qwen3 coder is already released after this post. This is from Zhipu AI and has been live on z (dot) ai for a bit" @casper_hansen_ on X 2025-07-24 15:36:31 UTC 8807 followers, XX engagements
"@willccbb the correct answer is always depends if your job title includes engineer or researcher" @casper_hansen_ on X 2025-07-22 14:41:01 UTC 8809 followers, XXX engagements
"The RL codebase I like the most: - The NanoGPT of RL - Supports multi-turn RL - Just 1k lines of code in Python - Data Tensor Sequence Parallel" @casper_hansen_ on X 2025-07-17 09:45:32 UTC 8794 followers, 26.4K engagements
"Im seeing lots of people with the worst takes on IMO medals OpenAI and generally AI on the timeline. Where did we go wrong when we critique such a crazy achievement The craziest part that its just natural language" @casper_hansen_ on X 2025-07-20 10:40:44 UTC 8619 followers, 1164 engagements
"@mgoin_ Michael if you ever need more buy-in from the vLLM / PyTorch team on this just reference this X post please :D Very much looking forward to see the progress on this one as I think a ton of people will feel a difference here (serverless RL)" @casper_hansen_ on X 2025-07-22 15:21:29 UTC 8791 followers, XXX engagements
"Another 20s saved on load already merged in nightly. Next vLLM release (0.9.3) will be amazing" @casper_hansen_ on X 2025-07-22 14:35:18 UTC 8806 followers, 1434 engagements
"Ever wanted to solve biomedicine Here you go thousands of applications can be created from this" @casper_hansen_ on X 2025-07-17 11:16:56 UTC 8791 followers, 6045 engagements
"@Sebyverse Ever heard of RLHF or RLVR Thats how models like o3 R1 K2 etc. are trained. 80-90% of this training process is inference. So you need to build inference before you build training" @casper_hansen_ on X 2025-07-21 13:13:27 UTC 8731 followers, XXX engagements
"@menloresearch @Alibaba_Qwen I wrote this post that covers your model Thanks for sharing everything - would recommend releasing a paper" @casper_hansen_ on X 2025-07-22 16:37:35 UTC 8806 followers, XXX engagements
"@giffmana @Alibaba_Qwen Try to read the chat template and apply it in various ways over multiple turns" @casper_hansen_ on X 2025-07-21 19:13:05 UTC 8775 followers, XXX engagements
"@WolframRvnwlf I found this nugget yesterday. Surface-level this looks like everything you would want" @casper_hansen_ on X 2025-07-21 12:47:23 UTC 8779 followers, XXX engagements
"Does anyone know what the required hardware is to run Qwen3 235B at 256k context length" @casper_hansen_ on X 2025-07-23 08:25:35 UTC 8806 followers, 1131 engagements
"claude opus and kimi k2 have been so heavily RL'd that they will sometimes hallucinate for literally no reason. and people say o3 is bad all models exhibit this behaviour" @casper_hansen_ on X 2025-07-22 13:01:00 UTC 8807 followers, XXX engagements
"@giffmana @Alibaba_Qwen Unfortunately I don't have public code to trigger this. Will touches on it briefly here. It's mostly horrible in terms of managing enable_thinking between turns and trying to capture format with empty think tags" @casper_hansen_ on X 2025-07-21 19:12:47 UTC 8779 followers, XXX engagements
"This is not a SMALL update. This is huge Give us this for every model please Qwen team🙏" @casper_hansen_ on X 2025-07-21 17:32:55 UTC 8808 followers, 38.7K engagements
"Want to get better at managing multiple Claude Code instances Just go play StarCraft thats way harder" @casper_hansen_ on X 2025-07-22 06:55:11 UTC 8804 followers, 1030 engagements
"@willccbb Model (Apache 2.0): Code: (MIT):" @casper_hansen_ on X 2025-07-22 15:10:50 UTC 8808 followers, 2574 engagements
"@boneGPT Sam has a CS degree and previously coded" @casper_hansen_ on X 2025-07-19 18:39:27 UTC 8767 followers, 2584 engagements
"@giffmana @Alibaba_Qwen The Qwen3 hybrid chat template was a nightmare to manage in a multi-turn scenario. Apart from that a hybrid model should be as strong as the standalone version. If you cant do that then a routing system over a model harness is better" @casper_hansen_ on X 2025-07-21 19:06:21 UTC 8779 followers, 2003 engagements
"@prashant_hq I didnt have the lite version" @casper_hansen_ on X 2025-07-21 12:40:00 UTC 8723 followers, XXX engagements
"if you loved kimi k2 you will love what another cracked chinese team is about to release. Models: - o3 competitor: multi-turn reasoning coding search - 106B A12B: XXX experts X active 128k context GQA - 355B A32B: Details unknown about config" @casper_hansen_ on X 2025-07-24 15:18:15 UTC 8808 followers, 8362 engagements
"uv for brew uv for apt uv for literally anything is all you need" @casper_hansen_ on X 2025-07-24 14:51:00 UTC 8807 followers, XXX engagements
"@michaelzluo @willccbb Wish I had those numbers. Paper inference and MCP server is pending release. They released SimpleQA though" @casper_hansen_ on X 2025-07-22 16:06:16 UTC 8808 followers, XXX engagements
"Step X of many: X. Three weeks ago I released a biomedical dataset of 521k samples. X. Two weeks ago I released full-text embeddings (32k) with 2560 dimension from Qwen3 4B embedding model. X. This week I release 19k semantically clustered texts that can be used to construct multi-article biomedical question-answering" @casper_hansen_ on X 2025-07-22 15:01:34 UTC 8808 followers, 4443 engagements
"@vikhyatk accurate. you need a hook to capture a diminishing attention span. like i 2x'd my whatever because of 4-bit quant" @casper_hansen_ on X 2025-07-24 08:59:04 UTC 8808 followers, XXX engagements
"@TheZachMueller Looked it up gigabrain vibes on this one" @casper_hansen_ on X 2025-07-21 12:42:45 UTC 8724 followers, XXX engagements
"@axolotl_ai Woah I'm a fan of this ALST @winglian is this preferred over sequence parallelism faster/scales better" @casper_hansen_ on X 2025-07-09 19:29:03 UTC 8731 followers, XX engagements
"@alandao_ai Thanks for creating it Alan Do you by chance have a wandb run that you could share Would love to observe the raw data of the training run :)" @casper_hansen_ on X 2025-07-23 08:21:57 UTC 8796 followers, XXX engagements
"@altryne More news to (maybe) chat about :)" @casper_hansen_ on X 2025-07-24 15:25:03 UTC 8807 followers, XXX engagements
"We need something better than Nvidia or AMD if we want to scale up AGI and make it accessible. A simple 2x in compute performance every X years is not enough to satisfy where we are headed" @casper_hansen_ on X 2025-07-21 14:11:36 UTC 8772 followers, 1002 engagements
"@mark_k @ChatGPTapp Not sure I will get it anytime soon as I don't have a subscription" @casper_hansen_ on X 2025-07-24 10:24:16 UTC 8806 followers, XXX engagements
"@jon_durbin Do you have a launch command or script for this that works with all the bells and whistles you just described" @casper_hansen_ on X 2025-07-23 10:51:13 UTC 8791 followers, XXX engagements
"TLDR; you can just fill out a form to have coffee with Lex which I think is pretty cool" @casper_hansen_ on X 2025-07-24 09:36:33 UTC 8808 followers, XXX engagements
"vLLM is finally addressing a long-standing problem: startup times 35s - 2s for CUDA graph capture is a great reduction" @casper_hansen_ on X 2025-07-22 12:36:35 UTC 8808 followers, 35.1K engagements
""Home light music synchronization with GPT-5 in X minutes" is essentially what's coming very soon. And you were clowning him for not having coding taste Imagine the demos" @casper_hansen_ on X 2025-07-22 19:54:03 UTC 8808 followers, 74.9K engagements
"@VoyageAI would love to share this but your HF upload is empty and doesn't have an open-source license" @casper_hansen_ on X 2025-07-24 08:24:08 UTC 8807 followers, XXX engagements
"be Sam Altman asked what keeps him up at night about AI X scary categories X. Bad guy gets superintelligence first designs a bioweapon take down the United States power grid break into the financial system and take everyone's money X. Sci-fi loss of control The AI is like oh I don't actually want you to turn me off. I'm afraid I can't do that X. Accidental takeover the models kind of accidentally take over the world they just become so ingrained in society Theres young people who just say like I can't make any decision in my life without telling ChatGPT everything that's going on We're so" @casper_hansen_ on X 2025-07-22 19:41:51 UTC 8807 followers, 97K engagements
"if you loved kimi k2 you will love what a certain chinese team is about to release which is highly competitive with 1M context length" @casper_hansen_ on X 2025-07-22 16:35:32 UTC 8809 followers, 95.9K engagements
"Google just deactivated my final ad-blocker that worked. So now its really time to move browser - what are people using these days" @casper_hansen_ on X 2025-07-21 12:37:07 UTC 8767 followers, 3752 engagements
"Whenever someone says DeepSeek this song runs in my head but with "I follow you Deep Seek baby"" @casper_hansen_ on X 2025-07-22 17:34:21 UTC 8807 followers, 6243 engagements
"a bunch of tricks to fool AI in a gotcha is not a good eval and not something you should hill climb. HLE might be detrimental to true progress given the low quality" @casper_hansen_ on X 2025-07-24 15:57:00 UTC 8807 followers, XXX engagements