[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
Lucas Beyer (bl16) posts on X about llm, xai, spacex, psa the most. They currently have XXXXXXX followers and XXX posts still getting attention that total XXXXXX engagements in the last XX hours.
Social category influence technology brands stocks finance celebrities currencies
Social topic influence llm #9, xai, spacex, psa, momentum, #ai, open ai, infra, token, mentioning
Top assets mentioned Alphabet Inc Class A (GOOGL)
Top posts by engagements in the last XX hours
"ARCHITECTURE They are vague but mention "sparse moe" and having made (specifically)architecture improvements in general and in long-context and image input specifically" @giffmana on X 2025-06-17 20:58:49 UTC 103K followers, 20.3K engagements
"@ivanfioravanti @casper_hansen_ @Alibaba_Qwen I doubt it. If you fine tune on no thinking it will quickly adapt not to think" @giffmana on X 2025-07-21 19:37:14 UTC 103K followers, XXX engagements
"@tenderizzation @Norapom04 I have a maybe naive question: why go through all this pain (I see it's reverted) and massive amount of code instead of just using torch.compile of is about the same speed" @giffmana on X 2025-07-21 18:43:11 UTC 103K followers, 1777 engagements
"@agihippo You know about which is from some ex-colleagues right Old name NoseBrain" @giffmana on X 2025-06-15 21:19:45 UTC 102.9K followers, 2085 engagements
"TL;DR: Qwen series finetuned on 5M reasoning traces from DeepSeek R1 0528 671B i.e. hard distillation" @giffmana on X 2025-07-21 18:25:49 UTC 103K followers, 59.7K engagements
"AKA data augmentation. The numbers actually match my experience exactly. This is something i think LLM people will slowly rediscover from vision people. Not sure how they can write up the whole paper and not even once think of running the AR with augmentation or dropout" @giffmana on X 2025-07-22 18:46:32 UTC 103K followers, 72.9K engagements
"PSA: I'm getting these phishing emails almost daily now. Don't fall for it guys why do so many fall for it Just ignore it" @giffmana on X 2025-07-20 09:30:49 UTC 103K followers, 15.4K engagements
"@corbtt But since you go back to using llm as judge you're back to having to worry about reward hacking eventually. Though i guess that's always the case for non verifiable tasks" @giffmana on X 2025-07-11 19:50:38 UTC 103K followers, 4093 engagements
"@casper_hansen_ @Alibaba_Qwen Do you have a pointer or concrete example regarding that nightmare by chance I don't see it because i never used this" @giffmana on X 2025-07-21 19:09:10 UTC 103K followers, 1449 engagements
"This paper is pretty cool; through careful tuning they show: - you can train LLMs with batch-size as small as X just need smaller lr. - even plain SGD works at small batch. - Fancy optims mainly help at larger batch. (This reconciles discrepancy with past ResNet research.) - At small batch optim hparams are very insensitive I find this cool for two reasons: 1) When we did ScalingViT I also surprisingly found (but never published) that pure SGD works much better than expected. However a small gap always remained so we dropped it in favour of (our variant of) AdaFactor. The results here confirm" @giffmana on X 2025-07-10 19:00:01 UTC 103K followers, 105.1K engagements
"Definitely has nontrivial hint that differs per problem. Although they are still broad enough that you could imagine having a bench full of them and then if the verifier is good enough it's fine" @giffmana on X 2025-07-22 19:24:39 UTC 103K followers, 9988 engagements
"@ivanfioravanti @casper_hansen_ @Alibaba_Qwen Why Just use the template of the move you fine tune Or maybe even no template in my experience "mode switches" are trivially rewired during fine-tuning" @giffmana on X 2025-07-21 19:24:38 UTC 103K followers, XXX engagements
"@_xjdr How OpenAI likely got IMO gold and XX lessons this teaches us about b2b saas sales a 🧵" @giffmana on X 2025-07-19 19:45:58 UTC 103K followers, 9703 engagements
"@ivanfioravanti @casper_hansen_ @Alibaba_Qwen Did your fine tuning examples contain tuning blocks or not" @giffmana on X 2025-07-21 19:29:34 UTC 103K followers, XXX engagements
"@GoldMagikarp42 Why are you not Solid though Missed opportunity" @giffmana on X 2025-07-10 21:12:01 UTC 103K followers, XXX engagements
"After talking with the community and thinking it through we decided to stop using hybrid thinking mode. Is there a write up about this decision somewhere @Alibaba_Qwen But also curious about people's thoughts in general" @giffmana on X 2025-07-21 18:49:39 UTC 103K followers, 42.8K engagements
"No this argument is wrong in programming because there is no death. What people fail to do is compare the time saved debugging thanks to typechecks revealing something vs the time wasted meta-programming the type-checker. My claim is the savings are nowhere near. And it's not a skill issue i grew up in typed languages and meta-programming" @giffmana on X 2025-06-29 07:29:10 UTC 103K followers, XX engagements
"@shaneguML Why There have been two very good open source models recently and supposedly another one coming soon. And you seem to have forgotten about gemma too" @giffmana on X 2025-07-14 21:55:16 UTC 102.8K followers, 11.2K engagements
"OPTIMIZATION specifically mention stability signal propagation and optimization as three things that improved. And distillation for the smaller models mentioning storing teacher logits as only "k" logits per token. I think this implies offline distillation and hence teacher-forcing (suboptimal but easier infra)" @giffmana on X 2025-06-17 20:58:50 UTC 103K followers, 12.2K engagements
"@shawshank_v @abursuc @y_m_asano @v_pariza @MrzSalehi @SpyrosGidaris @LukasKnobel1 @EliasRamzi27714 @valeoai @FunAILab I believe these statements are contradicting each other do you mind clarifying I cannot make the math work out also not when using 613M as the number of examples. I must be missing something" @giffmana on X 2025-07-21 18:08:01 UTC 103K followers, 2396 engagements
"HAHAHAHA yeah sure. Unrelated but Satya knows that I invented ConvNets right" @giffmana on X 2025-07-17 05:29:23 UTC 103K followers, 117.5K engagements
"@vikhyatk @stochasticchasm You have turned into a grayscale bro cc @y0b1byte new best friend" @giffmana on X 2025-07-20 09:39:32 UTC 102.9K followers, XXX engagements