[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@rasbt Sebastian Raschka

Sebastian Raschka posts on X about the first, scaling, build a, see the the most. They currently have XXXXXXX followers and XXX posts still getting attention that total XXXXXXX engagements in the last XX hours.

Engagements: XXXXXXX #

X Week XXXXXXX +241%
X Month XXXXXXXXX +56%
X Months XXXXXXXXX +217%
X Year XXXXXXXXX +42%

Mentions: XX #

X Month XX -XX%
X Months XXX +196%
X Year XXX +120%

Followers: XXXXXXX #

X Week XXXXXXX +0.33%
X Month XXXXXXX +1.40%
X Months XXXXXXX +11%
X Year XXXXXXX +24%

CreatorRank: XXXXXXX #

Social Influence

Social category influence fashion brands stocks technology brands

Social topic influence the first, scaling #46, build a, see the, hm, solve, the book, if you, trade, instead of

Top assets mentioned IBM (IBM)

Top Social Posts

Top posts by engagements in the last XX hours

"This interesting week started with DeepSeek V3.2 I just wrote up a technical tour of the predecessors and components that led up to this: 🔗 - Multi-Head Latent Attention - RLVR - Sparse Attention - Self-Verification - GRPO Updates"
X Link 2025-12-03T14:49Z 373K followers, 89.1K engagements

"@soul_surfer78 You want to build a ChatGPT-like interface just for yourself or for serving customers If the former maybe check out my example here:"
X Link 2025-12-10T15:15Z 372.4K followers, XX engagements

"@soul_surfer78 I see. That's a bigger topic since you also want to cost controls security scaling etc. @LightningAI may be a good option here. E.g. see the beginner tutorials via and"
X Link 2025-12-10T15:36Z 372.5K followers, XX engagements

"Hm I dont see the conflict. There are two things going on the reasoning behavior (the intermediate step-by-step explanations in LLMs to solve more complex problems) and then reasoning models which are LLMs specifically trained to emit such answers with intermediate steps. Does that help clarify"
X Link 2025-12-11T14:33Z 372.7K followers, XX engagements

"DeepSeek finally released a new model and paper. And because this DeepSeek-OCR release is a bit different from what everyone expected and DeepSeek releases are generally a big deal I wanted to do a brief explainer of what it is all about. In short they explore how vision encoders can improve the efficiency of LLMs in processing and compressing textual information. And the takeaway is that rendering text as images and feeding that to the model results in more efficient compression than working with text directly. My first intuition was that this sounds very inefficient and shouldn't work as"
X Link 2025-10-21T14:27Z 373K followers, 159.9K engagements

"Inference-scaling lets us trade extra compute for better modeling accuracy. Next to reinforcement learning it has become one of the most important concepts in today's LLMs so the book will cover it in two chapters instead of just one. I just finished the first one. It is a 35-page introduction to inference-time scaling through self-consistency sampling. This chapter was a lot of fun to write because it takes the base model on MATH-500 all the way from XXXX% percent to XXXX% accuracy. Seeing that jump without additional training is incredibly satisfying. Submitted the chapter yesterday and it"
X Link 2025-11-15T14:44Z 373K followers, 178.8K engagements

"If you are looking for something to read this upcoming weekend chapter X on inference-time scaling is available now 🔗"
X Link 2025-11-20T14:42Z 373K followers, 116.9K engagements

"My biennial update to the "Hello World"s of ML & AI: 2013: RandomForestClassifier on Iris 2015: XGBoost on Titanic 2017: MLPs on MNIST 2019: AlexNet on CIFAR-10 2021: DistilBERT on IMDb movie reviews 2023: Llama X with LoRA on Alpaca 50k 2025: Qwen3 with RLVR on MATH-500"
X Link 2025-12-08T18:56Z 373K followers, 129.5K engagements

"Hold on a sec Mistral X Large uses the DeepSeek V3 architecture including MLA Just went through the config files; the only difference I could see is that Mistral X Large used 2x fewer experts but made each expert 2x large"
X Link 2025-12-12T19:11Z 373K followers, 199.6K engagements

"@_thomasip Yes but the larger labs usually always had some distinct difference or tweak not a straight up reuse"
X Link 2025-12-12T20:19Z 373K followers, 7215 engagements

"@_thomasip Ok fair but now Qwen3 IBM Granite XXX Olmo X Phi-4 all have unique tweaks"
X Link 2025-12-12T20:52Z 373K followers, 1015 engagements

"Just updated the Big LLM Architecture Comparison article. .it grew quite a bit since the initial version in July 2025 more than doubled"
X Link 2025-12-13T14:21Z 373K followers, 72.9K engagements