[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]  DeepSeek [@deepseek_ai](/creator/twitter/deepseek_ai) on x 971.4K followers Created: 2025-02-26 01:00:48 UTC 🚀 Day X of #OpenSourceWeek: DeepGEMM Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference. ⚡ Up to 1350+ FP8 TFLOPS on Hopper GPUs ✅ No heavy dependency, as clean as a tutorial ✅ Fully Just-In-Time compiled ✅ Core logic at ~300 lines - yet outperforms expert-tuned kernels across most matrix sizes ✅ Supports dense layout and two MoE layouts 🔗 GitHub: XXXXXXX engagements  **Related Topics** [matrix](/topic/matrix) [inference](/topic/inference) [Post Link](https://x.com/deepseek_ai/status/1894553164235640933)
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
DeepSeek @deepseek_ai on x 971.4K followers
Created: 2025-02-26 01:00:48 UTC
🚀 Day X of #OpenSourceWeek: DeepGEMM
Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference.
⚡ Up to 1350+ FP8 TFLOPS on Hopper GPUs ✅ No heavy dependency, as clean as a tutorial ✅ Fully Just-In-Time compiled ✅ Core logic at ~300 lines - yet outperforms expert-tuned kernels across most matrix sizes ✅ Supports dense layout and two MoE layouts
🔗 GitHub:
XXXXXXX engagements
/post/tweet::1894553164235640933