Dark | Light
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

[@zexigh](/creator/twitter/zexigh)
"Im building my own inference engine in Rust and AVX512 (huge thanks to @rasbt & @karpathy 🙏). At first I thought it was just my bad config poor top-k/top-p/temperature choices But nope. Ive hit the same loop on @llamacpp @ollama (qwen) and even Grok X in thinking mode 😅"  
![@zexigh Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::580146594.png) [@zexigh](/creator/x/zexigh) on [X](/post/tweet/1947730274315337968) 2025-07-22 18:47:39 UTC XXX followers, XX engagements


"Update: Someone pointed out penalties should catch this loop. But oops Ollama/Qwen forces repeat_penalty to XXX Why Speed Doubt itwindows only XX tokens negligible vs model time. Logits incoming #AI #LLM #Debugging #72HourDays Still I had same issue with Grok3"  
![@zexigh Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::580146594.png) [@zexigh](/creator/x/zexigh) on [X](/post/tweet/1947923491828994189) 2025-07-23 07:35:25 UTC XXX followers, XX engagements


"🧠 Did you know GPT Tokenizer doesn't treat "space" as a normal char: ' ' (space)  (U+0120) 'n'  (U+010A) 't'  (U+010B) 'r'  (U+0108) Because it uses byte-level BPE and maps raw bytes to uncommon Unicode chars for readability & reversibility"  
![@zexigh Avatar](https://lunarcrush.com/gi/w:16/cr:twitter::580146594.png) [@zexigh](/creator/x/zexigh) on [X](/post/tweet/1946460242084036695) 2025-07-19 06:40:59 UTC XXX followers, XX engagements

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@zexigh "Im building my own inference engine in Rust and AVX512 (huge thanks to @rasbt & @karpathy 🙏). At first I thought it was just my bad config poor top-k/top-p/temperature choices But nope. Ive hit the same loop on @llamacpp @ollama (qwen) and even Grok X in thinking mode 😅"
@zexigh Avatar @zexigh on X 2025-07-22 18:47:39 UTC XXX followers, XX engagements

"Update: Someone pointed out penalties should catch this loop. But oops Ollama/Qwen forces repeat_penalty to XXX Why Speed Doubt itwindows only XX tokens negligible vs model time. Logits incoming #AI #LLM #Debugging #72HourDays Still I had same issue with Grok3"
@zexigh Avatar @zexigh on X 2025-07-23 07:35:25 UTC XXX followers, XX engagements

"🧠 Did you know GPT Tokenizer doesn't treat "space" as a normal char: ' ' (space) (U+0120) 'n' (U+010A) 't' (U+010B) 'r' (U+0108) Because it uses byte-level BPE and maps raw bytes to uncommon Unicode chars for readability & reversibility"
@zexigh Avatar @zexigh on X 2025-07-19 06:40:59 UTC XXX followers, XX engagements

creator/twitter::580146594/posts
/creator/twitter::580146594/posts