[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]  Ljubomir Josifovski [@ljupc0](/creator/twitter/ljupc0) on x 5254 followers Created: 2025-07-18 09:50:07 UTC I have not had opportunity to use it for real, but it's next on the list to try. Closed/paying ones I use daily - OpenAI's ChatGPT XXX for conversation. Gemini XXX Pro too, and especially for programming that often has a reasoning component to it. Claude never warmed up to, it seemed to be too flowery and flattery to my taste. Local ones - seeing llama.cpp I was mind-blown 🤯 haha 😆 so bought 2nd hand M2 mbp 96gb ram so I try everything that fits in memory. :-) Local models I have used more than try&forget, over time, listed below. 1) dots.llm1 # MoE localhost <75GB RAM ~16 tps http://127.0.0.1:8080 ~/llama.cpp$ sudo sysctl iogpu.wired_limit_mb=80000; build/bin/llama-server --model models/dots.llm1.inst-UD-TQ1_0.gguf --temp X --top_p XXXX --min_p X --ctx-size 32768 --flash-attn --cache-type-k q8_0 --cache-type-v q8_0 --jinja & 2) Qwen3 30B A3B doctored to use XX experts rather than X so A6B, 128K context Still have those but not giving them much attention lately: 3) OpenBuddy-R1-Distill-Qwen3-32B 4) qwen3-30b-a3b MLX vanilla Previously used, now largely not used anymore: 5) glm-4-32b-0414, glm-z1-32b-0414, glm-z1-rumination-32b-0414 6) QwQ-32B e.g. qwq-32b-q6_k.gguf On the TODO list next: 7) Hunyuan-A13B As you may guess - I love this stuff. ☺️ Having worked (programmed, trained) speech and language models XX years ago (then - HMMs, Ngrams, and some NN-s but only X hidden layer). XX engagements  **Related Topics** [llamacpp](/topic/llamacpp) [claude](/topic/claude) [open ai](/topic/open-ai) [Post Link](https://x.com/ljupc0/status/1946145451050401820)
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
Ljubomir Josifovski @ljupc0 on x 5254 followers
Created: 2025-07-18 09:50:07 UTC
I have not had opportunity to use it for real, but it's next on the list to try.
Closed/paying ones I use daily - OpenAI's ChatGPT XXX for conversation. Gemini XXX Pro too, and especially for programming that often has a reasoning component to it. Claude never warmed up to, it seemed to be too flowery and flattery to my taste.
Local ones - seeing llama.cpp I was mind-blown 🤯 haha 😆 so bought 2nd hand M2 mbp 96gb ram so I try everything that fits in memory. :-) Local models I have used more than try&forget, over time, listed below.
~/llama.cpp$ sudo sysctl iogpu.wired_limit_mb=80000; build/bin/llama-server --model models/dots.llm1.inst-UD-TQ1_0.gguf --temp X --top_p XXXX --min_p X --ctx-size 32768 --flash-attn --cache-type-k q8_0 --cache-type-v q8_0 --jinja &
Still have those but not giving them much attention lately:
OpenBuddy-R1-Distill-Qwen3-32B
qwen3-30b-a3b MLX vanilla
Previously used, now largely not used anymore:
glm-4-32b-0414, glm-z1-32b-0414, glm-z1-rumination-32b-0414
QwQ-32B e.g. qwq-32b-q6_k.gguf
On the TODO list next:
As you may guess - I love this stuff. ☺️ Having worked (programmed, trained) speech and language models XX years ago (then - HMMs, Ngrams, and some NN-s but only X hidden layer).
XX engagements
/post/tweet::1946145451050401820