[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

![ljupc0 Avatar](https://lunarcrush.com/gi/w:24/cr:twitter::1223118409.png) Ljubomir Josifovski [@ljupc0](/creator/twitter/ljupc0) on x 5254 followers
Created: 2025-07-18 09:50:07 UTC

I have not had opportunity to use it for real, but it's next on the list to try. 

Closed/paying ones I use daily - OpenAI's ChatGPT XXX for conversation. Gemini XXX Pro too, and especially for programming that often has a reasoning component to it. Claude never warmed up to, it seemed to be too flowery and flattery to my taste.

Local ones - seeing llama.cpp I was mind-blown 🤯 haha 😆 so bought 2nd hand M2 mbp 96gb ram so I try everything that fits in memory. :-) Local models I have used more than try&forget, over time, listed below.

1) dots.llm1

# MoE localhost <75GB RAM ~16 tps http://127.0.0.1:8080
~/llama.cpp$ sudo sysctl iogpu.wired_limit_mb=80000; build/bin/llama-server --model models/dots.llm1.inst-UD-TQ1_0.gguf --temp X --top_p XXXX --min_p X --ctx-size 32768 --flash-attn --cache-type-k q8_0 --cache-type-v q8_0 --jinja &

2) Qwen3 30B A3B doctored to use XX experts rather than X so A6B, 128K context



Still have those but not giving them much attention lately:

3) OpenBuddy-R1-Distill-Qwen3-32B


4) qwen3-30b-a3b MLX vanilla

Previously used, now largely not used anymore:

5) glm-4-32b-0414, glm-z1-32b-0414, glm-z1-rumination-32b-0414

6) QwQ-32B e.g. qwq-32b-q6_k.gguf

On the TODO list next:

7) Hunyuan-A13B




As you may guess - I love this stuff. ☺️ Having worked (programmed, trained) speech and language models XX years ago (then - HMMs, Ngrams, and some NN-s but only X hidden layer).


XX engagements

![Engagements Line Chart](https://lunarcrush.com/gi/w:600/p:tweet::1946145451050401820/c:line.svg)

**Related Topics**
[llamacpp](/topic/llamacpp)
[claude](/topic/claude)
[open ai](/topic/open-ai)

[Post Link](https://x.com/ljupc0/status/1946145451050401820)

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

Ljubomir Josifovski @ljupc0 on x 5254 followers Created: 2025-07-18 09:50:07 UTC

I have not had opportunity to use it for real, but it's next on the list to try.

Closed/paying ones I use daily - OpenAI's ChatGPT XXX for conversation. Gemini XXX Pro too, and especially for programming that often has a reasoning component to it. Claude never warmed up to, it seemed to be too flowery and flattery to my taste.

Local ones - seeing llama.cpp I was mind-blown 🤯 haha 😆 so bought 2nd hand M2 mbp 96gb ram so I try everything that fits in memory. :-) Local models I have used more than try&forget, over time, listed below.

dots.llm1

127.0.0.1:8080

~/llama.cpp$ sudo sysctl iogpu.wired_limit_mb=80000; build/bin/llama-server --model models/dots.llm1.inst-UD-TQ1_0.gguf --temp X --top_p XXXX --min_p X --ctx-size 32768 --flash-attn --cache-type-k q8_0 --cache-type-v q8_0 --jinja &

Qwen3 30B A3B doctored to use XX experts rather than X so A6B, 128K context

Still have those but not giving them much attention lately:

OpenBuddy-R1-Distill-Qwen3-32B
qwen3-30b-a3b MLX vanilla

Previously used, now largely not used anymore:

glm-4-32b-0414, glm-z1-32b-0414, glm-z1-rumination-32b-0414
QwQ-32B e.g. qwq-32b-q6_k.gguf

On the TODO list next:

Hunyuan-A13B

As you may guess - I love this stuff. ☺️ Having worked (programmed, trained) speech and language models XX years ago (then - HMMs, Ngrams, and some NN-s but only X hidden layer).

XX engagements

Engagements Line Chart

Related Topics llamacpp claude open ai

Post Link