LunarCrush LLM | post/tweet::1949356861809213522

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

![_avichawla Avatar](https://lunarcrush.com/gi/w:24/cr:twitter::1175166450832687104.png) Avi Chawla [@_avichawla](/creator/twitter/_avichawla) on x 42.3K followers
Created: 2025-07-27 06:31:07 UTC

That said, KV cache also takes a lot of memory.

Llama3-70B has:
- total layers = XX
- hidden size = 8k
- max output size = 4k

Here:
- Every token takes up ~2.5 MB in KV cache.
- 4k tokens will take up XXXX GB.

More users → more memory.

I'll cover KV optimization soon.

![](https://pbs.twimg.com/amplify_video_thumb/1949356808419844096/img/qW27RmFiJnW4PrBU.jpg)

XXXXX engagements

![Engagements Line Chart](https://lunarcrush.com/gi/w:600/p:tweet::1949356861809213522/c:line.svg)

**Related Topics**
[token](/topic/token)
[$avijo](/topic/$avijo)

[Post Link](https://x.com/_avichawla/status/1949356861809213522)

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

Avi Chawla @_avichawla on x 42.3K followers Created: 2025-07-27 06:31:07 UTC

That said, KV cache also takes a lot of memory.

Llama3-70B has:

total layers = XX
hidden size = 8k
max output size = 4k

Here:

Every token takes up ~2.5 MB in KV cache.
4k tokens will take up XXXX GB.

More users → more memory.

I'll cover KV optimization soon.

XXXXX engagements

Engagements Line Chart

Related Topics token $avijo

Post Link