[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

@exolabs Avatar @exolabs EXO Labs

EXO Labs posts on X about $2413t, over the the most. They currently have XXXXXX followers and XX posts still getting attention that total XXXXXXX engagements in the last XX hours.

Engagements: XXXXXXX #

Engagements Line Chart

Mentions: X #

Mentions Line Chart

Followers: XXXXXX #

Followers Line Chart

CreatorRank: XXXXXXX #

CreatorRank Line Chart

Social Influence #


Social category influence stocks technology brands

Social topic influence $2413t #14, over the

Top Social Posts #


Top posts by engagements in the last XX hours

"We can run these two stages on different devices: Prefill: DGX Spark (high compute device 4x compute) Decode: M3 Ultra (high memory-bandwidth device 3x memory-bandwidth) However now we need to transfer the KV cache over the network (10GbE). This introduces a delay"
X Link @exolabs 2025-10-15T18:18Z 37.6K followers, 4822 engagements

"LLM inference consists of a prefill and decode stage. Prefill processes the prompt building a KV cache. Its compute-bound so gets faster with more FLOPS. Decode reads the KV cache and generates tokens one by one. Its memory-bound so gets faster with more memory bandwidth"
X Link @exolabs 2025-10-15T18:18Z 37.6K followers, 5649 engagements

"Clustering NVIDIA DGX Spark + M3 Ultra Mac Studio for 4x faster LLM inference. DGX Spark: 128GB @ 273GB/s XXX TFLOPS (fp16) $3999 M3 Ultra: 256GB @ 819GB/s XX TFLOPS (fp16) $5599 The DGX Spark has 3x less memory bandwidth than the M3 Ultra but 4x more FLOPS. By running compute-bound prefill on the DGX Spark memory-bound decode on the M3 Ultra and streaming the KV cache over 10GbE we are able to get the best of both hardware with massive speedups. Short explanation in this thread & link to full blog post below"
X Link @exolabs 2025-10-15T18:17Z 37.6K followers, 137.5K engagements