LunarCrush LLM | post/tweet::1930493868795134400

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

![Jukanlosreve Avatar](https://lunarcrush.com/gi/w:24/cr:twitter::1836240683268759552.png) Jukan [@Jukanlosreve](/creator/twitter/Jukanlosreve) on x 22.5K followers
Created: 2025-06-05 05:16:19 UTC

BofA Global Technology Conference – NVIDIA (June 4, 2025)

Speaker: Ian Buck, VP of Accelerated Computing at NVIDIA

⸻

X. The Inference Boom and DeepSeek
• DeepSeek-R1 is the first open-source inference-specialized model, with both its training and optimization processes fully disclosed.
• It sets a new standard for “reasoning models”: capable of thinking through questions and double-verifying answers before output.
• Result: 13x increase in token generation, triggering an explosion in inference demand → massive surge in GPU demand and AI infrastructure profitability.
• DeepSeek achieves higher accuracy than many commercial LLMs (AME math benchmark: 89%), maximizing value relative to inference cost.

⸻

X. Rise of AI Factories and Sovereign AI
• Over XXX AI factories (data centers) are currently under construction worldwide.
• Taiwan is building an AI factory equipped with XXXXXX Blackwell GPUs for domestic industrial use.
• Japan, Germany, and the UK are also expanding investments in sovereign AI infrastructure using their own national datasets.
• This represents not just overseas spending but a genuine rise in global demand, especially for national HPC + AI projects.

⸻

X. Model Size vs. Inference Optimization
• 10B–1T parameter models are now the norm; 10T+ scale models are expected in the future.
• While models grow larger, execution efficiency is critical:
• Mixture of Experts (MoE) selectively activates only relevant parameters for maximum efficiency.
• Knowledge Distillation shrinks large models into personalized smaller ones to serve a broader range of needs.

⸻

X. NVIDIA’s Competitive Edge in Inference
• Inference demands far more complex optimization than training:
• Numerical precision formats (FP32, FP16, FP8, FP4, etc.)
• Multi-GPU scaling, inter-node networking, diverse workload support
• GPU-powered AI factories are superior to ASICs in flexibility and long-term ROI:
• AI factories operate for 5+ years and must run a variety of models.
• ASICs are optimized for fixed workloads but lack generality.

⸻

X. ASIC Growth Potential and Limitations
• While chips make up a small portion of AI infrastructure costs, end-to-end optimization (interconnects, cooling, integration) is crucial.
• For ASICs to grow, they need highly differentiated efficiency and ecosystem traction in niche markets.
• NVIDIA acknowledges: “Not all AI models need to run on GPUs,” but remains focused on large-scale training and inference clusters.

⸻

X. Growth Constraints
• Power supply shortages: AI factory power requirements are reaching gigawatt levels.
• Enterprise adoption speed: The key is how quickly Fortune XXX companies can integrate AI into actual business workflows.
• Standardization pace of AI software and infrastructure is another variable.

⸻

X. Software Monetization Strategy
• NVIDIA maintains an open platform approach with CUDA, PyTorch, and Hugging Face model execution.
• However, it’s increasingly monetizing software through:
• Direct enterprise collaboration
• Datacenter software like Lepton
• Support for NeMoTron models
• Enterprise customers seek direct support from NVIDIA, and demand for paid services is rising.

⸻

Summary:
• DeepSeek marks a turning point in the explosion of AI inference demand, leading to a global boom in sovereign AI factory construction.
• NVIDIA is reinforcing its leadership with a flexible, high-performance inference platform.
• Key challenges ahead: power constraints, adoption speed, and software standardization.

Upcoming Events:
• ISC High Performance 2025 (June 10–13, Berlin)
• GTC Developer Summit (June 11–12, Paris)
Expect further announcements on AI factories and sovereign AI at these events.

XXXXX engagements

![Engagements Line Chart](https://lunarcrush.com/gi/w:600/p:tweet::1930493868795134400/c:line.svg)

**Related Topics**
[deepseekr1](/topic/deepseekr1)
[inference](/topic/inference)
[stocks technology](/topic/stocks-technology)
[$nvda](/topic/$nvda)

[Post Link](https://x.com/Jukanlosreve/status/1930493868795134400)

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

Jukan @Jukanlosreve on x 22.5K followers Created: 2025-06-05 05:16:19 UTC

BofA Global Technology Conference – NVIDIA (June 4, 2025)

Speaker: Ian Buck, VP of Accelerated Computing at NVIDIA

⸻

X. The Inference Boom and DeepSeek • DeepSeek-R1 is the first open-source inference-specialized model, with both its training and optimization processes fully disclosed. • It sets a new standard for “reasoning models”: capable of thinking through questions and double-verifying answers before output. • Result: 13x increase in token generation, triggering an explosion in inference demand → massive surge in GPU demand and AI infrastructure profitability. • DeepSeek achieves higher accuracy than many commercial LLMs (AME math benchmark: 89%), maximizing value relative to inference cost.

⸻

X. Rise of AI Factories and Sovereign AI • Over XXX AI factories (data centers) are currently under construction worldwide. • Taiwan is building an AI factory equipped with XXXXXX Blackwell GPUs for domestic industrial use. • Japan, Germany, and the UK are also expanding investments in sovereign AI infrastructure using their own national datasets. • This represents not just overseas spending but a genuine rise in global demand, especially for national HPC + AI projects.

⸻

X. Model Size vs. Inference Optimization • 10B–1T parameter models are now the norm; 10T+ scale models are expected in the future. • While models grow larger, execution efficiency is critical: • Mixture of Experts (MoE) selectively activates only relevant parameters for maximum efficiency. • Knowledge Distillation shrinks large models into personalized smaller ones to serve a broader range of needs.

⸻

X. NVIDIA’s Competitive Edge in Inference • Inference demands far more complex optimization than training: • Numerical precision formats (FP32, FP16, FP8, FP4, etc.) • Multi-GPU scaling, inter-node networking, diverse workload support • GPU-powered AI factories are superior to ASICs in flexibility and long-term ROI: • AI factories operate for 5+ years and must run a variety of models. • ASICs are optimized for fixed workloads but lack generality.

⸻

X. ASIC Growth Potential and Limitations • While chips make up a small portion of AI infrastructure costs, end-to-end optimization (interconnects, cooling, integration) is crucial. • For ASICs to grow, they need highly differentiated efficiency and ecosystem traction in niche markets. • NVIDIA acknowledges: “Not all AI models need to run on GPUs,” but remains focused on large-scale training and inference clusters.

⸻

X. Growth Constraints • Power supply shortages: AI factory power requirements are reaching gigawatt levels. • Enterprise adoption speed: The key is how quickly Fortune XXX companies can integrate AI into actual business workflows. • Standardization pace of AI software and infrastructure is another variable.

⸻

X. Software Monetization Strategy • NVIDIA maintains an open platform approach with CUDA, PyTorch, and Hugging Face model execution. • However, it’s increasingly monetizing software through: • Direct enterprise collaboration • Datacenter software like Lepton • Support for NeMoTron models • Enterprise customers seek direct support from NVIDIA, and demand for paid services is rising.

⸻

Summary: • DeepSeek marks a turning point in the explosion of AI inference demand, leading to a global boom in sovereign AI factory construction. • NVIDIA is reinforcing its leadership with a flexible, high-performance inference platform. • Key challenges ahead: power constraints, adoption speed, and software standardization.

Upcoming Events: • ISC High Performance 2025 (June 10–13, Berlin) • GTC Developer Summit (June 11–12, Paris) Expect further announcements on AI factories and sovereign AI at these events.

XXXXX engagements

Engagements Line Chart

Related Topics deepseekr1 inference stocks technology $nvda

Post Link