[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]  Jukan [@Jukanlosreve](/creator/twitter/Jukanlosreve) on x 22.5K followers Created: 2025-06-05 05:16:19 UTC BofA Global Technology Conference – NVIDIA (June 4, 2025) Speaker: Ian Buck, VP of Accelerated Computing at NVIDIA ⸻ X. The Inference Boom and DeepSeek • DeepSeek-R1 is the first open-source inference-specialized model, with both its training and optimization processes fully disclosed. • It sets a new standard for “reasoning models”: capable of thinking through questions and double-verifying answers before output. • Result: 13x increase in token generation, triggering an explosion in inference demand → massive surge in GPU demand and AI infrastructure profitability. • DeepSeek achieves higher accuracy than many commercial LLMs (AME math benchmark: 89%), maximizing value relative to inference cost. ⸻ X. Rise of AI Factories and Sovereign AI • Over XXX AI factories (data centers) are currently under construction worldwide. • Taiwan is building an AI factory equipped with XXXXXX Blackwell GPUs for domestic industrial use. • Japan, Germany, and the UK are also expanding investments in sovereign AI infrastructure using their own national datasets. • This represents not just overseas spending but a genuine rise in global demand, especially for national HPC + AI projects. ⸻ X. Model Size vs. Inference Optimization • 10B–1T parameter models are now the norm; 10T+ scale models are expected in the future. • While models grow larger, execution efficiency is critical: • Mixture of Experts (MoE) selectively activates only relevant parameters for maximum efficiency. • Knowledge Distillation shrinks large models into personalized smaller ones to serve a broader range of needs. ⸻ X. NVIDIA’s Competitive Edge in Inference • Inference demands far more complex optimization than training: • Numerical precision formats (FP32, FP16, FP8, FP4, etc.) • Multi-GPU scaling, inter-node networking, diverse workload support • GPU-powered AI factories are superior to ASICs in flexibility and long-term ROI: • AI factories operate for 5+ years and must run a variety of models. • ASICs are optimized for fixed workloads but lack generality. ⸻ X. ASIC Growth Potential and Limitations • While chips make up a small portion of AI infrastructure costs, end-to-end optimization (interconnects, cooling, integration) is crucial. • For ASICs to grow, they need highly differentiated efficiency and ecosystem traction in niche markets. • NVIDIA acknowledges: “Not all AI models need to run on GPUs,” but remains focused on large-scale training and inference clusters. ⸻ X. Growth Constraints • Power supply shortages: AI factory power requirements are reaching gigawatt levels. • Enterprise adoption speed: The key is how quickly Fortune XXX companies can integrate AI into actual business workflows. • Standardization pace of AI software and infrastructure is another variable. ⸻ X. Software Monetization Strategy • NVIDIA maintains an open platform approach with CUDA, PyTorch, and Hugging Face model execution. • However, it’s increasingly monetizing software through: • Direct enterprise collaboration • Datacenter software like Lepton • Support for NeMoTron models • Enterprise customers seek direct support from NVIDIA, and demand for paid services is rising. ⸻ Summary: • DeepSeek marks a turning point in the explosion of AI inference demand, leading to a global boom in sovereign AI factory construction. • NVIDIA is reinforcing its leadership with a flexible, high-performance inference platform. • Key challenges ahead: power constraints, adoption speed, and software standardization. Upcoming Events: • ISC High Performance 2025 (June 10–13, Berlin) • GTC Developer Summit (June 11–12, Paris) Expect further announcements on AI factories and sovereign AI at these events. XXXXX engagements  **Related Topics** [deepseekr1](/topic/deepseekr1) [inference](/topic/inference) [stocks technology](/topic/stocks-technology) [$nvda](/topic/$nvda) [Post Link](https://x.com/Jukanlosreve/status/1930493868795134400)
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]
Jukan @Jukanlosreve on x 22.5K followers
Created: 2025-06-05 05:16:19 UTC
BofA Global Technology Conference – NVIDIA (June 4, 2025)
Speaker: Ian Buck, VP of Accelerated Computing at NVIDIA
⸻
X. The Inference Boom and DeepSeek • DeepSeek-R1 is the first open-source inference-specialized model, with both its training and optimization processes fully disclosed. • It sets a new standard for “reasoning models”: capable of thinking through questions and double-verifying answers before output. • Result: 13x increase in token generation, triggering an explosion in inference demand → massive surge in GPU demand and AI infrastructure profitability. • DeepSeek achieves higher accuracy than many commercial LLMs (AME math benchmark: 89%), maximizing value relative to inference cost.
⸻
X. Rise of AI Factories and Sovereign AI • Over XXX AI factories (data centers) are currently under construction worldwide. • Taiwan is building an AI factory equipped with XXXXXX Blackwell GPUs for domestic industrial use. • Japan, Germany, and the UK are also expanding investments in sovereign AI infrastructure using their own national datasets. • This represents not just overseas spending but a genuine rise in global demand, especially for national HPC + AI projects.
⸻
X. Model Size vs. Inference Optimization • 10B–1T parameter models are now the norm; 10T+ scale models are expected in the future. • While models grow larger, execution efficiency is critical: • Mixture of Experts (MoE) selectively activates only relevant parameters for maximum efficiency. • Knowledge Distillation shrinks large models into personalized smaller ones to serve a broader range of needs.
⸻
X. NVIDIA’s Competitive Edge in Inference • Inference demands far more complex optimization than training: • Numerical precision formats (FP32, FP16, FP8, FP4, etc.) • Multi-GPU scaling, inter-node networking, diverse workload support • GPU-powered AI factories are superior to ASICs in flexibility and long-term ROI: • AI factories operate for 5+ years and must run a variety of models. • ASICs are optimized for fixed workloads but lack generality.
⸻
X. ASIC Growth Potential and Limitations • While chips make up a small portion of AI infrastructure costs, end-to-end optimization (interconnects, cooling, integration) is crucial. • For ASICs to grow, they need highly differentiated efficiency and ecosystem traction in niche markets. • NVIDIA acknowledges: “Not all AI models need to run on GPUs,” but remains focused on large-scale training and inference clusters.
⸻
X. Growth Constraints • Power supply shortages: AI factory power requirements are reaching gigawatt levels. • Enterprise adoption speed: The key is how quickly Fortune XXX companies can integrate AI into actual business workflows. • Standardization pace of AI software and infrastructure is another variable.
⸻
X. Software Monetization Strategy • NVIDIA maintains an open platform approach with CUDA, PyTorch, and Hugging Face model execution. • However, it’s increasingly monetizing software through: • Direct enterprise collaboration • Datacenter software like Lepton • Support for NeMoTron models • Enterprise customers seek direct support from NVIDIA, and demand for paid services is rising.
⸻
Summary: • DeepSeek marks a turning point in the explosion of AI inference demand, leading to a global boom in sovereign AI factory construction. • NVIDIA is reinforcing its leadership with a flexible, high-performance inference platform. • Key challenges ahead: power constraints, adoption speed, and software standardization.
Upcoming Events: • ISC High Performance 2025 (June 10–13, Berlin) • GTC Developer Summit (June 11–12, Paris) Expect further announcements on AI factories and sovereign AI at these events.
XXXXX engagements
Related Topics deepseekr1 inference stocks technology $nvda
/post/tweet::1930493868795134400