NVIDIA CEO Jensen Huang Keynote at COMPUTEX 2025
NVIDIA's Vision for the Future of Computing
Introduction to NVIDIA's Journey
- Jensen Huang, CEO of NVIDIA, introduces himself and acknowledges his parents in the audience. He highlights NVIDIA's long-standing relationship with Taiwan, emphasizing its importance as a hub for partners and friends.
Industry Overview and Product Announcements
- Huang discusses the current state of the industry and hints at exciting new product announcements that will create new markets and growth opportunities.
The Evolution of NVIDIA
- He reflects on NVIDIA’s transformation from a chip company to a leader in AI infrastructure, noting significant milestones like the introduction of CUDA in 2006 which revolutionized computing.
Reinventing Technology Stacks
- Huang explains how advancements necessitated a complete reinvention of technology stacks, including processors and software systems. This led to the creation of DGX1, which initiated the AI revolution.
Data Center Architecture Shift
- The need for many processors working together is emphasized as essential for modern applications. Huang describes how data centers must be architected differently to support this shift towards distributed processing.
Networking Innovations
- He discusses acquiring Mellanox to enhance east-west networking capabilities crucial for high-performance computing, transforming entire data centers into unified computing units.
Roadmap Transparency
- Huang notes that unlike typical tech companies, NVIDIA has openly shared its five-year roadmap to help industries plan their infrastructures effectively around AI technologies.
Infrastructure as Intelligence
- He draws parallels between historical infrastructure developments (like electricity and information networks) and today's emerging "intelligence infrastructure," predicting its significance over the next decade.
The Evolution of AI Infrastructure
The Integration of AI into Industries
- AI has become an essential part of infrastructure across all sectors, similar to electricity and the internet.
- Current data centers are evolving; they will transform from traditional models into specialized AI factories that produce valuable outputs.
Understanding AI Factories
- These new AI data centers should be referred to as "AI factories," where energy input results in the production of tokens, a measure of output.
- The industry is shifting focus from traditional metrics to real-time token production, akin to manufacturing processes in other industries.
Nvidia's Growth and Technological Foundations
- Nvidia's journey has evolved from a $300 million chip opportunity in 1993 to a trillion-dollar potential within the AI factory sector.
- Key technologies driving this evolution include accelerated computing and proprietary algorithms, particularly the CUDA X libraries.
The Role of Libraries in Technology Development
- Libraries are central to Nvidia’s operations; they facilitate development and innovation by providing foundational tools for developers.
- A larger installed base encourages more developers to create libraries, leading to better applications and increased user engagement with technology.
Accelerated Computing vs. General Purpose Computing
- Accelerated computing differs fundamentally from general-purpose computing; it requires distinct methodologies for software development.
Accelerating Applications Through Advanced Architectures
Understanding Application Acceleration
- The logic behind application acceleration is based on understanding the applications better, allowing for the creation of architectures that can run at near-light speed, optimizing 99% of runtime despite only 5% of code being responsible for most processing time.
Domain-Specific Libraries and Their Impact
- Observations reveal that small parts of code consume the majority of runtime; this led to targeting various domains such as computer graphics and numerical libraries like Koopai, which is widely used in numerical computations.
Innovations in Signal Processing and AI Integration
- Aerial is highlighted as the first GPU-accelerated radio signal processing tool for 5G and 6G. This software-defined approach allows for AI integration into telecommunications, enhancing capabilities across multiple fields including genomics and medical imaging.
Advancements in Deep Learning Libraries
- The development of deep learning libraries such as Megatron and TensorRT has revolutionized computing by providing essential tools for training and inference, significantly impacting AI research and applications.
Computational Efficiency in Various Industries
- Tools like CU Litho have drastically reduced computation times in computational lithography from months to mere days, showcasing significant advancements in industries reliant on complex simulations like CAE (Computer-Aided Engineering).
Transforming Telecommunications with Software-Defined Networks
Shifting Paradigms in Computing
- The transition from general-purpose computing to specialized architectures opens new markets. Telecommunications must evolve similarly to cloud data centers by adopting software-defined models.
Performance Optimization Over Six Years
- After six years of refinement, a fully accelerated radio access network (RAN) stack achieves state-of-the-art performance metrics, enabling efficient data rates per watt—critical for modern telecommunications infrastructure.
Collaborations with Industry Leaders
- Partnerships with companies like SoftBank, T-Mobile, Nokia, Samsung, Fujitsu, and Cisco are pivotal in trials aimed at integrating AI into next-generation networks (5G/6G), enhancing overall system capabilities.
The Future of Quantum Computing Integration
Developing Quantum-Classical Platforms
- The CUDAQ platform represents a hybrid quantum-classical computing model where GPUs assist quantum processors (QPUs), indicating a future where supercomputers will integrate both technologies seamlessly.
Evolution of AI Capabilities
- Initial developments focused on perception AI models capable of recognizing patterns. Recent advancements emphasize generative AI's ability to create content across various formats—from text generation to image synthesis.
Generative AI: A New Frontier
Breakthrough Developments
- Generative AI has evolved into a universal function approximator capable of translating information between diverse formats through tokenization techniques. This capability marks a significant leap forward in machine learning applications.
One-Shot Learning Revolution
Understanding Agentic AI and Physical AI
The Nature of Intelligence
- Intelligence encompasses reasoning, problem-solving, and the ability to apply learned rules to unfamiliar situations.
- Technologies like "chain of thought" and "tree of thought" enhance AI's reasoning capabilities by simulating multiple options and evaluating their benefits.
Agentic AI: Understanding, Thinking, Acting
- Agentic AI mimics human cognitive processes by breaking down goals into actionable steps while considering consequences before execution.
- This form of AI operates in a digital realm akin to robotics, emphasizing understanding, thinking, and acting as core functions.
Advancements in Physical AI
- Physical AI is characterized by its understanding of real-world physics concepts such as inertia and object permanence.
- The application of physical AI includes generating training videos for self-driving cars to navigate various scenarios effectively.
The Future of Computing with Grace Blackwell
Revolutionary Computer Systems
- The Hopper computer revolutionized the field of AI three years ago; now a new system called Grace Blackwell aims to enhance inference time scaling for faster processing.
- Grace Blackwell enables both scaling up (creating larger computers beyond semiconductor limits) and scaling out (connecting multiple computers).
Performance Enhancements
- Full production of Grace Blackwell systems has commenced, with ongoing partnerships contributing to its development.
- Upcoming upgrades include the GB300 model which promises 1.5 times more inference performance and enhanced memory capacity compared to previous models.
Technical Specifications
Grace Blackwell GB300 System Overview
Performance Enhancements
- The Grace Blackwell GB300 system offers 1.5 times more inference performance compared to previous models, while training performance remains consistent.
- This system achieves approximately 40 petaflops, equating to the performance of the Sierra supercomputer from 2018, which utilized 18,000 Volta GPUs.
Moore's Law and Chip Development
- The advancement represents a 4,000 times increase in performance over six years, illustrating extreme adherence to Moore's Law.
- Nvidia emphasizes that scaling computing involves more than just faster chips; it requires innovative designs like connecting multiple chips together.
MVLink Technology
- Introduction of MVLink as the world's fastest switch with a bandwidth of 7.2 terabytes per second; nine switches are integrated into the rack.
- The MVLink spine connects all GPUs with a total bandwidth of 130 terabytes per second, surpassing peak internet traffic capabilities.
Innovations in GPU Architecture
Disaggregation and Cooling Solutions
- The architecture allows for disaggregated GPUs across an entire rack, effectively treating the whole rack as one motherboard.
- Due to high power density (120 kilowatts), liquid cooling solutions are essential for maintaining optimal temperatures within the system.
AI Factories Concept
- Nvidia is not merely constructing data centers but rather "AI factories," emphasizing large-scale production capabilities.
- Example: XAI Colossus factory spans four million square feet and operates at one gigawatt capacity, representing significant investment in technology infrastructure.
Manufacturing Process Insights
Investment and Complexity
- Building these advanced systems requires substantial financial investments (estimated between $60-$80 billion), with $40-$50 billion allocated specifically for electronics and computing components.
Production Workflow
- A detailed overview of manufacturing processes highlights collaboration with partners like TSMC for chip fabrication involving hundreds of steps to create each Blackwell die.
Integration and Assembly Techniques
Component Assembly
- Each Blackwell chip undergoes rigorous testing before assembly on PCBs at Foxconn facilities where robots manage component placement efficiently.
Cooling Innovations
- Custom liquid cooling solutions are developed by various manufacturers to ensure optimal thermal management during operation.
MVLink Breakthrough Details
High-Speed Connectivity
- MVLink technology enables high-speed connections among multiple GPUs, facilitating scalability into massive virtual GPU configurations.
AI Supercomputing and Infrastructure Development in Taiwan
Overview of Blackwell GPU Assembly
- The assembly of the Blackwell GPUs involves 72 black wells or 144 GPU dies, sourced globally from partners like Foxcon, Wistrun, Quanta, Dell, Asus, Gigabyte, HPE, and Super Micro.
- The entire process includes 1.2 million components and showcases the dedication and precision of the Taiwanese technology ecosystem.
- Acknowledgment of Taiwan's role not just in building supercomputers but also in developing AI infrastructure for local researchers and companies.
Importance of AI Infrastructure
- Emphasis on the need for a world-class AI infrastructure in Taiwan to support various sectors including research institutions and startups.
- Introduction of large chip systems made possible by MVLink technology that connects multiple components into a cohesive architecture.
Announcement of MVLink Fusion
- Launching NVIDIA MVLink Fusion to enable semi-custom AI infrastructure tailored to different needs across organizations.
- MVLink allows for scaling up semi-custom systems with diverse configurations involving CPUs and Nvidia GPUs.
Integration Capabilities
- Description of how MVLink Fusion integrates Nvidia’s CPU/GPU platforms with custom accelerators from partners.
- Flexibility offered to users wishing to incorporate their own CPUs into Nvidia’s ecosystem using MVLink chiplets.
Future Prospects
- The open nature of this new infrastructure allows for extensive customization while maintaining compatibility with existing technologies.
Nvidia's New Product Launch and Ecosystem
Nvidia Partnerships and Ecosystem
- Nvidia expresses excitement about partnerships with companies like LCHIP, Astera Labs, Marll, and MediaTek to support ASIC or semi-custom customers.
- Collaborations with Fujitsu and Qualcomm are highlighted for building CPUs integrated with MVLink into the Nvidia ecosystem.
- The MVLink Fusion ecosystem allows partners to scale up into AI supercomputers seamlessly.
Introduction of DGX Spark
- Announcement of the DGX Spark computer, which is in full production and will be available soon.
- Notable partners such as Dell, HPI, Asus, MSI, Gigabyte, and Lenovo are involved in creating various versions of DGX Spark aimed at AI developers.
- DGX Spark is designed for developers who prefer a personal AI cloud setup rather than relying on external cloud services.
Performance Comparison: DGX1 vs. DGX Spark
- The performance of the new DGX Spark is compared to the older DGX1 model; both have similar capabilities but represent significant advancements over ten years.
- Emphasis on accessibility; everyone can potentially own a personal AI supercomputer by Christmas.
Personal Supercomputing: The New DGX Station
- Introduction of another desk-side computer option from major brands like Dell and HPI that offers high performance from a standard wall outlet.
- This new workstation can run complex models like a one trillion parameter AI model effectively.
Redefining Enterprise IT with AI
- Discussion on how these systems are built specifically for modern AI applications without needing traditional IT software compatibility.
- Acknowledgment that enterprise computing must evolve to integrate compute, storage, and networking layers influenced by AI advancements.
Agentic AI Capabilities
- Introduction of "agentic AI," which serves as digital workers capable of performing tasks traditionally done by humans across various sectors including marketing and engineering.
The Future of Digital Employees and AI Integration
The Labor Shortage and Rise of Digital Agents
- There is a projected global labor shortage of 30 to 50 million workers by 2030, which could hinder economic growth.
- NVIDIA has integrated digital agents into its workforce, allowing software engineers to enhance productivity and code quality through AI assistance.
Transforming Enterprise IT with AI
- The concept of HR for digital workers is emerging, necessitating tools for managing AI agents within companies.
- A reinvention of computing infrastructure is required to support both traditional applications and new capabilities like agent-based AI.
Introducing the RTX Pro Enterprise Server
- The RTX Pro Enterprise server supports x86 architecture and can run classical hypervisors alongside modern workloads, including Omniverse applications.
- This server accommodates various forms of AI agents—textual, graphical, or video—ensuring compatibility across different modalities.
Advanced Networking Capabilities
- The new Blackwell RTX Pro 6000 features a switched network design with high bandwidth (800 gigabits per second), enhancing GPU communication.
- Each GPU in the system has its own networking interface, facilitating efficient east-west traffic among GPUs for improved performance.
Performance Metrics in AI Factories
- Throughput is measured in tokens per second; balancing high throughput with low latency remains a challenge in optimizing AI models.
- Different configurations (pipeline parallelism, expert parallelism, etc.) are necessary depending on workload requirements to maximize factory capability.
Breakthrough Contributions from DeepSeek R1
NVIDIA's AI Innovations and Enterprise Solutions
Overview of NVIDIA's New Server Capabilities
- The new server performance is four times that of the state-of-the-art H100, highlighting its superior capabilities for enterprise AI applications.
- The RTX Pro server is in volume production with industry partners, marking one of the largest market launches for NVIDIA systems.
Transition to AI Data Platforms
- Traditional structured databases like SQL are not sufficient for AI needs; AI requires querying unstructured data to extract semantic meaning.
- A new storage platform is being developed, featuring a complex software layer that integrates with existing storage solutions.
Future Storage Systems
- Future storage architectures will utilize GPUs instead of CPUs to process unstructured data efficiently, enabling better indexing and searching capabilities.
- The integration of GPU computing nodes in storage servers will enhance the ability to find meaning within raw data.
Advancements in AI Model Training
- NVIDIA focuses on post-training open AI models using safe and transparent data, ensuring high performance while maintaining security.
- The Llama Neotron reasoning model is highlighted as a leading open-source model available for download, showcasing significant improvements in query speed and results.
Collaboration with Industry Partners
- Vast has successfully implemented an accelerated AI data platform using NVIDIA’s IQ blueprint, demonstrating rapid development cycles for sales planning tools.
- Major companies like Dell, Hitachi, IBM, and NetApp are collaborating with NVIDIA to build robust AI platforms tailored for enterprise needs.
Introduction of AI Operations (AI Ops)
- A new software layer called "AI ops" will manage data curation and model evaluation within enterprises, enhancing operational efficiency.
- Partnerships with companies such as Crowdstrike and DataIQ aim to integrate fine-tuning models into the broader ecosystem of agentic AI solutions.
Enhancing Enterprise IT with AI Integration
- Enterprises can incorporate AI without overhauling existing IT systems; this approach allows businesses to run smoothly while integrating advanced technologies.
Robots and AI Agents: The Future of Robotics
Understanding AI Agents
- AI agents, often referred to as digital robots, are designed to perceive, understand, and plan actions similar to physical robots.
- To effectively learn how to function as a robot, it is essential for these agents to train in a virtual environment that adheres to the laws of physics.
Advancements in Physics Simulation
- Collaboration with Google DeepMind and Disney Research led to the development of Newton, an advanced physics engine set for open-source release in July.
- Newton is GPU accelerated and differentiable, allowing robots to learn from experiences with high fidelity simulations integrated into platforms like Mujoko and Nvidia's Isaac Sim.
Realistic Robot Training
- The simulation technology enables realistic scenarios where robots can learn through trial and error rather than mere animation.
- The self-driving car system comprises three components: AI model creation (GB 200/300), simulation (Omniverse), and deployment into vehicles.
Open Technology Integration
- Nvidia offers flexibility in technology integration; partners can choose various components based on their engineering needs.
- The Isaac Groot platform mirrors Omniverse's simulation capabilities while introducing Jetson Thor, a new robotic processor designed for both self-driving cars and robotic systems.
Data Strategy Challenges
- A significant challenge in robotics is developing an effective data strategy; human demonstrations are limited by time constraints.
- Utilizing human demonstrations allows robots to generalize tasks through AI amplification techniques that enhance data collection during training sessions.
Innovations in Robotics Data Generation
- The introduction of large-scale synthetic trajectory data generation addresses the need for extensive training datasets amidst labor shortages affecting industrial growth.
Cosmos and the Future of Robotics
Workflow of Cosmos in Robotics
- Developers fine-tune Cosmos using human demonstrations to operate a single task within a specific environment, allowing for effective training.
- The model generates 3D action trajectories from 2D dream videos through the Groot Dreams blueprint, enabling robots to learn diverse actions with minimal manual input.
- The process emphasizes the need for synthetic data generation and skill learning (fine-tuning), which relies heavily on reinforcement learning and substantial computational resources.
Importance of Humanoid Robotics
- Humanoid robotics is crucial due to its ability to be deployed in various environments (brownfield), fitting into existing infrastructures designed by humans.
- This technology is expected to become a multi-trillion dollar industry, driven by rapid technological innovation and high demand for computing power.
Digital Twins in Manufacturing
- The integration of digital twins allows for advanced simulations where robots can learn how to operate effectively within complex factory environments.
- Companies like Delta are already implementing digital twins for their manufacturing lines, preparing them for a future filled with robotic collaboration.
Global Manufacturing Trends
- Major companies such as TSMC, Foxconn, and Pegatron are developing digital twins on Nvidia Omniverse, enhancing every step of the manufacturing process.
- A projected $5 trillion investment in new plants worldwide highlights the importance of building efficient factories equipped with digital twin technology.
Applications of Digital Twins
- TSMC utilizes AI tools on co-op platforms to optimize intricate systems within their facilities, significantly reducing construction time and costs.
AI and Robotics: Taiwan's Pioneering Role
The Integration of AI in Industry
- Humanoids and vision AI agents are being utilized to enhance task performance, creating a diverse fleet that operates effectively.
- Linker Vision collaborates with the city of Kaos to use digital twins for simulating unpredictable scenarios, enabling real-time monitoring through city camera streams.
- Taiwan is positioned as the epicenter for advanced industries, particularly in AI and robotics, presenting extraordinary opportunities for innovation.
Transformative Impact of AI
- The speaker emphasizes how the work done by Taiwanese technology leaders has revolutionized various industries, suggesting a reciprocal transformation where AI will now enhance their own operations.
- A new product announcement hints at significant advancements from NVIDIA, indicating ongoing growth and development within the company.
Expansion Plans: Nvidia Constellation
- NVIDIA plans to establish a new office named "Nvidia Constellation" in Taiwan due to increasing partnerships and workforce expansion.
- The location for Nvidia Constellation has been selected at Beetho Sheiling, with negotiations underway regarding lease transfers.
Community Engagement and Future Prospects
- The mayor's approval is sought for building Nvidia Constellation; community support is encouraged through direct outreach to local officials.
- The speaker expresses gratitude towards partners while highlighting an unprecedented opportunity to create a new industry alongside existing IT advancements.
Collaborative Future in Technology
- Emphasizing collaboration, the speaker looks forward to working with partners on developing AI factories and enterprise solutions within a unified architecture.