GTC March 2024 Keynote with NVIDIA CEO Jensen Huang

Name: GTC March 2024 Keynote with NVIDIA CEO Jensen Huang
Uploaded: 2024-03-18T22:22:11.000Z
Duration: 4 h 6 min 5 s
Description: Watch NVIDIA CEO Jensen Huang’s GTC keynote to catch all the announcements on AI advances that are shaping our future. Dive into the announcements and discover more content at https://www.nvidia.com/gtc. Follow NVIDIA on X (formerly Twitter): https://twitter.com/NVIDIAGTC https://twitter.com/NVIDIA

Visionary AI Applications

The speaker introduces various roles of AI, portraying it as a visionary technology with transformative potential across different domains.

AI's Diverse Roles

AI is depicted as a visionary technology illuminating galaxies, guiding the blind, and assisting those without a voice.

It plays the role of a trainer teaching robots to assist and save lives, emphasizing clean energy and advanced patient care.

The multifaceted nature of AI is highlighted through its navigation capabilities for virtual scenarios and scriptwriting assistance.

Nvidia's Impactful Journey

Nvidia's CEO discusses the company's evolution since its inception in 1993, emphasizing key milestones that have shaped its trajectory in the realm of computing.

Evolution of Nvidia

Nvidia's founder emphasizes the scientific focus of the conference, highlighting diverse research fields leveraging AI for innovation.

The significance of accelerated computing across various industries is underscored, showcasing a broad spectrum of applications beyond IT.

The presenter acknowledges the transformative impact of accelerated computing on industries ranging from healthcare to logistics.

Revolutionizing Computing with Nvidia

The speaker delves into pivotal moments in Nvidia's history, outlining breakthrough innovations that have revolutionized computing paradigms.

Transformative Innovations

Key milestones such as CUDA's revolutionary computing model in 2006 and the inception of DGX-1 supercomputer in 2016 are highlighted.

World's First Homemade Concert

In this section, the speaker introduces the concept of a homemade concert and discusses the importance of accelerated computing in various industries.

Accelerated Computing for Sustainable Growth

Accelerated computing is crucial for driving down computing costs while increasing consumption sustainably.

The focus in certain industries, like simulation tools for product creation, is on scaling up computing rather than cost reduction.

The goal is to digitally design, build, simulate, and operate products entirely through digital twins, necessitating industry-wide acceleration.

Partnerships for Acceleration

Announcement of partnerships with key companies like Ansys and Cadence to accelerate their ecosystems towards accelerated computing.

Transitioning infrastructure to GPUs enables generative AI applications alongside accelerated computing benefits.

Nvidia's Software Partnerships

This segment highlights Nvidia's collaborations with software partners to advance computational lithography and semiconductor manufacturing using generative AI.

Transforming Semiconductor Manufacturing

Nvidia collaborates with Synopsis to accelerate computational lithography critical for chip production.

By accelerating software-defined processes with TSMC, Nvidia aims to apply generative AI in semiconductor manufacturing for enhanced geometric capabilities.

Building Supercomputers and Digital Twins

Collaboration with Cadence involves Cuda acceleration and supercomputer development using Nvidia GPUs for fluid dynamic simulations at scale.

Envisioning an interconnected ecosystem where AI co-pilots aid in chip design across platforms like Cadence's digital twin platform linked to Omniverse.

Computational Scale Challenges

Addressing the exponential growth in computational requirements due to scaling large language models and the need for increased GPU capacity.

Scaling Language Models

Large language models' rapid scaling demands significant computational resources doubling every six months, posing challenges in data processing and training token counts.

Computational Intensity

Giant Systems and Supercomputers

The speaker discusses the development of giant systems and supercomputers, highlighting the evolution from DJX1 to EOS in 2023.

Evolution of Supercomputers

GPUs connected with Mellanox Infiniband for giant systems like DJX1.

Emphasis on building chips, systems, networking, and software for supercomputers.

Distribution of computation across thousands of GPUs for energy efficiency and optimal performance.

Future plans include training models with multimodal data beyond text to enhance common sense understanding.

Innovations and Challenges

The speaker reflects on the need for larger models, synthetic data generation, reinforcement learning, and AI collaboration to advance technology.

Advancements and Hurdles

Importance of using synthetic data generation and reinforcement learning for model training.

Collaboration between AI entities to enhance model size and data volume.

Introduction of Blackwell GPU

Introducing a new GPU named after David Blackwell with significant capabilities.

Unveiling Blackwell

Introduction of a high-performance GPU named Blackwell after mathematician David Blackwell.

Blackwell Chip Architecture

Detailed insights into the architecture and features of the Blackwell chip.

Architecture Details

Description of Blackwell as a platform rather than just a chip.

Hopper - Advanced GPU

Discussion on Hopper as an advanced GPU with groundbreaking features.

Features of Hopper

Introduction to Hopper as an advanced GPU with 208 billion transistors.

Integration with Grace CPU

Integration details between Blackwell chips, dies, and Grace CPU for efficient computation.

Integration Process

New Section

In this section, the speaker discusses advancements in the Grace Blackwell system and the need for new features to push beyond the limits of physics.

Advancements in Grace Blackwell System

The introduction of a second-generation Transformer engine that can dynamically rescale and recast numerical formats to lower precision, crucial for artificial intelligence's probabilistic nature.

The importance of designing a smaller ALU and utilizing a fifth-generation MV link for faster computation within a network of GPUs.

Explanation of synchronization processes like all reduce and all gather in GPU networks to share information efficiently.

Implementing a reliability engine for self-testing every gate and memory component on the Blackwell chip to maintain high supercomputer utilization.

Introduction of encryption measures for data security during rest, transit, and computation phases.

Next New Section

This section delves into the significance of secure AI practices, compression engines, and performance enhancements in the Blackwell system compared to Hopper.

Secure AI Practices and Performance Enhancements

Implementation of 100% self-testing on every gate and memory component connected to the Blackwell chip for reliability assurance.

Introduction of data encryption measures during rest, transit, and computation stages to safeguard AI parameters from loss or contamination.

Integration of a high-speed compression engine to facilitate rapid data movement in and out of computers, optimizing utilization efficiency.

Comparison between Blackwell and Hopper systems highlighting improved fp8 performance for training per chip with new formats like fp6 and fp4 enhancing throughput significantly.

Final New Section

This section explores generative AI concepts, emphasizing its transformative impact on content creation processes.

Generative AI Concepts

Shift towards generative AI era where content generation surpasses retrieval methods leading to energy savings, bandwidth optimization, and time efficiency gains.

Discussion on how generative AI understands context better than pre-recorded content retrieval methods, revolutionizing information delivery processes.

Key Innovations in Computing

In this section, the speaker discusses the advancements and innovations in computing technology, focusing on content token generation, scaling of computational power over the years, and the development of a new chip called Envy link switch.

Content Token Generation and Computational Advancements

The format for content token generation is referred to as FP4.

Computational power has increased significantly with 5 times the Gen token generation and inference capability of Hopper.

Despite substantial advancements, there is a need for even greater computational power beyond current capabilities.

Scaling of Computational Power

Over eight years, computation has increased by 1,000 times.

This rate of advancement surpasses historical trends like Moore's Law (doubling every two years).

The introduction of a new chip named Envy link switch with remarkable specifications.

Envy Link Switch Chip Development

This part delves into the features and functionalities of the Envy link switch chip, highlighting its design elements and potential impact on GPU connectivity.

Envy Link Switch Chip Features

The Envy link switch chip boasts an impressive 50 billion transistors.

It includes four MV links with speeds reaching 1.8 terabytes per second each.

Enables direct communication between GPUs at full speed simultaneously for enhanced performance.

Revolutionary DGX System

Here, the speaker introduces the DGX system, showcasing its evolution over time and emphasizing its significance in modern computing infrastructure.

Introduction to DGX System

The DGX system represents a significant advancement in computing technology.

Notable progress from previous models like DJX1 to current systems offering higher performance levels.

The latest DGX model delivers exceptional processing power for training AI models efficiently.

Innovative Cooling Solutions

This segment focuses on the cooling mechanisms employed in high-performance computing systems like DGX, emphasizing efficiency and cost-effectiveness.

Liquid Cooling System Efficiency

The liquid-cooled system maintains optimal temperatures during operation.

Utilizes innovative cooling techniques to manage heat dissipation effectively.

Generative AI and Inference Challenges

The discussion delves into the challenges faced in computing inference for large language models, particularly focusing on generative AI and the complexities associated with interactive rates and token generation.

Generative AI Complexity

Large language models fall under the category of computing known as inference, which is particularly challenging due to their size.

Interactive applications like chatbots require supercomputers for inference, especially when dealing with trillions of tokens and parameters.

Token Generation and Parallelization

Effective token generation is crucial for user interaction speed and overall service cost.

Paralyzing model work across multiple GPUs enhances throughput, reducing costs per token while maintaining quality of service.

GPU Programmability and Exploration

NVIDIA's GPUs' programmability enables exploring vast search spaces to optimize model distribution across GPUs efficiently.

Different parallel configurations impact performance metrics like tokens per second, necessitating careful exploration to find optimal solutions.

Inference Capabilities Comparison: Blackwell vs. Hopper

A comparison between Blackwell and Hopper highlights the significant advancements in generative AI capabilities, emphasizing Blackwell's superior performance in handling large language models.

Blackwell's Superiority

Blackwell excels in handling trillion-parameter generative AI systems compared to Hopper, showcasing a substantial performance improvement.

Enhancements such as fp4 tensor core, new Transformer engine, and MV link switch contribute to Blackwell's exceptional inference capabilities.

Detailed Overview of Key Points

In this section, the speaker discusses various configurations and advancements in technology related to AI companies and cloud services.

Configurations and Advancements

Various configurations are highlighted, including those that slide into the hopper form factor for easy upgrades.

Mention of extreme versions like liquid-cooled examples and an entire rack connected by mvlink 72 for advanced performance.

Collaboration with AWS for building secure AI GPUs and a high-performance system, emphasizing joint efforts in accelerating AI across different sectors.

Google's preparations for Blackwell with existing GPU fleets and optimization initiatives across various services like data processing, AI, and robotics.

Microsoft's acceleration efforts in partnership with Nvidia, focusing on Cuda acceleration and integration of Nvidia technologies into Azure services.

Innovations in Digital Twin Technology

The discussion shifts towards the concept of digital twins, their significance in complex projects like building computers, and collaborations with manufacturing partners.

Digital Twin Technology

Emphasis on the importance of digital twins in project planning to ensure precision and efficiency.

Wistron's utilization of digital twins to meet demands for Nvidia accelerated computing through virtual integration and layout optimizations.

Benefits observed at Wistron include increased worker efficiency during construction, reduced cycle times, defect rates using Omniverse digital twin technology.

Revolutionizing Manufacturing Through Digitalization

The focus is on how digitalization is transforming manufacturing processes by enabling virtual testing, layout optimization, real-time monitoring using IoT data.

Manufacturing Transformation

Adoption of digital-first manufacturing approaches leading to enhanced operational efficiencies through rapid testing of layouts and real-time monitoring.

How AI is Revolutionizing Weather Prediction

In this section, the speaker discusses the advancements in AI technology and its impact on weather prediction, emphasizing the potential for generative AI to revolutionize forecasting accuracy.

The Miracle of Generating Meaningful Data

The speaker marvels at how three letters can generate a million pixels that make sense, highlighting the transformative power of AI.

Reflecting on advancements over ten years, the discussion shifts to recognizing and understanding text, images, videos, and sounds through AI technologies.

Digitization Beyond Text and Images

Explores the concept of digitization beyond text and images to proteins, genes, brain waves, and any structured data for pattern recognition and understanding.

Introduces Earth 2 as a digital twin for predicting weather using generative AI models like Civ with high resolution capabilities.

Enhancing Weather Forecasting with Generative AI

Highlights the importance of accurate weather predictions for mitigating extreme weather impacts globally.

Introduces Cordi as a generative AI model enhancing weather forecasting by super-resolving storm tracks with high efficiency.

Global Impact and Future Prospects

Discusses expanding Cordi's capabilities globally to provide detailed regional weather forecasts for minimizing damages.

Optimizing Drug Discovery with Nemo Microservices

The discussion revolves around how Biion Nemo is revolutionizing drug discovery by providing on-demand microservices that enhance computational drug design workflows.

Optimizing Drug Discovery

Biion Nemo optimizes molecules to bind to target proteins efficiently while considering other molecular properties, leading to the development of high-quality, synthesizable drugs for faster medicine creation.

Nims enable researchers to reinvent computational drug design, offering a range of models like computer vision and robotics models for diverse applications.

Nvidia introduces the concept of Nvidia Inference Microservice (Nim), pre-trained models packaged with dependencies optimized for various GPU configurations and simple APIs for ease of use.

Innovative Software Development with Nims

This segment delves into the innovative software development approach facilitated by Nvidia's Nim containers, streamlining AI model deployment and usage.

Innovative Software Development

A Nim is a pre-trained model packaged in a container optimized for Nvidia GPUs, simplifying AI model deployment across different hardware setups.

These Nim containers come with pre-trained open-source models and necessary dependencies tailored for single or multiple GPUs, enhancing accessibility and usability through straightforward APIs.

Future of Software Building with AI Co-Pilots

The conversation shifts towards envisioning the future of software development where AI co-pilots collaborate seamlessly in executing complex tasks efficiently.

Future Software Building

Future software development may involve assembling teams of AI co-pilots specialized in various tasks, working collaboratively to achieve optimal outcomes without extensive manual coding.

NVIDIA AI Foundry Overview

In this section, Jensen Huang discusses the three pillars of NVIDIA's AI Foundry, focusing on technology invention, tool creation for modification, and infrastructure provision for fine-tuning and deployment.

NVIDIA's Three Pillars

NVIDIA's AI Foundry comprises three key elements:

Inventing technology for AI models.

Creating tools to modify AI models.

Providing infrastructure for fine-tuning and deployment.

The concept of NVIDIA as an "AI Foundry" is likened to TSMC in building chips. NVIDIA aims to support industries with their AI needs by offering technology, tools, and infrastructure.

NVIDIA emphasizes the importance of understanding proprietary information within companies. They aim to extract meaning from internal data and reindex it into a vector database for enhanced insights.

NVIDIA Nemo Retriever and Digital Humans

This part delves into the Nemo Retriever service that allows users to interact with a smart database to retrieve information swiftly. Additionally, digital human Nims like Rachel are introduced as AI care managers.

Nemo Retriever and Digital Humans

The Nemo Retriever service enables users to interact with a smart database efficiently by requesting specific information retrieval tasks.

Introducing Diana, a digital human Nim designed for healthcare interactions. These digital humans represent advanced applications of AI in various sectors such as healthcare.

Partnerships in Building Co-Pilots

Jensen Huang discusses collaborations between NVIDIA's AI Foundry and prominent companies like SAP, ServiceNow, Cohesity, Snowflake, NetApp, and Dell in developing co-pilots using technologies like Nemo and DGX Cloud services.

Collaborative Partnerships

Collaboration examples include:

SAP: Building co-pilots using NVIDIA technologies.

ServiceNow: Utilizing NVIDIA AI Foundry for virtual assistance.

Cohesity: Working on Gaia generative AI agent development.

Other partnerships involve Snowflake storing vast amounts of data in the cloud while NetApp focuses on on-premises storage solutions. Both collaborate with NVIDIA on chatbots and co-pilot projects leveraging Nemo technology.

Role of Dell in Building AI Factories

Jensen Huang highlights Dell's role in constructing end-to-end systems at scale for enterprises aiming to establish AI factories. The significance of these factories in deploying chatbots and generative AIs is emphasized.

Dell's Role in Establishing AI Factories

New Section

In this section, the speaker discusses the process of training AI models to imitate human behavior by studying patterns and examples.

Training AI Models to Imitate Human Behavior

The AI learns to predict the next words by studying patterns and previous examples. It needs to understand context for accurate imitation.

Data is compressed into a large language model with trillions of parameters, which become the basis for the AI.

New Section

This part focuses on advancing AI to understand the physical world through three essential computers.

Advancing AI Understanding of Physical World

Three computers are crucial for AI understanding: an AI computer watching videos, a system for synthetic data generation, and a processor for robotics tasks.

The AGX system is designed for low power consumption but high-speed sensor processing and Ai operations in physical systems like cars or moving objects.

New Section

Here, the discussion shifts towards reinforcement learning in robotics and the need for physical feedback.

Reinforcement Learning in Robotics

Unlike language models that use human feedback, robots require physical feedback for reinforcement learning to align with physics laws properly.

A simulation engine called Omniverse provides a digital representation of the world where robots can learn articulation capabilities within real-world constraints.

New Section

The speaker introduces an example showcasing how AI and Omniverse collaborate in a robotics building scenario.

Collaboration Between AI and Omniverse

A warehouse setting demonstrates autonomous systems interacting as humans and forklifts autonomously coordinate activities overseen by an air traffic controller-like system within Omniverse Cloud hosting virtual simulations.

Omniverse Cloud and Robotics Integration

In this section, the discussion revolves around the integration of software for robotic systems using digital twins in Omniverse Cloud, making it more accessible through APIs.

Integrating Software for Robotic Systems

Digital twins are crucial for integrating software in future CI/CD processes for robotic systems.

Omniverse Cloud is made more accessible with simple APIs, enabling easy connection of applications.

The APIs in Omniverse Cloud provide magical digital twin capabilities for users.

Universal Scene Description and Semantic Encoding

This part focuses on the language used in Omniverse, Universal Scene Description (USD), and how semantic encoding is shifting towards scene-based semantics rather than traditional language-based semantics.

Language in Omniverse

Omniverse uses Universal Scene Description (USD) as its language.

Users can communicate with Omniverse in English, generating USD directly and receiving responses in USD.

Semantic encoding is now scene-based rather than language-based, allowing users to search semantically within scenes.

Seimens Partnership and Industrial Metaverse

The partnership between Nvidia and Siemens is highlighted, emphasizing the integration of technologies to build an industrial metaverse.

Siemens Collaboration

Siemens is connecting its accelerator platform to Nvidia Omniverse.

Team Center X from Siemens integrates AI and Omniverse technologies for data interoperability and rendering at an industrial scale.

This collaboration aims to create a physics-based digital twin for sustainable ship manufacturing projects.

Collaboration and Productivity Enhancement

The focus here is on how integrating Omniverse into workflows enhances collaboration across departments, leading to increased productivity by establishing a common ground truth.

Enhanced Collaboration

Connecting Omniverse throughout the workflow boosts productivity significantly.

Departments like design, engineering, manufacturing planning, art, architecture, and marketing benefit from operating on a shared ground truth.

Automotive Industry Transformation with Robotics

Discussion centers around the transformation of the automotive industry through robotics integration from top to bottom stack development by Nvidia.

Automotive Robotics Integration

Nvidia develops a complete robotic stack including self-driving applications for automotive companies like Mercedes and JLR.

The software-defined autonomous robotic systems encompass computer vision, artificial intelligence control, planning technology.

Jetson Robotics Computer and Isaac Perceptor

In this section, the speaker discusses the Jetson robotics computer and introduces the Isaac Perceptor, highlighting its advanced capabilities for future robotics applications.

Jetson Robotics Computer

The Jetson robotics computer is 100% CUDA compatible, enabling a rich ecosystem of software development.

NVIDIA focuses on maintaining compatibility with various tools to enhance developers' capabilities.

Isaac Perceptor

Introducing the Isaac Perceptor, a revolutionary SDK for robots with perception abilities.

Unlike pre-programmed bots, the Isaac Perceptor allows adaptive programming based on waypoints for dynamic navigation.

Features state-of-the-art vision odometry and 3D reconstruction for enhanced depth perception in robotic operations.

New Section

In this section, the speaker introduces the BDX robots of Disney Research and discusses five key points related to a new Industrial Revolution and generative AI.

Introduction of BDX Robots

The speaker presents the BDX robots of Disney Research, highlighting their significance in the technological landscape.

Five Key Points

A new Industrial Revolution is emphasized, with a focus on accelerating every data center for modernization.

Generative AI is introduced as a result of enhanced computational capabilities, leading to the creation of valuable software.

Discussion on how new computers and Nims will revolutionize software distribution and application development.

Nims are highlighted as tools to aid in creating proprietary applications and chatbots for future technologies.

The necessity of robotic systems in various industries is discussed, emphasizing the need for a digital twin platform like Omniverse.

Nvidia's Vision and Blackwell GPU

This section delves into Nvidia's vision regarding GPUs and introduces Blackwell as a significant advancement in processor technology.

Nvidia's Vision

The speaker shares Nvidia's perspective on GPUs, focusing on software stacks and introducing Blackwell as an innovative processor design.

Introduction of Blackwell GPU