NVIDIA CEO Jensen Huang GTC 2026 Full Keynote
How Intelligence is Made: The Role of Tokens
Introduction to Tokens
- Tokens are described as the building blocks of AI, functioning like a new kind of factory that generates intelligence.
- They represent a frontier in transforming data into knowledge, utilizing clean energy to unlock vast potential across various domains.
Impact of Tokens
- Tokens facilitate advancements in both virtual and physical worlds, aiding robots and enhancing human experiences.
- Their continuous operation allows for improvements where human intervention is limited, contributing to overall well-being.
Future Prospects
- The use of tokens aims to empower global initiatives and reach unprecedented heights in technology, symbolized by "star cloud one."
Nvidia's Vision at GTC Conference
Welcome Address by Jensen Huang
- Jensen Huang introduces the conference, emphasizing its focus on technology and platforms while acknowledging early attendees.
Nvidia's Platforms Overview
- Nvidia operates three main platforms: CUDA X, systems architecture, and a new initiative called AI factories.
Ecosystem Engagement
- Huang expresses gratitude towards pregame show hosts and highlights the extensive participation from 450 companies at the event.
The Evolution of CUDA
Celebrating 20 Years of CUDA
- This year marks the 20th anniversary of CUDA, showcasing its evolution as a revolutionary architecture for programming multi-threaded applications.
Integration Across Industries
- CUDA has become integral to numerous ecosystems with thousands of tools available for developers; it supports diverse applications in AI.
The Flywheel Effect in Computing
Building an Installed Base
- The installed base of CUDA has grown significantly over two decades, facilitating breakthroughs such as deep learning through developer engagement.
Accelerating Innovation
- The flywheel effect describes how increased downloads lead to more algorithms being developed, creating new markets and expanding ecosystems around them.
Sustaining Applications Through Infrastructure
Long-term Viability
- Nvidia GPUs support every phase of the AI lifecycle; their high utility ensures long-lasting value across various applications.
Continuous Improvement
- Ongoing software updates enhance performance while reducing costs over time due to architectural compatibility across all GPUs.
NVIDIA's Journey and Innovations in Graphics Technology
The Evolution of NVIDIA Architecture
- NVIDIA's new optimization benefits millions globally, expanding its reach while reducing computing costs, which fosters further growth.
- GeForce is highlighted as NVIDIA's most successful marketing strategy, creating future customers from a young age through parental investment.
- The introduction of the programmable shader 25 years ago marked the beginning of CUDA’s journey, leading to significant advancements in graphics technology.
- Despite initial hardships, NVIDIA dedicated itself to developing CUDA, believing strongly in its potential for revolutionizing computing.
- The evolution from pixel shaders to RTX architecture ten years ago represents a major redesign for modern computer graphics.
AI and Deep Learning Integration
- GeForce enabled prominent figures like Alex Kruefky and Jeff Hinton to leverage GPUs for accelerating deep learning, sparking an AI revolution.
- Ten years ago, NVIDIA introduced hardware ray tracing alongside programmable shading, anticipating AI's transformative impact on computer graphics.
- The next generation of graphics technology is presented as "neuro rendering," merging 3D graphics with artificial intelligence capabilities.
Fusion of Structured Data and Generative AI
- The integration of structured data from virtual worlds with generative AI leads to highly realistic content creation that remains controllable.
- This fusion concept is expected to permeate various industries, establishing structured data as the foundation for trustworthy AI applications.
Understanding Structured vs. Unstructured Data
- A detailed schematic illustrates the importance of structured data platforms (e.g., SQL, Spark), which serve as the ground truth for enterprise computing.
- Future developments will see AI utilizing structured databases at unprecedented speeds compared to human capabilities.
- Unstructured data constitutes about 90% of global information but has been largely underutilized due to indexing challenges; AI aims to address this issue effectively.
NVIDIA's Innovations in Data Processing
Foundational Libraries for Data Management
- NVIDIA has developed two key libraries: QDF for structured data (data frames) and QVS for unstructured data (vector stores), similar to how RTX was created for 3D graphics.
- These platforms are anticipated to play a crucial role in the future of data processing, which is complex due to the multitude of existing systems and services.
Collaboration with IBM
- IBM, known for inventing SQL, is enhancing its Watson X Data platform using NVIDIA's KUDF, marking a significant evolution in data processing.
- The collaboration aims to reinvent data processing for AI by integrating NVIDIA GPU computing libraries into SQL engines.
Accelerated Computing Benefits
- Rapid access to large datasets is essential for AI; traditional CPU systems struggle to keep pace with this demand.
- For instance, Nestle can now process supply chain decisions five times faster at 83% lower costs using accelerated Watson X Data on NVIDIA GPUs.
Partnerships and Cloud Integration
- Dell has partnered with NVIDIA to create an AI data platform that combines QDF and QVS, optimizing it for the AI era.
- Collaborations extend to Google Cloud where NVIDIA accelerates Vertex AI and BigQuery, significantly reducing computing costs—e.g., nearly 80% cost reduction for Snapchat.
Moore's Law and Future Directions
- Traditional performance improvements as per Moore’s Law have plateaued; thus, a new approach through accelerated computing is necessary.
- By continuously optimizing algorithms, NVIDIA aims to reduce computing costs while increasing speed and scale across various applications.
Expanding Cloud Services Reach
- NVIDIA’s accelerated computing platform includes multiple libraries like RTX, QDF, and KVS that enhance cloud service capabilities.
- The integration of these technologies allows customers from various sectors (e.g., Salesforce, Puma) to leverage enhanced computational power via cloud services.
Future Prospects with AWS
- Ongoing collaborations with AWS aim to integrate OpenAI into their ecosystem, driving substantial growth in cloud computing consumption.
- This partnership will enhance AWS offerings such as EMR and SageMaker by leveraging NVIDIA’s acceleration technologies.
NVIDIA's Strategic Partnerships and Innovations in AI
Deep Integration with Cloud Services
- NVIDIA has established a strong partnership with AWS, being their first cloud partner, and has also collaborated extensively with Microsoft Azure.
- The company accelerates various services on Azure, including Bing search and AI foundry capabilities, which are crucial for expanding AI globally.
Confidential Computing Capabilities
- NVIDIA's GPUs are the first to support confidential computing, ensuring that even operators cannot access sensitive data or models.
- This technology is vital for deploying valuable AI models securely across different clouds and regions.
Historical Context of Partnerships
- NVIDIA was Oracle's first AI customer, showcasing its pioneering role in introducing AI cloud concepts to the company.
- Core is highlighted as the world's first AI-native cloud provider focused solely on hosting GPUs for accelerated computing.
Innovative Platforms and Deployment Flexibility
- A collaboration between NVIDIA, Palantir, and Dell has led to the creation of a new type of AI platform that can be deployed anywhere—on-premises or in air-gapped environments.
- The ability to deploy these platforms without compromising security is attributed to NVIDIA’s confidential computing capabilities.
Vertical Integration Strategy
- NVIDIA positions itself as a vertically integrated but horizontally open company, emphasizing that accelerated computing transcends just hardware issues.
- The focus on application acceleration is critical; understanding specific applications allows for tailored solutions that enhance performance significantly.
Comprehensive Understanding of Applications
- To effectively accelerate applications across various domains (data centers, edge devices), NVIDIA must grasp algorithms and deployment scenarios thoroughly.
- This comprehensive approach enables integration into diverse systems while maintaining flexibility in software offerings.
Ecosystem Engagement at GTC
- The GTC event showcases how NVIDIA collaborates with upstream and downstream partners within its supply chain to drive innovation forward.
- Notably, financial services represent a significant portion of attendees at GTC, indicating strong interest from this sector in accelerated computing technologies.
NVIDIA's Computing Platforms and AI Innovations
Overview of NVIDIA's Computing Platforms
- NVIDIA emphasizes the need for domain-specific libraries to activate their computing platforms across various industries, including autonomous vehicles and financial services.
- The transition from classical machine learning to supercomputers utilizing deep learning is highlighted, showcasing advancements in algorithmic trading and healthcare.
Key Areas of Focus
- A keynote track led by Kimberly Pal discusses applications of AI in healthcare, such as drug discovery and customer service support through AI agents.
- NVIDIA is involved in a significant industrial buildout, focusing on creating AI factories and chip plants that will transform global industries.
Media, Entertainment, and Retail Applications
- In media and entertainment, real-time AI platforms are being developed for translation, broadcasting support, and live gaming experiences.
- The retail sector is leveraging NVIDIA technology for supply chain optimization and developing intelligent shopping systems with AI agents.
Robotics and Telecommunications Integration
- NVIDIA has been building essential computers for robotic systems over the past decade, collaborating with numerous companies in robotics.
- The telecommunications industry is undergoing a transformation where base stations will evolve into AI infrastructure platforms capable of running at the edge.
CUDA X Libraries: The Core of Innovation
- At the heart of NVIDIA’s offerings are CUDA X libraries which serve as algorithms enabling solutions across various sectors.
- During the event, NVIDIA announces updates to its extensive library collection aimed at solving complex problems across different fields.
Impact of CUDA Deep Neural Networks
- CUDA deep neural networks have revolutionized artificial intelligence by facilitating breakthroughs in modern AI development.
- A variety of specialized libraries (e.g., CU opt for decision optimization; CU litho for computational lithography; Aerial for AI ran; Warp for differentiable physics; pair bricks for genomics) showcase the breadth of applications supported by these algorithms.
Simulation vs. Animation in Demonstrations
- All demonstrations presented were simulations rather than animations or articulated models, emphasizing Nvidia's focus on realistic modeling through advanced algorithms.
Collaboration with Major Companies
- Notable partnerships include collaborations with major corporations like Walmart and Toyota alongside emerging "AI native" companies that are shaping future innovations.
The Rise of AI Startups and Investment Trends
Overview of New Companies in AI
- The speaker discusses a list of emerging companies in the AI sector, including well-known names like OpenAI and Anthropic, highlighting their diverse applications across various verticals.
Surge in AI Investments
- In the past two years, particularly the last year, there has been an unprecedented surge in venture investments into AI startups, totaling $150 billion—marking the largest investment wave in human history.
Shift in Investment Scale
- The scale of investments has dramatically increased from millions to billions of dollars. This shift is attributed to the high demand for computational resources among new AI companies.
Demand for Computational Resources
- New AI companies require substantial computing power and tokens. They either create their own tokens or integrate existing ones from established firms like OpenAI and Anthropic.
Impact on Computing Standards
- The speaker draws parallels between current advancements in computing and historical shifts during the PC and internet revolutions, suggesting that we are at the beginning of a new platform shift with significant implications for future companies.
Key Developments Driving Generative AI
Emergence of Generative AI
- The introduction of ChatGPT marked the start of a generative AI era capable not only of understanding but also generating unique content, revolutionizing how computers operate.
Transition from Retrieval-Based to Generative Computing
- Generative computing represents a fundamental change from traditional retrieval-based methods. This evolution will influence computer architecture and overall definitions of computing moving forward.
Advancements in Reasoning Capabilities
- Recent developments have enabled generative models to reason effectively by breaking down complex problems into manageable parts, enhancing their reliability through grounded truthfulness.
Revolutionizing Software Engineering with Cloud Code
Introduction to Agentic Models
- Cloud code represents a significant advancement as it allows models to read files, compile code, test it, and iterate on solutions autonomously—transforming software engineering practices.
Changing Interaction with AI Tools
- Users now interact with AIs by asking them to create or build rather than just retrieve information. This shift emphasizes problem-solving capabilities over mere data retrieval.
Increased Productivity Through Reasoning
- The evolution from perception-based AIs to those capable of reasoning signifies a leap towards productive work. AIs can now perform tasks that require critical thinking and problem-solving skills effectively.
The Growing Demand for GPUs
Escalating GPU Demand
- There is an overwhelming demand for NVIDIA GPUs due to increased computational needs driven by advancements in AI technologies; prices are soaring as supply struggles to keep pace with demand.
This structured summary captures key insights from the transcript while providing timestamps for easy reference.
The Rise of AI Inference and Computing Demand
The Surge in Token Generation and Compute Demand
- The transition from training to inference has led to a significant increase in the number of tokens generated, with compute requirements rising by approximately 10,000 times.
- Over the past two years, computing demand has surged dramatically, with estimates suggesting an increase of up to 1 million times. This sentiment is echoed across various startups and organizations like OpenAI and Anthropic.
Financial Projections for AI Infrastructure
- Last year, there was a projected demand of $500 billion for AI infrastructure through 2026; however, current insights suggest this could rise to at least $1 trillion through 2027.
- The speaker emphasizes that this substantial figure reflects not just potential but also confidence in the market's growth trajectory.
NVIDIA's Commitment to AI Inference
- NVIDIA focused on enhancing its capabilities in all phases of AI development during what they termed "the year of inference." This strategic move aims to ensure long-term utility and cost-effectiveness of their infrastructure investments.
- The company asserts that it offers the lowest-cost infrastructure available globally for AI applications, which is crucial as demand continues to escalate.
Broadening Applications Across Domains
- NVIDIA's architecture supports a wide range of AI models across various domains including language processing, biology, computer graphics, and robotics—making it a versatile platform for developers worldwide.
- The company's collaboration with major players like Anthropic signifies its pivotal role in advancing open-source models that are now reaching near frontiers in performance.
Market Dynamics and Future Outlook
- Approximately 60% of NVIDIA’s business comes from hyperscalers (large cloud service providers), indicating strong internal consumption trends towards deep learning and large language models within these companies.
- The remaining 40% encompasses diverse sectors such as regional clouds, enterprise solutions, industrial applications, and supercomputing systems—highlighting the resilience and broad applicability of AI technology today.
Revolutionizing AI Inference with MVLink72
Architectural Innovations in MVLink72
- The Hopper architecture was reimagined, leading to the development of MVLink72, which involved a complete rearchitecture and disaggregation of the computing system.
- Grace Blackwell's MVLink72 represents a significant investment and effort, introducing the new FP4 tensor core that enhances performance without sacrificing precision.
- The introduction of various algorithms and supercomputing capabilities through DGX cloud has optimized kernel performance for AI inference.
Importance of Inference in AI
- Jensen emphasizes that while inference may seem simple, it is crucial for driving revenue in AI applications.
- Tokens per watt is highlighted as a key metric for data centers, emphasizing the need for efficiency due to physical power constraints.
Performance Metrics and Insights
- The relationship between inference speed and model complexity is discussed; faster inference allows processing larger models with more context.
- A trade-off exists where smarter AI (higher intelligence) results in lower throughput due to increased processing time.
Cost Efficiency and Competitive Advantage
- Nvidia's performance metrics show an unprecedented increase in efficiency—35 times higher than expected—challenging traditional Moore's Law expectations.
- Jensen mentions that Nvidia’s cost per token is currently unmatched due to their advanced architecture design.
Strategic Positioning in the Market
- Emphasizing extreme code design as a competitive edge, Jensen notes that even if systems are free but poorly designed, they remain costly due to infrastructure needs.
- Nvidia integrates its software vertically while maintaining horizontal openness, allowing seamless integration into global inference service providers' ecosystems.
Impact on Data Center Operations
- The transformation of data centers from mere storage facilities to token generation factories highlights the evolving landscape of computational resources.
- An example shows how updating software led to a dramatic increase in token production speeds—from 700 tokens per second to nearly 5,000—demonstrating the power of optimization.
The Future of AI and Computing Architecture
The Importance of Token Optimization
- Inference is crucial as it represents the workload, while tokens are becoming a new commodity that drives revenue. Optimizing architecture for future demands is essential.
- Every company in the tech sector will focus on their token factory effectiveness, indicating a shift towards intelligence powered by tokens.
Evolution of AI Computing Systems
- Introduction of DGX1 in 2016 marked the beginning of deep learning-focused computers, featuring eight Pascal GPUs and NVLink technology.
- The DGXA100 Super Pod combined scale-up and scale-out architectures, leading to advancements like Hopper with FP8 Transformer engines that initiated the generative AI era.
Advancements in System Architecture
- Blackwell redefined AI supercomputing with NVLink 72, allowing for unprecedented bandwidth and integration of various computing components.
- With three scaling laws—pre-training, post-training, and inference—compute demand is growing exponentially alongside developments in Agentic systems.
Vera Rubin: A New Era for Agentic AI
- Vera Rubin architecture supports all phases of Agentic AI with significant compute capabilities (3.6 exaflops), enhancing orchestration and workflows.
- The Gro 3 LPX rack introduces a token accelerator that significantly boosts throughput efficiency compared to previous systems.
Innovations in CPU Design
- A new CPU designed for high single-threaded performance has been developed to support agentic processing needs efficiently.
- The Grace Blackwell system features complete liquid cooling solutions that drastically reduce installation time from two days to two hours.
Unique Cooling Solutions and Networking Technology
- The Vera Rubin system utilizes hot water cooling at 45°, optimizing energy use within data centers by reducing traditional cooling costs.
- MVLink technology represents a sixth-generation switching system unique to this architecture, showcasing advanced manufacturing capabilities.
Nvidia's Revolutionary CPU and Data Center Innovations
Introduction to Nvidia's New Technologies
- Nvidia has launched the Spectrum X, a revolutionary system that boasts twice the performance per watt compared to any existing CPUs globally.
- The company is now selling standalone CPUs, which is projected to become a multi-billion dollar business, highlighting the success of their CPU architects.
Overview of New Hardware
- The presentation includes details about the MVLink rack technology, designed for efficient data center cabling systems.
- Introduction of Reuben Ultra compute nodes that allow connection of 144 GPUs in one MVLink domain through a new Kyber rack design.
Advanced Compute Node Design
- Each compute node slides into the Kyber rack vertically, enhancing connectivity and efficiency within the data center architecture.
- The midplane design replaces traditional cabling with an advanced system capable of connecting multiple GPUs more effectively.
The Future of AI Factories: Throughput and Token Speed
Importance of Throughput and Token Speed
- A critical chart presented outlines throughput versus token speed as essential metrics for future AI factories; every CEO will need to monitor these closely.
- The analysis indicates that throughput and token speed at ISO power are directly linked to revenue generation for companies utilizing AI technologies.
Evolving Token Metrics
- As model sizes increase, both input and output token lengths are expanding significantly, affecting pricing strategies for tokens as they become commodities.
- Different tiers based on throughput and speed will emerge in the market; free tiers may offer lower speeds while premium services could command higher prices due to enhanced capabilities.
Pricing Strategies for Tokens
- Proposed pricing models suggest varying costs per million tokens based on service levels—ranging from free options up to $150 per million tokens for high-demand applications.
- This tiered approach aims to maximize revenue by aligning smarter AI models with increased pricing potential as capabilities improve.
Looking Ahead: Nvidia's Hopper Architecture
Anticipated Performance Improvements
- Discussion on Nvidia’s Hopper architecture suggests significant performance enhancements over previous generations, indicating a promising trajectory for future developments.
Grace Blackwell's Impact on Throughput and Revenue
Introduction to Grace Blackwell
- Grace Blackwell significantly enhances throughput at the free tier, increasing it by 35 times in monetized services.
- The introduction of a new tier reflects typical business models where higher tiers offer better quality and performance.
Customer Perspective and Revenue Generation
- Every service tier experiences increased throughput, with the highest tier seeing a 10x increase, showcasing the hard work behind these improvements.
- A hypothetical distribution of power across tiers illustrates how this model can attract more customers while maximizing revenue potential.
- This approach allows for a fivefold increase in revenues for both Blackwell and Vera Rubin, emphasizing the importance of optimizing service tiers.
Technical Challenges and Solutions
- High throughput requires substantial computational resources (flops), which conflicts with low latency needs; balancing these is crucial.
- The acquisition of Grock technology aims to address these challenges by integrating systems that enhance performance without compromising bandwidth.
The Role of MVLink72 and Grock Technology
Performance Insights
- MVLink72 architecture excels in high-throughput workloads but struggles beyond certain limits due to bandwidth constraints.
- Grock technology is introduced as a solution to extend capabilities beyond what MVLink72 can achieve, particularly for demanding token generation tasks.
Strategic Recommendations
- For workloads focused on high throughput, maintaining 100% Vera Rubin is advisable; however, incorporating Grock for specific tasks can optimize overall performance.
- The integration strategy suggests using Grock chips selectively within data centers to balance memory requirements effectively.
Grock's Unique Architecture and Future Potential
Architectural Advantages
- Grock operates as a deterministic data flow processor with static compilation, allowing efficient scheduling of compute tasks alongside data arrival.
- This design caters specifically to AI factory workloads, positioning it well as demand for high-speed token generation increases.
Integration Strategy
- Disaggregating inference through software like Dynamo enables optimized workload distribution between Vera Rubin and Grock processors.
- This unified approach leverages strengths from both architectures while addressing memory limitations inherent in each system.
AI Processing and Hardware Integration
Overview of AI Systems and Hardware
- The integration of a trillion parameter model requires efficient storage in Gro chips, which work alongside Nvidia's Vera Rubin to manage large amounts of KV cache essential for agentic AI systems.
- The decoding process, including attention mechanisms and token generation, is executed on the Vera Rubin system, utilizing advanced mathematical computations to enhance performance.
Performance Enhancements
- The combination of Gro chips and Vera Rubin allows for a remarkable 35 times increase in inference performance for token generation, marking unprecedented advancements in AI processing capabilities.
- Production of the Gro chip is underway with expectations to ship by Q3; early sampling has shown promising results with Vera Rubin systems being operational at Microsoft Azure.
Supply Chain and Manufacturing Capacity
- A robust supply chain has been established capable of producing thousands of AI factory systems weekly, indicating strong demand and production efficiency.
- The success of the Vera CPU is attributed to its design tailored for tool use in AI applications, highlighting its importance in next-generation data processing.
Future Developments in Storage Systems
- As AI systems increasingly utilize storage solutions previously managed by humans through SQL, there will be significant demands on storage infrastructure due to accelerated usage patterns driven by AI technologies.
- In two years, a one gigawatt factory could achieve a 350 times increase in token generation speed from 2 million to 700 million tokens per second through innovative architectural designs.
Roadmap for Upcoming Technologies
- The roadmap includes backward compatibility with existing architectures while introducing new systems like Oberon that can scale both copper and optical connections effectively.
- Future developments include the Reuben Ultra chip and LP35 which will incorporate Nvidia’s MVFP4 computing structure aimed at further enhancing processing speeds.
Scaling Strategies
- Forthcoming innovations such as Fineman will introduce new GPUs alongside LP40 CPUs designed collaboratively between Nvidia and Gro teams.
- There is an emphasis on scaling up both copper and optical technologies to meet growing capacity needs across various ecosystems within the industry.
NVIDIA's Transition to AI Infrastructure
Evolution of NVIDIA
- NVIDIA has rapidly transformed from a chip manufacturer to an AI infrastructure company, focusing on building entire AI factories.
- The need for collaboration among technology vendors is emphasized, as many components often operate in isolation until they meet in data centers.
Omniverse and AI Factory Design
- Introduction of the Omniverse platform allows various stakeholders to design gigawatt-scale AI factories virtually, integrating multiple simulation systems.
- The integration with grid power enables dynamic adjustments for energy efficiency within data centers using Max Q technology.
Maximizing Efficiency and Throughput
- The NVIDIA DXX platform aims to optimize token throughput by minimizing wasted power, highlighting the importance of every watt in revenue generation.
- Delays in building AI factories can lead to significant financial losses; thus, maximizing operational efficiency is crucial.
Developer Connectivity and Tools
- Developers utilize several APIs (DSX SIM, DSX Exchange, DSX Flex, DSX Max Q) for simulations and dynamic power management between the grid and data center operations.
- Collaboration with partners like PTC and Jacobs enhances model-based systems engineering through advanced simulation tools.
Digital Twin Technology
- Once operational, digital twins manage infrastructure dynamically with AI agents optimizing cooling and electrical systems for maximum efficiency.
- NVIDIA collaborates globally to build resilient AI infrastructure that ensures high throughput while maintaining energy efficiency.
Future Innovations: Space Computing
Expansion into Space
- NVIDIA is venturing into space computing with plans for satellite imaging and future data centers designed specifically for space environments.
Open Source Contributions
- Introduction of Open Claw software as a groundbreaking open-source project that quickly gained popularity surpassing historical benchmarks set by Linux.
Practical Applications of Open Claw
- Open Claw allows users to create AI agents easily through simple commands, showcasing its user-friendly interface and potential impact on development processes.
What is Open Claw?
Introduction to Open Claw
- The speaker shares a personal anecdote about a 60-year-old dad who used Open Claw to automate beer brewing and manage lobster orders through a connected website, highlighting its practical applications.
Understanding Open Claw
- Open Claw is introduced as a system that connects to large language models, managing resources and accessing tools, file systems, and scheduling capabilities.
Features of Open Claw
- It can decompose prompts into step-by-step tasks, spawn sub-agents, and communicate in various modalities (e.g., text messages, emails).
- The speaker compares Open Claw's functionality to an operating system, suggesting it enables the creation of personal agents similar to how Windows facilitated personal computers.
Implications for Companies
- Every technology company now needs an "Open Claw strategy," akin to previous technological shifts like Linux or Kubernetes. This reflects the necessity for adaptation in modern enterprise IT.
The Evolution of Enterprise IT
Traditional vs. Modern IT Landscape
- The traditional enterprise IT model involved data centers storing structured business data processed by software tools for human use.
Transitioning to Agentic Systems
- Post-Open Claw, every SaaS company is expected to evolve into an "agentic as a service" provider. This shift mirrors past technological advancements that transformed industries.
Security Considerations with Agentic Systems
Risks Associated with Agentic Systems
- Agentic systems can access sensitive information, execute code, and communicate externally—raising significant security concerns regarding data privacy and integrity.
Solutions for Secure Implementation
- Collaboration with security experts led to the development of "Open Nemo Claw," designed for secure enterprise use while integrating agentic AI toolkits.
Nemo Cloud: A Reference Design
Overview of Nemo Cloud
- Nemo Cloud serves as a reference design that allows users to connect policy engines from various SaaS companies while ensuring compliance with governance standards.
Customization Capabilities
- Users can create custom models within their own implementations of Open Claw through Nvidia's open model initiative across diverse AI domains such as robotics and digital biology.
Open Models and AI Ecosystems
The Diversity of Open Models
- The world is diverse, necessitating multiple models to serve various industries. Open Models represents one of the largest AI ecosystems globally, with nearly 3 million open models across different domains such as language, vision, biology, physics, and autonomous systems.
NVIDIA's Contributions to Open-Source AI
- NVIDIA is a major contributor to open-source AI, releasing six families of frontier models along with training data and frameworks that assist developers in customizing new top-ranking models.
Core Model Families
- Key model families include:
- Neotron: Focused on reasoning for language and visual understanding.
- Cosmos: Designed for physical AI world generation.
- Alpayo: Aims at autonomous vehicle intelligence.
- Bioneo: Targets biology and chemistry applications.
- Earth 2: For weather forecasting using AI physics.
Commitment to Continuous Improvement
- NVIDIA emphasizes its commitment to ongoing development of its models. Neotron will evolve from version 3 to version 4, ensuring continuous advancement in capabilities.
Enabling Sovereign AI Development
- The goal is to create foundational models that can be fine-tuned by users for specific needs. This includes fostering sovereign AI tailored for different countries and industries.
Neotron Coalition Announcement
Formation of the Coalition
- NVIDIA announces the formation of the Neotron coalition aimed at enhancing Neotron 4 through partnerships with various companies across different sectors.
Notable Partners in the Coalition
- Key partners include:
- Black Forest Labs (imaging)
- Cursor (coding tools)
- LangChain (custom agents)
- Mistrol (AI solutions)
Importance of Agentic Systems
- Every enterprise software company must adopt an agentic strategy. This involves integrating NVIDIA’s Nemo toolkit and open models into their operations.
The Future Landscape of Enterprise IT
Transformation into a Multi-Trillion Dollar Industry
- The enterprise IT sector is poised for transformation into a multi-trillion dollar industry focused on specialized agents rather than just tools.
Token Economy for Engineers
- Future engineers may receive tokens alongside their salaries as part of a productivity enhancement strategy. These tokens will be generated by collaborative AI factories developed by industry partners.
Emergence of Agentic Companies
- Companies will transition from traditional data centers to becoming agentic entities that produce tokens both for internal use and customer offerings.
Significance of Open Claw Event
Comparison with Major Technological Milestones
- The Open Claw event is likened to significant technological advancements like HTML and Linux due to its potential impact on building an open agentic framework accessible to all stakeholders in the industry.
The Future of Robotics and Physical AI
Overview of Digital and Physical Agents
- The speaker discusses the distinction between digital agents, which operate in the digital realm, and physically embodied agents, known as robots.
- Emphasis is placed on the development of physical AIs that are essential for robots to function effectively in real-world environments.
Announcements in Robotics Partnerships
- Nvidia announces collaborations with numerous companies involved in robotics, highlighting their extensive ecosystem that includes training computers and simulation systems.
- Introduction of four new partners for Nvidia's robo taxi platform: BYD, Hyundai, Nissan, and Ji, collectively producing 18 million cars annually.
Advancements in Autonomous Vehicles
- Partnership with Uber is revealed to integrate robo taxi-ready vehicles into multiple cities' networks.
- Discussion on how future radio towers will evolve into NVIDIA aerial AI systems capable of optimizing traffic management through advanced reasoning capabilities.
The Role of Simulation in Robot Training
- The importance of synthetic data generated from AI simulations is highlighted as a necessity for training robots due to the unpredictable nature of real-world scenarios.
- Nvidia's open-source Isaac Lab is introduced as a tool for robot training and evaluation using massive amounts of synthetic data.
Real-world Applications and Collaborations
- Various companies like Paratas AI and Disney are utilizing Nvidia’s technologies (Isaac Lab & Cosmos World models) to enhance their robotic applications across different industries.
- A demonstration featuring a Disney character robot showcases advancements made possible by physics simulations developed collaboratively with Disney and DeepMind.
Conclusion Highlights
- The speaker summarizes key topics discussed during the presentation including inference, physical AI advancements, and collaborative efforts within the robotics sector.
AI Revolution: The Road Ahead
Overview of AI Advancements
- The keynote highlights the emergence of AI factories and autonomous agents, indicating a significant shift in technology and its applications.
- A dramatic increase in computational power is noted, with a multiplication factor of 40 million, showcasing the evolution from traditional training paradigms to more efficient models.
- Historical context is provided on how industrial processes have evolved; previously slow and cumbersome methods are now being replaced by agile systems like DSX and Dynamo that convert power into revenue.
Autonomous Agents and Safety Mechanisms
- Agents are now capable of acting autonomously rather than waiting for instructions, but safety protocols (referred to as "safe claws") ensure they remain on course.
- The narrative emphasizes the reality of advanced AI technologies such as self-driving cars and robots, contrasting them with fictional portrayals in movies.
Economic Implications of AI Development
- The discussion touches on the economic impact of AI advancements, suggesting that new architectures will lead to increased financial opportunities ("raining cash").
- There’s an emphasis on continuous innovation within AI stacks, driven by demand for more tokens and capabilities, reflecting a vibrant ecosystem poised for growth.