Yann LeCun on What Comes After LLMs

Name: Yann LeCun on What Comes After LLMs
Uploaded: 2026-05-15T13:00:38.000Z
Duration: 2 h 42 min 57 s

The Future of AI: Insights from a Godfather of AI

Overview of Progress in AI

The speaker predicts that within five years, there will be significant advancements in AI, potentially leading to "world domination."

Clarifies misconceptions about their role at Meta and the relationship with Alex regarding AI development.

Discusses the limitations of large language models (LLMs) and the decision to pursue a different architectural approach.

Diverging Views on LLMs

The speaker expresses skepticism towards LLMs, suggesting they are not a pathway to achieving human-like intelligence.

Introduces AME (Advanced Machine Intelligence), emphasizing its focus on real-world applications rather than just language manipulation.

Challenges with Current AI Models

Acknowledges that while LLMs have utility, they do not lead to true understanding or intelligence akin to humans or animals.

Highlights the complexity of understanding the physical world compared to language processing, which is often oversimplified.

Transition from Research to Application

Describes how Meta's focus shifted away from exploratory research towards product-oriented development, impacting innovative projects.

Discusses the tension between pursuing diverse research directions versus focusing on commercially viable products.

World Models vs. Generative Approaches

Defines world models as systems that allow agents to predict outcomes based on their actions—essential for intelligent behavior.

Emphasizes that current LLM architectures lack predictive capabilities necessary for planning and optimization.

Learning from Cognitive Science

Mentions inspiration drawn from cognitive science regarding how humans plan actions based on predicted consequences.

Critiques generative models for failing to effectively learn representations compared to non-generative architectures like joint embedding predictive architecture (JEPA).

Future Directions and Applications

Envisions practical applications in industries requiring complex system modeling beyond traditional methods.

Identifies potential uses in various sectors such as manufacturing and healthcare where predicting system dynamics is crucial.

The Future of Intelligent Systems and AI

Goals and Aspirations in AI Development

Lin Starvald humorously stated his goal for Linux as "total world domination," which reflects the ambition behind developing intelligent systems that could dominate various sectors.

Current designs focus on creating systems capable of thinking, with language interfaces being an additional layer rather than a primary function.

Methodology and Progress in AI Training

Within the next year, a general methodology for training hierarchical models across diverse modalities is expected to be established, particularly in video processing.

Demonstrations will include training world models for applications in robotics, industrial process control, and healthcare within 12 to 18 months.

Industry Realizations and Paradigm Shifts

There is a growing recognition within industries that traditional LLMs (Large Language Models) are inadequate for real-world data applications.

A paradigm shift is underway as industries seek effective solutions for robotics and other practical applications by early 2027.

The Role of Tapestry in AI Sovereignty

Conceptualizing Tapestry

Tapestry aims to create an open foundation model that can be fine-tuned by users globally, addressing cultural and linguistic diversity.

As AI assistants become more prevalent, there’s concern about their effectiveness across different languages and cultures not well represented by existing models.

Addressing Global Needs through Open Platforms

Many countries desire sovereignty over their AI technologies to prevent external biases from influencing local populations.

Tapestry proposes a federated learning approach where contributors maintain control over their data while collaborating on global model development.

The Evolution of Open Source vs. Proprietary Models

Historical Context of Open Source Success

The transition from proprietary systems like Solaris to open-source platforms like Linux illustrates how open models can dominate technology landscapes.

Current companies like OpenAI may face similar fates as past tech giants if they do not adapt to the demand for open-source solutions.

Limitations of Current LLM Technologies

LLM capabilities are limited; they excel at problem-solving but lack creativity necessary for higher-level tasks such as software architecture or conceptualization.

Understanding the Constraints of LLM Capabilities

Problem-Solving vs. Creative Thinking

LLM's strengths lie in structured domains like mathematics where language serves as reasoning substrate; however, they struggle with creative aspects inherent in these fields.

Predictive Limitations of LLM Systems

For an agentic system to solve new problems effectively, it must predict outcomes based on its actions—something current LLM architectures cannot reliably achieve.

Diverging Perspectives on AI Safety

Concerns Over Reliability

There is skepticism regarding whether current LLM architectures can ever be made reliable due to their intrinsic inability to predict consequences accurately.

Alternative Approaches Towards Safer AI

Proposed objective-driven architectures would allow systems to predict outcomes based on defined objectives while incorporating safety constraints into their operational framework.

Understanding the Limitations of LLMs in Healthcare

The Need for Advanced Models

Discussion on how traditional Learning Management Systems (LMS) may not suffice in healthcare, particularly for chronic diseases. A deeper understanding of patient physiology is necessary to design effective treatment plans.

Challenges with Stem Cell Treatment

Inquiry into how to instruct stem cells to differentiate into specific cell types, such as pancreatic beta cells, which are crucial for insulin production in type 1 diabetes patients.

The Role of LLMs vs. Top Doctors

Exploration of the potential for LLMs to scale top-tier medical expertise globally while acknowledging that they primarily regurgitate existing knowledge rather than providing innovative solutions.

Practical Experience vs. Knowledge Accumulation

Emphasis on the necessity of hands-on experience in medicine beyond theoretical knowledge; practical skills are essential for accurate diagnosis and treatment.

Reflections on Leadership at Meta

Achievements at FAIR

Reflection on building a leading research lab (FAIR), contributing significantly to tools like PyTorch that have become industry standards.

Innovation Chain and Organizational Structure

Description of the innovation process from blue-sky research through practical application, highlighting where many organizations fail due to poor alignment between research and product development.

Isolation within Meta's Structure

Commentary on how FAIR became isolated within Meta, leading to missed opportunities as innovative ideas were not pursued effectively by the broader organization.

The Evolution and Challenges of AI Research

Short-Term Pressures in AI Development

Discussion about the increasing short-term pressures faced by organizations like Meta, impacting their ability to foster breakthrough research akin to earlier successful models like FAIR or Google’s approach.

Current Landscape for Researchers

Insight into how current researchers face challenges working under short-term priorities, potentially stifling long-term innovation and creativity in AI development.

Personal Journey and Vision Beyond Meta

Transitioning Roles at Facebook/Meta

Overview of personal career trajectory from director at FAIR to chief AI scientist, emphasizing a desire to focus more on scientific vision rather than management responsibilities.

Pursuing Self-Supervised Learning

Introduction of concepts around self-supervised learning as a pathway toward developing human-like AI systems based on sensory predictions rather than traditional supervised methods.

Future Directions in AI Research

Shifts in Understanding Self-Supervised Learning

Reflection on evolving perspectives regarding unsupervised versus self-supervised learning techniques over time, particularly their applications beyond video prediction towards language processing with LLM success stories.

This structured summary captures key discussions from the transcript while linking back directly to relevant timestamps for further exploration.

Understanding Neural Network Training Techniques

Challenges in Backpropagation and Weight Sharing

The discussion begins with the concept of not performing backpropagation on one model while sharing weights with another, utilizing an exponential moving average. This method is derived from reinforcement learning intuitions to prevent collapse, though the underlying reasons remain unclear.

There are theoretical papers that attempt to explain why this approach might work in specific cases, but they do not provide satisfactory answers. The cost function being minimized does not accurately reflect what is happening during training.

Transitioning Away from Traditional Methods

Recent developments include explicit regularizers aimed at preventing collapse by maximizing information content from the encoder. This aligns with historical works by Becker and Hinton (1989) and Schmidber (1992), as well as contemporary contrastive techniques.

A significant challenge remains: measuring or establishing a lower bound for information content. Currently, only upper bounds can be determined, leading researchers to rely on these estimates without certainty.

Innovations in Regularization Techniques

The latest technique discussed is SIGreg (Sketch Isotropic Gaussian Regularization), which aims to force the distribution of variables from the encoder to resemble a Gaussian distribution, thereby maximizing information output differently than previous methods.

Variations of SIGreg have been developed, including those producing sparse representations and isotropic representations that are not strictly Gaussian. Collaborative research has led to promising results in small-scale world models using these techniques.

Conclusion and Future Directions

The conversation wraps up with gratitude expressed towards Yan for his insights into neural network training methodologies. Jacob Efron emphasizes the importance of audience support for continuing such discussions on AI advancements.

Efron encourages listeners to subscribe and share the podcast as it helps facilitate future episodes featuring expert guests discussing AI's implications for business and society.