Yann LeCun on What Comes After LLMs

Yann LeCun on What Comes After LLMs

The Future of AI: Insights from a Godfather of AI

Overview of Progress in AI

  • The speaker predicts that within five years, there will be significant advancements in AI, potentially leading to "world domination."
  • Clarifies misconceptions about their role at Meta and the relationship with Alex regarding AI development.
  • Discusses the limitations of large language models (LLMs) and the decision to pursue a different architectural approach.

Diverging Views on LLMs

  • The speaker expresses skepticism towards LLMs, suggesting they are not a pathway to achieving human-like intelligence.
  • Introduces AME (Advanced Machine Intelligence), emphasizing its focus on real-world applications rather than just language manipulation.

Challenges with Current AI Models

  • Acknowledges that while LLMs have utility, they do not lead to true understanding or intelligence akin to humans or animals.
  • Highlights the complexity of understanding the physical world compared to language processing, which is often oversimplified.

Transition from Research to Application

  • Describes how Meta's focus shifted away from exploratory research towards product-oriented development, impacting innovative projects.
  • Discusses the tension between pursuing diverse research directions versus focusing on commercially viable products.

World Models vs. Generative Approaches

  • Defines world models as systems that allow agents to predict outcomes based on their actions—essential for intelligent behavior.
  • Emphasizes that current LLM architectures lack predictive capabilities necessary for planning and optimization.

Learning from Cognitive Science

  • Mentions inspiration drawn from cognitive science regarding how humans plan actions based on predicted consequences.
  • Critiques generative models for failing to effectively learn representations compared to non-generative architectures like joint embedding predictive architecture (JEPA).

Future Directions and Applications

  • Envisions practical applications in industries requiring complex system modeling beyond traditional methods.
  • Identifies potential uses in various sectors such as manufacturing and healthcare where predicting system dynamics is crucial.

The Future of Intelligent Systems and AI

Goals and Aspirations in AI Development

  • Lin Starvald humorously stated his goal for Linux as "total world domination," which reflects the ambition behind developing intelligent systems that could dominate various sectors.
  • Current designs focus on creating systems capable of thinking, with language interfaces being an additional layer rather than a primary function.

Methodology and Progress in AI Training

  • Within the next year, a general methodology for training hierarchical models across diverse modalities is expected to be established, particularly in video processing.
  • Demonstrations will include training world models for applications in robotics, industrial process control, and healthcare within 12 to 18 months.

Industry Realizations and Paradigm Shifts

  • There is a growing recognition within industries that traditional LLMs (Large Language Models) are inadequate for real-world data applications.
  • A paradigm shift is underway as industries seek effective solutions for robotics and other practical applications by early 2027.

The Role of Tapestry in AI Sovereignty

Conceptualizing Tapestry

  • Tapestry aims to create an open foundation model that can be fine-tuned by users globally, addressing cultural and linguistic diversity.
  • As AI assistants become more prevalent, there’s concern about their effectiveness across different languages and cultures not well represented by existing models.

Addressing Global Needs through Open Platforms

  • Many countries desire sovereignty over their AI technologies to prevent external biases from influencing local populations.
  • Tapestry proposes a federated learning approach where contributors maintain control over their data while collaborating on global model development.

The Evolution of Open Source vs. Proprietary Models

Historical Context of Open Source Success

  • The transition from proprietary systems like Solaris to open-source platforms like Linux illustrates how open models can dominate technology landscapes.
  • Current companies like OpenAI may face similar fates as past tech giants if they do not adapt to the demand for open-source solutions.

Limitations of Current LLM Technologies

  • LLM capabilities are limited; they excel at problem-solving but lack creativity necessary for higher-level tasks such as software architecture or conceptualization.

Understanding the Constraints of LLM Capabilities

Problem-Solving vs. Creative Thinking

  • LLM's strengths lie in structured domains like mathematics where language serves as reasoning substrate; however, they struggle with creative aspects inherent in these fields.

Predictive Limitations of LLM Systems

  • For an agentic system to solve new problems effectively, it must predict outcomes based on its actions—something current LLM architectures cannot reliably achieve.

Diverging Perspectives on AI Safety

Concerns Over Reliability

  • There is skepticism regarding whether current LLM architectures can ever be made reliable due to their intrinsic inability to predict consequences accurately.

Alternative Approaches Towards Safer AI

  • Proposed objective-driven architectures would allow systems to predict outcomes based on defined objectives while incorporating safety constraints into their operational framework.

Understanding the Limitations of LLMs in Healthcare

The Need for Advanced Models

  • Discussion on how traditional Learning Management Systems (LMS) may not suffice in healthcare, particularly for chronic diseases. A deeper understanding of patient physiology is necessary to design effective treatment plans.

Challenges with Stem Cell Treatment

  • Inquiry into how to instruct stem cells to differentiate into specific cell types, such as pancreatic beta cells, which are crucial for insulin production in type 1 diabetes patients.

The Role of LLMs vs. Top Doctors

  • Exploration of the potential for LLMs to scale top-tier medical expertise globally while acknowledging that they primarily regurgitate existing knowledge rather than providing innovative solutions.

Practical Experience vs. Knowledge Accumulation

  • Emphasis on the necessity of hands-on experience in medicine beyond theoretical knowledge; practical skills are essential for accurate diagnosis and treatment.

Reflections on Leadership at Meta

Achievements at FAIR

  • Reflection on building a leading research lab (FAIR), contributing significantly to tools like PyTorch that have become industry standards.

Innovation Chain and Organizational Structure

  • Description of the innovation process from blue-sky research through practical application, highlighting where many organizations fail due to poor alignment between research and product development.

Isolation within Meta's Structure

  • Commentary on how FAIR became isolated within Meta, leading to missed opportunities as innovative ideas were not pursued effectively by the broader organization.

The Evolution and Challenges of AI Research

Short-Term Pressures in AI Development

  • Discussion about the increasing short-term pressures faced by organizations like Meta, impacting their ability to foster breakthrough research akin to earlier successful models like FAIR or Google’s approach.

Current Landscape for Researchers

  • Insight into how current researchers face challenges working under short-term priorities, potentially stifling long-term innovation and creativity in AI development.

Personal Journey and Vision Beyond Meta

Transitioning Roles at Facebook/Meta

  • Overview of personal career trajectory from director at FAIR to chief AI scientist, emphasizing a desire to focus more on scientific vision rather than management responsibilities.

Pursuing Self-Supervised Learning

  • Introduction of concepts around self-supervised learning as a pathway toward developing human-like AI systems based on sensory predictions rather than traditional supervised methods.

Future Directions in AI Research

Shifts in Understanding Self-Supervised Learning

  • Reflection on evolving perspectives regarding unsupervised versus self-supervised learning techniques over time, particularly their applications beyond video prediction towards language processing with LLM success stories.

This structured summary captures key discussions from the transcript while linking back directly to relevant timestamps for further exploration.

Understanding Neural Network Training Techniques

Challenges in Backpropagation and Weight Sharing

  • The discussion begins with the concept of not performing backpropagation on one model while sharing weights with another, utilizing an exponential moving average. This method is derived from reinforcement learning intuitions to prevent collapse, though the underlying reasons remain unclear.
  • There are theoretical papers that attempt to explain why this approach might work in specific cases, but they do not provide satisfactory answers. The cost function being minimized does not accurately reflect what is happening during training.

Transitioning Away from Traditional Methods

  • Recent developments include explicit regularizers aimed at preventing collapse by maximizing information content from the encoder. This aligns with historical works by Becker and Hinton (1989) and Schmidber (1992), as well as contemporary contrastive techniques.
  • A significant challenge remains: measuring or establishing a lower bound for information content. Currently, only upper bounds can be determined, leading researchers to rely on these estimates without certainty.

Innovations in Regularization Techniques

  • The latest technique discussed is SIGreg (Sketch Isotropic Gaussian Regularization), which aims to force the distribution of variables from the encoder to resemble a Gaussian distribution, thereby maximizing information output differently than previous methods.
  • Variations of SIGreg have been developed, including those producing sparse representations and isotropic representations that are not strictly Gaussian. Collaborative research has led to promising results in small-scale world models using these techniques.

Conclusion and Future Directions

  • The conversation wraps up with gratitude expressed towards Yan for his insights into neural network training methodologies. Jacob Efron emphasizes the importance of audience support for continuing such discussions on AI advancements.
  • Efron encourages listeners to subscribe and share the podcast as it helps facilitate future episodes featuring expert guests discussing AI's implications for business and society.
Video description

Yann LeCun, Turing Award winner and former Chief AI Scientist at Meta, joins Jacob Effron. The conversation centers on Yann's contrarian thesis that LLMs are a dead-end on the path to human-level intelligence, despite being useful products — because they can't predict the consequences of their actions, can't plan, and fundamentally can't model the messy, high-dimensional real world. He unpacks his alternative architecture, JEPA (Joint Embedding Predictive Architecture), which learns abstract representations rather than generating pixel-level predictions, and explains why this approach is essential for robotics, industrial applications, and any system that needs to operate beyond the substrate of language. Yann also reveals the real story behind his departure from Meta (he had zero technical influence on Llama, contrary to public narrative), the genesis of his Tapestry project for sovereign open-source AI, why he believes LLMs are intrinsically unsafe, where he diverges from his fellow Turing laureates Hinton and Bengio, and why he predicts the industry will recognize the paradigm shift by early 2027. Throughout, he offers candid reflections on the tension between research and product at major labs, and why he intentionally headquartered AMI Labs in Paris with zero Silicon Valley VC money. 0:00 Intro 01:45 Why LLMs Aren't the Path to Intelligence 07:51 AMI and World Models 12:07 The JEPA Architecture Explained 15:55 Problems with Robotics Models Today 20:37 Silicon Valley Herd Behavior 28:18 Tapestry: Sovereign AI for the Rest of the World 35:49 OpenAI Is the Next Sun Microsystems 40:51 Why Yann's Views Diverged from Hinton & Bengio 44:32 LLMs Are Intrinsically Unsafe 58:00 Why Yann Left Meta 1:00:26 Reflections on FAIR 1:12:11 Advice for PhD Students LeWorldModel Paper: https://arxiv.org/abs/2603.19312 With your host:  @jacobeffron  - Partner at Redpoint