We Watched a Brain Emerge..." The AI That Might Kill Transformers (w/ Pathway's Zuzanna Stamirowska)

We Watched a Brain Emerge..." The AI That Might Kill Transformers (w/ Pathway's Zuzanna Stamirowska)

The Path to AGI: Insights from Zuzanna Stemroka

Introduction to the Podcast

  • Hosts Corey Nolles and Grant Harvey introduce the podcast, expressing excitement about their guest.
  • They welcome Zuzanna Stemroka, CEO of Pathway, who challenges the transformer-based AI paradigm.

Zuzanna's Background and Journey

  • Zuzanna shares her transition from studying at a French school for politicians to exploring complexity science and AI.
  • She references the movie "A Beautiful Mind," highlighting its emotional impact on her father and her own inspiration from it.

Academic Influences

  • At Stockholm School of Economics, she took a game theory course that sparked her passion for complex systems.
  • Despite coming from a different background than her peers, she excelled in understanding game results intuitively without heavy math.

Meeting John Nash

  • Zuzanna recounts meeting John Nash at a conference in Lisbon, which was a significant moment in her academic journey.

Specialization in Game Theory

  • She specialized in game theory on graphs during her master's program, leading to an interest in complexity science.
  • Discusses how small particles interacting can lead to larger societal phenomena or intelligence.

Complexity Science Exploration

  • Emphasizes the challenge of applying game theory within infinitely changing structures and its implications for understanding complex systems.
  • Reflecting on how mathematical abstractions simplify complex ideas into more manageable concepts.

This structured summary captures key insights from the transcript while providing timestamps for easy reference.

Understanding the Role of Time in AI Development

The Importance of Time in Evolving Systems

  • Discussion on how global phenomena arise from local interactions, emphasizing the necessity of time for systems to evolve and emerge.
  • Mention of a team member, Adrien Kosovski, highlighting his impressive background as a quantum physicist and theoretical computer scientist who joined the project.

Challenges with Current AI Models

  • Identification of existing models built on transformer architecture, which lacks an inherent understanding of time and memory.
  • Introduction to Pathway's goal: creating a post-transformer model that addresses the memory deficit in current AI systems.

Memory and Problem Solving

  • Explanation that memory is crucial for problem-solving as it allows for coherence and consequence recognition over time.
  • Reference to a lab called Meter measuring human task performance benchmarks against LLM capabilities, indicating limitations in current models.

Limitations of Current Language Models

  • Critique that current LLMs operate without true memory; they are trained once on vast datasets but do not retain information beyond their training phase.
  • Clarification that while LLMs can generate new outputs based on extensive data exposure, they lack internalized knowledge or evolving memory.

The Concept of Memory in AI

  • Distinction made between having a static library of knowledge versus possessing contextualized evolving memory that adapts to new situations.
  • Comparison drawn between traditional models' operation (akin to leaving sticky notes or tattoos for reminders), underscoring the difference between external prompts and genuine internalization.

Future Directions Beyond Transformers

  • Inquiry into whether transformer-based models have reached a plateau regarding consistent task performance over time.
  • Discussion about reasoning as an alternative pathway forward rather than solely relying on transformers, suggesting potential limits due to inherent memory constraints.

Designing AI: The Journey of Innovation

The Evolution of Orbit Design

  • Discussion on the initial cumbersome designs for orbits, which were necessary to explain observations in a more understandable way.
  • Emphasis on the importance of perspective shifts in understanding complex systems, leading to clearer and more elegant solutions.

Impact of Transformers on AI

  • Recognition of Transformers as a groundbreaking innovation that has significantly influenced both technology and market dynamics.
  • Noted that only 0.7% of GDP has been invested in AI technological advancements so far, indicating we are still in the early stages compared to past innovations like telecom.

Naming Conventions and Inspirations

  • Introduction to the name "Baby Dragon Hatchling" (BDH), with references to Terry Pratchett's "The Color of Magic" as an inspiration for the term 'dragon hatchling.'
  • Explanation that dragons appear more frequently with thought, paralleling reasoning models used in AI development.

Understanding BDH's Conceptual Framework

  • Clarification that while there is an architecture presented publicly, it represents just a part of the overall model.
  • Insight into why three-letter acronyms are favored in AI naming conventions; simplicity and ease of pronunciation play significant roles.

The Mythical Nature of Dragons in AI Development

  • Discussion about how dragons symbolize powerful yet controllable entities within the realm of continual learning in AI.
  • Description of BDH functioning similarly to a brain made from silicon, incorporating principles akin to Hebbian learning for adaptation over time.

Understanding Neural Networks and Their Efficiency

Basic Structure of Neurons

  • The brain consists of neurons (dots) connected by synapses, forming a complex network.
  • This model simplifies the intricate biological processes involved in neural communication, focusing on basic structures rather than chemical reactions.

Efficiency of Brain Functionality

  • The brain's structure is designed for efficiency due to spatial limitations and the need for effective learning capabilities.
  • Lifelong learning is facilitated by this efficient design, allowing for extensive contextual understanding.

Hardware Limitations and Technological Shifts

  • Current technological advancements must work within existing hardware constraints; significant breakthroughs often occur at inflection points where various factors align.
  • Research into transformer models aims to identify what elements are missing to better mimic brain functionality.

Local Interactions in Neural Models

  • The architecture being developed emphasizes local interactions among small neurons that activate based on incoming information (tokens). Only relevant neurons light up when new data is received.
  • This principle mirrors how real neurons operate: they only fire together if they are connected and interested in the same information.

Emergence of Complex Structures

  • A notable moment occurred when researchers observed a spontaneous emergence of a brain-like structure from simple rules governing local interactions among neurons, akin to social networks' dynamics.
  • This phenomenon illustrates how complexity can arise from fundamental principles, leading to organized structures without direct intervention or setup.

Memory Formation through Connections

  • When two interested neurons connect, their relationship strengthens over time, which is analogous to memory formation—connections that are frequently used become stronger while unused ones fade away.
  • This positive reinforcement mechanism underlines the efficiency of neural connections and their role in computational processes similar to those found in biological brains.

Understanding Scale-Free Graph Structures in Neural Networks

The Nature of Scale-Free Graphs

  • The discussion begins with the engineering perspective on scaling and distribution, emphasizing a scale-free graph structure that allows for predictable behavior beyond current data scales.
  • Unlike transformers, which lack extensive study regarding their emergent properties, the scale-free nature of this model provides a scientific basis for understanding its performance at larger scales.

Interpretability and Neural Activity

  • The model exhibits a level of interpretability; researchers can observe neural activity related to specific stimuli, indicating how neurons respond when they "care" about something.
  • An analogy is made comparing traditional methods (like MRI scans) to having a "CCTV inside the brain," allowing direct observation of neuron firing patterns associated with concepts like currency.

Learning Dynamics and Memory

  • The system demonstrates compression of information during learning, where certain concepts may activate multiple neurons but not always clearly represent large ideas.
  • Observations reveal that neurons exhibit decreased activity when exposed to repetitive stimuli, akin to how humans become less responsive to familiar experiences over time.

Long-Term Memory and Connection Fading

  • There is an exploration into whether unused connections within the model fade over time, similar to human memory dynamics. This raises questions about transferring knowledge into long-term memory.
  • Unlike databases designed for permanent storage, this model aims for efficient reasoning by maintaining relevant and compact structures rather than simply accumulating vast amounts of data.

Scaling Challenges and Future Directions

  • Current models have proven effective at 1 billion parameters (GB2 GBT2 scale), prompting inquiries about scaling up to 100 billion parameters while questioning the necessity of such growth.
  • Emphasis is placed on improving learning efficiency rather than merely increasing parameter counts; faster problem-solving capabilities are prioritized over brute-force scaling strategies.

Innovation Through Reasoning

  • True innovation stems from recognizing possibilities beyond existing frameworks. Effective reasoning involves identifying gaps in knowledge or potential developments rather than just processing known information.
  • Questions arise regarding cognitive limits in complex problem-solving; however, it is suggested that these limits do not conform strictly to conventional understandings of parameter capacity.

Understanding Neural Networks and Their Efficiency

The Structure of Neural Networks

  • Current models cap the number of neurons, suggesting that while growth is possible, reasoning power in transformers does not solely depend on size.
  • The brain's structure provides significant computational power due to its vast number of synaptic connections, estimated in the trillions.
  • This extensive network allows for efficient memory storage and processing, akin to having infinite context within a limited physical space.

Memory Efficiency in Neural Models

  • Human brains are compact yet capable of storing immense information efficiently without needing extensive lookups or additional compute resources.
  • Memory is kept close to the core processing unit, enhancing efficiency by minimizing energy expenditure during tasks.
  • Only relevant neural pathways are activated for specific tasks, rather than engaging the entire model at once.

Model Integration and Specialization

  • Unlike human brains that cannot easily merge two distinct cognitive processes, current models can be combined effectively.
  • Research shows that separately trained models can be integrated seamlessly to produce coherent outputs across different languages or domains.
  • This integration resembles assembling Lego blocks, allowing specialized knowledge from different fields (e.g., finance and legal) to create a more powerful unified model.

Real-world Applications and Collaborations

  • Early adopters like NATO and Formula 1 are exploring these advanced models but have not yet deployed them fully into operational environments.
  • These organizations utilize existing technology layers to prepare data efficiently for real-time applications with low latency requirements.

Challenges and Future Directions

  • Implementing live intelligence requires careful consideration of data connectivity and system readiness before deployment in critical scenarios.
  • The potential applications span various fields; however, challenges remain regarding how best to leverage this technology effectively.

Understanding Complex Systems and AGI Development

The Interconnectedness of Systems

  • Discussion begins with the analogy of boys and girls liking buses and ships, highlighting the complexity of interconnected systems in the world.
  • Emphasis on the need for technologies that can help predict patterns within chaotic systems, aiming for better control over these dynamics.

Roadmap to AGI

  • Announcement of a partnership with Nvidia and AWS, indicating readiness to make advancements available to customers via AWS infrastructure.
  • Plans for production are set for next year, focusing on a faster path toward Artificial General Intelligence (AGI).

Reasoning as Core Intelligence Function

  • Shift in focus towards reasoning as a primary function of intelligence rather than just language model applications like chatbots or summarization.
  • Acknowledgment from various labs that reasoning is essential; the goal is to develop an innovator capable of solving complex problems beyond mere recomposition.

Safety Measures in AI Development

  • Importance placed on understanding how models work scientifically, mapping interactions to ensure safety through known laws governing behavior.
  • Internal discussions about establishing provable risk levels for AI behavior, ensuring models do not act unpredictably or "hallucinate."

Controlling AI Objectives

  • Comparison made between hiring practices and AI development; expectations that AI should perform reliably without causing chaos.
  • Recognition that while we can understand how trained individuals behave, controlling AI objectives remains a challenge needing resolution as technology advances.

Learning Management Strategies

  • Discussion on rollback capabilities to checkpoints as a method for managing unwanted learning outcomes in AI systems.
  • Analogy drawn between information spread in epidemics and model training; small irrelevant data cascades may not impact overall model performance significantly.

AI and the Future of Civilization

The Concept of Reversibility in AI Models

  • Discusses the ability to reverse small changes in AI models, emphasizing that larger changes may lead to irreversible states.
  • Mentions the concept of "quarantining" data within models to manage unwanted information effectively.

Generalization Capabilities in AI

  • Expresses excitement about unlocking true generalization capabilities with new architectures beyond current transformer models.
  • Highlights the importance of achieving innovator-level generalization for future advancements.

Space Exploration and AI Integration

  • Talks about the potential for deploying TPUs (Tensor Processing Units) in space, indicating a significant technological shift.
  • Suggests that scientific breakthroughs, particularly in energy, are necessary for successful space travel and exploration.

The Role of AI in Advancing Civilization

  • Compares the transformative impact of AI on civilization to the advent of agriculture, suggesting it could lead to "civilization 2.0."
  • Encourages long-term thinking about humanity's trajectory over centuries rather than just focusing on immediate advancements.

Rapid Evolution of Technology

  • Reflects on how quickly technology evolves, contrasting it with historical infrastructure projects that took much longer to develop.
  • Shares personal experiences predicting trends in AI development and acknowledges how rapidly perceptions can change within months.

Closing Thoughts and Future Engagement

  • Concludes with gratitude towards Susanna for her insights and encourages viewers to explore more about Pathway through their website.
  • Invites viewers to follow research papers and upcoming blogs for deeper understanding beyond traditional academic formats.
Video description

Imagine an AI that doesn’t just output answers — it remembers, adapts, and reasons over time like a living system. In this episode of The Neuron, Corey Noles and Grant Harvey sit down with Zuzanna Stamirowska, CEO & Cofounder of Pathway, to break down what's building: the world’s first post-Transformer frontier model, called BDH — the Dragon Hatchling architecture. Zuzanna explains why current language models are stuck in a “Groundhog Day” loop — waking up with no memory — and how Pathway’s architecture introduces true temporal reasoning and continual learning. We explore: • Why Transformers lack real memory and time awareness • How BDH uses brain-like neurons, synapses, and emergent structure • How models can “get bored,” adapt, and strengthen connections • Why Pathway sees reasoning — not language — as the core of intelligence • How BDH enables infinite context, live learning, and interpretability • Why gluing two trained models together actually works in BDH • The path to AGI through generalization, not scaling • Real-world early adopters (Formula 1, NATO, French Postal Service) • Safety, reversibility, checkpointing, and building predictable behavior • Why this architecture could power the next era of scientific innovation From brain-inspired message passing to emergent neural structures that literally appear during training, this is one of the most ambitious rethinks of AI architecture since Transformers themselves. If you want a window into what comes after LLMs, this interview is essential. Resources: -📄 Read the BDH research paper: https://arxiv.org/abs/2509.26507 - 🌐 Learn more about Pathway: https://pathway.com/ Subscribe to The Neuron newsletter for more interviews with the leaders shaping the future of work and AI: https://theneuron.ai ➤ CHAPTERS 0:00 - Intro to Zuzanna Stamirowska 01:12 - From Game Theory to Complexity Science 05:09 - How Intelligence Emerges from Simple Interactions 06:39 - The Transformer Breakthrough — and Its Limits 08:23 - AI’s Groundhog Day Problem 13:24 - Why Pathway Calls It Baby Dragon Hatchling 16:52 - Continual Learning and the Dragon Metaphor 17:20 - Learning Like a Brain: Neurons and Connections 21:27 - When a Brain Emerges Inside the Model 22:54 - Memory as Strengthened Connections 24:58 - Seeing Neural Activity Inside the Model 26:46 - Memory, Surprise, and Forgetting 27:47 - Scaling Without Brute Force 32:44 - Gluing Models Together Like Lego 34:16 - Real-World Use Cases: From Formula 1 to NATO 36:38 - Dragon Nests & Production Roadmap 38:18 - Reasoning as the Core of Intelligence 39:45 - Safety and Controllable Risk 43:13 - Unlocking True Generalization 45:54 - Long-Term Vision for AI and Humanity Hosted by: Corey Noles and Grant Harvey Guest: Zuzanna Stamirowska, CEO and Cofounder of Pathway AI Published by: Manique Santos Edited by: Adrian Vallinan