Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic

Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic

What Are Agent Skills and Why Are They Important?

Introduction to Agents and Skills

  • The speakers, Barry and Mahes, discuss the evolution of agents from their previous talk, highlighting that many now use agents daily but still notice gaps in expertise.
  • They explain the shift from building agents to focusing on skills due to advancements like MCP becoming a standard for agent connectivity.

New Paradigm for Agents

  • A new paradigm is emerging where there is a tighter coupling between models and runtime environments; they argue that code serves as a universal interface to the digital world.
  • The realization that while customization is important, the underlying agent structure can be more universal than previously thought.

Domain Expertise vs. General Intelligence

  • The speakers emphasize the importance of domain expertise by comparing an intelligent individual (Mahes) with an experienced professional (Barry), suggesting that expertise leads to better outcomes.
  • Current agents are likened to Mahes—intelligent but lacking necessary contextual knowledge and learning capabilities over time.

Concept of Agent Skills

  • Agent skills are defined as organized collections of files containing procedural knowledge designed for easy use by both humans and agents.
  • This simplicity allows anyone with a computer to create and utilize these skills, integrating seamlessly with existing tools like Git or Google Drive.

Advantages of Using Code in Skills

  • Traditional tools often have ambiguous instructions; however, code is self-documenting, modifiable, and can reside in file systems until needed.
  • An example illustrates how repetitive tasks can be streamlined by saving scripts within skills for future use, enhancing consistency and efficiency.

The Structure and Growth of Skills

Organization of Skills

  • Skills are progressively disclosed at runtime; only essential metadata is shown initially while detailed instructions remain accessible when needed.

Types of Skills Developed

  • Since launching five weeks ago, thousands of skills have emerged across various categories: foundational skills, third-party partner skills, and enterprise-specific skills.

Foundational Skills

  • Foundational skills provide new general or domain-specific capabilities; examples include document creation/editing abilities added to Claude's functionality.

Third-party Partner Contributions

  • Partners have developed specific skills tailored for their software products; Browserbase created a skill for browser automation enhancing Claude's web navigation capabilities.

Enterprise-Specific Applications

  • Large enterprises are leveraging these skills as educational tools within organizations; discussions with Fortune 100 companies reveal significant interest in this approach.

Understanding the Evolution of Skills in Agent Technology

The Role of Skills in Developer Productivity

  • Discussion on how organizations utilize bespoke internal software to enhance developer productivity, focusing on large teams serving thousands of developers.
  • Emphasis on the creation and consumption of skills by various individuals within an organization, highlighting that anyone can create these skills to enhance agent capabilities.

Trends in Skill Complexity

  • Observation that skills are becoming increasingly complex, evolving from simple markdown files to comprehensive packages including software and executables.
  • Noting that while basic skills may take minutes to develop, more advanced skills could require weeks or months for proper construction and maintenance.

Integration with MCP Servers

  • Explanation of how developers are building skills that orchestrate workflows across multiple MCP tools, enhancing functionality with external data connectivity.
  • Exciting trend where non-technical individuals (e.g., finance, legal professionals) are creating skills, validating the idea that coding is not a prerequisite for skill development.

Emerging Architecture of General Agents

  • Introduction to the concept of an agent loop managing internal context and token flow, coupled with a runtime environment for file system access.
  • Description of how agents can be equipped with libraries of hundreds or thousands of skills relevant at runtime based on specific tasks.

Future Directions for Skills Development

  • Insights into Enthropic's deployment strategy using Claude across new verticals like financial services and life sciences through tailored MCP servers and skill sets.
  • Focus on treating skills similarly to software by exploring testing methodologies, versioning practices, and quality measurement tools as they evolve over time.

The Vision for Knowledge Sharing

  • Discussion about the potential value derived from sharing and distributing skills within organizations as a collective knowledge base evolves.
  • Highlighting how agents improve through interaction and feedback mechanisms, ensuring they adapt to team-specific needs over time.

Understanding the Evolution of Skills in AI

The Growing Ecosystem of Skills

  • The development of skills within the AI ecosystem enhances the utility of agents, extending their value beyond individual organizations to the broader community.
  • As skills evolve, they contribute to a more reliable and useful knowledge base, facilitating continuous learning for AI systems.

Standardized Learning and Memory

  • Claude's standardized format ensures that any information recorded can be efficiently utilized by future versions, making learning transferable.
  • Skills allow Claude to acquire new capabilities quickly and discard obsolete ones, enhancing cost-effectiveness for frequently changing information.

Continuous Improvement Over Time

  • The goal is for Claude to significantly improve from day one to day thirty of interaction, showcasing its evolving capabilities through skill creation.

Comparing Agent Stack with Computing Models

  • The analogy between models as processors highlights their potential when combined with an operating system that orchestrates resources effectively.
  • Agent runtime is likened to an operating system that maximizes processor value by managing data flow and processes around it.

Opening Up Creativity Through Skills

  • While few companies create processors or operating systems, millions develop applications that encode unique expertise; skills aim to democratize this layer.
  • The focus is shifting from rebuilding agents to building skills as a new paradigm for sharing capabilities, inviting collaboration in skill development.
Video description

In the past year, we've seen rapid advancement of model intelligence and convergence on agent scaffolding. But there's still a gap: agents often lack the domain expertise and specialized knowledge needed for real-world work. We think Skills are the solution—a minimal form factor for packaging procedural knowledge that agents can dynamically load. It's a portable, composable approach to giving one agent capabilities across domains. In this talk, we'll share how we built Skills at Anthropic, the network effects we're observing, and where we believe this leads: agents writing their own Skills from experience. Our thesis: equipping agents for real-world work means building reusable expertise. Barry: https://twitter.com/barry_zyj Mahesh: https://twitter.com/MaheshMurag