Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic
What Are Agent Skills and Why Are They Important?
Introduction to Agents and Skills
- The speakers, Barry and Mahes, discuss the evolution of agents from their previous talk, highlighting that many now use agents daily but still notice gaps in expertise.
- They explain the shift from building agents to focusing on skills due to advancements like MCP becoming a standard for agent connectivity.
New Paradigm for Agents
- A new paradigm is emerging where there is a tighter coupling between models and runtime environments; they argue that code serves as a universal interface to the digital world.
- The realization that while customization is important, the underlying agent structure can be more universal than previously thought.
Domain Expertise vs. General Intelligence
- The speakers emphasize the importance of domain expertise by comparing an intelligent individual (Mahes) with an experienced professional (Barry), suggesting that expertise leads to better outcomes.
- Current agents are likened to Mahes—intelligent but lacking necessary contextual knowledge and learning capabilities over time.
Concept of Agent Skills
- Agent skills are defined as organized collections of files containing procedural knowledge designed for easy use by both humans and agents.
- This simplicity allows anyone with a computer to create and utilize these skills, integrating seamlessly with existing tools like Git or Google Drive.
Advantages of Using Code in Skills
- Traditional tools often have ambiguous instructions; however, code is self-documenting, modifiable, and can reside in file systems until needed.
- An example illustrates how repetitive tasks can be streamlined by saving scripts within skills for future use, enhancing consistency and efficiency.
The Structure and Growth of Skills
Organization of Skills
- Skills are progressively disclosed at runtime; only essential metadata is shown initially while detailed instructions remain accessible when needed.
Types of Skills Developed
- Since launching five weeks ago, thousands of skills have emerged across various categories: foundational skills, third-party partner skills, and enterprise-specific skills.
Foundational Skills
- Foundational skills provide new general or domain-specific capabilities; examples include document creation/editing abilities added to Claude's functionality.
Third-party Partner Contributions
- Partners have developed specific skills tailored for their software products; Browserbase created a skill for browser automation enhancing Claude's web navigation capabilities.
Enterprise-Specific Applications
- Large enterprises are leveraging these skills as educational tools within organizations; discussions with Fortune 100 companies reveal significant interest in this approach.
Understanding the Evolution of Skills in Agent Technology
The Role of Skills in Developer Productivity
- Discussion on how organizations utilize bespoke internal software to enhance developer productivity, focusing on large teams serving thousands of developers.
- Emphasis on the creation and consumption of skills by various individuals within an organization, highlighting that anyone can create these skills to enhance agent capabilities.
Trends in Skill Complexity
- Observation that skills are becoming increasingly complex, evolving from simple markdown files to comprehensive packages including software and executables.
- Noting that while basic skills may take minutes to develop, more advanced skills could require weeks or months for proper construction and maintenance.
Integration with MCP Servers
- Explanation of how developers are building skills that orchestrate workflows across multiple MCP tools, enhancing functionality with external data connectivity.
- Exciting trend where non-technical individuals (e.g., finance, legal professionals) are creating skills, validating the idea that coding is not a prerequisite for skill development.
Emerging Architecture of General Agents
- Introduction to the concept of an agent loop managing internal context and token flow, coupled with a runtime environment for file system access.
- Description of how agents can be equipped with libraries of hundreds or thousands of skills relevant at runtime based on specific tasks.
Future Directions for Skills Development
- Insights into Enthropic's deployment strategy using Claude across new verticals like financial services and life sciences through tailored MCP servers and skill sets.
- Focus on treating skills similarly to software by exploring testing methodologies, versioning practices, and quality measurement tools as they evolve over time.
The Vision for Knowledge Sharing
- Discussion about the potential value derived from sharing and distributing skills within organizations as a collective knowledge base evolves.
- Highlighting how agents improve through interaction and feedback mechanisms, ensuring they adapt to team-specific needs over time.
Understanding the Evolution of Skills in AI
The Growing Ecosystem of Skills
- The development of skills within the AI ecosystem enhances the utility of agents, extending their value beyond individual organizations to the broader community.
- As skills evolve, they contribute to a more reliable and useful knowledge base, facilitating continuous learning for AI systems.
Standardized Learning and Memory
- Claude's standardized format ensures that any information recorded can be efficiently utilized by future versions, making learning transferable.
- Skills allow Claude to acquire new capabilities quickly and discard obsolete ones, enhancing cost-effectiveness for frequently changing information.
Continuous Improvement Over Time
- The goal is for Claude to significantly improve from day one to day thirty of interaction, showcasing its evolving capabilities through skill creation.
Comparing Agent Stack with Computing Models
- The analogy between models as processors highlights their potential when combined with an operating system that orchestrates resources effectively.
- Agent runtime is likened to an operating system that maximizes processor value by managing data flow and processes around it.
Opening Up Creativity Through Skills
- While few companies create processors or operating systems, millions develop applications that encode unique expertise; skills aim to democratize this layer.
- The focus is shifting from rebuilding agents to building skills as a new paradigm for sharing capabilities, inviting collaboration in skill development.