AGENT THREADS. How to SHIP like Boris Cherny. Ralph Wiggum in Claude Code.

AGENT THREADS. How to SHIP like Boris Cherny. Ralph Wiggum in Claude Code.

How to Measure Improvement in Engineering with Agents

Introduction to the Challenge of Improvement

  • Andy Devdan introduces a question about how engineers can measure their improvement from being novice coders to senior engineers capable of shipping production-ready code.
  • He references Andrew Carpathy, who expresses feeling left behind as a programmer, highlighting a trend where some engineers struggle to keep up with advancements in agent technology.

The Need for Continuous Improvement

  • Devdan emphasizes that one step is not enough; continuous daily improvement is necessary due to the evolving nature of engineering skills and tools.
  • He notes that self-awareness among top engineers is crucial, as they recognize the need for new frameworks to measure progress in agentic engineering.

Introducing Thread-Based Engineering

  • Devdan presents "thread-based engineering" as a mental framework for operationalizing agents and measuring improvement effectively.
  • A thread consists of three components: prompting or planning (beginning), agent work (middle), and reviewing or validating (end).

Understanding Threads of Work

  • Each prompt initiated in an agent tool starts a new thread of work, which includes executing tool calls and ultimately requires review by the engineer.
  • The value created by agents can be measured through tool calls, which correlate with impact when prompted correctly.

Scaling Threads for Efficiency

  • Once an initial thread is established, it can be scaled through parallel execution, allowing multiple threads of work to run simultaneously across different environments.
  • Devdan illustrates this concept using Boris Churnney's setup with Claw Code, demonstrating how running multiple agents leads to increased productivity.

Conclusion on Parallel Execution Benefits

  • Churnney’s approach involves running several cloud codes simultaneously, showcasing the effectiveness of managing multiple threads efficiently.
  • This method allows engineers like Churnney to maximize output by leveraging parallel processing capabilities within their workflows.

Parallel Processing with Agents

Introduction to Parallel Processing

  • The speaker discusses the use of Pthread to implement multiple threads for processing tasks, specifically mentioning a review of the Ralph Wiggum implementation in apps/star.
  • By forking a terminal, four instances of agents are created, allowing simultaneous execution of commands to analyze the codebase.

Benefits of Parallelism

  • Running four agents in parallel effectively quadruples computational capacity compared to traditional engineering methods, enhancing productivity through parallelization.
  • The approach allows for increased confidence in responses by running identical prompts across multiple agents, which is particularly useful for code reviews.

Scaling Engineering Output

  • The speaker emphasizes that if one has to monitor a single agent closely, it may indicate a need to reduce complexity and focus on fewer threads.
  • A common theme throughout this discussion is increasing the number of tool calls made by agents on behalf of users to improve overall efficiency.

Types of Threads: Base and C Threads

Understanding Thread Types

  • The speaker introduces two primary types of threads: base threads and C threads, clarifying that these are not traditional CPU process threading but rather lines of work executed by agents.

Chained Work with C Threads

  • C threads allow for chaining tasks into phases. This method is beneficial when dealing with large projects or sensitive production work that requires careful step-by-step execution.
  • Reasons for using C threads include limitations in an agent's context window or the necessity for meticulousness in high-stakes environments like migrations.

Managing Complex Tasks

  • For extensive plans (e.g., 50-step processes), breaking down tasks into manageable chunks can prevent errors during critical operations.
  • Tools such as "ask user question" enable agents to pause workflows and seek input from users at crucial checkpoints.

Fusion Threads: Enhancing Collaboration

Introduction to Fusion Threads

  • Fusion threads (F threads), also known as fusion chains, involve sending similar prompts to multiple agents simultaneously. This technique enhances collaboration among agents.

Aggregating Results

  • After collecting results from various agents, reviewing and combining their outputs leads to more robust conclusions. This method aligns with previous discussions about selecting the best outcomes from multiple inputs.

Parallelizing Agent Work

Overview of Parallelization and Fusion Threads

  • The process involves spinning up multiple agents using more compute resources, allowing for the aggregation of results in various ways, not limited to just selecting the best outcome.
  • Demonstrates the use of Pthread with three different codecs to fire up multiple agents within a single workload, emphasizing rapid prototyping as a key application.
  • The fusion thread begins as a P thread where results are combined or merged after running multiple agents in parallel, enhancing the final output quality.

Benefits of Using Multiple Agents

  • Utilizing agent sandboxes allows for deferred trust and experimentation with various versions of solutions, which is crucial for rapid prototyping.
  • By deploying more compute resources, one can run numerous lines of work simultaneously (e.g., nine agents), increasing the likelihood of successful outcomes through synthesis and selection.

Confidence Through Aggregation

  • The concept that having more agents working on a problem increases confidence in the answers provided; if several agents agree on an answer, it enhances reliability.
  • Research applications often employ fusion threads by utilizing sub-agents to conduct multiple web searches or similar tasks, showcasing practical uses beyond basic implementations.

Advanced Agentic Engineering: Bthreads

Introduction to Bthreads

  • Bthreads represent a meta structure where prompts trigger other prompts (sub-agents), allowing for complex workflows without needing detailed oversight from engineers.
  • An example includes planning and building workflows where one agent plans while another executes tasks; this abstraction simplifies engineering processes.

Managing Complexity with Bthreads

  • Engineers benefit from deploying thicker threads that allow for more operations within a specific timeframe by effectively managing subthreads.
  • The orchestration of primary and sub-agents enables streamlined project management; engineers initiate high-level commands while internal processes remain hidden.

Conclusion on Future Trends

  • As engineering evolves towards greater complexity and efficiency through methods like Bthreads, future rapid prototyping will increasingly rely on these advanced structures.

Understanding Agentic Workflows and Thread-Based Engineering

The Role of Bthreads in Agentic Workflows

  • Bthreads are crucial for orchestrating agents, allowing for the combination of code and agents to maximize efficiency.
  • A new pattern, termed the "Ralph Wiggum pattern," emerges where AI engineers discover that integrating agents with code yields better results than using agents alone.
  • This concept has been discussed in courses on principal AI coding and tactical agent coding, highlighting its significance in AI developer workflows (ADWs).
  • Building teams of specialized agents enhances workflow efficiency by teaching primary agents to utilize appropriate tools effectively.

L Threads: High Autonomy and Long Duration Work

  • L threads represent high autonomy tasks that can run for extended periods without human intervention, showcasing impressive long-duration capabilities.
  • Effective prompting is essential; clearer prompts enable longer-running workflows that call numerous tools, enhancing agent autonomy.
  • The structure of L threads mirrors base threads but involves more tool calls, leading to increased autonomy and improved context management.

Enhancing Agent Performance through Planning

  • Engineers are encouraged to plan effectively as it translates into better prompting strategies, which ultimately leads to longer chains of work being executed autonomously.
  • The "stop hook" mechanism allows for verification processes within long-running tasks, ensuring accuracy by chaining multiple agent calls together.

Importance of Thread-Based Engineering Techniques

  • Thread-based engineering provides a framework for operating with agents efficiently during lengthy task execution phases.
  • There are four key methods to improve performance: running more threads, extending thread length, increasing thread thickness, and reducing human checkpoints in workflows.

Improving Agentic Engineering

Key Strategies for Enhancing Agentic Engineering

  • The four ways to improve agentic engineering include running more threads, longer threads, thicker threads, and reducing the number of checkpoints in human oversight.
  • Boris demonstrates effective use of parallel processing with five clouds running simultaneously in his terminal, showcasing scalability through multiple active threads.
  • He utilizes Opus 4.5 and maintains manageable sizes for cloud resources while employing both inloop and outloop agent decoding methods for efficiency.
  • A crucial tip is to establish a validation loop that allows agents to verify their own work, minimizing the need for human intervention during reviews.
  • Reducing the frequency of human checks can enhance productivity; however, there are scenarios where manual review is necessary to ensure accuracy.

Advanced Concepts in Thread-Based Engineering

  • Emphasizing thread-based engineering involves viewing all tasks as interconnected threads of work that agents execute over time.
  • Building a specialized layer around your codebase with tailored agents can significantly improve problem-solving capabilities within specific domains.
  • The concept of a "Zthread" or zero-touch thread represents maximum trust in agents where manual review steps are eliminated entirely.
  • This advanced level of trust challenges traditional views on coding practices and suggests a future where engineers rely heavily on autonomous systems.
  • It’s important to differentiate this high-level approach from basic coding practices; it requires deep understanding and confidence in agent performance.

Measuring Improvement in Agentic Workflows

  • Engineers like Andrew Carpathy adapt by conceptualizing their work as threads—units of engineering effort executed collaboratively with agents over time.
  • Improvement indicators include adding more threads, extending thread duration, combining results from various threads, and orchestrating additional tasks concurrently during agent operations.
  • Mastery of prompt engineering and context management is essential for executing longer and more complex threads effectively within workflows.
  • Ultimately, success hinges on four core elements: context, model, prompt, and tools—all critical for enhancing agentic engineering processes.
  • Trusting your systems leads to fewer checkpoints needed during execution due to improved models and better prompts guiding the agents' actions.

The Future of Agentic Engineering

Introduction to Agentic Threads

  • The speaker discusses the evolution of working with agents in software engineering, starting from simple tasks with one agent to managing multiple agents simultaneously.
  • The concept of "Zthreads" is introduced as the pinnacle of tactical agentic coding, representing a goal for engineers aiming for efficiency and automation.
  • The mission is emphasized: to create living software that operates autonomously, allowing developers to focus on higher-level planning rather than micromanaging agent performance.

Embracing Change in Software Engineering

  • A call to action is made for engineers to embrace the ongoing changes in software engineering by thinking in terms of threads and scaling their computational capabilities.
  • The importance of increasing workload on agents is highlighted as a means to improve overall impact and effectiveness within projects.
Video description

How do you KNOW you're improving as an engineer in the age of AI? Thread Based Engineering is the answer. Even Andrej Karpathy feels left behind. If one of the greatest engineers of our generation struggles to keep up, what hope do the rest of us have? The answer is simpler than you think: MEASURE YOUR THREADS. 🔥 VIDEO REFERENCES • Tactical Agentic Coding: https://agenticengineer.com/tactical-agentic-coding?y=-WBHNFAB0OE_a • How Boris Cherny uses CC Tweet: https://x.com/bcherny/status/2007179832300581177 • Andrej Karpathy Tweet: https://x.com/karpathy/status/2004607146781278521 • Ralph Wiggum Original Post by Geoffrey Huntly: https://ghuntley.com/ralph/ • Multi Process Terminal Tool (mprocs): https://github.com/pvolok/mprocs 🚀 In this video, we introduce thread based engineering, a powerful mental framework for understanding and improving your agentic coding abilities. A thread is a unit of engineering work over time driven by you and your agents. You show up at the beginning with the prompt and at the end with the review. Everything in between? That's your AI agents doing the heavy lifting through tool calls. 🛠️ We break down SIX essential thread types for mastering agentic engineering: - Base Thread: Your fundamental unit of work - P Thread: Parallel execution for scaling output - C Thread: Chained work for production-sensitive tasks - F Thread: Fusion threads for rapid prototyping and confidence - B Thread: Meta structures with agents prompting agents - L Thread: Long duration, high autonomy workflows 💡 Learn how Boris Cherny, creator of Claude Code, runs FIVE parallel Claude instances in his terminal and another 5-10 in the background. This is thread engineering in action. We also explore the Ralph Wiggum technique, showing how code plus agents outperforms agents alone. 🔥 Four concrete ways to know you're improving: - Run MORE threads of work - Run LONGER threads with increased autonomy - Run THICKER threads with nested sub-agents - Run FEWER human-in-the-loop checkpoints 🌟 Whether you're a senior engineer shipping to production with every prompt or just getting started with AI coding, understanding thread based engineering will transform your productivity. Master context engineering, prompt engineering, and principled AI coding to unlock the full potential of multi-agent workflows. 💡 The future belongs to engineers who can scale their compute through parallel threads, fusion chains, and agentic prompt engineering. Don't get left behind in the agentic horizon. Start thinking in threads today. Stay focused and Keep Building. #aicoding #agenticcoding #claudecode