Why Vertical LLM Agents Are The New $1 Billion SaaS Opportunities

Why Vertical LLM Agents Are The New $1 Billion SaaS Opportunities

The Impact of AI on Legal Technology

Introduction to AI's Influence

  • The speaker describes the transformative experience of using advanced AI, which can accomplish tasks in minutes that previously took days. This shift created a sense of urgency and opportunity within their company.

Guest Introduction: Jake Heller

  • Gary introduces Jake Heller, founder of CaseText, likening him to one of the first people on the moon due to his pioneering work in legal technology over the past 11-12 years.

Growth and Valuation Insights

  • CaseText achieved a $100 million valuation before GPT-4's release, which then skyrocketed to a $650 million acquisition by Thomson Reuters within two months. This highlights the rapid value creation possible with large language models.

Transitioning Focus to AI

  • Jake discusses how many new companies are focusing on vertical-specific AI agents, noting that he leads one of the most successful examples in this space.

Early Adoption and Strategic Shift

  • Jake recounts how his company pivoted entirely towards developing a new product called Co-Counsel based on early access to GPT-4 technology after just 48 hours of deliberation among all employees.

Challenges in Legal Research Before Technology

Initial Company Mission

  • The mission was centered around integrating technology into legal practices. Jake shares his frustrations as a lawyer dealing with outdated tech tools that hindered efficiency.

Inefficiencies in Traditional Legal Workflows

  • He contrasts simple tasks like finding movie times online with complex legal research processes that could take days or weeks, emphasizing the need for better solutions.

Historical Context of Legal Research Tools

  • Prior to digital advancements, lawyers had to sift through physical documents manually—a tedious process that lacked modern search capabilities like "Ctrl + F."

Personal Experience Driving Innovation

Understanding the Journey of a Startup in Legal Tech

Initial Challenges and Direction

  • The speaker discusses the uncertainty faced when starting a company, emphasizing that while there may be a general direction, finding the right solution can take considerable time.
  • They identified issues with technology in the legal field and aimed to improve it by acquiring user-generated content (UGC) from lawyers to enhance their offerings.

Struggles with User Engagement

  • Despite aspirations to create a UGC platform similar to Wikipedia or GitHub, they struggled to engage lawyers due to their limited availability and high-value hourly rates.
  • The speaker notes that unlike typical Wikipedia editors who have more free time, lawyers are often too busy billing hours to contribute voluntarily.

Pivoting Towards Technology

  • After realizing the challenges of obtaining UGC, they pivoted towards investing in natural language processing and machine learning technologies instead.
  • They began developing better user experiences using algorithms similar to those used by music recommendation services like Pandora and Spotify.

Incremental Improvements vs. Fundamental Changes

  • The speaker reflects on how many of their initial improvements were incremental rather than revolutionary, making it easy for clients to overlook them.
  • Many clients expressed reluctance for change since they were already successful financially; thus, any new technology posed potential risks.

Impact of AI on Perceptions and Market Fit

  • The launch of ChatGPT marked a significant shift in perception among lawyers regarding technology's impact on their work; they recognized its potential for substantial change.
  • This newfound awareness led clients who previously resisted change to seek out innovative solutions actively.

Navigating the Idea Maze

  • The concept of an "idea maze" is introduced, illustrating how startup founders navigate through uncertainties and dead ends before reaching product-market fit.
  • The discussion highlights how external factors like advancements in AI can dramatically alter paths toward success within this maze.

Reflections on Product-Market Fit

  • As progress was made through various iterations, there was optimism about nearing product-market fit based on customer feedback and revenue generation prior to launching Co-Counsel.

Challenges and Breakthroughs in Launching Co-Council

Initial Challenges Faced

  • The speaker reflects on the early challenges of launching Co-Council, including server outages and difficulties in hiring support and sales personnel.
  • They humorously mention frequent visits to a diner known for its association with venture capitalists during this stressful period.
  • The launch led to significant media attention, marking a pivotal moment that demonstrated true product-market fit.

Rapid Growth and Acquisition

  • Within two months of launching, discussions began regarding an acquisition valued at $650 million, although the transaction closed six months later.
  • The concept behind Co-Council was an AI Legal Assistant capable of performing complex legal tasks quickly, akin to having an additional team member.

Development Process

  • Early versions were developed under strict NDAs with OpenAI; initial testing involved select law firms unaware they were using GPT technology.
  • Users experienced remarkable efficiency gains as the AI could complete tasks that previously took lawyers hours in just minutes.

Company Dynamics During Transition

  • The intense work environment fostered rapid iteration and innovation among the 120 employees before the public launch of GPT-4.
  • Some companies struggled to keep pace due to a lack of focus compared to their experience during this critical period.

Founder Leadership and Employee Buy-In

  • Transitioning into "founder mode," the speaker faced skepticism from long-term employees about shifting focus towards AI development despite previous successes.

How to Achieve Sales Targets and Leverage AI in Legal Research

Introduction to the Discussion

  • The conversation begins with a focus on achieving sales targets for the next quarter, but quickly shifts to a different topic regarding product development and customer engagement.
  • The speaker emphasizes the importance of early customer involvement in product development, showcasing real-time reactions during Zoom calls as pivotal moments that influenced perceptions.

Customer Reactions and Product Validation

  • Customers displayed significant emotional responses when exposed to new technology, indicating a transformative impact on their understanding of legal processes.
  • The release of GPT-4 played a crucial role in shaping these reactions, as earlier models were deemed inadequate for practical legal applications.

Evolution of AI Models

  • Initial versions of AI struggled with accuracy; however, improvements were noted with GPT-3.5 and later models which began producing more plausible outputs relevant to legal contexts.
  • A study showed that while GPT-3.5 performed poorly on bar exams (10th percentile), early tests with GPT-4 indicated it surpassed 90% of test takers, marking significant progress.

Development Process for Legal Applications

  • The team conducted extensive testing using specific legal cases to refine the model's ability to generate accurate memos with proper citations.
  • There was an initial skepticism about whether AI could meet legal standards; however, witnessing its capabilities led to a shift in mindset among both developers and customers.

Problem-Solving Approach

  • The process began by identifying user needs—specifically how users wanted research results presented—and working backward from that desired outcome.
  • Emphasis was placed on creating "skills" or capabilities within the AI that would allow it to effectively address complex queries similar to how top attorneys approach research tasks.

Methodology for Effective Legal Research

  • The methodology involved breaking down user requests into actionable search queries across various databases, mirroring the diligence expected from expert attorneys.

Research Process and Prompt Engineering

Steps in the Research Process

  • The research process involves extracting insights from readings, gathering citations, and compiling a research memo. Each step is now achievable with modern technology through structured prompts.
  • The final result of the research may require multiple individual prompts, each designed to think step-by-step to achieve clarity and precision in outcomes.

Testing and Development

  • A rigorous testing framework was established, evolving from dozens to thousands of tests for each prompt response system (PRS). This ensures that search queries are effective and meet predefined standards.
  • The approach mirrors test-driven development in software engineering, emphasizing the importance of writing tests first to ensure high accuracy in responses from language models.

Challenges with Language Models

  • Unlike traditional coding practices, prompting requires careful consideration as language models can produce unexpected results. Adjustments made for one issue can inadvertently create new problems.
  • There is skepticism about companies merely creating wrappers around existing technologies without adding intellectual property (IP), but significant finesse is required to solve real customer problems effectively.

Building Comprehensive Applications

Integrating Multiple Layers

  • Effective applications must integrate various components such as proprietary datasets, legal annotations, and connections to specific document management systems relevant to the legal field.
  • Attention to detail is crucial; for instance, how well OCR (Optical Character Recognition) functions impacts the ability to review large sets of documents accurately.

Addressing Edge Cases

  • Legal documents often present unique challenges like varied formatting or multi-page layouts condensed into single pages. These edge cases must be addressed before leveraging large language models effectively.

Value Beyond Basic Functionality

Custom Solutions vs. Generic Tools

  • Building a successful application involves more than just using a language model; it requires developing business logic tailored for specific domains that are difficult for competitors to replicate.

Transitioning from 70% to 100%

  • Many SaaS solutions initially function at about 70% effectiveness. Achieving full reliability demands extensive work on integrations and user experience enhancements that justify higher pricing tiers based on performance guarantees.

Addressing Limitations of Language Models

Hallucination Issues

Understanding the Importance of Accuracy in AI for Legal Applications

The Stakes of Providing Accurate Information

  • There is significant risk involved if an AI agent provides incorrect information to lawyers handling critical court cases, emphasizing the need for accuracy.
  • The process involves analyzing patterns of mistakes and refining instructions to improve clarity and context, which is crucial for effective communication with the AI.

Test-Driven Development Framework

  • Implementing a test-driven development framework helps identify why certain tests fail, allowing developers to make necessary adjustments to achieve accurate results.
  • Many founders are tempted to rely solely on intuition ("vibes") rather than structured testing; however, this approach can lead to unreliable outcomes.

Learning from Experience

  • Having a legal background informs the understanding that even minor errors can have serious repercussions in legal contexts, reinforcing the necessity for precision.
  • A single negative experience with AI can lead users, especially busy professionals like lawyers, to lose faith in its utility quickly.

User Experience and Trust

  • Ensuring that initial interactions with AI are successful is vital for fostering trust among users who may not be technologically inclined.

The Evolution of Language Models: OpenAI's New Approach

System One vs. System Two Thinking

  • Previous models operated primarily on "System One" thinking—fast and intuitive decision-making—while newer models aim to incorporate "System Two" thinking, which involves more deliberate reasoning.

Performance Testing of New Models

  • The new model was subjected to rigorous testing against known failures; it demonstrated improved thoroughness and precision compared to earlier iterations.

Nuanced Understanding in Legal Context

  • In tests where legal briefs were slightly altered (e.g., changing key terms), previous models failed to recognize errors while the new model successfully identified them due to its enhanced analytical capabilities.

Implications for Future Developments

Exploring AI Problem-Solving Techniques

The Shift in AI Instruction Methods

  • Discussion on the transition from traditional input-output models to a more nuanced approach where AI is prompted to think critically about problem-solving.
  • Emphasis on leveraging expertise, such as insights from top lawyers, to guide the AI's thought process and improve its responses.
  • Acknowledgment that it is still early in the investigation of new prompting techniques, with no conclusive evidence yet on their effectiveness.

The Importance of Domain Expertise

  • Recognition of the need for sharing knowledge about advancements in AI technology, especially among companies unfamiliar with recent developments.
  • Highlighting the potential for significant improvements (from 70% to 100% accuracy) through better understanding and application of AI capabilities.

Transformative Potential of AI in Various Fields

  • Discussion on how industries like law could be revolutionized by reducing time and costs associated with document review processes.
  • Encouragement for companies to embrace innovative solutions rather than cling to outdated beliefs about AI limitations.

Future Outlook on Jobs and Technology

Channel: Y Combinator
Video description

As LLM’s become exponentially better it is clear that vertical AI agents are key to the next generation of billion dollar SaaS companies. In this episode of the Lightcone, the hosts sit down with YC alum Jake Heller, the co-founder and CEO of Casetext (which sold to Thomson Reuters for $650 million in cash in 2023) to discuss what it takes to build a successful vertical AI company and overcome resistance from industry veterans and skeptics. Chapters (Powered by https://bit.ly/chapterme-yc) - 00:00 Coming Up 01:40 Building a successful vertical AI company 06:05 The unique challenges of law and AI 09:24 The turning point for lawyers with ChatGPT 11:25 Finding product market fit in legal 15:04 Entering deep founder mode 20:40 Approaching prompt engineering step by step 25:05 Going beyond GPT wrappers 28:10 Aiming for 100% accuracy 30:48 Thoughts on o1’s capabilities 36:42 Outro