Wall Street Just Bet $285 Billion on AI Agents. The Best One Barely Works.

Wall Street Just Bet $285 Billion on AI Agents. The Best One Barely Works.

The Future of AI Agents: Trends and Challenges

Overview of Current Trends in AI Agents

  • The latest trend among agents is the promise of doing work autonomously, allowing users to relax while they handle tasks.
  • Companies like Co-work, Codec, Lindy, Sauna, and Google Opal are leading this movement but often fail to address significant challenges.

Understanding the Landscape of AI Tools

  • A practical review will cover which outcome-focused agents are effective and which are not, highlighting successes and failures among major companies and startups.
  • The discussion will also include a three-layer architecture for building personalized agent infrastructure.

Emergence of Co-work as a Game Changer

  • Anthropic's Co-work was launched in January as an autonomous AI agent that operates directly on user files without coding requirements.
  • Microsoft quickly responded by developing its own version called Co-pilot co-work, leveraging Claude's agent technology despite previous investments in OpenAI.

Market Impact and Investor Reactions

  • The introduction of these agents led to a significant decline in stock prices for SaaS companies, with over $285 billion lost due to fears that such tools could replace traditional software solutions.
  • Despite the hype around these agents, many remain in research preview stages with notable limitations affecting their reliability. For instance, Co-work becomes inactive when a laptop is closed.

Evaluating Agent Effectiveness

  • To understand why some agents succeed where others fail, it's essential to consider verifiability—code can be easily validated compared to non-coding tasks performed by agents.
  • Three critical questions help assess whether an agent is effective:
  • Does it have persistent memory across sessions?
  • Can it produce inspectable artifacts?
  • Does its architecture allow context accumulation over time?

Analyzing Co-work Against Key Criteria

  • In terms of persistent memory, Co-work has limited capabilities; it retains some context but requires careful input from users each session for optimal performance.

Understanding AI Agents: A Deep Dive

Overview of AI Agent Capabilities

  • The speaker discusses the mental model used with Claude, emphasizing that while persistent memory is beneficial, it is not a complete dependency.
  • Co-work's ability to produce tangible work artifacts is highlighted as a significant strength, contributing to its popularity and causing concern among investors due to the visible outputs generated by AI.
  • The architecture of co-work does not allow for context compounding over time; it operates on a one-shot basis without building upon previous interactions.
  • Despite limitations in agent capabilities, there is high demand and excitement surrounding AI tools like Claude, indicating strong product-market fit even with imperfect answers.
  • The speaker plans to evaluate four lesser-known agents using the same criteria applied to co-work.

Exploring Lindy: An Outcome-Focused Agent

  • Lindy is introduced as a prominent outcome-focused agent created by founder Flo Crello, aimed primarily at busy executives seeking efficiency in their daily tasks.
  • Unlike traditional tools like Zapier that require detailed component arrangement, Lindy utilizes natural language processing to interpret user intent and automate task execution seamlessly.
  • Despite celebrity endorsements, user experiences with Lindy are mixed; Trust Pilot ratings reflect dissatisfaction regarding credit usage transparency and unproductive outcomes.
  • Lindy's output lacks clear editable artifacts compared to co-work; users find it challenging to access or modify results effectively due to its opaque nature.
  • While Lindy has some form of persistent memory—remembering user queries—it struggles with context compounding and often leads to inefficient credit consumption during complex requests.

Automation and AI Agents: Insights on Emerging Tools

The Role of Automation in Executive Efficiency

  • The speaker discusses the potential niche for automation tools like Lindy, which aim to simplify tasks for executives by automating mundane activities.
  • While acknowledging Lindy's success, the speaker argues that it does not yet qualify as a deep outcomes-focused agent.

Wordware's Pivot to Sauna

  • Wordware raised $30 million to develop an IDE for AI agents but pivoted to become Sauna, an AI workspace for professionals.
  • The founder emphasizes memory as a foundational element in their product, allowing context to compound over time rather than being just a toggle feature.
  • Sauna retains its engineering infrastructure from Wordware, enhancing its orchestration capabilities beyond mere API functions.

Key Insights on Knowledge Work

  • A crucial insight is that knowledge workers will not need to become programmers; instead, they should be able to articulate clear specifications for their work.
  • Despite the promising features of Sauna, there are concerns about its early-stage development and whether it can deliver on its promises regarding persistent memory and artifact production.

Google Opal: An Underappreciated Tool

  • Google Opal is highlighted as a tool often overlooked in discussions about AI agents; it recently received upgrades with Gemini 3 technology.
  • Opal offers dynamic routing and self-correction capabilities while remembering user objectives across sessions.

Community Engagement and Building Public Workflows

  • Users are actively building practical applications like meeting prep agents using Opal, showcasing real-world use cases rather than just demos.
  • The ability to remix workflows fosters collaboration among users, leveraging Google's open-source ethos to enhance productivity.

Accessibility of Google Opal

  • As a free tool with no barriers to entry, Google Opal presents significant opportunities for users looking to explore automation without financial constraints.

Google Opal: A Double-Edged Sword?

Overview of Google Opal's Strengths and Weaknesses

  • The speaker appreciates Google Opal as a fantastic tool, emphasizing its affordability compared to other expensive tools that drain finances.
  • Concerns arise regarding Google's history of abandoning products after initial experiments, raising doubts about the long-term viability of Google Opal.
  • The memory feature is criticized for being simplistic and spreadsheet-like, which may not support complex, long-term tasks effectively.

Evaluating Agent Capabilities

  • Questions are posed about the agent's persistent memory; while it claims to have this feature, its simplicity limits effectiveness.
  • The importance of asking hard questions about AI tools is highlighted to avoid falling for marketing hype and ensure genuine utility in investments.

Exploring Obvious: An Ambitious AI Workspace

Features and Potential of Obvious

  • Obvious is introduced as an ambitious AI workspace with various features like SQL workbooks, live charts, presentations, and custom apps.
  • It allows cross-referencing between different artifacts (e.g., slide decks and spreadsheets), enhancing usability for outcomes-focused projects.

Challenges Facing New Tools

  • Despite its potential, Obvious faces challenges due to its newness in the market compared to competitors like Sauna that are more visible through demos.

Key Principles for Effective AI Agents

Essential Design Considerations

  • Three key principles are outlined for building effective AI agents:
  • Memory must be integrated into the architecture rather than being an add-on feature.
  • Outcomes should be produced on editable surfaces to enhance user experience and control over results.
  • Contextual knowledge needs to compound over time to improve task efficiency.

Architectural Framework for Agents

  • A three-layer architecture is proposed:
  • Knowledge Store: Where memory resides (e.g., databases).
  • Agent Recipes: Pre-wired workflows tailored to specific tasks.

Understanding Workflow Automation

The Concept of Pre-Wired Workflows

  • The speaker describes workflows as pre-wired systems akin to recipe cards or punch cards, emphasizing their utility in producing editable artifacts.
  • These workflows can encompass various formats such as calendars, meetings, documents, and presentations.

Importance of Continuous Improvement

  • There is a focus on the necessity for workflows to improve over time through iterative scheduling and learning processes.
  • The speaker acknowledges previous discussions on these principles but aims to set the stage for new developments.

Open Brain Project Developments

  • The speaker introduces the Open Brain project, which includes a knowledge store and scheduling loop, along with new agent recipes or workflow templates.
  • Users who prefer cost-effective solutions can utilize Open Brain's offerings at a lower price point compared to other services.

Future Directions and User Engagement

  • Emphasis is placed on understanding the evolving landscape of automation tools rather than solely focusing on specific platforms like Open Brain.
  • The speaker encourages asking better questions regarding agent capabilities and their application in everyday tasks beyond business contexts.
Video description

My site: https://natebjones.com Full Story w/ Prompts: https://natesnewsletter.substack.com/p/every-ai-agent-you-use-has-the-same?r=1z4sm5&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true ___________________ What's really happening with AI agents that claim to do the work for you? The common story is that outcome-focused AI agents have finally arrived — but the reality is that most of them still can't answer three basic questions. In this video, I share the inside scoop on which AI agents actually deliver outcomes and which are still living on demo energy: • Why verifiability is the hidden foundation of every real agent • How three questions separate genuine agents from expensive hype • What Lindy, Google Opal, Sauna, and Obvious actually get right • Where the three-layer architecture points for builders who want control Operators and builders who apply these three questions before committing will avoid the hype cycle and invest in tools that compound value over time. Chapters 00:00 Why Outcome Agents Exist Now 01:45 The $285 Billion SaaS Sell-Off Explained 03:30 Co-Work's Real Limitations 05:15 Why Code Made Agents Work First 07:00 Three Questions That Separate Real from Fake 09:00 Scoring Claude Co-Work Against the Framework 11:00 Lindy: Executive Automation or Overhyped? 13:30 Sauna: Memory as Architecture, Not Feature 16:00 Google Opal: Free but Fragile 18:30 Obvious: Most Ambitious, Least Known 20:30 Three Principles Every Real Agent Needs 22:30 The Three-Layer Architecture for Builders 24:30 Build vs. Buy and What Comes Next Subscribe for daily AI strategy and news. For deeper playbooks and analysis: https://natesnewsletter.substack.com/ Listen to this video as a podcast. - Spotify: https://open.spotify.com/show/0gkFdjd1wptEKJKLu9LbZ4 - Apple Podcasts: https://podcasts.apple.com/us/podcast/ai-news-strategy-daily-with-nate-b-jones/id1877109372