Cursor vs. Claude Code - Which is the Best AI Coding Agent?

Cursor vs. Claude Code - Which is the Best AI Coding Agent?

Comparison of Coding Agents: Cursor vs. Anthropic

User Experience (UX) Insights

  • The speaker compares the user experience of two coding agents, Cursor and CLA Code, while working on a Rails app.
  • Cursor's interface promotes agent interaction as the primary method for code changes within a fully featured IDE, but some design elements are considered clunky.
  • Issues arise with multiple prompts and terminal commands that can lead to confusion due to small interface panes and unclear waiting indicators.
  • The speaker prefers CLA Code’s single-pane CLI interface, which focuses solely on agent interactions without unnecessary distractions from file management.

Code Quality Challenges

  • The speaker discusses their Rails app project involving email bots that require updates and improvements after nine months of inactivity.
  • Tasks included cleaning up tests, updating gems, replacing LangChain with direct OpenAI API calls, and adding support for Anthropic.
  • Both agents utilized Claude 3.7 Sonet as the underlying model; however, Cursor had an advantage in web documentation search capabilities during development challenges.

Performance in Task Execution

  • While attempting to add Anthropic support, CLA Code struggled with syntax issues and ultimately created its own implementation instead of finding existing documentation.
  • Cursor successfully searched online for documentation to resolve issues related to integrating Anthropic into the Rails app.

Cost Analysis

  • The cost of using CLA Code was approximately $8 for about 90 minutes of work on three coding tasks—considered reasonable but potentially high if used frequently.
  • In contrast, Cursor operates on a subscription model costing $20 per month for 500 premium requests; only a fraction was used during this exercise (less than $2).

Conclusion on Value Proposition

  • Overall costs indicate that CLA Code is significantly more expensive compared to Cursor when considering frequent usage scenarios.

Cloud Code vs. Cursor Agent: A Comparative Analysis

Cost Comparison

  • Cloud Code is approximately four times more expensive than Cursor Agent, raising questions about value versus functionality.

Autonomy and Trust in Coding Agents

  • Initially hesitant, the speaker found that Cloud Code gained their trust over time, allowing it to perform commands autonomously without repeated permission requests.
  • In contrast, Cursor Agent lacked a mechanism for earning trust; it repeatedly asked for permission even after prior approvals with Cloud Code. This led to a reluctance to enable its features fully.
  • The speaker hopes for future updates to Cursor Agent that incorporate incremental permissioning similar to Cloud Code's model of earned trust.

Software Development Life Cycle Integration

  • Emphasizing test-driven development, the speaker preferred using agents that provided robust test coverage due to relinquishing control over code writing. Cloud Code excelled in this area compared to Cursor Agent.
  • The workflow with Cloud Code involved writing tests first, building features, ensuring tests passed, and committing changes—resulting in well-crafted commit messages that surpassed those generated by Cursor Agent.

User Experience and Interaction

  • The terminal-based nature of Cloud Code made interactions feel more natural when executing commands and handling outputs from tests compared to the less integrated experience with Cursor Agent.
  • While appreciating the repository UI of Cursor Agent for browsing commits and branches, the speaker noted its auto-generated commit messages were less detailed than those produced by Cloud Code. This highlighted a significant difference in user experience between both tools.

Overall Impressions and Recommendations

  • Both coding agents successfully completed tasks on a stalled project, surprising the speaker with their effectiveness despite initial skepticism regarding LLM capabilities without human oversight. They acknowledged improvements but noted limitations based on code complexity differences between personal projects and professional environments.
Video description

Cursor Agent and Claude Code dropped within a week of each other. I wanted to find out which coding agent was better. So I gave them each three tasks on a non-trivial web app running in production and ranked each agent on UX, Code Quality, Cost, Autonomy, and Tests & Version Control. tldw: I preferred Claude's CLI based UX. I found Cursor's Agent controls a bit clunky -- it wasn't always clear where and when I needed to click. For as much agency as I was giving to the agent, I also didn't feel like I needed 2/3 of the screen taken up by a file editor. A full-blown IDE might be overwrought as agents get good. Both agents were powered by Claude 3.7 Sonnet, so a lot of the line-level code quality was the same. Claude Code had a more wholistic understanding of my codebase, but Cursor's ability to search the web for documentation got it out of some jams. Claude Code's metered pricing can get expensive! I racked up about $8 dollars during ~90 minutes of hands on keyboard. Not a lot relative to software development costs, but not-trivial at scale either. Cursor's includes a lot of Agent use with its $20/month subscription, and my rough estimation clocks Cursor Agent at 4-5x cheaper than Claude Code (probably because Cursor is using less context of your codebase). For developers already paying for Cursor, it'll be interesting to see if they're willing to pay extra for Claude Code. Claude Code gains your trust through iterative permission granting. Cursor Agent has two modes: approve every change, or YOLO mode. By the end of my session with Claude Code, I was letting it do almost everything, because it had earned the right incrementally. I never was brave enough to click YOLO button on Cursor, so my development experience felt like button mashing. Claude Code worked better with my test suite, and I'm embracing tests more as I let LLMs write more of my code. Claude Code also wrote the prettiest commit messages I've ever pushed to git. In the end, I preferred the experience of working with Claude Code, but I won't be giving up my AI powered IDE any time soon, and I imagine the Cursor team will be rapidly improving on the developer experience. The bigger picture here: both agents successfully accomplished the tasks I gave them and got me unstuck on an abandoned project. I sort of can't believe we're here. I've been bullish on LLMs for code for a couple years, but I've long thought that a Human in the Loop is what made LLM powered coding viable. This experiment changed my mind. The agents were far from perfect, and my tasks and codebase are less complex than what developers often do at work, but to me, the trajectory is clear. This is where software development is headed. If you're a developer, even if you're skeptical, especially if you're skeptical, you owe it to yourself to at least spend a weekend working with these tools on a side project.