I Cut My AI Agent Costs 70% With One Change (Manifest)

Name: I Cut My AI Agent Costs 70% With One Change (Manifest)
Uploaded: 2026-05-03T12:00:30.000Z
Duration: 12 min 58 s

Introduction to Manifest and Cost Reduction

Overview of Manifest

The speaker shares their experience with Manifest, noting a 70% reduction in token costs while using the same agent and tasks due to better routing.

Many AI agents use expensive models for simple tasks like classification and summarization, leading to inflated bills.

Understanding the Problem

Agents often make thousands of calls, most of which are straightforward; however, they default to high-cost models for these basic operations.

Writing custom routing logic can complicate code with numerous if-else statements that may break easily with prompt changes.

How Manifest Works

Functionality of Manifest

Manifest acts as an intermediary between your agent and various models, scoring requests across 23 dimensions to route them efficiently.

It operates through a single endpoint without requiring rewrites or complex setups, allowing for seamless integration into existing workflows.

Real-Time Dashboard Features

The dashboard provides real-time updates on token usage, cost per agent, and budget tracking, potentially reducing costs by up to 70%.

Technical Insights into Routing

Mechanism of Action

Manifest functions as a controller that determines the best model for each request without calling another LLM, ensuring low latency (under 2 milliseconds).

It supports hundreds of models from various providers while maintaining efficient routing intelligence compared to other tools like Open Router or Light LLM.

Advantages and Limitations

Benefits of Using Manifest

Users benefit from significant savings by utilizing existing subscription plans rather than incurring additional token costs.

The dashboard allows users to monitor expenses across different models in real time without major rewrites needed for existing clients.

Considerations Before Adoption

While setup is relatively simple, it still requires managing API keys and wiring providers; some developers desire more SDK options.

Ideal for those running multiple agents making frequent small calls; not recommended for users seeking zero setup complexity.

Video description

If you’re building AI agents with tools like OpenClaw, LangChain, or Claude Code, there’s a good chance you’re massively overpaying for LLM usage without realizing it. In this video, we see how I reduced my AI agent costs by 70% with one simple change—no rewrites, no complex routing logic, just smarter model selection. I’ll show you exactly how Manifest, an open-source LLM router, sits between your agent and 600+ models to automatically route each request to the cheapest capable option in under 2 milliseconds. 🔗 Relevant Links Manifest Repo - https://github.com/mnfst/manifest Manifest Site - https://manifest.build/ ❤️ More about us Radically better observability stack: https://betterstack.com/ Written tutorials: https://betterstack.com/community/ Example projects: https://github.com/BetterStackHQ 📱 Socials Twitter: https://twitter.com/betterstackhq Instagram: https://www.instagram.com/betterstackhq/ TikTok: https://www.tiktok.com/@betterstack LinkedIn: https://www.linkedin.com/company/betterstack 📌 Chapters: 0:00 Reduce AI Agent Costs (70% Savings Explained) 0:37 Why AI Agents Are So Expensive (Hidden LLM Costs) 0:50 The Real Problem: Bad LLM Routing 1:45 Live Demo: 70% Cheaper AI Agent (Manifest Setup) 2:57 What Manifest Is (Open-Source LLM Router Explained) 3:18 Why Most LLM Calls Don’t Need GPT-4 / Claude Opus 3:53 Manifest vs OpenRouter vs LiteLLM (Best LLM Router?) 4:50 Self-Hosted vs Cloud LLM Routing (Privacy + Cost) 5:09 Pros & Cons of Manifest (Dev Feedback) 6:20 Is Manifest Worth It? (Who Should Use It) 6:45 Final Verdict: Cut Your AI Costs Without Rewriting Code