Minimax M2.5 - What Makes This Different!
How Much Does Running a Frontier AI Model Cost?
Overview of AI Model Costs
- The cost to run a Frontier AI model like Claude Opus ranges from $15 to $20 per hour, depending on throughput.
- GPT-5 models are expected to have similar pricing, with the new Spark model potentially costing even more.
- Minimax's newly launched M2.5 model claims a significantly lower operational cost of just $1 per hour.
Performance and Evidence
- Despite skepticism about low-cost claims, third-party evidence supports the effectiveness of Minimax's M2.5 model for software engineering and office tasks.
- The video will explore insights from various organizations regarding this model and its potential as a competitor in the market.
Minimax's Development Journey
- Minimax is recognized as one of China's leading AI companies, focusing on improving models across multiple domains including LLMs, vision, and audio.
- Currently, the M2.5 model is not available as open weights; however, it has been shared with several companies for broader access.
Competitive Landscape
- Companies like Ollama are partnering with Minimax to offer their models through cloud services at no cost temporarily.
- The M2.5 model has two versions available: one operating at 50 tokens per second and another at 100 tokens per second.
Benchmarking Against Competitors
- Initial benchmarks suggest that the M2.5 model competes well against other models such as Opus 4.5 and Gemini 3 Pro.
- OpenHands has benchmarked this model as an "unlocked" version that could rival top proprietary coding assistants.
What Makes OpenHands' Findings Significant?
Insights from OpenHands
- OpenHands originated from Carnegie Mellon’s project aimed at creating an open-source coding assistant called Devin.
- Their research into coding harnesses has yielded valuable insights into effective coding practices and tools.
Cost Comparison Analysis
- OpenHands highlights that while the M2.5 may lag behind Opus in performance, it offers substantial cost savings—over 90% cheaper than competitors.
- Detailed comparisons show how the M2.5 performs better economically for long-duration tasks compared to Claude models.
Detailed Pricing Breakdown
Token Pricing Structure
- For the faster M2.5 lightning model (100 tokens/second), costs are approximately $0.30 per million input tokens and $2.40 per million output tokens.
- The slower version (50 tokens/second), suitable for most applications, costs around $1.20 per million output tokens.
Overall Cost Efficiency
- The operational costs indicate that running this model continuously can be significantly less expensive than using Claude Opus or GPT 5 series models.
Exploring the Impact of Cheap Tokens on AI Development
The Role of Affordable Tokens in AI Applications
- The accessibility of inexpensive tokens may enable broader use cases for AI, such as always-on agents that perform tasks without incurring high costs.
- Potential applications include coding, continuous integration/deployment, and deep research, prompting builders to consider new ways to implement these agents.
Insights into Model Improvement Rates
- A blog post reveals significant improvements in model performance since June last year, outpacing proprietary models from companies like Anthropic and Google.
- This rapid improvement raises questions about the methods used by Minimax to enhance their models effectively.
Understanding Model Size and Training Techniques
- Minimax's models are relatively small compared to other proprietary models; they do not rely solely on larger datasets or more extensive training.
- The key to their success lies in scaling reinforcement learning (RL), which is a common strategy among leading foundation model companies.
Reinforcement Learning Environments
- Many tasks within the company have been transformed into training environments for RL, contributing to improved performance in office-related tasks.
- There are already hundreds of thousands of such environments created with support from various startups focused on RL task development.
Challenges and Innovations in Asynchronous Training
- Scaling up RL training presents challenges due to slow processes; asynchronous scheduling strategies allow agents to train across multiple environments simultaneously.
- On-policy versus off-policy learning is discussed using a cooking analogy: on-policy provides immediate feedback but is slower, while off-policy allows for faster iterations but can lead to forgotten mistakes.
Enhancements through Tree Structured Merging Strategy
- Minimax employs asynchronous scheduling and tree structured merging strategies that enable efficient weight updates during training across multiple tasks.
- This approach reportedly achieves a 40x speedup compared to traditional one-task-at-a-time training methods.
Understanding RL Training Techniques
The Secret of Training Models
- Companies like Google, OpenAI, and Anthropic have not disclosed their methods for solving complex training problems in reinforcement learning (RL), which remains a closely guarded secret.
- Early models from OpenAI, such as O1, introduced the concept of breaking down tasks into steps for verification during RL rollouts but lacked transparency on their reward mechanisms.
Innovations in Reinforcement Learning
- Minimax is utilizing an alternative approach to GRPO and PPO called CISPO, which appears crucial for maintaining stability in mixture of experts models during large-scale RL training.
- The extensive use of hundreds of thousands of internal RL environments contributes significantly to achieving superior performance in various skills.
Competitive Landscape and Model Performance
- Anthropic's Cowork model benefits from numerous RL environments tailored for collaborative tasks, enhancing its generalization capabilities across different coding challenges.
- OpenHands has developed a 230 billion parameter mixture of experts model that operates with only 10 billion active parameters, showcasing impressive results despite being smaller than many proprietary models.
Pricing and Accessibility
- Initial perceptions about Minimax's pricing may seem high; however, upon realizing it’s an annual fee rather than monthly, it becomes more competitive compared to other models like Claude Code.
- Monthly costs are significantly lower than those associated with other leading models such as Opus and Codex.
Future Developments and Testing
- Minimax's offerings are already available on platforms like OpenRouter, allowing users to experiment with different pricing tiers based on token usage.
- The company is headquartered in Singapore with data centers in the US. There are ongoing security concerns regarding certain models like OpenClaw that warrant careful testing before public engagement.
- Anticipation builds around upcoming releases from various companies including DeepSeek ahead of Chinese New Year; feedback from users will be valuable for assessing these new developments.