Anthropic's New AI Just Changed Everything
Opus 4.6: The Latest AI Model from Anthropic
Introduction to Opus 4.6
- Anthropic has released Opus 4.6, touted as the most powerful AI model yet, just 72 hours after OpenAI launched Codeex and GPT 5.3.
- The discussion takes place at Clawcon, where the speaker met Peter, founder of Clawbot (now OpenClaw), setting a relevant context for the conversation.
Key Features of Opus 4.6
- Opus 4.6 features a 1 million token context window in beta, allowing Claude to read and remember content equivalent to about ten novels; however, this may slightly reduce intelligence.
- It scored 76% on long context benchmarks, significantly outperforming its predecessor (4.5), which only scored 18%.
- Claude identified over 500 zero-day vulnerabilities in open-source code without any special training, showcasing its impressive capabilities.
Improvements Over Previous Versions
- Enhanced coding skills allow for better planning and reliability in larger codebases; improved debugging skills help catch mistakes autonomously.
- New features include adaptive thinking, enabling Claude to adjust its cognitive load based on task complexity—similar to human behavior in different social contexts.
Performance Comparisons
- Notion and GitHub AI leads are excited about the new model's capabilities; it outperformed GPT 5.2 by 144 ELO points on enterprise benchmarks.
- Anthropic is positioning itself against OpenAI by avoiding ads in their models while critiquing OpenAI's potential ad introduction strategy.
Practical Applications and Testing
- Initial tests included converting USD to Euro using adaptive thinking; results varied between versions due to external factors like Wi-Fi quality.
- A comparison of Python function generation for finding prime numbers showed that both versions performed well but with differences in speed and accuracy of responses.
This structured overview captures the essence of the transcript while providing clear timestamps for further exploration of each point discussed regarding Opus 4.6's capabilities and comparisons with previous models.
AGI Predictions and Model Comparisons
Evaluating Coding Solutions
- The speaker discusses adaptive thinking in coding, comparing two models (4.5 and 4.6) on a simple coding task, noting that model 4.6 provided a more optimal solution with O(n log n) time complexity.
- Model 4.6 is deemed superior for senior-level engineering roles compared to model 4.5, which is likened to junior developer capabilities.
Inquiry into AGI Timelines
- The speaker prompts both models to predict when Artificial General Intelligence (AGI) will be achieved, seeking well-researched responses.
- Model 4.5 estimates AGI could occur within 5 to 20 years but acknowledges the uncertainty surrounding such predictions.
Diverse Perspectives on AGI Arrival
- Various experts' opinions are shared: Eric Schmidt predicts AGI in 3 to 5 years; Jensen suggests March of 2024; while others express skepticism about current capabilities.
- Model 4.6 provides a range of predictions from optimists (2026-2028), middle ground (2028-2035), to skeptics (2035+), indicating a consensus around uncertainty.
Research Insights and Recommendations
- The speaker appreciates model 4.6's resources for further reading, including "80,000 Hours Overview AI" and "Metaculus AGI Forecast," highlighting its depth of research compared to model 4.5.
- Model 4.6 concludes with a prediction of AGI by the year 2032 but emphasizes the wide error bars associated with this estimate due to current limitations in AI systems.
Final Thoughts on Predictions
- The discussion reflects on the credibility of various predictions, noting that industry leaders often hype timelines while academics suggest longer horizons.
- Ultimately, both models provide close predictions (2030 vs. 2032), leading the speaker to favor model 4.6 for its comprehensive reasoning and resource suggestions regarding future developments in AGI technology.