El peligro de depender de Claude y cómo ahorrar tokens IA
Anthropic's Cloud Code Changes and Personal AI Agents
Anthropic's Cloud Code Plan Changes
- Recently, Anthropic quietly removed the Cloud Code from its Pro plan (€20), with no official announcement or communication to subscribers. This change was discovered by users and went viral.
- After 20 hours, Anthropic reversed the decision, claiming it was a test affecting only 2% of new users. Amol Abasare stated that current plans were not designed for high usage levels.
Transitioning Away from Cloud Code
- The speaker decided to stop using Cloud Code for most daily tasks, opting instead for free or lower-cost alternatives.
- Two personal agents will be used in parallel: one for basic tasks (like note processing) and another for more complex tasks, referred to as a "digital brain."
Task Categorization and Model Usage
- Tasks are categorized into three types:
- Basic tasks (e.g., sorting notes, summarizing audio): can be handled by decent models available for free.
- Intermediate tasks (e.g., drafting emails): require slightly more advanced models but do not necessitate expensive options like Opus.
- Advanced tasks (e.g., programming): best suited for complex models like Opus due to their unique capabilities.
Limitations of Current Models
- The current issue is that Cloud Code operates on a single model basis, limiting users to Anthropic’s more expensive models while cheaper alternatives exist that can perform similar functions.
- Many users are locked into these systems due to vendor lock-in strategies employed by companies like Google and OpenAI.
Strategy Moving Forward
- The goal is to match task complexity with appropriate AI models: use inexpensive models for basic tasks, intermediate ones for medium complexity, and advanced ones only when necessary.
- A new multimodal personal agent called Hermes Agent is introduced as an alternative. It operates independently across various platforms (Telegram, WhatsApp, Discord).
Features of Hermes Agent
- Hermes Agent is open-source and designed to work continuously without human intervention. It can be hosted on local servers or VPS setups.
- Unlike other services backed by venture capitalists that may change pricing unpredictably, Hermes aims to grow alongside user needs without financial constraints.
Three Key Differences of Hermes
Multimodal Capabilities
- Hermes is a true multimodal agent, capable of interacting with 600 different models, unlike Cloud Code which only communicates with Aneropic models.
- Users can mix and match tasks across various models (e.g., combining Yemini Flash with Opus) within the same workflow, leading to significant token savings.
Continuous Operation and Communication
- Hermes operates 24/7 and has native communication capabilities through platforms like Telegram, Discord, and Slack without needing additional plugins.
- It features a built-in scheduler that allows users to automate tasks such as processing emails at specific times.
Self-Improvement Mechanism
- A standout feature of Hermes is its wired self-improvement engine that evaluates performance after every 15 actions, allowing it to autonomously create or enhance skills based on user experience.
- This self-improvement mechanism applies principles of compound effect and knowledge management directly into the agent's functionality.
Comparison with Open Cloud
Misconceptions about Self-Learning Skills
- There are misconceptions in forums comparing Open Cloud's self-learning capabilities to those of Hermes; they are fundamentally different in execution.
- The distinction lies in how improvements are integrated: Open Cloud relies on programmed skills while Hermes' enhancements occur natively within its core code.
Analogy for Understanding Improvement
- An analogy is drawn between hydration methods: Hermes provides continuous improvement akin to intravenous fluid delivery rather than sporadic oral intake, emphasizing efficiency and effectiveness in learning.
Installation Process Overview
Quick Setup Instructions
- The installation process for Hermes is straightforward and can be completed in under ten minutes using terminal commands provided by the speaker.
- Users will clone the repository without requiring sudo access; a quick setup option is available for ease of use.
Model Configuration Options
- During setup, users can select from various AI providers; the flexibility allows for easy switching between models based on task requirements.
Using Open Router for AI Models
Simplified Access to Multiple Models
- Open Router serves as a single access point for over 200 AI models, streamlining configuration by requiring only one private key entry instead of multiple API keys.
Cost Management Features
- Users can set spending limits when creating their private keys on Open Router, ensuring budget control while utilizing free model options.
Open Routro Configuration and Model Selection
Setting Up Open Routro
- The speaker demonstrates how to create a configuration in Open Routro by copying the private key and pasting it into the terminal.
- They choose the Nvidia Nemotron 3 Super model, which has 120 billion parameters, emphasizing that it's a free version suitable for their needs.
Understanding Model Capabilities
- The speaker explains that models are segmented and used based on specific tasks, allowing for efficient resource management while keeping costs low.
- They highlight Nvidia's reputation as a leading manufacturer of AI processors and note that the Nemotron model is capable of handling basic to medium-level tasks effectively.
Importance of Model Selection
- A critical insight shared is that not all models are suitable for every skill; using more expensive models may be necessary for complex tasks.
- The speaker warns about potential future price increases for AI models, stressing the importance of understanding current pricing structures to make informed decisions.
Granularizing Models for Efficiency
- Emphasizing the need to granularize model selection, they provide an example where smaller models fail with detailed Spanish instructions due to insufficient capacity.
- They share personal experiences with various models, indicating that larger parameter counts (like 120 billion) are often required to execute certain skills correctly.
Executing Tasks with Hermes Agent
- The speaker runs a command in the terminal to check if the Nemotron model is configured correctly before proceeding with task execution.
- They explain how Nexus MD serves as a universal layer connecting their digital brain with external AI tools, facilitating seamless operation across different agents.
Task Execution and Results Verification
- After executing a task related to email idea generation, they verify its success by checking their notes in Obsidian.
- The results show accurate property creation and proper inference of task requirements, demonstrating effective integration between Hermes agent and Nvidia's model.
Conclusion on Model Suitability
- The speaker concludes that selecting appropriate skills will dictate which AI model is needed; thus creating test batteries can help map out suitable options.
- They emphasize that not all high-capacity models are necessary for simpler tasks since there are specialized models trained specifically for those skills.
Virtual Assistance and Digital Brain
The Role of Virtual Agents
- Discussion on the use of virtual agents that work 24/7, employing various personal agents and tools to handle tasks based on complexity.
- Suggestion for users to consider a cost-effective plan (Cloud Code Pro) for programming tasks, emphasizing its utility for specific complex tasks while delegating simpler ones.
Delegation and Efficiency
- Emphasis on the ability to delegate up to 70% of tasks using Hermes and Nanotron, highlighting efficiency in task management.
- Insight into the importance of control over digital systems; it’s not about how much you pay but who manages your digital brain.
Future-Proofing Your Digital Presence
- Acknowledgment that everyone will eventually have a digital brain; the focus should be on when this will happen rather than if.
- Introduction of resources such as newsletters, templates, and a community (Cerreo Digital + AI), aimed at building systems collaboratively rather than passively consuming information.