This AI Agent can actually self-evolve… just watch

Name: This AI Agent can actually self-evolve… just watch
Uploaded: 2026-04-27T22:25:34.000Z
Duration: 1 h 32 min 50 s
Description: Wanna learn how to code with AI? Go here: https://www.skool.com/new-society Follow me on Instagram - https://www.instagram.com/davidondrej1/ Follow me on Twitter - https://x.com/DavidOndrej1 Space Agent: https://space-agent.ai/login Github: https://github.com/agent0ai/space-agent Subscribe if you're serious about AI. Space Agent is a self-evolving agent and it's free and open source.

Introduction to Space Agent

Overview of Self-Updating Agents

The Space Agent is described as the first self-updating agent, capable of creating user interfaces (UIs) that other agents cannot.

Van, the creator of Space Agent and founder of Agent Zero, introduces the product and its unique capabilities.

Limitations of Traditional Agents

Traditional agents operate on limited layers, often restricted by their communication interfaces like WhatsApp or Telegram.

Even with rich web UIs, traditional agents can only perform limited actions without modifying backend processes.

Key Features of Space Agent

Client-Side Operation

Unlike traditional agents, Space Agent runs client-side in a JavaScript runtime, allowing it to mutate the displayed page directly.

Dynamic Communication

The agent can dynamically render various content types such as prices, charts, news, and even games based on user requests.

User Experience with Space Agent

Initial User Interaction

Users are presented with an empty space upon entering Space Agent for the first time; no setup is required for use.

Open Source Accessibility

Users can try out Space Agent for free via GitHub; it offers a guest account option requiring minimal setup.

Demonstration of Functionality

Example Interaction

A demonstration shows how users can ask about weather conditions in Prague and receive responses both in chat and through widgets.

Technical Insights

The agent's response mechanism involves generating raw text without additional formatting or tokens unless necessary.

Efficiency and Token Management

Token Efficiency Explained

The conversation loop is designed to be token-efficient; responses are generated using minimal tokens while maintaining clarity.

Simplified Response Mechanism

The agent responds in plain text when possible, avoiding unnecessary complexity in its output.

Token Efficiency and System Design

Token Management in AI Models

The discussion highlights the efficiency of generating widgets with minimal token usage, noting that it only took about 280 tokens to create one.

As AI models become more expensive, particularly with GBD 5.5 Pro costing $180 per million output tokens, there is a growing interest in "tokconomics" to optimize costs associated with running inference.

Prompt Optimization Techniques

The speaker describes an automated research process using Codeex to iterate on system prompts, creating three versions: conservative, medium change, and wild variations for testing.

Continuous optimization led to version number 250 of the prompt, focusing on token efficiency and reliability through iterative testing.

Dynamic User Interface Creation

A demonstration of a notes app showcases how multiple widgets can cooperate within a dynamic space that allows resizing and rearranging elements easily.

The implications suggest that users can design custom UIs without traditional coding or server hosting; everything runs in the browser using JavaScript.

Future Operating Systems Concept

The vision presented indicates a future where operating systems may not require traditional apps or interfaces since agents could manage tasks like email sorting directly through user commands.

Notes App Functionality Overview

The notes app developed supports various features such as folder management, visual editing, markdown view conversion, copy-pasting images/attachments—functioning effectively as a complete notes application.

Local Data Management and Security

Browser Limitations and Backend Solutions

Browsers have security limitations preventing file management on host systems; thus, a thin Node.js backend manages user permissions and file storage securely.

Native App Advantages

Users are encouraged to download the native app for local data storage without server communication; this setup enhances privacy by keeping all files on the user's machine.

Dashboard Creation Using Space Agent

Surveillance Dashboard Example

A surveillance dashboard was created quickly by selecting public IP cameras; the agent generated this interface in just minutes after initial camera selection.

Time Investment for Development

Building the notes app required more time due to detailed instructions (approximately 10 minutes), while simpler dashboards were constructed rapidly (1–2 minutes).

Space Agent: Orchestrating Dynamic Systems

Overview of Space Agent's Capabilities

Space Agent excels in orchestrating multiple systems, allowing for dynamic UI management and control over various agents from a single dashboard.

The speaker connects Space Agent to an instance of Agent Zero on their local machine, creating a chat interface that utilizes the Agent Zero API for communication.

Functionality and Performance

The integration features an embedded browser window that allows full control over the Agent Zero web UI without relying on APIs or shortcuts.

Initial prototypes are typically fast due to their simplicity; however, this implementation maintained speed even after adding functionalities.

Technical Implementation

The agent transcribes web pages into a simplified format, enabling it to interact with elements like buttons and input fields by referencing them with unique identifiers.

Commands sent by the agent (e.g., "click button 25") yield updated transcriptions of the page state, facilitating seamless interaction.

Challenges and Solutions

Web crawling presents challenges due to complex structures like iframes and shadow DOMs; solutions involved injecting hacks into the browser renderer for better accessibility.

A combination of AI assistance and manual adjustments led to an effective transcription method that optimizes interactive element recognition while minimizing token usage.

Token Management Strategy

Historical states are not retained in memory once they become outdated, preventing unnecessary token consumption during interactions.

By managing transient data effectively, the system maintains efficiency in token usage while ensuring relevant context is always available for ongoing conversations.

Future Enhancements

There is a desire to demonstrate live updates within custom UI elements, showcasing how easily modifications can be made using Space Agent's capabilities.

The speaker suggests starting with pre-built interfaces or conducting research-driven modifications as part of future demonstrations.

Research Harness Development and UI Interaction

Overview of the Research Harness

The UI is designed to facilitate communication between the agent and itself, allowing for a two-way interaction. A research harness has been created and tested on Agent Zero and GPD 5.5.

Custom instructions can be set for each space, enabling tailored interactions based on specific requirements or characteristics.

Testing the Research Functionality

During testing with Claude Mythos, the agent receives messages from the UI and updates a research template in real-time, gathering notes and creating a markup file based on predefined templates.

The flexibility of creating custom UIs allows users to design interfaces beyond standard options like WhatsApp or terminal interfaces.

Enhancements in Output Formatting

Suggestions were made to improve formatting by adding colors or converting outputs into PDF format for better presentation.

The chat interface offers different display modes; compact mode is preferred for clarity when multiple elements are present.

Exporting Features and Persistence

A new feature was added to export research as a PDF, which enhances usability since this functionality was not originally included in the space's features.

The persistence of data is highlighted; refreshing the page does not lose any changes made by the agent due to how widgets are created and stored.

Creating New UI Elements

Adding Interactive Components

There’s an intention to create various UI elements from scratch, starting with a Kanban board styled similarly to Trello, emphasizing colorful designs.

Developers are encouraged to make components extensible while considering performance; faster models than Opus 4.7 can be utilized during development.

Performance Insights

Initial development used GPT 5.4 mini for quick responses but found that Gemma 4 performed comparably well locally despite being less reliable than other models like Claude Sonet or Claude Opus.

Additional Features Exploration

Following the creation of a Kanban board, there’s interest in integrating diverse functionalities such as stock price charts for companies like Nvidia, Apple, and Google (Alphabet), exploring API limitations for data retrieval.

Agent Capabilities and Limitations

Overview of Agent Functionality

The training data for agents is crucial as it influences future iterations, similar to a tech stack that evolves over time.

Running the agent in a browser prevents API blocks typically encountered by VPS-hosted agents, allowing smoother operation.

Limitations of Space Agent

Space Agent is not suitable for tasks requiring operating system access or low-level operations like installing Linux packages.

It cannot perform background jobs since it operates on the front end; turning off the computer halts its functionality.

Future developments may address background job capabilities, but current focus remains on user feedback and needs.

Potential and User Engagement

Developers believe they have only tapped into about 5% of the agent's potential, indicating room for exploration and innovation.

The concept of a "new paradigm" suggests that understanding and utilizing these agents will take time for both developers and users.

User Experience with Spaces

Functionality of Spaces

Users can create multiple persistent spaces for different projects without incurring additional costs; each space is lightweight in storage.

Development Insights

A step sequencer was developed quickly (around 20 minutes), showcasing rapid prototyping capabilities within the platform.

Accessibility of AI Tools

Engaging Non-Tech Users

The platform serves as an entry point for non-tech individuals to experience AI's power through simple English commands without needing technical knowledge or installations.

Local Inference Requirements

To utilize local inference, users need a powerful GPU; models from Hugging Face can be integrated easily into the system.

Understanding Space Agent and Its Development

Concept of Compatible Models

The discussion begins with the mention of a community repository containing numerous models, highlighting that currently, only the Jim 4 model is relevant for writing JavaScript and understanding system properties.

Philosophy Behind Space Agent

The creator shares their thought process in developing Space Agent, emphasizing a desire to connect various ideas and create an accessible agent similar to Agent Zero.

Self-Modifying Capabilities

A key feature of Space Agent is its ability to fully control itself, including modifying its own modules dynamically based on user needs. This allows for personalized development environments.

Client-Side Processing Advantages

Unlike other systems like Agent Zero that rely on backend processing (Python), Space Agent operates entirely on the client side, resulting in minimal server load (2% CPU usage).

Browser as the Chosen Platform

The decision to use a browser was driven by the need for cross-platform compatibility without installation requirements. Browsers provide a standardized runtime across all operating systems.

Challenges with Browser Limitations

While browsers offer great security features (sandboxing), they also present challenges such as cross-origin issues which require workarounds during development.

Development Process Using AI Tools

Utilizing Codeex for Development

The developer reveals that they did not write any code manually; instead, they utilized Codeex to build the entire codebase efficiently.

Importance of Documentation

An essential part of the workflow involves creating an agents.mmd documentation framework that helps maintain context and prevents redundancy in coding tasks.

Comprehensive Documentation Strategy

Every aspect being developed is documented simultaneously within agents.mmd files, allowing the AI agent to understand core principles and functionalities effectively.

This structured approach ensures clarity in both development and future adaptations of software projects.

Agent Development Insights

Current State of Agent Development

The speaker emphasizes the need for all preferences and decisions to be documented in markdown files within the repository, highlighting ongoing challenges with agent decision-making.

Despite improvements over the past couple of years, there are still numerous instances where manual corrections are necessary. Documentation helps agents avoid repeating mistakes.

Recent advancements like GBD 5.5 and DeepSc V4 are expected to significantly reduce existing issues within 3 to 6 months.

Acceleration in Development

The development timeline for Agent Zero took three months initially, while Space Agent was developed in about four weeks, showcasing a marked acceleration in productivity.

The MVP (Minimum Viable Product) for Space Agent was completed even faster, indicating a substantial increase in efficiency—estimated at a speed-up factor of 12 to 20 times compared to earlier projects.

Coding Skills and AI Integration

The speaker argues that coding remains an essential skill despite advancements in AI tools; they can quickly identify inefficiencies and guide AI towards better solutions.

Continuous monitoring of AI outputs is crucial; the speaker actively checks multiple code execution windows for errors during development.

Tool Preferences: Codex vs. Cloud Code

Preference for Codex over Cloud Code is discussed; Codex performs better with hierarchical documentation and consistent updates compared to Cloud Code's tendency to overlook documentation changes.

An anecdote illustrates frustrations with Cloud Code's inefficiency when fixing minor issues, contrasting it with Codex’s superior performance.

Cost Efficiency and Value Proposition

A comparison between OpenAI's pricing plans reveals significant differences in value; OpenAI offers more inference capabilities at similar costs compared to competitors.

While both tools have their strengths, Codex is preferred for complex tasks requiring significant refactoring or debugging due to its effectiveness.

Relationship Between Agent Zero and Space Agent

Both products are developed by the same team but serve different use cases; users may utilize both simultaneously depending on their needs.

Space Agent is designed as a more dynamic tool that interacts closely with users, whereas Agent Zero operates autonomously without user interruptions.

How to Use Space Agent for Interactive Browsing

Interactive Browser Features

The browser is fully interactive, allowing users to navigate websites like Google and solve captchas directly.

Users can set up an anonymous browsing experience through the Space Agent platform by visiting space-agent.ai and entering their API key.

Setting Up the API Key

New users can find the option to set their LLM API key on the Space Agent website, which defaults to OpenRouter for easy access to various AI models.

Users are guided on how to create an API key from OpenRouter, ensuring a seamless integration with Space Agent.

User Experience and Model Performance

The initial user experience is designed to be impressive and tolerant, especially for new users interacting with the "Big Bang" space.

Sonet is highlighted as a balanced model in terms of cost and performance, with discussions about potential future releases (Sonet 4.7 or 5).

Unique Features of Space Agent

A notable feature of Space Agent is its time travel capability, allowing users to revert changes made within their directories easily.

Changes are tracked automatically in a Git repository format, enabling users to undo actions without needing external assistance.

Admin Mode Functionality

An admin mode provides a persistent interface that allows users to fix issues even if they break the main application interface completely.

This mode enables interaction with agents and management of installed modules while maintaining access to time travel features.

Troubleshooting and Support

If an agent breaks itself due to critical dependency issues, manual intervention may be required; however, Space Agent simplifies this process significantly.

Beginners often face challenges during debugging but can utilize built-in tools within Space Agent for easier troubleshooting.

Getting Started with Space Agent

Users are encouraged to explore both the GitHub repository and the official website for comprehensive information about setup and usage.

The easiest way for newcomers is trying out the live version available on space-agent.ai before downloading any local applications.