Generative UI: Specs, Patterns, and the Protocols Behind Them (MCP Apps, A2UI, AG-UI)
Building AI Applications with Generative UI
Introduction to the Session
- The session focuses on building AI applications, particularly generative UI, within the context of Copilot Kits and AGUI frameworks.
- Tyler introduces himself as a founding engineer at Copilot Kits and is joined by Nathan, a senior developer who engages with the community to understand user needs.
Overview of Copilot Kit
- Copilot Kit is described as an open-source framework for creating AI copilots—user-facing agentic applications—with significant adoption (30,000 GitHub stars).
- The framework serves various clients including Fortune 100 companies and startups, being utilized in production by 10% of Fortune 500 companies.
- The agenda includes discussions on the state of Agentic UI, patterns in agentic UI, and future trends leading into 2026.
State of Agentic UI
- The evolution from serving models to serving agents highlights how applications are becoming more agentic; examples include Claude Co-work as a model for current developments.
- Agentic applications are categorized into two types: SAS co-pilots (e.g., HubSpot co-pilot) that assist users in navigating complex SaaS applications and productivity co-pilots (e.g., Cursor).
Challenges in Building Agentic Applications
- A key challenge is that agentic applications break traditional request-response paradigms due to their long-running tasks which can take seconds or minutes.
- Streaming content makes these tasks appear faster while still taking substantial time; this necessitates new approaches when developing software around agents.
- Developers must manage both structured and unstructured data inputs/outputs, including voice files and text content.
Complexity of Non-deterministic UIs
- Non-deterministic UIs created by agents introduce additional complexity; developers need to handle unpredictable outputs generated by these systems effectively.
- AGUI aims to address these complexities associated with building agent-based systems.
Agent User Interaction Protocol Overview
Introduction to Agent User Interaction Protocol
- The agent user interaction protocol connects AI backends (like Langraphs, Crew AIs, and ADKs) to front-end frameworks such as React, Angular, Flutter, and Dart.
- An amendment will be made to the slide later; currently discussing MCP for tools, context, and resources with the recent addition of MCP apps that can send iframes for embedding content.
Components of the Protocol
- A2A (Agent-to-Agent communication) allows transport via ATUI (Agent-to-User Interface), which is a declarative JSON specification optimized for generating structured components.
- AGUI serves as a standardization layer that facilitates building applications like chat interfaces by defining 16 events emitted by agents.
Streaming Support and Technical Details
- Currently, AGUI does not support streaming in the traditional sense; it sends one initial event followed by subsequent events all at once. However, streaming is supported through SSE or WebSockets.
- The ecosystem has seen adoption from major frameworks including Google ADK and Microsoft’s stack (agent framework and semantic kernel), contributing to AUI's growth.
Adoption Metrics and Generative UI Context
- AUI has achieved approximately 2 million weekly downloads since launch and supports various agent stacks while enabling generative UI implementations through MCP apps.
- The focus on generative UI highlights its importance in current discussions around application development using protocols like AGUI.
Event Structure in AGUI
- AGUI consists of 16 standard events facilitating agent-user communication; an example includes streamed content denoted by delta events indicating ongoing message transmission.
- Streaming works through state deltas where document generations are transmitted over time. Messages are treated as special types of state within this architecture.
Transport Mechanisms
- Various transport methods are available including HTTP streamable options and WebSocket implementations; flexibility exists for other protocols like WebRTC.
- AGUI operates on a client-server architecture similar to MCP where servers represent agents while clients represent front-end applications.
Demonstration Segment
- A demonstration is set up to showcase the practical application of these concepts; confirmation of screen visibility indicates readiness for further exploration.
Introduction to Generative UI
Overview of the Demo
- The speaker introduces a demo showcasing a new template being developed for partner frameworks, emphasizing an overhaul of existing templates.
- A button on the left allows real-time viewing of events as they occur, demonstrating interaction with the agent by sending greetings.
- The demo connects to a lang graph, illustrating how text message events are processed and displayed in real-time.
Understanding Generative UI
- The speaker transitions to discussing generative UI, specifically static generative UI, which generates components based on documentation searches.
- Static generative UI is defined as rendering pre-built developer-controlled components (like cards or widgets) in response to agent activity.
- An example is provided with a weather component that displays various weather metrics such as humidity and wind levels.
Types of Generative UI
Categories of Generative UI
- The speaker outlines three types of generative UI: static generative UI, open generative UI, and declarative generative UI.
- Static generative UI maps generated data from language models (LM) to existing front-end components for structured content display.
Implementation Details
- The process involves mapping tool calls or agent states to specific front-end components using AGUI events for seamless integration.
- A code example illustrates how a simple agent can be created and utilized within this framework, highlighting compatibility layers with partner frameworks.
Static Generative UI in Action
Code Demonstration
- A minimal code example shows how the get weather tool returns static data like temperature and humidity through an instantiated agent.
- The right side displays actual code used in the demo; all code will be made available on GitHub for reference.
User Experience Insights
- The co-pilot chat experience integrates various components while maintaining pixel-perfect accuracy in design through static mappings.
- Further exploration reveals how headless UIs can provide direct programmatic access to agents via tools like use agent hook.
Generative UI: Pros and Cons
Understanding the Basics of Generative UI
- The content rendering process involves mapping tool calls to components, where state and structured data are also linked to specific components. This creates a straightforward mental model for developers.
- Developers have significant control over how data generated by the language model (LM) is rendered, which enhances UI customization and satisfies design requirements.
- While there are advantages, a major drawback is coupling; as teams may be divided into front-end and back-end roles, changes in one area can disrupt functionality in another.
Challenges of Coupling in Generative UI
- High coupling occurs when front-end teams must align closely with back-end definitions of tool calls, leading to potential issues if API contracts change.
- As applications grow with more tools and UIs, maintaining component mappings becomes increasingly complex, necessitating additional development efforts for each new feature or tool.
Exploring Open-ended Generative UI
- The next evolution in generative UI is open-ended systems like MCP apps that provide flexible content generation through HTML or embedded iframes.
- MCP apps allow developers to insert dynamic content directly into applications but come with risks associated with direct browser manipulation.
Real-world Applications of MCP Apps
- MCP apps have evolved from previous iterations (MCP UI and Enthropic), now enabling integration with platforms like Zillow for enhanced user experiences.
- A demo showcases embedding an iframe from an MCP server that generates 3D rendered content dynamically based on user input, illustrating practical use cases for real-time interaction.
Iterating on User Experience with 3D Content
- Users can interactively request modifications to 3D objects (e.g., generating a cube), demonstrating the flexibility of generative UIs in creating tailored visual experiences.
- The ability to iterate on designs—like transforming a simple cube into a Rubik's cube—highlights the potential for engaging user interactions within applications powered by generative technologies.
Low Coupling in Frontend and Backend Development
Challenges of Low Coupling
- The low coupling between frontend and backend allows for flexible rendering, but it introduces unpredictability in UI presentation.
- Variability in UI can lead to inconsistent user experiences; for example, a Rubik's cube demo appears differently each time it's rendered.
- Security concerns arise from embedding iframes, which can expose applications to XSS vulnerabilities. MCP apps address these issues through their protocol.
Web First Experience
- The current approach is primarily web-focused, with native client adaptations being more complex due to the iframe method.
- Embedding interactive widgets like Zillow or HubSpot into applications showcases the potential of this technology.
Declarative UI: A New Approach
Understanding Declarative UI
- Declarative UI represents a semi-open set of constrained UIs driven by a declarative specification, allowing some flexibility while maintaining structure.
- This approach enables agents to generate JSON that renders specific components based on user requests (e.g., credit card application).
Characteristics of Declarative UI
- It balances between fully open-ended generative UIs and static generative UIs by allowing defined parameters for rendering.
- Typically described using cards and widgets with recurring elements such as forms or charts.
AUI: The Transport Mechanism
Introduction to AUI
- AUI (Agent User Interface) serves as a transport mechanism for declarative content, facilitating communication between agents and frontends.
- As an early partner with ATUI (Agent Transport User Interface), the focus is on delivering specifications that allow dynamic component generation.
Practical Application of AUI
- The ATI composer project demonstrates how AUI works by generating components based on user-defined specifications without knowing their final appearance upfront.
Component Generation Example
Interactive Component Creation
- An example involves taking a flight card component and modifying it interactively using copilot kits within the widget editor.
- The process includes asking follow-up questions to refine the generated component based on user input.
Understanding Declarative UI and Its Applications
Exploring JSON Content Generation
- The speaker discusses the generation of content in JSON format, highlighting a declarative approach to creating user interfaces. They humorously note the LLM's attempt at humor with phrases like "Adventure awaits" and "Buckle up for the fun ride."
- An arbitrary request to make content funnier leads to the addition of a button labeled "make me laugh," demonstrating how iterative design can enhance user interaction.
Component Rendering and Interaction
- The process involves generating components on-the-fly, where the left side shows what is being created while the right side displays rendering. Copilot Kit facilitates this dynamic interaction.
- AI Composer allows users to pre-build components that agents can render, emphasizing flexibility in composing various HUI (Human User Interface) elements into applications.
States and User Input
- The speaker explains different states that an agent can enter when generating UI components, showcasing how user input or agent-generated content fills out these states.
- A practical example illustrates how an agent could generate a form for submitting articles based on user requests, integrating seamlessly with platforms like Slack through backend connections.
Extensibility and Control Over UI Design
- Users have control over individual component designs such as inputs and buttons while maintaining pixel-perfect accuracy. This separation allows for flexible yet structured UI development.
- The discussion highlights potential interactions across multiple MCP servers, enabling diverse ways for users to engage with generated UIs.
Pros and Cons of Declarative UI
- A question arises about whether mini-applications like calculators can be generated. While possible, it requires collaboration with agents due to limitations in pre-built logic within HUI specs.
- Advantages include low coupling between front-end and back-end teams, allowing precise control over component rendering while accommodating common use cases effectively across various frameworks (React, Angular, etc.).
- However, constraints exist; fully custom UI needs may not be met due to spec limitations. For instance, specific features like calculator displays might require custom solutions outside standard specifications.
Understanding AGUI and AUI in Generative UI
Variability in User Interfaces
- The structure of user interfaces (UIs) can vary unpredictably, affecting how tasks are performed. This variability is not just about aesthetics but also the underlying framework.
- AUI (Agent User Interface) supports generative UIs by embracing, extending, and innovating within this landscape, particularly through its integration with co-pilot kits.
Support for New Technologies
- AUI facilitates embedding A2A (Agent to Agent) meshes into any agent framework, allowing seamless interaction across different protocols.
- With the evolution of MCP apps, AUI provides day-one support for rendering these applications on the front end via AGUI.
State Management in Agents
- The concept of agent states is crucial for creating responsive UIs. It allows systems like ChatGPT or cloud co-work tools to maintain context while generating content.
- Agents serve as abstractions over LLMs (Large Language Models), enabling them to manage state effectively—transforming stateless interactions into stateful ones.
Structuring Data and State Sharing
- Messages and structured data represent two types of state that agents can handle. This capability allows developers to create applications that share state between front-end and back-end seamlessly.
- HUI introduces a first-class concept of state management where snapshots can be transported over networks using AGUI.
Interactivity Through State Updates
- Users can send input along with new states back to their agents, enhancing interactivity. For instance, editing generated documents reflects changes understood by the agent.
- Copilot Kit manages security configurations while allowing users to modify states and messages easily through simple interactions like button clicks.
Agent Collaboration and Future Directions
Bidirectional Interaction with Agents
- The agent's awareness of changes allows for a powerful interaction model where users can edit documents while the agent generates content.
- The process involves using
agent.state.documentto manage document states, enabling seamless updates as users make edits.
- This collaborative approach is foundational to the Copilot framework, emphasizing user-agent cooperation in content creation.
Interactive Features: App Mode
- The introduction of "app mode" allows users to create sticky notes within the interface, enhancing interactive capabilities.
- Users can modify sticky notes, demonstrating how bidirectional state syncing keeps the agent informed of changes in real-time.
Agent Steering and Human-in-the-Loop
- Agent steering is highlighted as a critical feature that enables users to guide agents through interactive interfaces rather than solely relying on code inputs.
- The concept of "human in the loop" ensures that user feedback is integrated into agent operations, improving outcomes through active participation.
Self-improving Layers and Auto RLHF
- The development of self-improving layers (auto RLHF) leverages user feedback for training agents, enhancing their performance over time.
- Successful models like Cursor and Windsurf utilize automatic human feedback to refine their coding capabilities through reinforcement learning techniques.
Future Perspectives on Generative UI
- Agents are positioned to transform UI/UX paradigms by facilitating generative interactions that adapt based on user collaboration.
- Generative UI patterns will allow applications to dynamically generate experiences tailored to user needs, marking a significant evolution in application design.
- Emphasis on auto RLHF suggests ongoing exploration into scalable learning mechanisms for agents, indicating a trend towards more intelligent systems.
Generative UI and Its Implications
Overview of Generative UI
- Generative UI is being discussed across various platforms, including blogs and YouTube videos from major companies like Google and Microsoft, highlighting the importance of gathering diverse perspectives on the topic.
Security Concerns in Generative UI
- There are ongoing discussions about security related to generative UI applications. A link was shared for a repository where these conversations are taking place, specifically regarding model context-apps.
Declarative Content and AUI
- The question arises about who determines the UI components for declarative content. Declarative UI is a concept, while A2I represents its implementation.
- Various types of generative or declarative UI specifications exist, with JSON being optimized for generating components. The choice of components ultimately depends on user preference.
Maturity of HUI
- HUI launched in December last year and is approaching its 1.0 version. Regular interactions occur with their team, contributing to the React renderer for building applications.
Future Developments in AUI
- Upcoming developments include a React implementation expected by Q1 2026, with current efforts focused on achieving maturity before reaching version 1.0.
Clarification on Component Types
- Clarification was made regarding AUI components versus AGUI components due to similar naming conventions; AUI focuses on transporting JSON content into rendered components.
Functionality of Generated Components
- Generated components can indeed be used to fill actual forms and interact with back-end systems, allowing agents to map button clicks to actions effectively.
Composing Complex Views
- Users can compose complex views using component parts within HUI specifications that support various formatting options for different contexts such as chat or dashboards.
Audio Elements in Applications
- Applications can incorporate audio elements like recording features alongside other media types (images/videos), enhancing interactivity within generated UIs.
Latency Overhead Considerations
- The latency introduced by AUI/copilot kit is minimal since it serves as a translation standardization tool rather than a hosted solution, facilitating compatibility across multiple frameworks like Angular and React.
AGUI Deployment and Performance Insights
Performance Metrics
- The performance of AGUI is described as "pretty minimal," indicating that it operates at high speed, with latency for backend communication being less than a millisecond.
Production Readiness
- AGUI is confirmed to be ready for large-scale deployment and is already in production use. This includes various integrations and collaborations with numerous users.
Ecosystem and Collaborations
- The AGUI ecosystem has expanded significantly beyond its core maintainers, involving many contributors who integrate different UI frameworks regularly. This growth reflects the increasing adoption of AGUI across diverse sectors.
Industry Applications
- AGUI is utilized in highly regulated industries such as finance, healthcare (Medicare), and cybersecurity, showcasing its versatility and reliability in critical applications.
Community Engagement
- Users are encouraged to reach out via LinkedIn, email, Slack, or Discord for any unanswered questions regarding AGUI. Links to community resources can be found on their documentation page or website.