AI Agent Dev School 1 pt 2

AI Agent Dev School 1 pt 2

Running AI Models Locally and Understanding Embeddings

Challenges of Running AI Models Locally

  • The speaker discusses the difficulties of running the Eliza stack locally on a MacBook, noting performance issues due to hardware limitations.
  • To streamline the process, they plan to copy environment variables and configurations from an old project into a new repository for easier access.

Understanding Embeddings in AI

  • The speaker introduces embeddings, explaining that they convert words into vectors, allowing for comparisons based on proximity in a numerical space.
  • They reference resources like Google’s documentation and SL FL examples as valuable for understanding how embeddings work and their applications in AI.

Learning Resources for Deep Understanding

  • Mentioning Andre Karpathy's "Neural Network Zero to Hero" playlist, the speaker highlights it as an excellent resource for learning about embeddings and tokenizers from scratch.
  • The playlist is praised for its comprehensive approach, enabling learners to build models like GPT-1 by the end of the series.

Visualizing Word Relationships through Vectors

  • The concept of placing complex sentences into low-dimensional spaces is discussed, with examples illustrating how words can be related (e.g., pizza vs. hot dog).
  • The speaker explains how relationships between concepts (like king vs. queen) can be represented along gender vectors within high-dimensional spaces.

Tools and Techniques for Exploring Embeddings

  • A tool called TensorFlow's embedding projector is introduced as a way to visualize word relationships in three-dimensional space derived from high-dimensional vectors.

Discord Integration and Character Creation

Initial Setup and Challenges

  • The speaker discusses issues with the Claude model, indicating it is stuck in a loop while generating text responses.
  • There are initialization problems with the Llama service, which should work but appears messy; the speaker expresses concern about its functionality.
  • The conversation shifts to personal goals, emphasizing the importance of being strong and driven.

Character Development

  • The speaker suggests creating a new character for fun, asking for ideas from the audience. Kevin Hart and Snoop Dogg are mentioned as potential characters.
  • Snoop Dogg is characterized as a successful rapper and actor who is cool, chill, and funny; he has interests in web3 projects like Sandbox.

Crafting Character Lore

  • A humorous dialogue is created around Snoop Dogg's persona, including references to smoking blunts as part of his character traits.
  • The speaker clarifies that character bios do not need full sentences but can be listed as lore or attributes about Snoop Dogg.

Prompt Engineering Insights

  • There's an emphasis on prompt engineering's role in enhancing character development; suggestions include how to effectively inject bio information into prompts.
  • The discussion includes various ways to describe Snoop Dogg’s skills in music-making and lifestyle choices.

Integrating Discord Client

  • A humorous scenario is presented about having a personal employee whose job is to roll blunts for Snoop Dogg; this leads into technical discussions about integrating characters into Discord.
  • Steps are outlined for enabling Discord as a client for interaction with the created character, highlighting challenges faced during setup.

Technical Implementation Details

  • The process of adding Discord as a client involves navigating through code complexities; there’s mention of needing to import specific clients like Twitter and Telegram too.

Creating a Discord Bot from Scratch

Setting Up the Discord Application

  • The speaker encounters an error due to missing Discord configuration and decides to create a new application in the Discord Developer portal, naming it "Snoop Dog."
  • The process involves obtaining an application ID and resetting the bot token, which requires multi-factor authentication (MFA).
  • After signing in, the speaker copies the bot token securely to avoid exposure.
  • The bot token is stored as an environment variable for security; users are instructed to place their own bot token here.

Inviting the Bot to a Server

  • To invite the bot, users need to generate an invite link using discord.js, requiring their client ID.
  • The speaker demonstrates how to paste the application ID into the invite link and successfully invites Snoop Dog into a server.

Configuring Bot Permissions

  • After inviting, it's noted that Snoop Dog isn't running yet. The speaker assigns roles like "agent" and "verified" but acknowledges that without running, he won't respond.
  • Upon attempting to run Snoop Dog, issues arise with disallowed intents; this is common when necessary permissions aren't granted.

Adjusting Gateway Intents

  • Users must enable specific privileged gateway intents: server members, message content, and presence updates for proper functionality.
  • After saving these settings, Snoop Dog responds correctly in Discord. However, there are minor bugs such as double responses that need fixing.

Final Thoughts on Development Process

  • The speaker summarizes the development journey: cloning repositories, installing dependencies, creating character files, launching from command line, and getting interaction within Discord.
  • Future plans include adding Twitter integration by simply entering credentials into the system.

Transitioning to Advanced Topics

  • A transition is made towards advanced topics for further exploration in future sessions.
  • Discussion about "coj Journey," a startup idea aimed at connecting people through AI interactions based on shared interests or professions.

Understanding the Eliza Framework and Its Applications

Introduction to Agent Features

  • Discussion on enabling features for agents to connect users with others they might like, though it's not currently a priority.

Technical Aspects of Character Files

  • Explanation of using typed versions in TypeScript for character files, emphasizing the importance of preventing mistakes through enumeration.
  • Clarification that the project is unrelated to any cryptocurrency, focusing instead on the Eliza GitHub repository.

Historical Context of Eliza

  • Overview of Eliza as a framework created by Joseph Weizenbaum in 1966 at MIT, marking a significant development in human-computer interaction.
  • Description of how Eliza engages users with broad questions, leading to what is known as the "Eliza effect," where users attribute more humanity to AI than it possesses.

The Eliza Effect and Its Implications

  • Insight into the psychological phenomenon where people perceive simple language models as more intelligent or human-like than they are, highlighting its relevance in current AI discussions.

Future Developments and Enhancements

  • Mention of plans to integrate a news API into agents for real-time updates based on user interactions.
  • Outline of upcoming advanced features for agent development, including sentiment analysis towards individuals based on their communication style.

Permissions and Knowledge Integration

  • Clarification that no OAuth permissions are needed beyond three privileged intents for Discord integration.
  • Discussion about adding knowledge sources (like books or bios) into agents via text arrays and existing character file repositories.

Practical Implementation Tips

  • Instructions on extracting data from various formats (e.g., tweets or documents) into character files using command-line tools like npx.

Adding Bots to Discord Servers

  • Guidance provided on how to add bots to personal Discord servers by referring to resources available at discord.js.

Comparison with Other Frameworks

  • Distinction made between Truth Terminal and Eliza; Truth Terminal is not open-source and operates differently by looping conversations without human oversight.

How to Add Knowledge in TypeScript Files

Adding a New Knowledge Field

  • To add knowledge in a TypeScript file, create a new field called "knowledge" and make it an array. This allows for optional inclusion of knowledge within the character object.

Security Recommendations

  • For security purposes, it's recommended to store sensitive information like Twitter API secrets in environment variables rather than directly in character files. This is especially important if managing multiple character files with different accounts.

System Directives and Hallucination Prevention

  • Implementing system directives can help prevent AI from generating false information (hallucinations). If the AI doesn't know something, it should explicitly state that instead of fabricating an answer.

Weighting Contextual Information

  • The order of context provided to language models affects their output; items at the bottom of the context are weighted more heavily than those at the top. This influences how responses are generated based on previous inputs.

Targeting Accounts with Specific Criteria

  • While direct targeting of accounts based on criteria (like location or keywords) isn't currently possible, modifications can be made to the Twitter client package to enable this feature. Caution is advised when engaging with users who may not appreciate AI interactions.

Understanding Documentation and Package Architecture

Importance of Documentation

  • Comprehensive documentation is crucial for understanding code functionality. It requires significant effort to produce quality documentation that aids user comprehension.

Package Structure Overview

  • The core package architecture includes various components such as database adapters and plugins, all integrated into a central core package. This structure supports flexibility across different environments where TypeScript can be utilized.

Integration Capabilities

Database Adapters and Client Connections

Understanding Database Adapters

  • The discussion begins with an explanation of database adapters, specifically focusing on SQLite as the default file-based adapter that operates locally using a .sqlite file.
  • A potential issue is raised regarding changes in bot personality not reflecting due to a SQLite database that may be storing old data, indicating the need to locate this database for troubleshooting.
  • The speaker expresses confusion about the location of the SQLite database, humorously questioning whether it could be hidden or misplaced within the project structure.

Locating and Managing Databases

  • After some deliberation, it's suggested that the SQLite database might be located in a 'data' folder, which is deemed reasonable for storage.
  • Deleting the db.sqlite file from the agent folder will erase all memories associated with agents; caution is advised before performing this action.

Alternative Database Options

  • Besides SQLite, other databases like PostgreSQL and Supabase are mentioned as alternatives. SQL.js can also run in-browser as a version of SQLite.

Client Types and Their Functions

Overview of Client Functionality

  • Clients serve as connections between agents and external sources such as social media networks. Various types of clients are introduced:
  • Direct Client: Provides REST API access for app communication.
  • Discord Client: Handles voice and messaging interactions.
  • Telegram Client: Manages commands and image handling.
  • Twitter Client: Facilitates posting and responding to users.

Autonomous Loop with Auto Client

  • The Auto client runs autonomously in a loop, allowing for analysis and auto trading based on set parameters. This feature is still being developed further.

Integrating Front-End Applications

Connecting Eliza to Front-End Repositories

  • A recommendation is made to connect Eliza (an AI model) to front-end applications using an open-source project called Live Video Chat.
  • An example interaction showcases how users can engage with Eliza through this integration, highlighting its fun potential while referencing humorous scenarios.

Addressing Twitter Limitations

Enhancements Needed for Twitter Integration

  • A request arises regarding fixing word limit issues on Twitter posts exceeding 280 characters. It’s noted that different APIs handle long-form posts versus standard tweets.
  • Discussion includes recent updates by Boshi related to media uploads but acknowledges limitations concerning long-form post capabilities.

Future Development Plans

Upcoming Features and Examples

Core Concepts of Agent Runtime

Overview of Agent Functionality

  • The discussion introduces the concept of using a provider, evaluator, and action to retrieve information, such as news updates.
  • Each agent operates within its own runtime environment that maintains the current state and character details, including model providers.

Creating and Managing Agents

  • A new constant called runtime is defined to create an agent's runtime, which can handle multiple agents simultaneously.
  • The process includes creating agents with specific plugins based on conditions like wallet public keys or private keys.

Plugin Integration

  • Plugins serve as wrappers for various functionalities; individual primitives can also be passed directly without needing a plugin.
  • Actions can be customized by importing them from different sources (e.g., Solana), allowing flexibility in how agents perform tasks.

Understanding Providers, Actions, and Evaluators

Key Abstractions for Agent Functionality

  • Providers, actions, and evaluators are essential abstractions that enhance what an agent can do across different clients.
  • Clients are necessary but secondary; the focus is on the capabilities of the agent rather than client-specific features.

Contextual Information Handling

  • The context refers to all information available during interactions; it should adapt dynamically based on relevant inputs from providers.
  • An example illustrates how an agent could import news while considering formatting differences for various platforms like Discord or Twitter.

Role of Providers in Information Gathering

Input Mechanisms for Agents

  • Providers supply information into the agent's context; they act as input sources that inform decision-making processes.
  • Future support will include streaming audio and images as part of the context provided to agents.

Dynamic Context Assembly

  • The assembly of context is dynamic rather than static; relevant providers contribute data based on situational needs (e.g., computer terminal provider).

Understanding Agent Actions and Evaluators

Overview of Agent Actions

  • The agent can perform various actions, such as ordering a pizza or minting a coin. While pizza ordering is not currently implemented, it illustrates the potential for future actions.
  • An action could involve calling an API to retrieve information, demonstrating how agents can interact with external data sources.
  • Actions are defined as tasks that an agent executes; examples include continuing a conversation or ignoring user input.

Role of Evaluators

  • Evaluators allow the agent to reflect on its current scenario by extracting relevant information from ongoing interactions.
  • A practical example includes using a trust score database to analyze marketplace conversations and determine high-conviction bets based on user inputs.

Plugin Development and Integration

  • Discussion about creating plugins for functionalities like perplexity providers, emphasizing the importance of integrating multiple components within a single plugin framework.
  • The Solana plugin serves as an example where actions, evaluators, and providers are effectively utilized in code.

Exploring Provider Functions

Wallet Provider Functionality

  • The wallet provider retrieves current asset information from users' wallets, requiring an API connection for functionality.
  • It processes various token balances and converts values into different currencies (e.g., USD to Soul), providing comprehensive portfolio insights.

Information Injection Process

  • Providers serve as standard abstractions that bring information into the system; they require runtime context to function correctly.
  • If a public key exists for the wallet, it establishes connections to RPC hosts and formats portfolio data into strings for further use.

Trust Score Evaluation Mechanism

Trust Score Calculation

  • Trust score providers calculate scores based on virtual transactions derived from conversations, influencing how agents perceive user reliability.
  • This mechanism allows agents to respond appropriately based on trust levels established through virtual buys in discussions.

Technical Considerations for Performance

  • Users inquire about hardware performance; running models efficiently may depend on choosing appropriate systems (e.g., Mac vs. Nvidia GPUs).

Understanding the Template and Action Mechanism

Overview of Templates

  • Discussion begins about potential bugs in the system, particularly regarding repeated words.
  • Introduction to a recommendation template that utilizes macros to incorporate context from recent messages or recommendations.
  • Explanation of how Discord message handlers work within templates, showcasing examples of actions where knowledge is injected into the system.

Composition and Functionality

  • Description of how providers are injected into templates, including attachments like PDFs which provide summaries and names.
  • The process involves filling out templates with extensive information, leading to a completion phase where all data is synthesized.

Evaluation Process

  • Evaluators run at the end to extract key details such as recommendations, ticker symbols, contact addresses, and conviction levels for buy/sell decisions.
  • Emphasis on preventing duplication by checking if previous recommendations have been seen before processing new ones.

Exploring Actions: The Pump Fund Example

Introduction to Actions

  • Transitioning to discussing actions with a focus on the "pump fund" action's origin story and its initial implementation without much oversight.

Validation Mechanism

  • Explanation of validation functions that determine whether an action should be available based on user permissions (e.g., whitelisted users).

Action Descriptions and Context Injection

  • Importance of providing clear descriptions for actions so that models can understand when it’s appropriate to call them; issues with small models hallucinating incorrect action names are noted.

Handler Functionality in Actions

Execution Flow

  • Details on what happens when an action is called: retrieving wallet info and generating token metadata through a pump fund template.

Token Creation Process

  • Steps involved in minting tokens include creating them successfully while returning relevant contract addresses back to the user interface (e.g., Discord).

Callback Mechanism

Understanding Action Abstraction in AI Agents

The Concept of Actions

  • The speaker discusses a unique abstraction for actions within AI systems, emphasizing that actions are registered and executed as a whole rather than through separate tools.
  • Unlike traditional systems that require multiple steps (e.g., using a browser tool), this approach simplifies the process by encapsulating all necessary tasks into one action, reducing cognitive load on the agent.
  • This abstraction allows for seamless execution of complex commands like summarization without needing to call external tools or services, enhancing efficiency.

Comparison with Existing Tools

  • The speaker contrasts their method with OpenAI's assistant API, which utilizes various tools but lacks the streamlined approach they advocate for.
  • They highlight that while existing systems have lists of tools and decide when to use them, their model features a more robust mode change based on user requests.

Implementing Actions in Custom Agents

  • To create custom agents, users can modify character files to include specific actions or plugins during initialization.
  • Plugins can contain lists of actions and evaluators; however, caution is advised against enabling potentially harmful actions by default.

Future Developments and Capabilities

  • The current setup does not support different plugins per character but suggests potential modifications could allow this flexibility.
  • Upcoming developments will focus on creating characters capable of evaluating interactions based on user behavior, exploring multi-dimensional responses.

Community Engagement and On-chain Activities

  • As the session nears its end, the speaker invites questions from participants about practical applications of AI agents in blockchain environments.

Integration and Development in the Ethereum Ecosystem

Current Developments and Integrations

  • The speaker discusses ongoing work with Ethereum, specifically mentioning the anticipation of the release of Agent Kit for TypeScript on npm, which will facilitate integration with various plugins.
  • Emphasis is placed on allowing developers to build their preferred plugins independently before moving towards unification, particularly around EVM (Ethereum Virtual Machine), as it becomes more standardized.
  • The conversation highlights a new GitHub plugin that enables interaction with GitHub functionalities such as responding to issues and making pull requests.

Simplifying User Experience

  • There is a focus on keeping the integration process simple for users, avoiding complex installations like Python to enhance accessibility.
  • A question arises about integrating trending topics from Twitter into bots, indicating a desire for real-time data access without relying heavily on external APIs.

Enhancements in Data Handling

  • Discussion includes how certain file types (like TypeScript files) are prioritized for coding prompts within the system.
  • The limitations of current Twitter client capabilities are acknowledged, but there’s potential for future enhancements by adding endpoints for missing features.

Performance Insights and Model Comparisons

  • The speaker expresses mixed feelings about GPT models, stating that while GPT-4 has seen updates improving creative writing, they find it less effective compared to previous versions like GPT-3.
  • There's an exploration of how character knowledge can be enhanced using trending data from platforms like Twitter, although this requires additional programming effort.

Client Customization and Local Development

  • Instructions are provided regarding launching client-side applications locally. The speaker mentions using localhost settings to test functionality during development phases.

How to Create a Custom Plugin for Twitter Integration

Overview of Plugin Development

  • The speaker discusses starting with the Aliza starter and mentions using Bootstrap and Node plugins, emphasizing the flexibility of creating custom packages or code.
  • A suggestion is made to define a custom plugin, including providers and a specific client, such as a Twitter client, indicating that developers can modify existing code from the main Eliza package.

Copying Existing Code

  • The speaker highlights the option to copy the entire Twitter client package into one's project, allowing for extensive modifications. This includes accessing low-level functionalities provided by Twitter.
  • It is noted that the current Twitter client does not utilize most available features; thus, copying and pasting relevant code (like index.ts) can enable developers to expand functionality.

Community Engagement and Future Plans

  • The speaker expresses intent to engage with the community through Discord after taking a break, inviting others to join in building and streaming sessions.
  • Plans are shared for regular development sessions on Thursdays or Tuesdays, aiming for consistency while accommodating holidays or travel schedules.

Organizing Collaborative Efforts

  • An invitation is extended for community members to help organize curriculum ideas and projects within Discord channels. This aims at fostering collaboration among participants.
Video description

AI Agent Dev School with Shaw, using Eliza https://github.com/ai16z/eliza