How to Build Your Own JARVIS AI Agent 100% Free! | LiveKit Tutorial

How to Build Your Own JARVIS AI Agent 100% Free! | LiveKit Tutorial

How to Create Your Own Advanced Jarvis AI Voice Agent

Introduction to Jarvis AI Voice Agent

  • This video tutorial demonstrates how to create an advanced Jarvis AI voice agent using LifeKit and Python, which is completely free.
  • The speaker engages in a light-hearted conversation with the AI, showcasing its ability to switch between text and voice communication.

Features of the AI Agent

  • The tutorial will cover how to equip the AI agent with tools for various tasks that can be executed through Python functions, such as checking weather or stock prices.
  • The speaker introduces "Friday" as the updated version of Jarvis, indicating improvements over previous iterations.

Overview of LifeKit

  • LifeKit is highlighted as an open-source tool that offers sophisticated capabilities compared to other paid options like VP Retail or 11 Labs. It requires some programming knowledge but allows complete control over the agent's functionality.
  • Major companies like OpenAI utilize LifeKit for their voice components, emphasizing its reliability and sophistication. Users can run it locally for enhanced data privacy.

Setting Up LifeKit

  • To begin setting up LifeKit, users are instructed to create a project on the platform after signing up for an account. They should name their project (e.g., "Jarvis 1.0") and generate API keys necessary for integration later on.
  • Important credentials include websocket URL, API key, and API secret; these must be stored securely as they cannot be retrieved later once closed.

Preparing Python Environment

  • Users are advised to set up a Python environment using any IDE (Visual Studio Code recommended) and install Copilot for assistance in coding efficiency during development. A virtual environment needs to be created and activated before proceeding with library installations via requirements.txt file.
  • Essential libraries required for this project are specified in the requirements file, which users need to install using pip commands in their terminal window.

Implementing Google API Key

  • A Google API key is necessary for utilizing Gemini within the project; users must create a Google Cloud account if they do not already have one and set up a new cloud project where this key will reside. Instructions are provided on navigating through Google's interface to obtain this key safely.

Coding Implementation Steps

Initial Setup of Agent.py

  • Sample code from LifeKit’s documentation is copied into agent.py, followed by modifications needed to switch from OpenAI's model to Gemini's model while ensuring all relevant imports are correctly configured within the script files created earlier (prompts.py).

Defining Prompts

  • Two prompts are defined: one serves as system instructions guiding how the agent operates while another sets session-specific instructions at each interaction start point (e.g., greeting messages). These prompts help shape user interactions with the agent effectively.

Running Local Tests

  • After implementing changes in code structure including defining entry points and session management functions within agent.py, users can test their setup locally by running specific commands in their terminal window.

The initial interaction showcases basic conversational abilities of Friday when prompted by user input.

Enhancing Functionality

Multimodal Interaction Capabilities

  • The speaker discusses enabling video capabilities so that Friday can visually interact with users through camera input alongside voice responses.

This feature aims at making interactions more dynamic beyond just audio communication.

Adding Task Performance Abilities

Tool Definitions

  • Users learn how to define tools within tools.py that allow Friday access external APIs (like weather information) or search engines (DuckDuckGo) enabling task performance based on user requests.

Asynchronous functions are utilized here for efficient execution without blocking operations.

Testing Functionality

  • Once implemented successfully, testing reveals Friday’s ability not only responds verbally but also executes tasks such as fetching weather data or conducting web searches based on user queries demonstrating practical applications of developed functionalities.

Users receive feedback about successful executions logged into console outputs confirming operational integrity throughout interactions.

Creating Custom Functions in Python

Overview of Function Creation

  • Users can create additional functions for tasks like setting appointments or sending emails by defining the business logic and parameters needed.
  • A new function for sending messages via Gmail was added based on user requests, utilizing asynchronous programming to handle email sending.

Email Function Implementation

  • The speaker emphasizes the importance of providing variable descriptions to ensure the AI agent correctly interprets them without confusion.
  • To use Gmail for applications, users must specify their account and create an app password through Gmail settings; a tutorial link is provided for guidance.

Sending Emails with Python

  • The process involves defining the recipient, connecting to the email server, and executing the send command; code is available for download to simplify implementation.
  • Environment variables (ENV file) are used to manage sensitive information like API keys, allowing easy access within functions.

Testing Email Functionality

Running in Development Mode

  • The speaker runs the application in development mode using LiveKit's playground feature for testing purposes.

Interaction with AI Assistant

  • The AI assistant named Friday interacts with users, asking for details such as email address, subject, and message content before sending an email.
  • Confirmation of successful email delivery is provided after inputting necessary details into the system.

Launching as an App

Template Code Usage

  • A template code from LiveKit is available for launching an Android app; iOS users need a different template due to platform differences.

Installation Requirements

  • Users must install specific software packages using terminal commands; links are provided in the video description for convenience.

Setting Up Token Server

Creating a Token Server

  • Instructions are given on creating a token server within LiveKit's sandbox environment; obtaining sandbox ID is crucial for further steps.

Command Execution

  • Users need to navigate their terminal to copy sample code into local folders while replacing placeholders with actual sandbox IDs and API credentials.

Developing Android Applications

Using Android Studio

  • After downloading necessary files, users should open their project folder in Android Studio IDE to begin development work on their app.

Running on Emulator or Physical Device

  • Instructions are provided on how to run apps either through emulators or directly on physical devices by pairing them via wireless debugging options found in developer settings.

Final Demonstration and Future Plans

Successful App Functionality

  • The assistant successfully retrieves weather information upon request during demonstration, showcasing functionality after installation.

Future Developments

  • Plans are hinted at regarding future features and potential series continuation focused on enhancing AI capabilities within applications.
Video description

📝Sign up for the waiting list for the offer how to get into AI Engineering for Voice Agents: https://form.typeform.com/to/VUoyMgrw 🚀 Build Your Own JARVIS AI Agent – 100% Free! 🎙️ Talk to it. 👀 Let it see you. 💬 Chat with it. ☁️ Check the weather. 🌐 Search the web. 📧 Send emails. 📱 Deploy it as an Android app! All in one epic tutorial – and it's completely FREE. In this video, I’ll show you how to create your very own AI Voice Agent using LiveKit and other free tools. This isn’t just a basic chatbot – this JARVIS-like agent is super responsive, intelligent, and packed with powerful features: ✅ Real-time voice & text conversation ✅ Camera vision integration ✅ Web search capabilities ✅ Weather-checking tool ✅ Email messaging system ✅ Android app deployment tutorial Whether you're a beginner or an AI enthusiast, this step-by-step guide will walk you through everything you need to build and deploy your personal assistant – no paid APIs or expensive tools required. 🔧 Tech stack: LiveKit, Open Source Tools, Custom Python Scripting 📲 Final product: Fully functional AI assistant running on your Phone, on your PC and in your Browser! 💼 Work with me: https://form.typeform.com/to/X1oLmajR 🔗 Chapters: 0:00 Intro 1:35 LiveKit Explaination 4:00 Setting Up LiveKit 5:44 VS Code Set Up 8:50 Gemini API Key 11:02 Python Implementation 16:16 Testing Initial JARVIS 16:44 Upgrade to FRIDAY 17:12 Testing Initial FRIDAY 17:33 Enable Video & Using Web App 19:24 Test Web App (Playground) 21:16 Weather Check & Web Search Tool 25:37 Testing Weather Check & Web Search Tool 28:29 Send Email Tool 30:17 Testing Send Email Tool 31:33 Launch it as Android App! 37:25 Testing Android App 37:53 Outro 💡 Subscribe for more AI & tech tutorials 👍 Like the video if you found it helpful 💬 Drop your questions & progress in the comments! GitHub Repository: https://github.com/ruxakK/friday_jarvis Android App Example: https://github.com/livekit-examples/android-voice-assistant?tab=readme-ov-file How to create a Google App Password: https://www.youtube.com/watch?v=wniM7sU0bmU&ab_channel=Woodpecker.co Here is the first part of Jarvis using ElevenLabs (not needed for this tutorial): https://www.youtube.com/watch?v=ECBmgtxd_Zk&t=52s&ab_channel=Thanh-yDavidNguyen #AI #JARVIS #LiveKit #VoiceAgent #Gemini #FreeAI #OpenSource #AndroidApp #AITutorial #ArtificialIntelligence #PythonTutorial #IronMan