Coding Local AI Agent in Python (Ollama + LangGraph)

Name: Coding Local AI Agent in Python (Ollama + LangGraph)
Uploaded: 2025-05-23T16:00:15.000Z
Duration: 1 h 40 min 43 s

Building a Local AI Agent in Python

Introduction to AI Agents

An AI agent is defined as a system capable of autonomously completing tasks that may involve multiple steps.

Unlike basic LLMs, which can perform simple actions like listing or summarizing emails, an AI agent can dynamically execute complex tasks using various tools.

Tools and Technologies Used

The tutorial will utilize Python along with Langchain, Langraph, and Olama for local models to build the AI agent.

A GPU with sufficient VRAM is required; the presenter uses an Nvidia GeForce RTX 3060Ti with 8 GB of VRAM to run a model called Quen 3. Smaller models are also viable options.

Upcoming Event: Nvidia GTC

The presenter mentions attending the Nvidia GTC event in Paris, highlighting its significance for those interested in machine learning and GPUs. Keynote by CEO Jensen Huang is scheduled for June 11th.

Attendees can participate in workshops at reduced prices and enter a raffle for an RDX5090 signed by Jensen Huang if they register through a provided link.

Overview of the Project

The goal is to create an AI agent that operates independently on personal hardware without relying on external APIs from companies like OpenAI or Google. This setup emphasizes privacy and control over data.

The agent will be able to engage in conversations and utilize two specific tools: one for listing unread emails and another for summarizing them, showcasing its multi-tool capabilities.

Demonstration of Tool Usage

The presenter demonstrates how the agent lists unread emails from different senders while providing reasoning behind its outputs based on email content (e.g., deadlines, reminders).

Mail Assistant AI Agent Setup

Overview of the Mail Assistant Functionality

The response evaluation process involves calling another tool to summarize reminders and to-dos, such as birthdays and tasks like going to the gym or reading.

The assistant can list unread emails and summarize those from specific senders, demonstrating its ability to call multiple tools in sequence for efficient processing.

An aggregated summary is provided after processing emails, including relevant details while excluding unnecessary information.

Building the Mail Assistant AI Agent

The setup will involve creating a customizable mail assistant AI agent that can be tailored with additional tools and models for enhanced functionality.

Initial steps include opening a terminal and navigating to a working directory where dependencies will be installed without needing external API keys.

Installing Dependencies

Two methods are available for installing packages: using pip (or pip3) or opting for UV, a faster Rust-based package manager.

Using UV requires initializing a project which creates necessary files in the directory, allowing for streamlined package management.

Setting Up Project Requirements

Essential packages include langchain, langchain-ol, lang graph, and IMAP-client for connecting to mailing servers; these are crucial for the mail assistant's operation.

After installation, an empty code file is prepared for development.

Configuring Mailbox Access

To replicate this project, users need Olama installed locally along with access credentials (IMAP host, email address, password).

Users must identify their IMAP server by searching online based on their email provider (e.g., Gmail), ensuring they have correct connection details.

Finalizing Connection Details

Specific connection requirements include knowing your email address and password alongside the IMAP host; these are essential for accessing your mailbox effectively.

How to Set Up Olama and Use Python for Email Management

Installing Olama

To install Olama, visit olama.com, select the appropriate version for your operating system (Mac, Linux, or Windows), and download it.

Choosing the Right Model

Select a model based on your hardware capabilities. For example, Quen 3 with 8 billion parameters is recommended if you have sufficient VRAM; otherwise, consider models with fewer parameters (4 billion or 1.7 billion).

Be aware that lower parameter models may perform poorly in recognizing necessary tools due to their limited intelligence.

Setting Up the Model

After selecting a model like Quen 3, use the command olama list to verify its presence on your system.

If not installed, run olama pull quen3_col_8billion to download it. Ensure Olama is running using O Lama Surf before connecting via Python.

Configuring Environment Variables

Create an .n file for storing sensitive information such as IMAP host details and user credentials instead of hardcoding them into your script.

Populate the .n file with variables: IMAP_host, IMAP_user, and IMAP_password. Save this file securely after entering your password.

Importing Required Modules in Python

Begin coding by importing essential modules: OS for environment variables, JSON for data handling, and typing for type definitions.

Utilize libraries like IMAP tools to simplify mailbox connections without directly using IMAP packages from Python.

Initializing Langchain Components

Import functions from Langchain's chat models to create an Olama chat instance. This includes decorators that define tool functionalities through docstrings.

Prepare to build a graph structure by importing necessary components from Langraph which will help manage email interactions effectively.

Building Basic Functions

Start by writing a function that connects to your mailbox while ensuring security practices are followed by loading sensitive data from environment variables rather than embedding them in code directly.

Chat Model Implementation

Defining the Chat Model

The chat model is referred to as chat_model, specifically using the 3 and 8 billion parameter versions of Llama. This specification is crucial for targeting the correct model among multiple options available.

Establishing Connection to Email

A method for connecting to email is defined, utilizing a state that passes from note to note within the Lang graph framework. The simplest state includes just message history, but it can be expanded with additional context if needed.

Creating Basic State Structure

A class named chat_state is introduced as a typed dictionary containing a list of messages, representing the most basic form of state management in this implementation. Additional context or buffers can be added later if required.

Connecting to Mailbox

The connection function (connect) logs into an IMAP mailbox using provided credentials (user and password) and specifies an initial folder for operations. It returns the mailbox object for further use in fetching emails. This approach simplifies interaction compared to other libraries like IMAP lib.

Defining Tools for Email Interaction

Tools are defined as functions callable by the large language model (LLM). One such tool, list_unread_emails, retrieves unread messages without parameters and includes documentation accessible by the agent, detailing what information will be returned (subject, date, sender, UID).

Implementing Unread Emails Functionality

Fetching Unread Emails

The function fetches unread emails while ensuring they remain marked as unread after retrieval by setting appropriate criteria (e.g., unseen status). Only headers are fetched initially since content isn't necessary at this stage. If no unread emails exist, a corresponding message is returned; otherwise, it processes and formats them into JSON format for output.

Structuring Response Data

Each unread email's details are structured into dictionaries containing UID, subject, date formatted appropriately (year-month-day hour:minute), and sender information while avoiding Python keyword conflicts through naming conventions (e.g., using 'sender' instead of 'from'). This data forms part of a response string returned by the tool call.

Summarizing Individual Emails

Creating Summary Tool

How to Summarize Emails Using IMAP and LLMs

Fetching Emails with IMAP

The process begins by connecting to the mailbox using IMAP, where emails are fetched based on a specific UID. The command next MB fetch retrieves the email.

The goal is to summarize an email identified by its UID, which is not provided by the user but obtained through a function call.

If no email is found for the given UID, an error message will indicate that summarization was not possible. This is formatted as "Could not summarize email with UID."

Preparing the Summarization Prompt

A prompt for a large language model (LLM) is constructed to summarize the retrieved email concisely.

The prompt includes key details such as subject, sender, date, and content of the email, ensuring proper formatting with line breaks.

Integrating Tools into the System

An empty string is returned initially while preparing to feed data into an LLM. This step involves annotating it as a tool for future use.

Two tools are defined: one for listing unread emails and another for summarizing emails. These tools enable agentic behavior in processing tasks.

Setting Up Language Models

A chat model (lm) is initialized using a local model provider (e.g., Olama), allowing access to tools necessary for operations.

The LLM binds these tools together so that it can utilize them effectively during execution.

Creating Raw Language Model Instances

A raw LLM instance without tool access is created specifically for tasks like summarizing emails without agentic interference.

This separation ensures that basic functions like summarization do not rely on decision-making capabilities of an agentic AI.

Implementing Workflow Graph Structure

The workflow graph represents how tasks are processed within this system, featuring nodes and edges that define actions available to the agent.

Each node can represent different functionalities (like text processing), while routers determine subsequent actions based on conditions met during execution.

Understanding Tool Calls in Language Models

Overview of Tool Integration

The process begins with a language model (LM) receiving a user query, such as "hello how are you," and generating a simple response.

When the user requests specific information, like listing unread emails, the LM recognizes the need for a tool call to retrieve that data.

The output from the tool is formatted as a JSON object containing unread emails, which is not directly shown to the user but used by the LM to formulate an answer.

Handling Complex Queries

If a user asks for both a list of unread emails and their summaries, the LM identifies this multi-step requirement and initiates another tool call.

The system maintains context throughout these interactions, allowing it to summarize specific emails based on additional criteria provided by the user.

State Management in Language Models

The state management involves keeping track of message history and understanding when further tool calls are necessary for complex queries.

This allows for multiple email summaries to be generated sequentially before responding back to the user with comprehensive information.

Building an Agent Workflow

Defining Nodes and Functions

The workflow is structured around defining nodes within an automaton-like framework where each node represents different functions or states in processing queries.

Each node takes input (the current state), processes it, and outputs an altered state for subsequent nodes.

Routing Logic

A router component determines whether to end the interaction or initiate another tool call based on conditions derived from previous messages.

This routing logic ensures efficient handling of responses by either concluding with an answer or continuing with further data retrieval.

Graph Structure Implementation

Nodes representing tools are defined alongside conditional routers within a graph structure that visualizes how different components interact during query processing.

A builder function creates this graph starting from an initial empty chat state, progressively adding nodes like LLM and tool nodes as needed.

Connecting States in Query Processing

Establishing Connections Between Nodes

Edges connect various states within this graph structure, facilitating transitions between different processing stages based on user inputs.

Connection Between Tools and LLM

Overview of Process Flow

The process involves a connection from tools to the LLM (Language Model), where the output from the tool must be processed by the LLM before concluding any operation.

A conditional edge is introduced, allowing the router to either direct to tools or end the process based on its return value.

Routing Logic

If the router indicates "tools," it directs to the defined tools node; otherwise, it leads to an end state. This routing logic is crucial for managing flow within the system.

The initial sketch of this flow was represented visually, showing connections from start to LLM, then through a router leading either to tools or an end state.

Compiling and Running Code

Setting Up State

The code initializes with an empty state dictionary containing messages as an empty list and prompts user input for instructions or quitting.

An endless loop is established for continuous user interaction until a quit command is issued.

Invoking Graph

User messages are appended into the state, and then graph invocation proceeds using this updated state. The last message in history is retrieved for processing.

Debugging Issues

Common Errors Encountered

Initial attempts at running resulted in errors due to incorrect assignment in if statements and missing modules like IMAP tools.

Adjustments were made regarding pluralization in function calls (e.g., changing "add conditional edge" to "add conditional edges").

Testing Functionality

User Interaction Testing

After resolving bugs, testing involved querying how assistance could be provided, including listing unread emails as a simple tool call.

Combining Queries

Users can also combine queries effectively; however, limitations arise with less capable models that struggle with complex requests compared to advanced models like GPT-4.

Identifying Code Issues

Tool Node Limitations

A significant issue was identified where tool nodes returned blank message histories instead of appending new responses correctly.

Proposed Solution

A function named tools_node was proposed that would append results rather than overwrite existing message history, improving overall performance.

Final Testing Steps

Running Final Commands

Email Management with AI Tools

Utilizing AI for Email Summarization

The speaker discusses the process of extracting User IDs (UIDs) from a model's output to enhance clarity when using a smaller AI model. This explicit approach is necessary due to limitations in the 8 billion parameter model.

An example is provided where an email summarization tool is called, yielding a response about project deadlines and sender information, demonstrating practical application.

The speaker tests the system by asking it to list unread emails and summarize those from a specific sender, Florian, without mentioning UIDs. This raises questions about the tool's ability to function effectively under these conditions.

A successful summary of unread emails from Florian is achieved, revealing reminders and tasks. The speaker notes that having full email history may have contributed to this success.