How to Build Deep Research Google Docs AI AGENTS - Full Tutorial
Happy New Year and Introduction to Autonomous Agents
Overview of the Setup
- The speaker wishes everyone a Happy New Year and introduces the dual-agent system being demonstrated, consisting of a search agent connected to Google Docs and a report writer agent.
- The report writer agent autonomously generates search queries based on its needs to complete reports, showcasing an interactive workflow between searching for information and writing.
Functionality of the Agents
- The search agent retrieves information about Bitcoin ETF inflows from Brave Search, which is then utilized by the report writer to expand the document.
- The speaker plans to delve into the code behind this setup, explaining how it operates and discussing potential new features that will be added live during the demonstration.
Technical Requirements and Setup
Necessary Components
- To set up this system, users need a Brave API key, access to Google APIs (including Drive and Docs), and specific credentials in JSON format.
- A future video may provide guidance for those unfamiliar with setting up these components.
Document Management
- Two different document IDs are used: one for content generated by the report writer and another for managing searches conducted by the search agent.
- Functions are implemented to append content rather than replace it in order to continuously build upon existing reports as new information is gathered.
Search Query Generation Process
Initial Search Queries
- An initial search query is generated based on a specified topic (e.g., "Bitcoin 2024 summary") which guides subsequent searches for relevant information.
- The process involves monitoring changes in documents; if new search results are detected, they trigger further enhancements to the report.
Enhancing Reports with New Information
- The agents combine assigned topics with current content and newly retrieved data from searches to enrich reports effectively.
- A structured approach is taken where each new search query aims at exploring related topics or filling knowledge gaps while adhering to concise formats (max four words).
Monitoring Changes and Fetching Data
Continuous Monitoring Mechanism
- A loop mechanism runs every 30 seconds (adjustable interval), checking for changes in documents that indicate updates from previous searches.
- If changes are detected due to new queries provided by other agents, it prompts additional searches using Brave's API.
Conclusion of Technical Workflow
AI Agents and Report Generation
Overview of the Setup
- The speaker discusses the simplicity of setting up their system, indicating a desire to test it further before introducing a third feature or "agent."
- A new feature is proposed that will evaluate whether a report is complete, functioning as an LLM (Large Language Model) judge to determine if scripts should be halted.
Testing Queries with Google Docs Agents
- The focus shifts to testing AI agents by initiating a report on "AI agents 2025," requiring two terminals for different agent functionalities.
- An initial search for "AI agent Market forecast 2025" yields mixed results, prompting further exploration into relevant information.
Evaluating Search Results
- The speaker notes that while some search results are not ideal, they provide context for writing the report.
- A second query about "AI agent use cases" reveals limited insights but highlights increasing applications in various sectors like chatbots and routine tasks.
Exploring Player Feedback on Path of Exile 2
- Transitioning to gaming, the speaker decides to investigate player feedback on "Path of Exile 2," leveraging YouTube videos as informal research sources.
- Initial impressions from players indicate mixed reactions regarding game difficulty and unique abilities, aligning with observed sentiments from video content.
Iterative Research Process
- Further searches yield diverse opinions on level design and gameplay mechanics; however, some results are deemed less relevant.
- The speaker reflects on how iterative searching can enhance report quality by gathering comprehensive data over time.
Structuring Final Reports with New Agent Development
- After extensive testing, the need arises for a new agent capable of structuring reports based on gathered research data.
- This new agent would incorporate an LLM judge to approve or deny reports based on their completeness and relevance.
Building Out the System
- Plans are laid out for developing this system further in collaboration with others using cursor technology.
- A detailed prompt is introduced outlining how the main agent will utilize gathered content to create final reports effectively.
Creating a Research Agent with Google Docs
Overview of the Main Agent Functionality
- The main agent is designed to gather five entries before generating a final report, which will stop all running scripts once completed.
- The speaker expresses uncertainty about achieving the desired outcome in one attempt but proceeds to run the code for testing.
- The initial setup includes evaluating content and ensuring that it returns true or false, which may need adjustments later.
Code Implementation Steps
- Updates are made to the main agent's code, including credential retrieval and monitoring research progress.
- A check is added to ensure proper functionality within Google Docs (G dos), indicating ongoing modifications to improve performance.
- The process requires five entries before evaluation; the speaker speeds up this part of the demonstration.
Evaluation and Error Handling
- After gathering entries, an error occurs during content evaluation, prompting a shutdown of scripts for troubleshooting.
- The speaker identifies that an empty document caused issues and plans to modify the code to handle such cases effectively.
Final Report Generation
- A retry mechanism is implemented after an initial failure; successful data insertion leads to generating a final report based on gathered research notes.
- All assisting processes shut down autonomously upon completion of tasks, allowing for efficient operation without constant oversight.
Reflections on Project Success
- The speaker expresses satisfaction with the project outcomes and highlights improvements in using Cursor software for coding tasks.