Cómo usar CHATGPT (api) sobre la base de datos de tu Empresa [Y te regalo la APP]

Cómo usar CHATGPT (api) sobre la base de datos de tu Empresa [Y te regalo la APP]

Introduction to the Role of a Data Scientist

In this section, the speaker discusses the common misconception about the role of a data scientist and highlights the importance of data analysis for various stakeholders.

The Reality of Being a Data Scientist

  • Many people think that being a data scientist involves primarily training models and creating algorithms.
  • However, in reality, a significant portion of a data scientist's time is spent on data analysis for managers and product teams.

Creating an AI Application Prototype

This section introduces the concept of building an AI application prototype that allows business users to perform their own analyses without relying on technical expertise.

Building an AI Application Prototype

  • The goal is to create an AI application prototype that enables business users to conduct their own analyses.
  • The prototype will connect directly to the company's database using the GPT chat API and utilize Streamlit as the user interface.

Architecture Overview

This section provides an overview of how the system works, including the user interface design and backend integration with GPT chat API and a relational database.

System Architecture

  • The application will have a simple chat-style interface where users can input natural language queries for analysis.
  • Streamlit will be used to create this user interface, which connects directly to Python backend code.
  • The backend code will handle user queries by passing them to OpenAI's GPT chat API with specific prompts.
  • The GPT chat API will generate SQL code based on user queries, allowing connection with a relational database for analysis.

User Interface Design and Backend Integration

This section focuses on designing a simple chat-style user interface using Streamlit and integrating it with the backend code for data analysis.

User Interface Design and Backend Integration

  • The user interface will be designed as a chat-style interface where users can input their natural language queries.
  • Streamlit will be used to create this interface, which directly connects to the Python backend code.
  • The backend code will handle user queries by passing them to OpenAI's GPT chat API with specific prompts.
  • The generated SQL code from the GPT chat API will allow connection with a relational database for analysis.

Connecting to the Database and Generating SQL Queries

This section explains how the backend code connects to the database and generates SQL queries based on user queries using OpenAI's GPT chat API.

Connecting to the Database and Generating SQL Queries

  • The Python backend code connects to a relational database, in this case, SQLite, using a class called sql_database.
  • After loading the database, the backend imports necessary APIs, including OpenAI's GPT chat API.
  • The environment variable is set using os module to store the result of OpenAI's GPT chat API call made in a previous script.
  • The backend then proceeds to process user queries by creating an instance of an LM (language model).

Utilizing a Relational Database for Analysis

This section discusses utilizing a relational database for analysis purposes and emphasizes that no technical knowledge is required from users.

Utilizing a Relational Database for Analysis

  • A simple relational database structure is used for demonstration purposes, specifically focusing on sales data in an e-commerce company.
  • The sales table includes columns such as invoice ID, product ID, product name, quantity sold, invoice date, unit price, customer ID, country, and total sales amount.
  • Users can make natural language queries without any technical knowledge, and the application will return the corresponding results.

User-Oriented System Design

This section highlights that the system is designed to be user-oriented, requiring no technical expertise from users.

User-Oriented System Design

  • The system is designed to cater to users who do not possess technical knowledge.
  • Users only need to make natural language requests for the desired analysis, and the application will handle all technical aspects.
  • Unlike other systems like Tableau or Power BI, this application does not require any technical skills from users.

Disclaimer and Legal Considerations

This section provides a disclaimer regarding data privacy and legal responsibilities when using OpenAI's GPT chat API.

Disclaimer and Legal Considerations

  • When connecting to OpenAI's GPT chat API and sending data, it is essential to exercise caution with real data and comply with company policies.
  • The video emphasizes that this content is for educational purposes only and does not imply any recommendations for usage.

Database Structure Overview

This section provides an overview of the database structure used in the demonstration, focusing on sales data in an e-commerce company.

Database Structure Overview

  • The database used in this demonstration is a simplified version of an e-commerce company's database.
  • The main table is the "sales" table, which includes columns such as invoice ID, product ID, product name, quantity sold, invoice date, unit price, customer ID, country, and total sales amount.
  • The example shows approximately 542,000 records in the sales table.

Application Components Overview

This section provides an overview of the application components, including the database, scripts, and their functionalities.

Application Components Overview

  • The application is built using Visual Studio and consists of the following components:
  • Database: A SQLite database.
  • PowerPoint: A diagram illustrating the system architecture (not necessary for the application).
  • Scripts:
  • "api_script": Contains the OpenAI GPT chat API key.
  • "backend_script": Handles user queries and interacts with OpenAI's GPT chat API.
  • "frontend_script": Generates the user interface using Streamlit.

Backend Script Details

This section focuses on the backend script responsible for handling user queries and interacting with OpenAI's GPT chat API.

Backend Script Details

  • The backend script consists of several steps:
  1. Loading the SQL database using a class called sql_database.
  1. Importing necessary APIs, including OpenAI's GPT chat API.
  1. Setting up an environment variable to store the result from OpenAI's GPT chat API call made in a previous script.

Backend Script Continued

This section continues discussing the backend script responsible for handling user queries and interacting with OpenAI's GPT chat API.

Backend Script Continued

  • After setting up environment variables, the backend script imports required libraries such as os to utilize environment variables.
  • The script then proceeds to create an instance of an LM (language model), which represents the language model used by OpenAI's GPT chat API.

Frontend Script Details

This section focuses on the frontend script responsible for generating a user interface using Streamlit.

Frontend Script Details

  • The frontend script is responsible for generating the user interface using Streamlit.
  • It connects to the backend code and interacts with it to handle user queries and display results.
  • Streamlit provides an easy way to create a user-friendly interface for users to input their natural language queries.

Understanding User Queries and Model Selection

In this section, the speaker discusses the importance of understanding user queries and introduces the model they will be using, specifically the 3.5 Turbo version which is more affordable than the GPT-4.

User Query Processing

  • The speaker explains that they will be using a package called "lanchain" to process user queries.
  • Two main approaches can be used: chaining processes together or utilizing agents.
  • Since no external tools are needed in this case, they opt for the simpler approach of chaining processes.

SQL Database Change

  • The speaker mentions that they will use the "sql database change" option to transform natural language queries into SQL objects.
  • This step allows them to convert user input into a format that can be understood by the system.

Defining Prompt Templates

  • The success of language models depends on providing detailed instructions and desired output formats.
  • By defining prompt templates, users only need to provide their query while all other necessary information is added automatically.

User Input and Function Creation

  • Users only need to provide their query without having to write all the additional code.
  • A function called "consulta" is created to handle user queries by formatting them according to predefined templates.
  • The function then calls Open API with the formatted query, retrieves results, and returns them as output.

Building the Interface

In this section, the speaker focuses on constructing an interface for users to interact with instead of directly writing code.

Launching Streamlit Application

  • To launch the application, open a terminal and run it using Streamlit package.
  • The application opens in a web browser with a simple chat-like interface named "No Moleste Usuario de Negocio."

Testing Backend Functionality

  • The speaker demonstrates the functionality of the backend by asking a question through the chat interface.
  • They ask about the number of records in a table and receive the correct answer, confirming that the backend is working.

Interface Construction

  • The code for constructing the interface is explained briefly, with mentions of buttons and click events.
  • Two lists are created to store questions and corresponding answers.

Conclusion and Further Explanation

In this section, the speaker concludes by mentioning that more advanced examples will be covered later. They also provide instructions for downloading the code.

Code Explanation

  • The code is not explained in detail but can be downloaded from the description.
  • Streamlit package is imported along with other necessary packages.
  • Titles and lists are created to store questions and answers.

Overall Summary

This transcript covers two main sections: understanding user queries and model selection, as well as building an interface using Streamlit. In the first section, various aspects such as query processing, SQL database change, prompt templates, and function creation are discussed. The second section focuses on launching a Streamlit application, testing backend functionality through a chat-like interface, and briefly explaining code construction.

Function "consulta" in the Backend

The function "consulta" is responsible for processing user queries and executing the necessary functionality. It takes the user's question from the chat input, passes it to the backend, and executes the "consulta" function. This function calls the Chat GPT API to retrieve an SQL code, which is then passed to the database using Lanchain. The database resolves the query and returns the response to be displayed to the user.

Implementation of "consulta" Function

  • The "consulta" function processes user queries entered in the chat.
  • It calls the Chat GPT API to generate an SQL code based on the user's question.
  • The generated SQL code is passed to the database using Lanchain.
  • The database resolves the query and returns a response that will be shown to the user.

Testing a Simple Query

In this section, a simple query is tested to check if it can successfully resolve a basic question. The query asks for three countries with the highest total sales.

Testing a Simple Query

  • A query is made asking for three countries with the highest total sales.
  • The response shows that United Kingdom has sales of 8,187,000 and Netherlands has sales of 284,000.
  • To verify if this information is correct, an SQL query is executed directly on the database.
  • The SQL query retrieves country names and their corresponding total sales from a table called "ventas".
  • The results confirm that United Kingdom and Netherlands have exactly matching sales figures as obtained through our application.

Testing a Complex Query

This section tests a more complex query that requires advanced SQL structures such as subqueries or Common Table Expressions (CTEs). The query compares the percentage representation of each product's sales out of the total sales.

Testing a Complex Query

  • A query is made to retrieve the top five products based on quantity sold.
  • The response shows that "World to Gliders" has sold 53,847 units and "Jumbo Bag" has sold 47,300 units.
  • To verify this information, an SQL query is executed directly on the database.
  • The SQL query involves calculating the total units sold for all products and then dividing each product's quantity by the total to obtain the percentage representation.
  • The results match with what our application provided.

Conclusion

The transcript covers the implementation and testing of two types of queries: a simple one for retrieving countries with high sales and a more complex one involving percentage calculations. Both queries are successfully executed using backend functions and validated against direct SQL queries on the database.

Sales Analysis

This section discusses the sales analysis of different products.

Top Selling Products

  • The top-selling product is "World War 2" with 53,847 units sold, accounting for 1.04% of total sales.
  • The second best-selling product is "Jumbo Back" with 47,367 units sold.

Application of Artificial Intelligence in Business

  • This video showcases another application of artificial intelligence in business and how it can add value to companies as data scientists.

Remember to download the transcript for a more detailed understanding.

Video description

👉 Descarga todos los ficheros para replicar la app del vídeo: 1. Haz click (o copia-pega) en este link: https://bit.ly/chatdatabase 2. Te llevará a una página con las instrucciones y un botón rojo 3. Haz click en el botón rojo 4. Te aparecerá un nuevo botón que pone “Suscribe” para suscribirte al canal 5. Haz click y vuelve a la página anterior 6. Descarga los ficheros En este vídeo vamos a construir una app de inteligencia artificial que: - permita a un usuario decir en lenguaje natural qué consulta quiere hacer - usando langchain la consulta le llegue a la api de ChatGpt   - conecte con una base de datos relacional y haga la consulta - devuelva el resultado en lenguaje natural El poder de este sistema es que para un usuario de negocio ya no será necesario: - aprender SQL - aprender herramientas de BI personal como PowerBI o Tableau - tener que hacer cada petición al equipo de data science / data analytics Herramientas usadas para esta app: • Python • Langchain • Api ChatGPT • Streamlit • Inteligencia artificial