Curso Básico Power BI 2025 - Aula 2 - Editando a Base de Dados com Power Query
Introduction to Power BI - Lesson 2
Overview of the Lesson
- The second lesson of the basic Power BI course continues from the first, which covered installation and initial edits. This session focuses on importing tables and automating data treatment processes in Power Query.
Course Engagement
- Viewers are encouraged to like the video if they enjoy the course content, helping it reach a wider audience. Additionally, free downloadable materials for practice are provided in the video description.
Importing Data into Power BI
- The instructor revisits previous lessons by importing a new table related to clients after having imported a store registration table in the last session.
- The client table includes essential information such as client code, name (formatted as surname, first name), gender, and date of birth.
Steps for Data Importation
- To import data into Power BI, navigate to "Home" tab > "Import Excel Workbook," then locate and select the desired file (e.g., client registration).
- After selecting the sheet containing data, users should check it and click "Transform Data" to edit before finalizing its import into Power BI.
Editing Data in Power Query
Renaming Tables
- It is important to rename imported tables meaningfully; for instance, changing "Sheet1" to "Client Registration."
Improving Table Format
- The instructor highlights that names may be formatted incorrectly (e.g., reversed order). Editing directly in Excel isn't ideal due to consistent formatting issues; thus, using Power Query is recommended.
Automation Benefits
- Edits made within Power Query are saved as applied steps. If new clients are added later, simply clicking "Refresh" will automatically apply all previous adjustments.
Using Example Column Feature
Text Manipulation Tool
- A powerful tool within Power Query is the “Column from Examples” feature found under “Add Column.” This allows users to specify how they want text columns formatted through examples.
Providing Examples for Formatting
- Users must provide examples of how they want their data transformed. For instance, entering correctly formatted names helps guide Power Query's understanding of desired outcomes.
Finalizing Changes
Completing Edits
- After providing sufficient examples for formatting names correctly (e.g., Juana P. Apoliano), users can finalize changes by clicking OK.
Renaming New Columns
- Once a new column with properly formatted names is created, it's advisable to delete any original columns that contained incorrect formats for clarity and organization.
This structured approach ensures that learners can easily follow along with each step while also grasping key concepts discussed throughout this lesson on using Power BI effectively.
Understanding Power Query Functionality
The Independence of Client Columns
- The client column does not rely on the full name column to function, as demonstrated by removing the full name column without affecting client data.
Power Query's Non-Interference with Excel
- Edits made in Power Query do not alter the original Excel spreadsheet; changes are isolated within Power Query.
Historical Tracking of Changes
- Power Query saves all applied steps, allowing users to review and revert to previous stages of their data transformations.
Data Integrity and Safety
- Users can safely delete columns in Power Query since all actions are recorded in history, ensuring that original data remains intact in Excel.
Efficient Updates for New Data
- When new clients or stores are added, users can simply refresh the query to apply all previous transformations automatically.
Enhancing Data Analysis with Gender and Age Information
Improving Gender Representation
- To create more informative sales graphs, gender abbreviations (M/F) need to be replaced with full terms (Masculine/Feminine).
Quick Substitution Techniques
- Users can utilize the Transform tab in Power Query for quick substitutions of values within a column.
Importance of Age Over Birth Date Alone
- While birth dates provide essential information, knowing a client's age is crucial for deeper analysis like age demographics.
Analyzing Age Groups for Sales Insights
- Understanding age groups allows businesses to analyze sales trends among different demographics effectively.
Calculating Age from Birth Dates
Creating an Age Column
- A new column should be created to calculate age based on birth dates rather than replacing existing date information.
Calculating Years Instead of Days
- The initial calculation provides age in days; however, it needs adjustment to reflect years for practical use.
Understanding Age Calculation and Categorization in Data Analysis
Working with Date and Time Data
- The speaker discusses options for working with date and time data, specifically focusing on calculating age. They demonstrate how to view age in total years.
- It is noted that the calculated ages may vary based on the current date settings on different computers, leading to potential discrepancies in displayed values.
Rounding Age Values
- The speaker emphasizes the importance of rounding down age values to avoid unnecessary decimal places, stating that knowing someone is 33.7 years old is less relevant than simply knowing they are 33.
Analyzing Age Groups for Marketing Strategies
- The need to categorize individuals into age groups (e.g., young, adult, senior) is introduced as a strategy for analyzing product sales across different demographics.
- By understanding which products appeal to specific age groups, marketers can tailor their advertising strategies effectively.
Creating Conditional Columns for Age Categories
- A new column will be created to classify individuals into defined age categories: young (up to 25), adult (26 to 55), and senior (56+).
- The speaker outlines an example categorization system but notes that users can define categories according to their preferences.
Implementing Conditional Logic in Data Columns
- Instructions are provided on how to create a conditional column that assigns an age category based on specified criteria using Power Query.
- The process involves setting conditions where if a person's age is less than or equal to 25, they are categorized as "young"; otherwise, further conditions determine if they fall into "adult" or "senior."
Finalizing Data Types in Power BI
- It’s highlighted that each column must have a defined data type within Power BI for proper functionality—dates should be recognized as dates, text as text, and numbers accordingly.
- The importance of ensuring the newly created categorical column has an appropriate data type is emphasized for effective analysis and visualization.
Power Query: Simplifying Data Transformation
Introduction to Power Query Editing
- The speaker demonstrates how to edit a client table in Power Query, emphasizing the simplicity of using the mouse for navigation and minimal keyboard input.
- To add a new table, the speaker suggests clicking "Close & Apply" to transfer the updated client table into Power BI.
Importing New Data Sources
- The process of importing data from an Excel workbook is explained, highlighting the option to bring in a product registration table directly from Power Query.
- Within Power Query, there are limited options when importing; only transformation options are available instead of direct loading into Power BI.
Handling Decimal Separator Issues
- The speaker identifies a common issue where unit prices and costs use periods as decimal separators instead of commas, leading to incorrect data representation in Power Query.
- An example shows that prices like 188.5 appear incorrectly as 188 50 due to this formatting discrepancy between English and Portuguese conventions.
Correcting Data Formatting
- The need for correction arises because imported data often misinterprets decimal points as text rather than numerical values.
- To rectify this, the speaker explains dividing numbers by 100 to adjust for two decimal places correctly.
Merging Sales Tables
- The discussion transitions to merging sales tables from different years (2022, 2023, and 2024), which are initially separated but should be combined for better analysis.
- By stacking these tables vertically within Power Query, users can analyze historical sales more effectively without editing each year’s data individually.
How to Combine Tables in Power BI
Combining Multiple Tables into One
- The speaker discusses the process of combining multiple tables in Power BI, emphasizing that they can be merged into a single table for better data management.
- Two options are available when adding queries: "Append Queries" and "Append Queries as New." The latter creates a new table while keeping the original tables intact.
- Using "Append Queries" without creating a new one would result in a disorganized table, mixing data from different years (2022, 2023, 2024).
- The speaker opts for "Append Queries as New," allowing the creation of a comprehensive sales database while preserving individual year tables.
- After selecting the three tables to combine, the speaker confirms their choice and prepares to create a new consolidated table.
Finalizing Table Creation
- Once all desired tables are selected (2022, 2023, and 2024), the speaker names the new table “Base Vendas” (Sales Base).
- After naming the table, modifications are finalized by clicking on “Close & Apply,” which loads changes into Power BI.
- Unlike Excel's limit of one million rows, Power BI can handle millions of rows efficiently without significant performance issues.
Data Management and Updates
- The speaker highlights that all created tables remain accessible within Power BI’s interface for future reference or edits.
- Any updates made in Excel will reflect in Power BI upon refreshing; this includes sorting customer data by ID for better organization.
Testing Data Refresh
- A demonstration is provided where a new client is added directly to an Excel sheet to test if it appears in Power BI after refresh.
- The speaker adds a new client with specific details but notes that it initially does not appear in Power BI until refreshed.
Refreshing Data in Power BI
- By clicking on “Refresh,” all data is reloaded from scratch. This ensures any recent changes or additions from Excel are incorporated into Power BI seamlessly.
- Upon completion of the refresh process, updated information reflects accurately within Power BI—showing corrected formats and calculated fields like age group classifications.
Editing Capabilities Post Refresh
- All transformations made during previous sessions remain saved. Users can return to edit further by accessing “Transform Data” within Power BI.
Power BI Data Management Techniques
Hiding Unnecessary Tables in Power BI
- The speaker discusses the organization of data tables within Power BI, noting that there are multiple tables for sales, clients, stores, and products. However, not all tables need to be displayed if their information is already consolidated in a larger sales table.
- To streamline the view in Power BI, the speaker demonstrates how to hide specific sheets (tables) from being displayed. This involves right-clicking on the table and selecting an option to disable its load into Power BI.
- The process of disabling the load for each unnecessary table (2022, 2023, and 2024) is explained. By unchecking this option, these tables will no longer be loaded into Power BI while still being available for data processing in Power Query.
- After applying these changes by clicking "Close & Apply," the hidden tables will no longer occupy space within Power BI's interface but remain functional for data aggregation purposes.
Next Steps in Data Relationships
- The session concludes with a preview of upcoming lessons focused on establishing relationships between different tables. This functionality will allow users to integrate information from various sources into a single report effectively. Viewers are encouraged to engage with the content by liking and subscribing for future updates.