Scanning Microsoft Power BI Resource in Cloud Data Governance and Catalog (CDGC)
How to Scan Microsoft Power BI Resources in Cloud Data Governance
Overview of Microsoft Power BI and CDGC
- Vivek Sim introduces himself as a Senior Solutions Specialist at Informatica, discussing the focus on scanning Microsoft Power BI resources within Cloud Data Governance and Catalog (CDGC).
- The session will cover an overview of Microsoft Power BI, objects that can be extracted into CDGC, prerequisites for configuration, resource creation in MCC, job monitoring, and demonstrations.
- Emphasis is placed on the fact that the discussion pertains specifically to Microsoft Power BI Cloud rather than any on-premises versions.
Objects Extracted from Power BI
- Key objects that can be extracted from Microsoft Power BI into CDGC include workspaces, dashboards, datasets, reports, tiles, and workbooks.
- It is noted that column-level information for Power BI reports cannot be extracted into CDGC.
Prerequisites for Configuration
- Before configuring the Microsoft Power BI resource in Metadata Command Center (MCC), organizations must ensure they have the required licenses.
- Two types of connections supported by CDGC are:
- Admin user connection requiring URL, client ID, username, and password.
- Service principal connection needing URL, client ID, tenant ID, and client secret.
Setting Up Admin User Credentials
- Instructions are provided for logging into Azure portal to create a new user if one does not already exist. This includes providing an alias name and display name.
- After creating the user account, roles such as Microsoft 365 Global Administrator or Power BI Service Administrator need to be assigned.
Creating a Native Application
- Steps are outlined for creating a native application in Azure Active Directory including registration and setting API permissions necessary for accessing Power BI services.
- Important settings include enabling public client flows under advanced settings after granting admin consent for required permissions.
Admin User Setup in Workspaces
- The created user must also be added as an admin user across all relevant workspaces to refresh datasets effectively.
- Refreshing datasets involves selecting each workspace under the admin portal before proceeding with scans.
Resource Creation in Metadata Command Center
- A new resource for Microsoft Power BI is created by entering details like cloud URL and client ID from the previously created native application.
- Options are available to include/exclude personal workspaces or filter specific workspaces based on modification dates during setup.
Connection Assignment Process
- After running successful scans of resources in MCC, connection assignments must be made to link extracted data with actual resources.
- Monitoring connection assignments allows users to see which connections require assignment; this ensures proper linkage between Snowflake tables used in reports.
Viewing Extracted Objects in Data Governance Catalog
- Once connection assignments are completed successfully, users can browse through data governance catalogs where all extracted objects from Power BI will appear.
Power BI Service Principal Setup and Dashboard Overview
Overview of Power BI Reports
- The discussion begins with an introduction to a Power BI report, specifically the "Superstore dashboard," which features two tiles.
Creating a Service Principal in Azure
- Instructions are provided for logging into the Azure portal, navigating to Active Directory, and creating a service principal through app registrations.
- After registration, it is crucial to note the application client ID and directory tenant ID for future resource creation in MCC (Metadata Command Center).
Configuring API Permissions
- Steps include generating a new client secret under certificates and secrets, noting that this value will be masked after creation.
- Users must add permissions for the Power BI service by selecting delegated permissions and granting admin consent.
Group Management for Service Principals
- A new group should be created in Azure AD where the service principal can be added as a member.
Admin Portal Configuration
- In the Power BI admin portal, settings must be adjusted to allow the service principal to use APIs. This includes enabling read-only admin APIs and enhanced responses for metadata.
Granting Workspace Access
- To refresh datasets effectively, access must be granted at each workspace level by adding the service principal as an admin.
Resource Creation in Metadata Command Center
- The process involves logging into MCC, creating a new resource related to Microsoft Power BI while filling out necessary details like client ID and tenant ID.
Scanning Workspaces
- Users can specify filters for workspaces modified within a certain timeframe during resource setup. Stakeholder assignments can also be configured here.
Monitoring Resource Execution
- After saving resources, users can run them and monitor execution stats post-scan completion.
Data Governance Insights
- Upon accessing data governance tools, users can browse extracted objects from Power BI including datasets, reports, dashboards, etc.
Exploring Dashboard Lineage
- The lineage of dashboards is explored; specific references are made to dataset relationships within the Superstore dashboard context.
Detailed View of Superstore Dashboard Components
- The Superstore workspace contains various elements such as datasets named "Superstore sales on Snowflake" along with associated reports and dashboards.
Final Observations on Dashboard Tiles
Connection Assignment in Power BI and Snowflake
Overview of the Dashboard
- The dashboard displays a return order by customer, featuring two tiles that are also present in the CDGC report.
- The first object is labeled as a reference data set due to an incomplete connection assignment.
Performing Connection Assignment
- Navigate to the metadata command center; select "Monitor" followed by "Connection Assignment."
- Identify the Power BI resource with two reference data sets; recognize that GCS HC Vivek is a schema under a Snowflake resource.
Assigning Schema to Snowflake Resource
- Locate the appropriate Snowflake resource and confirm that the schema matches GCS_HC_Vivek, where actual tables reside.
- Execute the assignment process for this schema to ensure proper linkage with Power BI.
Finalizing Connection Assignment
- After successful completion of the connection assignment job, return to Data Governance and Catalog, then refresh the dashboard.