Overview of Health Care Data Analytics
Introduction to Healthcare Data Analytics
Overview of Healthcare Data
- The lecture introduces the basics of healthcare data analytics, focusing on various types of data and available technologies.
- It discusses the definition of big data and the unique challenges associated with managing healthcare data.
Importance of Analytics in Healthcare
- Peter Sardi's quote emphasizes that information is crucial for modern healthcare, likening it to oil, while analytics serves as the combustion engine driving improvements.
- The Institute of Medicine highlights complexities and inefficiencies in America's healthcare system that hinder quality care and economic stability.
Learning Healthcare Systems
- A learning healthcare system is defined as one that generates and applies evidence for collaborative patient-provider decisions, fostering continuous improvement in care quality.
Data Aggregation Challenges
Information Systems in Hospitals
- Hospitals utilize various electronic health record systems alongside specialized departmental systems (e.g., lab, imaging), each capturing specific patient data but lacking comprehensive datasets.
Need for Centralized Data
- To gain insights into individual patients or groups, it's essential to aggregate data from multiple systems for analysis and reporting purposes.
Clinical Data Warehousing
ETL Process Explained
- A clinical data warehouse consolidates patient data from different sources using an ETL (Extraction, Transformation, Load) process to prepare it for analysis.
- The transformation step ensures consistency across varying formats used by different systems (e.g., gender representation).
Linking Patient Records
- A master patient index is necessary to link a patient's identifiers across various systems effectively.
Understanding Analytics
Definition and Scope of Analytics
- Analytics encompasses discovering meaningful patterns in data; it includes steps like collection, preparation, analysis, interpretation, and reporting.
Types of Analytics
- According to NIST's formal definition from 2015, analytics involves methods and tools used to derive value from raw data through systematic processes.
Understanding Diagnostic, Predictive, and Prescriptive Analytics
Overview of Analytics Types
- The fourth type of diagnostic analytics is defined as advanced analytics that examines data to answer the question: "Why did it happen?" This is part of a broader framework where descriptive analytics forms the simplest level.
- Diagnostic analytics are more valuable but also more complex than descriptive analytics. Predictive analytics follow, being even more challenging yet valuable, while prescriptive analytics represent the highest complexity and value.
Descriptive Analytics
- Descriptive analytics simply describe data using common statistics like laboratory test counts or average patient ages. They are often visualized through pie charts, bar charts, tables, or narratives.
Diagnostic Analytics Tools
- Tools for diagnostic analytics include drill-down techniques, data discovery methods, and correlation analysis. An example from Kaiser Permanente illustrates how they analyzed infant data to classify sepsis risk in newborns.
Understanding Sepsis
- Sepsis is a life-threatening condition caused by an infection leading to systemic inflammatory responses. If untreated, it can progress to septic shock with high mortality rates.
Predictive Analytics Attributes
- Gartner outlines four attributes of predictive analytics:
- Emphasis on prediction over description (e.g., predicting which infants may develop sepsis).
- Rapid analysis within hours or days due to the urgency of conditions like sepsis.
- Business relevance of insights that directly impact care decisions.
- Ease of use for clinical staff without requiring extensive technical knowledge.
Limitations of Predictive Analytics
- Michael Woo emphasizes that predictive analytics cannot definitively predict future events; they can only forecast potential outcomes based on probabilistic models.
Prescriptive Analytics Definition
- Prescriptive analytics answers "What should be done?" using techniques such as graph analysis and machine learning. It represents the most advanced form of data analysis.
Steps in Data Analysis Process
- Identify the Problem: Clearly define what needs studying and its importance for patient care or institutional impact. Stakeholders must be identified who have a vested interest in the results.
- Data Identification: Determine what data is needed and where it resides across systems; identify responsible contacts for retrieval efforts.
- Plan Development: Create a plan for both analysis and retrieval processes while ensuring all necessary records are accounted for during extraction steps.
Data Retrieval and Analysis Process
Steps in Data Retrieval and Preparation
- A data retrieval process begins with developing an analysis plan, which requires consultation with a statistician to address key questions such as the target population, sample size, and appropriate statistical tests.
- After formulating the plan, data extraction from relevant systems occurs. It's crucial to check for completeness to ensure all necessary records have been retrieved.
- If any discrepancies are found during the completeness check, adjustments to the extraction plan may be needed, leading to another round of data extraction from source systems.
- Once a complete dataset is obtained, errors must be identified and corrected. Common issues include transposed letters in names or incorrect values that need addressing.
- Data synchronization is essential; for instance, patient gender might be recorded differently across systems (e.g., "mfu" vs. "129"). All records must use consistent value sets before importing into the destination system for analysis.
Conducting Data Analysis
- With data imported into the analysis system—ranging from complex clinical data warehouses to simple desktop setups—it's vital to verify readiness against the initial analysis plan developed earlier.
- The actual analysis involves executing the planned statistical analyses while collaborating with a statistician for accurate interpretation of results.
Communicating Results Effectively
- Clear communication of findings is critical for decision-makers. Selecting appropriate visualizations based on data type enhances understanding; categorical data can utilize bar charts or tables while quantitative data may employ histograms or scatter plots.
- Upon completing analyses and visualizations, a report must be generated detailing the original problem addressed, methodology used, results obtained, and supporting visuals. This new knowledge should then be shared with stakeholders identified initially.
Implementation of Findings
- Finally, implementing new insights requires active participation from stakeholders to effectively tackle the original problem presented at the beginning of this process.
For further reading on these topics:
- "Six Steps of an Analytics Project" by Jad Kandja