Unsupervised Learning: Dimensionality Reduction
Introduction to Unsupervised Learning
Overview of Unsupervised Learning
- The lecture introduces the unsupervised learning paradigm, contrasting it with supervised learning, which focuses on regression and classification tasks.
- Unsupervised learning is described as vague and typically serves as a pre-processing step rather than an end goal.
- The primary aim of unsupervised learning is to build models that compress, explain, and group data.
Practical Example of Unsupervised Learning
- An example illustrates how unsupervised learning can be applied in marketing by summarizing large volumes of tweets about Coca Cola.
- The challenge involves grouping millions of tweets into manageable categories for effective reporting to management.
- Grouping tweets allows for easier interpretation and actionable insights, highlighting the importance of human interpretation post-analysis.
Dimensionality Reduction in Unsupervised Learning
Purpose and Application
- Dimensionality reduction aims at compression and simplification, particularly useful in handling large datasets like gene expression levels from numerous individuals.
- A practical scenario involves managing a massive matrix representing gene expressions across many subjects, emphasizing the need for data transmission efficiency.
Mechanism of Dimensionality Reduction
- The process involves creating two models: an encoder that compresses high-dimensional data into lower dimensions (d') and a decoder that reconstructs the original data from this compressed format.
Understanding Encoder-Decoder Mechanisms
The Goal of Encoding and Decoding
- The primary objective is to ensure that the function g(f(x_i)) approximates x_i . A perfect match would yield zero, but an approximation is acceptable.
- To measure this approximation, one can compute the squared norm of the difference between g(f(x_i)) and x_i , aiming for a minimal value across all inputs.
Dimensionality Reduction Example
- An example illustrates dimensionality reduction where input data has 2 dimensions (d = 2) and is reduced to 1 dimension (d' = 1).
- Four training points are provided: (1, 0.8), (2, 2.2), (3, 3.2), and (4, 3.8).
Encoder and Decoder Functions
- The encoder function f(x) maps a two-dimensional vector to a scalar by calculating x_1 - x_2 .
- The decoder function g(u) takes this scalar output and returns a two-dimensional vector in the form of (u, u) .
Evaluation of Initial Encoder/Decoder Pair
- Testing the first encoder-decoder pair shows poor performance; distinct inputs map to identical outputs like (0.2, 0.2), indicating ineffective compression.
- This initial setup fails to retain original input information despite reducing dimensionality.
Improved Encoder/Decoder Functions
- A new pair of functions is introduced: f' (tilde)(x)=x_1+x_2/2 text and g'(u)= (u,u).
- Applying these functions yields better results with outputs such as (0.9, 0.9), which are closer to their respective original points.
Comparison of Performance
- The improved encoder-decoder pair demonstrates significantly better accuracy in reconstructing original data compared to the first pair.
- Visual representation confirms that outputs from the second set closely align with original data points while maintaining proximity among them.
Conclusion on Effectiveness
Dimensionality Reduction Algorithm Overview
Simplified Dimensionality Reduction Process
- The discussion introduces a simplified dimensionality reduction algorithm that operates by selecting between two pairs of encoder-decoder functions, denoted as f, g or tildef, tildeg .
- In practice, a more complex algorithm would evaluate an infinite number of potential functions for both encoding and decoding processes to determine the optimal pair for dimensionality reduction.