Computer Vision - Lecture 7.1 (Learning in Graphical Models: Conditional Random Fields)
Introduction to Learning in Graphical Models
Overview of Previous Lectures
- This lecture is the seventh in a series on graphical models, following an introduction and basic inference algorithm (belief propagation) discussed in previous lectures.
- Lecture six covered applications of graphical models, emphasizing how prior knowledge can be integrated into models, such as smoothness in depth maps for stereo vision.
Focus of Current Lecture
- The current lecture will address parameter estimation within graphical models, specifically how to learn parameters from datasets.
- The structure of the lecture includes three units: terminology of conditional random fields (CRFs), learning problems with linear/log-linear parameters, and deep structured models with non-linear dependencies.
Understanding Conditional Random Fields
Definition and Importance
- Conditional random fields are introduced as a necessary extension beyond Markov random fields (MRFs), which focus on single model instantiations.
- An example is provided where MRF involves 100 random variables related to image denoising, utilizing exponential sums of log factors or potentials.
Potentials and Probability Distribution
- The probability distribution defined by MRF requires normalization through a partition function that accounts for all possible states.
- Inference tasks include estimating marginal distributions and maximum a posteriori solutions relevant for computer vision applications.
Learning Parameters in Graphical Models
Parameter Estimation Challenges
- The lecture shifts focus to the learning problem: estimating parameters like lambda from datasets rather than just one instance.
- A remark clarifies that potentials can be viewed differently; high values may imply high probabilities depending on interpretation but will be treated generically here.
Need for Structured Output Learning
Understanding Conditional Random Fields
Introduction to Conditional Random Fields
- The mapping function f with parameters w transitions from an input space x to an output space y , focusing on structured output learning where outputs are complex structured objects.
- Outputs can include various forms such as images, semantic segmentation maps, text, parse trees, and computer programs.
Definition and Structure of CRFs
- In a conditional random field (CRF), the relationship between inputs and outputs is explicitly defined; we write the distribution of y , which represents the output.
- It’s crucial to swap variables when discussing CRFs: given a noisy image x , the model produces a denoised image y .
Input and Output Variables
- The notation in Markov Random Fields (MRF) uses only x ; however, in CRFs, we denote input variables as x and output variables as y .
- Unary potentials depend on both input and output variables. Pairwise potentials also consider these dependencies.
Learning Parameters in CRFs
- The set of all input variables is denoted as calligraphic X , while outputs are represented by calligraphic Y . Learning involves estimating model parameters.
- In supervised learning scenarios like denoising or semantic segmentation, we infer optimal parameters based on annotated input-output pairs.
Generalization of Conditional Random Fields
- This discussion provides a specific example related to image denoising but can be generalized for broader applications within CRFs.
- A more general form includes summing unary, pairwise, and higher-order factors into a long vector multiplied by parameters.
Inner Product Representation
- The general representation can be expressed through inner products; different parameters may exist for each unary potential or combination of pairs.
- We compute the inner product of parameter vector w with concatenated feature functions from the conditional random field.
Normalization and Probability Distribution
- To achieve proper conditional probability distributions from this model, normalization via partition functions is necessary. These depend on inputs but not directly on outputs.
Feature Functions in Graphical Models
- Feature functions operate on both inputs and outputs; they may decompose according to graphical model structures.
- The dimensionality of feature space depends on problem complexity; larger datasets require more flexible models with additional parameters.
Conclusion: Tractability Through Graphical Model Structures
Concatenating Features in Graphical Models
Understanding Feature Concatenation
- The process involves concatenating unit features across all pixels and pairwise features over all pairwise pixels into a single large vector.
- A parameter vector w is introduced, with dimensions defined by the number of output nodes m and the dimensionality of the feature space d .
- This flexible approach allows for individual parameters for each feature as defined by the graphical model, enhancing model complexity and adaptability.
Learning Parameters from Data
- The partition function is described as the sum over the entire output state space y , which plays a crucial role in understanding model behavior.