Computer Vision - Lecture 7.1 (Learning in Graphical Models: Conditional Random Fields)

Computer Vision - Lecture 7.1 (Learning in Graphical Models: Conditional Random Fields)

Introduction to Learning in Graphical Models

Overview of Previous Lectures

  • This lecture is the seventh in a series on graphical models, following an introduction and basic inference algorithm (belief propagation) discussed in previous lectures.
  • Lecture six covered applications of graphical models, emphasizing how prior knowledge can be integrated into models, such as smoothness in depth maps for stereo vision.

Focus of Current Lecture

  • The current lecture will address parameter estimation within graphical models, specifically how to learn parameters from datasets.
  • The structure of the lecture includes three units: terminology of conditional random fields (CRFs), learning problems with linear/log-linear parameters, and deep structured models with non-linear dependencies.

Understanding Conditional Random Fields

Definition and Importance

  • Conditional random fields are introduced as a necessary extension beyond Markov random fields (MRFs), which focus on single model instantiations.
  • An example is provided where MRF involves 100 random variables related to image denoising, utilizing exponential sums of log factors or potentials.

Potentials and Probability Distribution

  • The probability distribution defined by MRF requires normalization through a partition function that accounts for all possible states.
  • Inference tasks include estimating marginal distributions and maximum a posteriori solutions relevant for computer vision applications.

Learning Parameters in Graphical Models

Parameter Estimation Challenges

  • The lecture shifts focus to the learning problem: estimating parameters like lambda from datasets rather than just one instance.
  • A remark clarifies that potentials can be viewed differently; high values may imply high probabilities depending on interpretation but will be treated generically here.

Need for Structured Output Learning

Understanding Conditional Random Fields

Introduction to Conditional Random Fields

  • The mapping function f with parameters w transitions from an input space x to an output space y , focusing on structured output learning where outputs are complex structured objects.
  • Outputs can include various forms such as images, semantic segmentation maps, text, parse trees, and computer programs.

Definition and Structure of CRFs

  • In a conditional random field (CRF), the relationship between inputs and outputs is explicitly defined; we write the distribution of y , which represents the output.
  • It’s crucial to swap variables when discussing CRFs: given a noisy image x , the model produces a denoised image y .

Input and Output Variables

  • The notation in Markov Random Fields (MRF) uses only x ; however, in CRFs, we denote input variables as x and output variables as y .
  • Unary potentials depend on both input and output variables. Pairwise potentials also consider these dependencies.

Learning Parameters in CRFs

  • The set of all input variables is denoted as calligraphic X , while outputs are represented by calligraphic Y . Learning involves estimating model parameters.
  • In supervised learning scenarios like denoising or semantic segmentation, we infer optimal parameters based on annotated input-output pairs.

Generalization of Conditional Random Fields

  • This discussion provides a specific example related to image denoising but can be generalized for broader applications within CRFs.
  • A more general form includes summing unary, pairwise, and higher-order factors into a long vector multiplied by parameters.

Inner Product Representation

  • The general representation can be expressed through inner products; different parameters may exist for each unary potential or combination of pairs.
  • We compute the inner product of parameter vector w with concatenated feature functions from the conditional random field.

Normalization and Probability Distribution

  • To achieve proper conditional probability distributions from this model, normalization via partition functions is necessary. These depend on inputs but not directly on outputs.

Feature Functions in Graphical Models

  • Feature functions operate on both inputs and outputs; they may decompose according to graphical model structures.
  • The dimensionality of feature space depends on problem complexity; larger datasets require more flexible models with additional parameters.

Conclusion: Tractability Through Graphical Model Structures

Concatenating Features in Graphical Models

Understanding Feature Concatenation

  • The process involves concatenating unit features across all pixels and pairwise features over all pairwise pixels into a single large vector.
  • A parameter vector w is introduced, with dimensions defined by the number of output nodes m and the dimensionality of the feature space d .
  • This flexible approach allows for individual parameters for each feature as defined by the graphical model, enhancing model complexity and adaptability.

Learning Parameters from Data

  • The partition function is described as the sum over the entire output state space y , which plays a crucial role in understanding model behavior.
Video description

Lecture: Computer Vision (Prof. Andreas Geiger, University of Tübingen) Course Website with Slides, Lecture Notes, Problems and Solutions: https://uni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/autonomous-vision/lectures/computer-vision/