Computer Vision - Lecture 7.1 (Learning in Graphical Models: Conditional Random Fields)

Name: Computer Vision - Lecture 7.1 (Learning in Graphical Models: Conditional Random Fields)
Uploaded: 2021-05-20T13:10:09.000Z
Duration: 36 min 23 s
Description: Lecture: Computer Vision (Prof. Andreas Geiger, University of Tübingen) Course Website with Slides, Lecture Notes, Problems and Solutions: https://uni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/autonomous-vision/lectures/computer-vision/

Introduction to Learning in Graphical Models

Overview of Previous Lectures

This lecture is the seventh in a series on graphical models, following an introduction and basic inference algorithm (belief propagation) discussed in previous lectures.

Lecture six covered applications of graphical models, emphasizing how prior knowledge can be integrated into models, such as smoothness in depth maps for stereo vision.

Focus of Current Lecture

The current lecture will address parameter estimation within graphical models, specifically how to learn parameters from datasets.

The structure of the lecture includes three units: terminology of conditional random fields (CRFs), learning problems with linear/log-linear parameters, and deep structured models with non-linear dependencies.

Understanding Conditional Random Fields

Definition and Importance

Conditional random fields are introduced as a necessary extension beyond Markov random fields (MRFs), which focus on single model instantiations.

An example is provided where MRF involves 100 random variables related to image denoising, utilizing exponential sums of log factors or potentials.

Potentials and Probability Distribution

The probability distribution defined by MRF requires normalization through a partition function that accounts for all possible states.

Inference tasks include estimating marginal distributions and maximum a posteriori solutions relevant for computer vision applications.

Learning Parameters in Graphical Models

Parameter Estimation Challenges

The lecture shifts focus to the learning problem: estimating parameters like lambda from datasets rather than just one instance.

A remark clarifies that potentials can be viewed differently; high values may imply high probabilities depending on interpretation but will be treated generically here.

Need for Structured Output Learning

Understanding Conditional Random Fields

Introduction to Conditional Random Fields

The mapping function f with parameters w transitions from an input space x to an output space y , focusing on structured output learning where outputs are complex structured objects.

Outputs can include various forms such as images, semantic segmentation maps, text, parse trees, and computer programs.

Definition and Structure of CRFs

In a conditional random field (CRF), the relationship between inputs and outputs is explicitly defined; we write the distribution of y , which represents the output.

It’s crucial to swap variables when discussing CRFs: given a noisy image x , the model produces a denoised image y .

Input and Output Variables

The notation in Markov Random Fields (MRF) uses only x ; however, in CRFs, we denote input variables as x and output variables as y .

Unary potentials depend on both input and output variables. Pairwise potentials also consider these dependencies.

Learning Parameters in CRFs

The set of all input variables is denoted as calligraphic X , while outputs are represented by calligraphic Y . Learning involves estimating model parameters.

In supervised learning scenarios like denoising or semantic segmentation, we infer optimal parameters based on annotated input-output pairs.

Generalization of Conditional Random Fields

This discussion provides a specific example related to image denoising but can be generalized for broader applications within CRFs.

A more general form includes summing unary, pairwise, and higher-order factors into a long vector multiplied by parameters.

Inner Product Representation

The general representation can be expressed through inner products; different parameters may exist for each unary potential or combination of pairs.

We compute the inner product of parameter vector w with concatenated feature functions from the conditional random field.

Normalization and Probability Distribution

To achieve proper conditional probability distributions from this model, normalization via partition functions is necessary. These depend on inputs but not directly on outputs.

Feature Functions in Graphical Models

Feature functions operate on both inputs and outputs; they may decompose according to graphical model structures.

The dimensionality of feature space depends on problem complexity; larger datasets require more flexible models with additional parameters.

Conclusion: Tractability Through Graphical Model Structures

Concatenating Features in Graphical Models

Understanding Feature Concatenation

The process involves concatenating unit features across all pixels and pairwise features over all pairwise pixels into a single large vector.

A parameter vector w is introduced, with dimensions defined by the number of output nodes m and the dimensionality of the feature space d .

This flexible approach allows for individual parameters for each feature as defined by the graphical model, enhancing model complexity and adaptability.

Learning Parameters from Data

The partition function is described as the sum over the entire output state space y , which plays a crucial role in understanding model behavior.