Dmitry Ulyanov - Deep Image Prior

Dmitry Ulyanov - Deep Image Prior

Introduction

Dmitry introduces himself and his recent work on deep image prior, which was accepted to CVPR this year. He explains the concept of image restoration in various tasks such as denoising, inpainting, and super resolution.

Deep Image Prior

  • Dmitry discusses his recent paper called "Deep Image Prior" and how it can be useful as an image prior for generating images.
  • He explains that image restoration involves taking a degraded image and restoring it to its original state through denoising, inpainting, or super resolution.
  • Dmitry defines super resolution as increasing the spatial resolution of an image without introducing new information. He also explains that a prior is our knowledge about the world that we use to restore images.
  • To teach computers to do super resolution, you can collect a dataset with raw low-resolution images and high-resolution images and train a neural network to learn the prior from this data set.

Explicit Priors

  • Another way to do super resolution is by using explicit priors or handcrafted priors. This involves synthesizing an image subject to certain constraints such as being an image of a face, having natural lighting conditions, etc.

Introduction

In this section, the speaker introduces the concept of explicit priors and how they differ from implicit priors. They also explain that the paper will explore constructing an explicit prior using neural networks.

Explicit Priors vs Implicit Priors

  • Explicit priors are constructed directly while implicit priors are learned through a learning process.
  • The paper aims to construct an explicit prior using neural networks in a different way than traditional methods.

Importance of Network Structure

In this section, the speaker explains why it is important to understand the role of network structure in deep learning.

Role of Network Structure

  • Deep learning's success is attributed to its ability to learn on large datasets.
  • However, understanding the importance of network structure and other components like hyperparameters is also crucial.
  • The paper aims to eliminate the learning process and show that useful information for image restoration can be found within network structure alone.

Results Comparison

In this section, the speaker compares results from different methods for super resolution tasks.

Super Resolution Results Comparison

  • A ground truth image is compressed by a factor of four and then upscaled with two different methods: bicubic interpolation and a trained neural network.
  • The trained neural network produces better results than bicubic interpolation but requires training on a dataset.
  • The proposed method, Deep Image Prior, can generate comparable results without being trained on any dataset.

Formalism for Image Restoration

In this section, the speaker introduces formalism for image restoration tasks.

Formalism for Image Restoration

  • Given a clean image X degraded by some process, we aim to restore it given only a corrupted version X_hat.
  • We use Bayesian rule to maximize likelihood times prior instead of maximizing posterior distribution.
  • Likelihood relates clean image to corrupted image and prior defines our knowledge of the world, such as what is a natural image.

Introduction to Image Restoration

In this section, the speaker introduces the concept of image restoration and explains how it can be approached using maximum likelihood estimation.

Maximum Likelihood Estimation

  • The likelihood is a normal distribution with a bell curve in the mean.
  • Without prior information, we cannot restore an image better than its corrupted version.
  • The maximum posterior estimate is the same as the maximum likelihood estimate when the prior is uniform.
  • Likelihood votes for images that follow degradation process while prior votes for natural images. Maximum posterior lies between these two.
  • People use -logarithm of likelihood (data) and -logarithm of prior (constraint) to minimize instead of maximizing posterior.

Parameterization and Optimization

This section discusses different ways to approach optimization problems in image restoration.

Optimizing in Image Space

  • Start with an initial estimate and compute gradient with respect to X.
  • Update X until convergence point is reached.

Optimizing in Parameter Space

  • Every image is an output of a function that maps values from parameter space theta to images X.
  • Optimize over Thetas instead of images by computing gradient with respect to Theta and updating it until convergence point is reached.

Using G as Hyperparameter

  • Treat function G as hyperparameter and tune it to highlight desired images while preventing wrong ones.
  • Function G defines a prior that can be used for optimization.

Introduction to Deep Image Prior

In this section, the speaker introduces the concept of deep image prior and explains how it works.

What is Deep Image Prior?

  • Every image can be expressed as a matrix-vector multiplication.
  • The coefficients needed to apply to every image are called theta.
  • Deep Image Prior proposes a different parameterization for neural networks.
  • Unlike other types of networks, in deep image prior, we fix the input and vary the weights to get different outputs.

How Does Deep Image Prior Work?

  • We take parameters and insert them into our neural network.
  • We initialize an input with noise or fix an image to be used throughout the process.
  • We iteratively solve an optimization task with our favorite gradient-based method.
  • When we find the optimal theta, we can easily get an optimal image by forward passing this fixed input through the network with parameters theta star.

Advantages of Using Deep Image Prior

In this section, the speaker discusses why deep image prior is useful and its advantages over other parametrizations.

Why Use Deep Image Prior?

  • It has high impedance to noise in structured images.
  • It's more complicated for a network to overfit unstructured portals than structured ones.

Advantages of Using Deep Image Prior

  • It's a universal estimator that can approximate everything.

Introduction to Priors

In this section, the speaker discusses priors and their use in selecting the right local minimum for optimization.

Using Priors for Optimization

  • Priors are used to select the right local minimum for optimization.
  • Without a prior, optimization may converge to an undesired local minimum.
  • Deep image prior allows us to stop the optimization process at a desired point before overfitting occurs.

Data Term and Regularization

In this section, the speaker explains how data term and regularization are used in image restoration.

Maximizing Posterior Distribution

  • The clean image is first degraded and then restored by maximizing posterior distribution.
  • The data term expresses the relation between the clean and corrupted image.
  • Early stopping is necessary to prevent overfitting of corrupted images.

Image Restoration Tasks

In this section, the speaker discusses different tasks related to image restoration.

Different Image Restoration Tasks

  • Denoising involves using L2 distance between synthesized and corrupted images as a data term.
  • Inpainting involves generating an image with known pixel values at certain points while leaving unknown pixels blank.
  • Super-resolution involves generating an image that when downsampled matches a low-resolution or corrupted version of itself.

Feature Inversion

In this section, the speaker explains feature inversion and the importance of priors in this task.

Feature Inversion

  • Feature inversion involves synthesizing an image with the same features as a given input image.
  • Priors are necessary to prevent generation of weird images.

Example of Denoising

In this section, the speaker provides an example of denoising using deep image prior.

Denoising with Deep Image Prior

  • Deep image prior is used to denoise a real image with JPEG artifacts.
  • Optimization process starts with random parameters and gradually improves until it overfits corrupted images.
  • Early stopping is necessary to obtain optimal results before overfitting occurs.

Deep Image Prior

In this section, the speaker discusses the Deep Image Prior and its ability to generate high-quality images without using external information.

Super Resolution

  • The Deep Image Prior and Be Cubic do not use any external information but the corrupted image.
  • The optimization process generates a sharp zebra image in the first iteration.
  • The cubic method generates blurry results compared to Deep Image Prior.

Inpainting

  • The network can generate an image that has the same values in pixels we know and interpolate nearby regions for inpainting.
  • Textures are recreated quite nicely, and there are no halos or remaining text in the image.
  • A shallow network cannot utilize content effectively for inpainting.

Architecture Comparison

  • Different architectures were tested, including encoder-decoder with skip connections like in Unit.
  • A large depth is necessary to compress images before restoring them with an encoder-decoder architecture.
  • Many skip connections allow input to be easily converted to output, leading to overfitting.

Conclusion

In this section, the speaker concludes that Deep Image Prior works best when it is hard for the network to overfit an image. They also discuss how a fully connected layer would make it easy for a network to transform a fixed input into an image.

Overfitting Images

  • A depth of six is necessary because it's complicated for networks to overfit images at this depth.
  • If there was only one layer with 1x1 filters, then it would be easy for networks to overfit images.
  • Fully connected layers make it easy for networks to transform fixed inputs into images.

Understanding Deep Neural Networks

In this section, the speaker discusses how deep neural networks work and how they store information.

Using Skip Connections in Encoder-Decoder Models

  • The smallest path from input to output should be long to avoid overfitting.
  • Skip connections cannot generate and utilize context, so strong regularization is used to prevent their use.
  • Finding the right learning rate is crucial for successful image processing.

Feature Inversion with Priors

  • Using a deep image prior instead of a TV prior results in more natural-looking images.
  • Deep neural networks store information about color and low-level details even in the last layers of the network.
  • Activation maximization can be used to diagnose a network's performance on specific classes.

Image Restoration Methods

  • The deep image prior is a powerful tool for denoising and restoring images.
  • Deep neural networks impose strong priors on image space, capturing low-level statistics before any learning takes place.

Introduction

The speaker introduces the topic of their research and explains how it can lead to new insights.

Understanding Bottlenecks

  • The research aims to understand bottlenecks in the network structure.
  • This understanding can help determine if the data set or network structure needs to be changed.

Inception Score and Metrics

The speaker discusses Inception Score and other metrics used in engineering models.

Inception Score

  • Inception score is used to compare the distribution that a model generates with the target distribution.
  • It is not applicable in this case since there is no generative process involved.

Metrics for Learning Networks

  • Different metrics, such as L1 and Laplace, can be used to train networks for tasks like denoising and super resolution.
  • These metrics generate different artifacts, such as less blur, but do not utilize all information in the data set.

Training Process

The speaker explains how the network architecture was created and trained.

Random Initialization

  • The network was randomly initialized with no transfer learning involved.
  • No pre-trained models were used.

Human Gradient Descent

  • The architecture was created by humans using gradient descent.
  • This shows that people are doing it right without any tuning.

Objective Function

The speaker describes the objective function of the network.

Network Input and Objective

  • The input throughout the whole process is a noisy image.
  • The objective is for the network to generate an image with same pixel values as those of a known image.

Pairwise Distance Calculation

  • A pairwise distance calculation is done between generated images and corrupted images.
  • Distances are multiplied by a mask so that only relevant pixels are considered.

Training

  • The network is trained to generate the same pixels as those in the known image.
  • The objective is to examine what happens in all other pixels.

Determining the Number of Iterations for Super-Resolution

In this section, the speaker discusses how to determine the number of iterations needed for super-resolution.

Fixed Number of Iterations

  • A fixed number of iterations that works best on a holdout dataset is chosen.
  • The iteration number is fixed at 2040 or 2400 and applied to all images.
  • Optimization can be done as long as desired.

Early Stopping

  • Early stopping is important for tasks like denoising and super-resolution.
  • The iteration number may need to be tuned as a hyperparameter on different data with different noise structures or down-sampling.

Using Deep Image Prior for Synthetic Data

In this section, the speaker discusses using deep image prior when synthetic data is used.

Unknown Parameters

  • Sometimes it's difficult to construct a dataset with regular images, so synthetic data must be used.
  • When using synthetic data, you may not know how the down-sampling was performed or the parameters of the noise.
  • Deep image prior can be used in these cases, but someone needs to validate whether it's enough and if the iteration number is correct.

Nonlinear Manifolds in Image Space

In this section, the speaker discusses nonlinear manifolds in image space.

Open Question

  • It's hard to say whether agreement in capsule networks could be a stronger prior or how it could be used as a prior.
  • It's an open question that requires further discussion.
Video description

Deep convolutional networks have become a popular tool for image generation and restoration. Generally, their excellent performance is imputed to their ability to learn realistic image priors from a large number of example images. In this paper, we show that, on the contrary, the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning. In order to do so, we show that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, superresolution, and inpainting. Furthermore, the same prior can be used to invert deep neural representations to diagnose them and to restore images based on flash-no flash input pairs. Apart from its diverse applications, our approach highlights the inductive bias captured by standard generator network architectures. It also bridges the gap between two very popular families of image restoration methods: learning-based methods using deep convolutional networks and learning-free methods based on handcrafted image priors such as self-similarity. Dmitry Ulyanov received his major in Machine Learning at Moscow State University and now studies for his Ph.D. degree at Skoltech Institute. His supervisors are Victor Lempitsky and Andrea Vedaldi and his work is mostly focused on image synthesis and generative models. Dmitry also serves as teaching assistant at Deep Learning class at Skoltech and Yandex's School of Data Analysis. He worked in Yandex and had an internship at Google. Dmitry is a prize winner in more than 10 Data Science contests and runs class about competitive Data Science on Coursera.

Dmitry Ulyanov - Deep Image Prior | YouTube Video Summary | Video Highlight