Computer Vision - Lecture 2.1 (Image Formation: Primitives and Transformations)
Introduction to Image Formation in Computer Vision
Overview of the Lecture
- The lecture introduces the fundamental concepts necessary for building computer vision systems, focusing on understanding how a 3D scene is projected onto a 2D image.
- It outlines the structure of the lecture, which consists of four subunits covering basic primitives and transformations, geometric image formation, photometric processes, and camera processing.
Key Topics Covered
- The first unit discusses geometric features and their projection onto a 2D image plane.
- The second unit focuses on how light changes as it travels from the source to the camera.
- The final unit examines how images are processed and stored within the camera system.
Geometric Primitives and Transformations
Fundamental Building Blocks
- Geometric primitives such as points, lines, and planes are essential for describing 3D shapes.
- Basic transformations in both 2D and 3D space will be discussed, referencing Riksilisky's book for foundational knowledge.
Homogeneous Coordinates
- Introduction to homogeneous coordinates as an extension of conventional (inhomogeneous) coordinates; they add a third dimension to represent points in projective space.
- This three-dimensional space (P2 or projective space) excludes zero elements and allows for more complex representations of geometric relationships.
Understanding Homogeneous Coordinates
Properties of Homogeneous Coordinates
- Homogeneous vectors differ only by scale; thus, they form equivalence classes where vectors like (1,1,1) and (2,2,2) are considered equivalent.
- This construction facilitates expressing points at infinity and simplifies mathematical operations related to projections in computer vision.
Conversion Between Coordinate Systems
- Converting between inhomogeneous vectors (e.g., x = [x1,x2]) to homogeneous vectors involves adding a one: x̄ = [x1,x2,1].
Understanding Homogeneous and Inhomogeneous Coordinates
Conversion Between Coordinate Systems
- To convert from inhomogeneous to homogeneous coordinates, divide by the last element of the homogeneous vector tildew , resulting in a last element of one for the augmented vector.
- An example illustrates this conversion: dividing an inhomogeneous vector tildex by tildew yields components x/tildew, y/tildew, 1 , forming an augmented vector.
- The relationship between homogeneous, inhomogeneous, and augmented vectors is encapsulated in one equation. A special case arises when w = 0 , leading to ideal points or points at infinity.
Ideal Points and Their Representation
- Ideal points cannot be represented with inhomogeneous coordinates since division by zero occurs when w = 0. This allows for expressing points at infinity without using the infinity symbol.
- A homogeneous vector with a last element of zero corresponds to a point at infinity, simplifying representation within coordinate systems.
Visualizing Coordinate Relationships
- A visual illustration shows how homogeneous coordinates relate to the xy-plane. Dividing a vector by w intersects this plane, defining inhomogeneous coordinates.
- The projection process resembles perspective projection, indicating its utility as a representation method for geometric transformations.
Expressing Lines Using Homogeneous Coordinates
- Lines can also be expressed using homogeneous coordinates. The line is denoted as l^sim , where multiplying a free vector (e.g., abc) with an augmented vector results in the line equation: ax + by + c = 0.
- All points satisfying this equation lie on the line; thus, it can be expressed through inner products involving homogeneous vectors.
Normalization and Special Cases of Lines
- Line equations remain valid under scaling transformations due to their equivalence class nature. Normalizing line vectors provides geometric meaning—where n represents a normal vector perpendicular to the line and d indicates distance from origin.
- The line at infinity is defined as l^sim_infty = (0, 0, 1), passing through all ideal points. It cannot be normalized due to division by zero constraints inherent to its definition.
Cross Product Properties
Understanding Homogeneous Coordinates and Their Applications
Introduction to Skew Symmetric Matrices
- The skew symmetric matrix is defined such that multiplying it with vector b yields the cross product of vectors a and b, where elements are represented as a_1, a_2, a_3 and b_1, b_2, b_3 .
Distinguishing Between Vectors and Matrices
- Square brackets are used for matrices while non-squared brackets denote vectors in this course. This distinction helps clarify the representation of mathematical objects.
Intersection of Lines in Homogeneous Coordinates
- The intersection point in homogeneous coordinates can be expressed using the cross product of two line vectors.
- The relationship between lines and points can be compactly represented; the proof is straightforward and left as an exercise.
Example: Finding Intersection Points
- For lines characterized by equations y = 1 and x = 2 , their corresponding line vectors are derived.
- Computing the cross product of these line vectors results in an intersection point at (2, 1), confirming its correctness.
Parallel Lines and Ideal Points
- When dealing with parallel lines, the intersection point calculated is (0, -1, 0), indicating that they intersect at infinity since the last element is zero.
Relationship Between Points at Infinity and Lines
- The inner product between a point at infinity and a line at infinity confirms that they indeed lie on each other.
Transitioning from Linear to Polynomial Equations
- Homogeneous coordinates facilitate expressing more complex algebraic objects like conic sections through quadratic equations.
Types of Conic Sections
- Depending on how the plane intersects with a cone (defined by matrix Q), different shapes emerge: circles, ellipses, parabolas, or hyperbolas.
Resources for Further Study
- Recommended reading includes "Hartley and Zisserman" for deeper insights into multiview geometry related to camera calibration.
Extending Concepts to 3D Points
- Similar principles apply when transitioning from 2D to 3D points using homogeneous coordinates within projective space P3.
Representation of 3D Planes
Understanding 3D Geometry and Transformations
The Concept of Normals and Planes
- The normal vector is perpendicular to the plane, with 'd' representing the distance to the origin. An exception exists for the plane at infinity, which includes all ideal points where tildew = 0 .
- A coordinate system illustrates a plane at distance 'd', with a normalized normal vector 'n' corresponding to that plane.
Representing 3D Lines
- Expressing points on a line as a linear combination of two points (p and q) results in using six parameters for four degrees of freedom, which is inefficient.
- Alternative representations include two-plane parametrization and Plücker coordinates, detailed further in Silicy and Hartley’s work.
Understanding Quadric Surfaces
- The 3D analog of 2D conics are quadric surfaces represented by quadratic equations in homogeneous coordinates.
- Quadrics are significant in multi-view geometry studies and serve as modeling primitives for scene understanding through compact representations.
Applications of Super Quadrics
- Super quadrics generalize quadric surfaces to represent geometric objects using simpler parts like cuboids, aiding in shape abstraction and compression.
- This method allows for efficient storage while preserving the dominant semantic meaning of scenes.
Introduction to Transformations
- The discussion transitions into transformations, starting with basic 2D transformations such as translation.
- Translation involves adding a two-dimensional vector to all points; this can also be expressed using homogeneous coordinates via a matrix multiplication approach.
Chaining Transformations
- Using matrices facilitates chaining or inverting transformations easily; this applies not only to translations but also other transformation types like Euclidean similarity.
Exploring Euclidean Transformations
- Euclidean transformations combine translation and rotation, expressed as R x + t , where 'R' is the rotation matrix from SO(2).
- These transformations preserve distances between points post-transformation, unlike affine transformations which do not maintain this property.
Similarity Transformations Explained
Affine Transformations and Degrees of Freedom
Understanding Affine Transformations
- The scaling, rotation, and translation of the original cube have changed, leading to a new hierarchy in affine transformations with six degrees of freedom.
- The transformation includes four degrees of freedom from the rotation matrix and two from translation; overall, it maintains six degrees of freedom.
- While angles are not preserved in this transformation, parallel lines remain parallel post-transformation.
Perspective Transformation
- A perspective transformation allows every point on a square to move to different locations, resulting in eight degrees of freedom for the corners.
- This is represented by a three-by-three homogeneous matrix defined only up to scale, thus having eight degrees instead of nine.
Homogeneous Coordinates
- The principle of homogeneity applies not just to vectors but also matrices; this affects how transformations are represented mathematically.
- Perspective transformations do not preserve parallel lines but ensure that straight lines remain straight after transformation.
Transforming Lines with Homogeneous Representations
Co-vectors and Line Equations
- Co-vectors (lines) can be transformed using homogeneous coordinates combined with line equations for effective representation.
- The transformed line equation can be expressed as l' = H^T cdot l , where H is the transformation matrix.
Projective Transformation Representation
- The action on co-vectors like 2D lines can be represented by the transposed inverse of the original matrix used for point transformations.
Hierarchy of Transformations
Overview of 2D Transformations
- A hierarchy exists among various transformations: translation, Euclidean, similarity, affine, and projective. Each has distinct representations and properties regarding degrees of freedom.
Properties Preservation
- Similarity transformations preserve angles and parallelism while translations maintain orientation along with other properties.
- Projective transformations primarily preserve straight lines without maintaining angles or distances.
Estimating Parameters for Transformations
Focus on Perspective Transformation
Understanding Homography and Its Estimation
Degrees of Freedom in Homography
- The homography transformation has eight degrees of freedom, requiring at least four correspondences between two images to estimate it accurately.
- Each pixel correspondence provides two constraints (x and y coordinates), leading to the need for a minimum of four correspondences, although more can be used for an over-determined system.
Representing Correspondences
- The relationship between points before and after transformation is expressed in homogeneous coordinates, where tildex represents the original point and tildex' the transformed point.
- A cross product formulation is introduced to express the relationship, ensuring that vectors are aligned directionally but not necessarily in magnitude.
Linear System Formulation
- The equation can be rewritten as a linear equation involving the transformation matrix h , allowing for a structured approach to solving for unknowns.
- This leads to a linear system represented as Ah = 0 , where certain rows can be dropped due to linear dependence.
Constructing the Matrix
- Each correspondence yields two equations, which can be stacked into a larger matrix format (e.g., 2 by 9 matrix).
- An over-determined system is formed by combining multiple correspondences into one comprehensive matrix, facilitating better estimation of vector h .
Optimization Problem
- The solution involves minimizing an expression related to squared errors while introducing constraints on h .
- A Lagrangian formulation is utilized with a multiplier that ensures the norm of h equals one, addressing issues of scale ambiguity.
Singular Value Decomposition Approach
- The optimization problem's solution corresponds to the singular vector associated with the smallest singular value from decomposing matrix A .
- This method parallels concepts discussed in PCA but employs singular value decomposition rather than eigenvalue decomposition.
Eigenvalue Problem Context
- Minimizing this expression leads to an eigenvalue problem similar to those encountered in PCA scenarios.
Estimation and Homography in Image Stitching
Defining Correspondences Between Images
- The discussion begins with the concept of estimation applied to systems expressed in homogeneous coordinates, emphasizing the importance of defining correspondences between two images taken from the same viewpoint.
- It is noted that while images must be captured from the same viewpoint, they can be oriented differently by rotating around one's own axes. This flexibility allows for various perspectives in image stitching.
- The process involves identifying specific points (correspondences) on both images, such as notable features on a hill, which are crucial for aligning and stitching the images together into a panorama.
- By determining at least four correspondences between two images, one can utilize Singular Value Decomposition (SVD) to solve for the homography matrix H , which defines how one image can be warped into another's space.