Chapter 4: Geometry and Coordinate Systems

To represent objects in a digital world, we need a mathematical framework to define their position, size, and orientation. This is the role of coordinate systems and geometry.

Euclidean Geometry

Euclidean geometry is the study of plane and solid figures on the basis of axioms and theorems employed by the Greek mathematician Euclid. In computer graphics, it forms the basis for defining shapes using points, lines, and planes.

Point: A zero-dimensional object specifying a position in space.
Line: A one-dimensional object that extends infinitely in both directions.
Plane: A two-dimensional flat surface that extends infinitely.

Cartesian Coordinate Systems

The Cartesian coordinate system is the most common way to represent geometry.

2D Coordinate System

In 2D, a point $P$ is represented by a pair $(x, y)$ , where $x$ and $y$ are its distances from two perpendicular axes.

Standard Screen Coordinates: Typically, the origin $(0, 0)$ is at the top-left corner, with $x$ increasing to the right and $y$ increasing downwards.
Mathematical Coordinates: The origin is at the center, with $x$ increasing to the right and $y$ increasing upwards.

3D Coordinate System

In 3D, a third axis ( $z$ ) is added, representing depth. There are two main types:

Right-Handed System: Standard in mathematics and many 3D tools (e.g., Blender). If you curl your fingers from $x$ to $y$ , your thumb points to $+z$ .
Left-Handed System: Often used in game engines (e.g., DirectX, Unity). $+z$ points into the screen.

Homogeneous Coordinates

Homogeneous coordinates are a system of coordinates used in projective geometry. In computer graphics, they are essential because they allow us to represent translations, rotations, and projections using a single matrix multiplication.

Instead of representing a 2D point as $(x, y)$ , we represent it as $(x, y, w)$ . Similarly, a 3D point $(x, y, z)$ becomes $(x, y, z, w)$ .

For most points, $w = 1$ .
To convert back to Cartesian coordinates, divide each component by $w$ : $(x/w, y/w, z/w)$ .
If $w = 0$ , the point represents a direction or a point at infinity.

Why use them?

In standard Cartesian coordinates, translation is an addition operation, while rotation and scaling are multiplications. By using 4x4 matrices and homogeneous coordinates (adding the $w$ component), translation also becomes a multiplication. This allows us to chain multiple transformations into a single matrix.

Coordinate Spaces in the Rendering Pipeline

In a 3D application, an object's coordinates pass through several "spaces":

Object Space (Local Space): Coordinates relative to the object's origin (e.g., the center of a 3D character).
World Space: Coordinates relative to the entire scene's origin.
View Space (Camera Space): Coordinates relative to the camera's position and orientation.
Clip Space: Coordinates after being projected by the camera's lens.
Screen Space: The final 2D coordinates on the display monitor.

Vectors and Normalization

A vector represents a direction and a magnitude. In graphics, we often use unit vectors (vectors with a magnitude of 1). The process of converting any vector into a unit vector is called normalization.

$\hat{v} = \frac{v}{\|v\|}$

Unit vectors are crucial for lighting calculations, as they simplify dot product results to the cosine of the angle between two directions.

Summary

Geometry and coordinate systems provide the spatial vocabulary for computer graphics. Homogeneous coordinates are the "glue" that allows us to perform complex transformations efficiently using linear algebra. Understanding the transition between different coordinate spaces is the key to building any 3D rendering system.