The destination space (screen space ) for the vertex shader, is more precisely called projection space. The visible part of projection space is the unit-radius cube from (–1, –1, –1) to (1, 1, 1). Anything outside of this cube gets clipped and thrown out. The x and y axes map across the viewport, the part of the screen in which any rendered output will be displayed, with (–1, –1, z) corresponding to the lower left corner, (1, 1, z) to the upper right, and (0, 0, z) to the center. The rasterizer uses the z coordinate to assign a depth value to every fragment it generates; if the framebuffer has a depth buffer, these depth values can be compared against the depth values of previously rendered fragments, allowing parts of newly-rendered objects to be hidden behind objects that have already been rendered into the framebuffer. (x, y, –1) is the near plane and maps to the nearest depth value. At the other end, (x, y, 1) is the far plane and maps to the farthest depth value. Fragments with z coordinates outside of that range get clipped against these planes just like they do the edges of the screen.
Projection space is computationally convenient for the GPU, but it's not very usable by itself for modeling vertices within a scene. Rather than input projection-space vertices directly to the pipeline, most programs use the vertex shader to project objects into it. The pre-projection coordinate system used by the program is called world space, and can be moved, scaled, and rotated relative to projection space in whatever way the program needs. Within world space, objects also need to move around, changing position, orientation, size, and shape. Both of these operations, mapping world space to projection space and positioning objects in world space, are accomplished by performing transformations with mathematical structures called matrices.
Linear transformations are operations on an object that preserve the relative size and orientation of parts within the object while uniformly changing its overall size or orientation. They include rotation, scaling, and shearing. If you've ever used the "free transform" tool in Photoshop or GIMP, these are the sorts of transformations it performs. You can think of a linear transformation as taking the x, y, and z axes of your coordinate space and mapping them to a new set of arbitrary axes x', y', and z':
For clarity, the figure is two-dimensional, but the same idea applies to 3d. To represent a linear transformation numerically, we can take the vector values of those new axes and arrange them into a 3×3matrix. We can then perform an operation called matrix multiplication to apply a linear transformation to a vector, or to combine two transformations into a single matrix that represents the combined transformation. In standard mathematical notation, matrices are represented so that the axes are represented as columns going left-to-right. In GLSL and in the OpenGL API, matrices are represented as an array of vectors, each vector representing a column in the matrix. In source code, this results in the values looking transposed from their mathematical notation. This is called column-major order (as opposed to row-major order, in which each vector element of the matrix array would be a row of the matrix). GLSL provides 2×2, 3×3, and 4×4 matrix types named mat2 through mat4. It also overloads its multiplication operator for use between matn values of the same type, and between matns and vecns, to perform matrix-matrix and matrix-vector multiplication.
A nice property of linear transformations is that they work well with the rasterizer's linear interpolation. If we transform all of the vertices of a triangle using the same linear transformation, every point on its surface will retain its relative position to the vertices, so textures and other varying values will transform with the vertices they fill out.
Note that all linear transformations occur relative to the origin, that is, the (0, 0, 0) point of the coordinate system, which remains constant through a linear transformation. Because of this, moving an object around in space, called translation in mathematical terms, is not a linear transformation, and cannot be represented with a 3×3 matrix or composed into other 3×3 linear transform matrices. We'll see how to integrate translation into transformation matrices shortly. For now, let's try some linear transformations:
We'll start by writing a shader that spins our rectangle around the zaxis. Using the timer uniform value as a rotation angle, we'll construct a rotation matrix, using the sin and cos functions to rotate our matrix axes around the unit circle. The shader looks like this;
uniform float timer; attribute vec4 position; varying vec2 texcoord; varying float fade_factor; void main() { mat3 rotation = mat3( vec3( cos(timer), sin(timer), 0.0), vec3(-sin(timer), cos(timer), 0.0), vec3( 0.0, 0.0, 1.0) ); gl_Position = vec4(rotation * position.xyz, 1.0); }
You probably noticed that the rectangle appears to be horizontally distorted as it rotates. This is because our window is wider than it is tall, so the screen distance covered along a unit on the x axis of projection space is longer than the distance the same unit would cover along the y axis. The window is 400 pixels wide and 300 pixels high, giving it an aspect ratio of 4:3 (the width divided by the height). (This will change if we resize the window, but we won't worry about that for now.) We can compensate for this by applying a scaling matrix that scales the x axis by the reciprocal of the aspect ratio, as in below shader
mat3 window_scale = mat3( vec3(3.0/4.0, 0.0, 0.0), vec3( 0.0, 1.0, 0.0), vec3( 0.0, 0.0, 1.0) ); mat3 rotation = mat3( vec3( cos(timer), sin(timer), 0.0), vec3(-sin(timer), cos(timer), 0.0), vec3( 0.0, 0.0, 1.0) ); gl_Position = vec4(window_scale * rotation * position.xyz, 1.0); texcoord = position.xy * vec2(0.5) + vec2(0.5); fade_factor = sin(timer) * 0.5 + 0.5;
Note that the order in which we rotate and scale is important. Unlike scalar multiplication, matrix multiplication is noncommutative: Changing the order of the arguments gives different results. This should make intuitive sense: "rotate an object, then squish it horizontally" gives a different result from "squish an object horizontally, then rotate it". As matrix math, you write transformation sequences out right-to-left,
backwards compared to English: scale * rotate * vector rotates the vector first, whereas rotate * scale * vector scales first.
Now that we've compensated for the distortion of our window's projection space, we've revealed a dirty secret. Our input rectangle is really a square, and it doesn't match the aspect ratio of our image, leaving it scrunched. We need to scale it again outward, this time before we rotate, as in below shader
mat3 window_scale = mat3( vec3(3.0/4.0, 0.0, 0.0), vec3( 0.0, 1.0, 0.0), vec3( 0.0, 0.0, 1.0) ); mat3 rotation = mat3( vec3( cos(timer), sin(timer), 0.0), vec3(-sin(timer), cos(timer), 0.0), vec3( 0.0, 0.0, 1.0) ); mat3 object_scale = mat3( vec3(4.0/3.0, 0.0, 0.0), vec3( 0.0, 1.0, 0.0), vec3( 0.0, 0.0, 1.0) ); gl_Position = vec4(window_scale * rotation * object_scale * position.xyz, 1.0);
(Alternately, we could change our vertex array and apply a scaling transformation to our generatedtexcoords. But I promised we wouldn't be changing the C anymore in this chapter.)
With this shader, our rectangle now rotates the way we would expect it to:
The window_scale matrix conceptually serves a different purpose from the rotation and object_scale matrices. While the latter two(rotation & object_scale) matrices set up our input vertices to be where we want them in world space, the window_scale serves to project world space into projection space in a way that gives an undistorted final render. Matrices used to orient objects in world space, like our rotation and object_scale matrices, are called model-view matrices, because they are used both to transform models and to position them relative to the viewport. The matrix we use to project, in this case window_scale, is called the projection matrix. Although both kinds of matrix behave the same, and the line drawn between them is mathematically arbitrary, the distinction is useful because a 3d application will generally only need a few projection matrices that change rarely (usually only if the window size or screen resolution changes). On the other hand, there can be countless model-view matrices for all of the objects in a scene, which will update constantly as the objects animate.
Model Matrix
rotation
View Matrix
object_scale
Projection Matrix
window_scale
Projecting with a scaling matrix, as we're doing here, produces an orthographic projection, in which objects in 3d space are rendered at a constant scale regardless of their distance from the viewport. Orthographic projections are useful for rendering two-dimensional display elements, such as the UI controls of a game or graphics tool, and in modeling applications where the artist needs to see the exact scales of different parts of a model, but they don't adequately present 3d scenes in a way most viewers expect. To demonstrate this, let's break out of the 2d plane and alter our shader to rotate the rectangle around the x axis, as in
const mat3 projection = mat3( vec3(3.0/4.0, 0.0, 0.0), vec3( 0.0, 1.0, 0.0), vec3( 0.0, 0.0, 1.0) ); mat3 rotation = mat3( vec3(1.0, 0.0, 0.0), vec3(0.0, cos(timer), sin(timer)), vec3(0.0, -sin(timer), cos(timer)) ); mat3 scale = mat3( vec3(4.0/3.0, 0.0, 0.0), vec3( 0.0, 1.0, 0.0), vec3( 0.0, 0.0, 1.0) ); gl_Position = vec4(projection * rotation * scale * position.xyz, 1.0);
With an orthographic projection, the rectangle doesn't very convincingly rotate in 3d space—it just sort of accordions up and down. This is because the top and bottom edges of the rectangle remain the same apparent size as they move toward and away from the view. In the real world, objects appear smaller in our field of view proportional to how far from our eyes they are. This effect is called perspective, and transforming objects to take perspective into account is called perspective projection. Perspective projection is accomplished by shrinking objects proportionally to their distance from the "eye". An easy way to do this is to divide each point's position by some function of its z coordinate. Let's arbitrarily decide that zero on the z axis remains unscaled, and that points elsewhere on the z axis scale by half their distance from zero. Correspondingly, let's also scale the z axis by half, so that the end of the rectangle coming toward us doesn't get clipped to the near plane as it gets magnified. We'll end up with the shader code innaive-perspective-rotation.v.glsl:
const mat3 projection = mat3( vec3(3.0/4.0, 0.0, 0.0), vec3( 0.0, 1.0, 0.0), vec3( 0.0, 0.0, 0.5) ); mat3 rotation = mat3( vec3(1.0, 0.0, 0.0), vec3(0.0, cos(timer), sin(timer)), vec3(0.0, -sin(timer), cos(timer)) ); mat3 scale = mat3( vec3(4.0/3.0, 0.0, 0.0), vec3( 0.0, 1.0, 0.0), vec3( 0.0, 0.0, 1.0) ); vec3 projected_position = projection * rotation * scale * position.xyz; float perspective_factor = projected_position.z * 0.5 + 1.0; gl_Position = vec4(projected_position/perspective_factor, 1.0);
Now the overall shape of the rectangle appears to rotate in perspective, but the texture mapping is all kinky. This is because perspective projection is a nonlinear transformation—different parts of the rectangle get scaled differently depending on how far away they are. This interferes with the linear interpolation the rasterizer applies to the texture coordinates across the surface of our triangles. To properly project texture coordinates as well as other varying values in perspective, we need a different approach that takes the rasterizer into account.
Reference: