We'll be using only python and its official GUI, tkinter (so no official 3D engine will be used like OpenGL(PyOpenGL, PyGame) or Direct3D).

Everything here is available on the Internet but it is time-consuming to gather all the info.

## Spaces

#### Object/Model Space(3D)

We design our model (e.g. a cube) in the model's coordinate system(CS). The center of the object is the origin.

The object can be scaled, rotated here. This is the model's local CS.

See the image below (World-Space): the cube's local CS is the one defined by the red axes.

#### World Space(3D)

Let's say that we designed a cube in model-space. We want to populate our world by many cubes, so we take the cube from model space and put it into our world (the one defined by the blue axes).

For example we put the cube in our world so that the cube's center is at (2, 5, -3).

Now we want to put another cube (the same one from Model-Space) but now we scale it by 2 and put it at (-4, -4, 2).

Finally we want to put a cube into our world that is standing only on one of its edges: we rotate the cube 45 degrees around the Z-axis and then put it into our world at e.g. (4, 4, 4).

These actions (rotations, scaling and "put") can be achieved by applying transformations.

#### View/Camera Space(3D)

This is relative to the camera (i.e what we see from our world). They say that the camera is at (0,0,0) and is looking down the negative Z axis (in a right-handed CS) but it seems that it is at (0,0,1). The camera-position is also called "the eye"-position.

We treat the camera as if it would be a model. We apply the same transformations as above in case of the cube but now we need to invert everything.

If we want our camera to be at (0, 0, 2) in our world then we need to use (0, 0, -2) translation in our transformation because we want the world to come to us.

#### Screen Space(2D)

By mapping the 3D View-Space to 2D we will be in Screen-Space.

## Coordinate System handedness

3D cartesian coordinate system (X, Y, Z axes [plural of axis]):

Left-handed (Direct3D) +Z is into the screen towards the wall behind the monitor, Right-handed (OpenGL) +Z is out of the screen towards us.

To convert between the two: just negate the Z-coordinate.

We'll be using the right-handed CS like OpenGL.

## Vector

A point can be represented in 3D-space (in Cartesian coordinate system) by its three coordinates: P(x, y, z)

Vector v=[v1, v2, v3] points from the origin(where the three axes intersect) to point P, v[1.15, 1.35, 0.9].

A vector shows a direction and has a magnitude.

#### Homogeneous Coordinates

Point P(x, y, z, 1) is in homogeneous coordinates. The last (the fourth) coordinate is called w.

In order to be able to multiply a 4*4 matrix (see later) we need this fourth coordinate: v = [x, y, z, w].

If after the multiplication(i.e. transformation) the transformed w' is not 1 then we will do:

x' = x'/w'

It is important that only a homogeneous point with w=1 can be used as a cartesian-point.

We won't code w into the Vector3D class. We will only assume it.

## Matrix

A 4*4 matrix:

Simply put, a matrix is a two dimensional array (first index is the row number and the second one is the column). If the number of the rows is equal to that of the columns then we have a square (or quadratic) matrix. A matrix can be e.g. 3*5 (3 rows * five columns) too. In 3D programming only 4*4 matrices are used.

#### Row-major vs Column--major Matrices

"Majorness" means how the vectors are aligned in a matrix.

In a __column-major__ matrix the components of a vector are in one column (v, u, f are vectors):

v1 u1 f1

M = v2 u2 f2

v3 u3 f3

If we want to multiply this M matrix with a vector a=[a1, a2, a3]: (M*a)

a1

a2

a3

v1 u1 f1 x1

v2 u2 f2 x2

v3 u3 f3 x3

The resulting vector is x(x1, x2, x3)

x1 = v1*a1+u1*a2+f1*a3

__Column-major__ matrices are multiplied from the right (vector "a" is on the right).

In case of __column-major__ matrices the order of the transformations(see later):

P' = Rx*Ry*Rz*T*P

where Rx, Ry and Rz are the rotation matrices, T is the translation matrix, P is the point to transform and P' is the transformed point.

In a __row-major__ matrix the components of a vector are in one row (v, u, f are vectors): (a*M)

v1 v2 v3

M = u1 u2 u3

f1 f2 f3

If we want to multiply this M matrix with a vector a=[a1, a2, a3]:

a1 a2 a3

v1 v2 v3

x1 x2 x3 u1 u2 u3

f1 f2 f3

The resulting vector is x(x1, x2, x3)

x1 = a1*v1+a2*u1+a3*f1

x2 = a1*v2+a2*u2+a3*f2

x3 = a1*v3+a2*u3+a3*f3

__Row-major__ matrices are multiplied from the left (vector "a" is on the left).

In case of __row-major__ matrices the order of the transformations(see later):

P' = P*T*Rz*Ry*Rx

where Rx, Ry and Rz are the rotation matrices, T is the translation matrix, P is the point to transform and P' is the transformed point.

A matrix need to be transposed(reflect its elements on its diagonal) if want to change between left and right majorness (i.e. multiplication).

Matrix-layout (row- or column-major) matters only when the user sets or gets the items of a matrix by indexing.

In case of e.g. multiplication we know in the code that the components of the matrix-vectors need to be multiplied with the vector.

We'll be using column-major matrices like OpenGL.

## Scale Matrix

Example in 2D (scaling by 1/2 both in case of X and Y):

## Shear Matrix

SHxy =

SHxz =

SHyz =

SH = SHxy*SHxz*SHyz

Example in 2D:

## Rotation Matrix

Rx =

Ry =

Rz =

R = Rx*Ry*Rz

Example in 2D:

The order of the rotations matter:

Rxy = Rx*Ry is different from Rxy = Ry*Rx.

Change the signs of the sines in all three matrices in order to get Left-handed coordinate system.

"Right Hand Rule" for rotations: grasp axis with right hand with thumb oriented in positive direction, fingers will then curl in direction of positive rotation for that axis.

## Translation Matrix

T =

The offset (p, q, r) will be added to the point to be translated. Where p is x, q is y and r is z.

Example in 2D:

## Projection Matrix

I coudn't find an image of a column-major projection matrix so this is a row-major one (need to be transposed).

There can be orthogonal and perspective projections. This is a perspective projection matrix.

P =

The points are projected on the plane which is 1 unit away from the camera.

Where fov (field of view) is usually between 30 and 90 degrees. fov here is the horizontal angle of the screen.

f: zFar (far clipping plane), can be e.g. 100.0

n: zNear (near clipping plane), can be e.g. 0.1

Note that this matrix didn't work.

However it works if we change it like this:

P[(3, 3)] = 1.0 (0 to 1.0)

Other variants of this matrix can be found in the code. They take the aspect ratio of the viewport (screen) into account so a cube will be a cube even if the screen is not square-shaped. It's worth experimenting with them.

Note that the projection matrix will project to [-1; 1], so first we will get rid of all the points that are out of the screen (viewing-volume or viewing frustum):

This image shows a left-handed CS, so the Z-axis need to be reflected on the XY-plane.

We work with polygons (e.g. a face of the cube). What we will do is to check every vertex (or point) of a polygon and if at least one is in the viewing-volume then we will draw the whole polygon. So we won't do clipping.

inviewingvolume = False

For every vertex of the polygon:

Let P'(x', y', z') be a projected point of the polygon.

if ((-1.0 <= x' <= 1.0) and (-1.0 <= y' <= 1.0) and (-1.0 <= z' <= 1.0))

inviewingvolume = True

break

If inviewingvolume is true then we will draw the polygon.

This is important because it will be faster because we will only draw the polygons that are visible.

Not to mention that without this a polygon behind us would be drawn on the screen (x and y in range but z out of range).

Now we know if a polygon need to be drawn. The next step: the projected points need to be scaled on screen. The direction of Y need to be negated because according to screen coordinates Y is zero at the top.

The simplest is to use a toScreen matrix:

WIDTH/2 0 0 WIDTH/2

toScreen = 0 -HEIGHT/2 0 HEIGHT/2

0 0 1 0

0 0 0 1

Here WIDTH is the width of the screen and HEIGHT is the height of the screen (e.g. 640*480). Note that toScreen[(1,1)] negates the Y coordinate and toScreen[(0,3)] and toScreen[(1,3)] are the offsets.

Check the "toScreenCoords" function in the code for details.

## Transformations

Transformations need to be applied in this order: Scaling, rotating, translating and projecting.

Let's forget about scaling and shearing for a moment and let T be the transformation matrix:

| Rx1 Ry1 Rz1 0 | | 0 0 0 X |

T = R*Tr = | Rx2 Ry2 Rz2 0 | * | 0 0 0 Y |

| Rx3 Ry3 Rz3 0 | | 0 0 0 Z |

| 0 0 0 1 | | 0 0 0 1 |

where R=Rx*Ry*Rz and Tr is the translation matrix.

What do the vectors mean in T? T is a 4*4 column-major matrix. Its first 3 dimensional vectors(3*3 submatrix) contain the rotated X, Y and Z axes. The first vector is called Right, the second one is called Up and the third one is called Forward. The rotated X, Y and Z axes respectively. The 3rd column of the inverted rotation matrix contains the vector the camera is facing. However inverting a matrix is slow.

The last vector (column) contains the translation in the rotated coordinate system:

| R1 U1 F1 Tx |

T = | R2 U2 F2 Ty |

| R3 U3 F3 Tz |

| 0 0 0 1 |

(Tx, Ty, Tz) are the projections of the point(X, Y, Z) on the rotated axes. The rotated translation. They are the dot products.

## Visibility Algorithms

Polygons can be in the screen, outside of the screen or partly in the screen.

1. If the polygon is entirely out of screen (viewing volume) then it needn't be rendered. That is easy to implement (see the Projection-Matrix-section) and it is called __viewing frustum culling__.

2. If the polygon is only partly in the screen then __clipping__ should be applied (i.e. cutting the polgon into two parts, draw only the visible part and throw away the part that is outside of the screen).

3. If the polygon is fully in the screen then it can be visible or hidden by other polygons (maybe only partly).

In case of 2. and 3. when there are several polygons in the screen we need to decide in what order to draw them (which one is behind the other one).

See __Painter's algorithm__, __BSP-trees__ and __Z-Buffering__ below. During the drawing, the polygons of an object that face away from us needn't be drawn (See __Backface Culling__ below).

#### Painter's Algorithm

This is the simplest one. Draw the polygons according to their distance from the viewer. First draw the farthest one and the closest one last.

Compute the average Z in case of every polygon and use that.

However overlapping polygons can cause the algorithm to fail.

#### BSP-Trees

It solves the problem of the overlapping polygons of the Painter's Algorithm.

However it is more complicated.

#### Z-Buffering

This is pixel-level and slow unless HW-based. This gives the best result.

There are other algorithms as well.

## Backface Culling

Polygons (or faces) of an object needn't be drawn even if in the screen, if they face away from us because the other faces of the object will hide them.

We can skip drawing these polygons in order to have better performance.

To achieve this we can use surface-normals or a area of the polygon (or the sum of the edges).

#### Surface Normal

First we must calculate the normal-vectors of the polygons(faces or surfaces). Next we have to calculate our view-vector (the direction the camera is facing). This just means to compute the vector from the camera-position to one of the vertices of the polygon.

By computing the dot product between these two vectors we will get the angle between them. Depending on the sign of the dot product the polygon is visible or invisible.

This has been implemented in tut5.

#### Area of Polygon (or Sum of Edges)

Our polygons need to be drawn either in clockwise or counter-clockwise order.

This means that we should create the vertices of a polygon(or face) of an object as if we would stand facing that polygon.

We can chose either clockwise or counter-clockwise but all of our polygons need to be defined in that order.

There is a simple algorithm to get to know if a vertices of a polygon is in clockwise or counter-clockwise order (it works only for convex polygons):

Sum over the edges (x2-x1)*(y2+y1):

Point[0] = (5, 0) edge[0]: (6-5)*(4+0) = 4

Point[1] = (6, 4) edge[1]: (4-6)*(5+4) = -18

Point[2] = (4, 5) edge[2]: (1-4)*(5+5) = -30

Point[3] = (1, 5) edge[3]: (1-1)*(0+5) = 0

Point[4] = (1, 0) edge[4]: (5-1)*(0+0) = 0

--------

-44

If our polygons were drawn in a clockwise order then this -44 means that this projected polygon is in a counter-clockwise order, so it shouldn't be drawn because it cannot be seen.

if (x2-x1)*(y2+y1) > 0 ===> Clockwise

Note that maybe in a left-handded CS this needs to be reversed:

if (x2-x1)*(y2+y1) < 0 ===> Clockwise

This has been implemented in tut2.

For the "area of the polygon"-method see the link to Paul Burke's article below.

## Example

#### tut1

Needs both tut1.py and matrix.py

To start:

Red: +X

Green: +Y

Blue: +Z

We can rotate around the X and Y axes by pressing the left mouse-button and moving the mouse (dragging).

If we press the left-control key too, then we will rotate around the Z axis.

There are the lines in the code:

self.ang = [0.0, 0.0, 0.0]

self.trans = [0.0, 0.0, 0.0]

ang contains the angles in degrees to rotate (x, y, z). if set to e.g. [0.0, 45.0, 0.0] then an initial rotation will take place around the Y axis.

trans contains the translation (x, y, z). If set to e.g. (0, 0, -2) then the camera will be put to (0, 0, 2) so the cube will be farther away.

#### tut2

Needs both tut2.py and matrix.py

To start:

python ./tut2.py

It works the same way as tut1 (rotations, self.ang and self.trans)

Added backface-culling and the cube is filled. Changed to toScreen-matrix.

If we change the line in isPolygonFrontFace in the code:

return summa > 0.0

to

then it will be frontface-culling

#### tut3

Needs both tut3.py and matrix.py

To start:

python ./tut3.py

It works the same way as tut2 (rotations, self.ang and self.trans).

Changed to another projection matrix (they say that it's OpenGL's; takes the aspect ratio of the screen into account).

A circle in 3D. We draw a circle in the xy-plane by rotating one initial point and connecting the dots with lines.

#### Tut4

Needs both tut4.py and matrix.py

To start:

python ./tut4.py

It works the same way as tut3 (rotations, self.ang and self.trans) but scaled by 1/2.

A sphere starting from an icosahedron. Recursion-level can be changed in the create() of the IcoSphere class.

See the link to Andreas Kahler's webpage for details.

#### Tut5

Needs both tut5.py and matrix.py

To start:

python ./tut5.py

self.ang and self.trans are in Camera-CS.

Backface culling with surface normals.

Collision detection added. There are six walls (planes) but they are not shown.

We can move with the arrow keys (Up: forward, Down: backward, Left: turn left, Right: turn right), 'u': upward and 'd': downward. Press 'Esc' to quit.

The position, rotation and forward-vector of the camera are shown.

#### Clipping

A cube can be moved with the mouse. Cohen-Sutherland line-clipping algorithm in 3D.

Change "PLANE = 1" to e.g. "PLANE = 0.8" to see better.

To start:

python ./clipping.py

Double buffering is missing and we should think about having our objects only form triangles or from any kinds of polygons.

#### The wepages that helped me the most:

Scratchapixel, Basic lessons, Geometry:

Scratchapixel, Advanced lessons, Perspective Projection Matrix:

Collision detection:

How to draw a 3D circle:

How to draw a sphere:

OpenGL:

robert dot pluto at gmail dot com