This post provides a high level (but non-rigorous) overview, or a map of concepts/terms to rejog ones memory. Inspired from here
Vector: Think 2d or 3d points, but can be more general objects. They live in a "vector space" if they follow certain rules
Scalar: a real number
Vector space: a (generally infinite) set of vectors, and some rules (Addition, scalar multiplication). See here for all 8 properties
Linear combination: Given a set of n vectors, and n scalars, multiply each vector by their corresponding scalar, and add them up
Span: Given a set of n vectors, all the vectors that can be generated by linear combinations (by changing the scalars) (or the smallest vector space containing these n vectors).
Linear dependence/independence: Given n vectors, if none of them can be written as a linear combination of others, the collection is linearly independent
Basis: a set of vectors that are linearly independent and spans the entire vector space
subspace: A smaller/subset vector space within a bigger one. Eg a line in a 2D plane. Must contain zero vector (else it wouldn't be a vector space itself).
Rank: The number of basis vectors needed
In a 2D plane,
Basis vectors are generally called i, j
2 colinear points cannot span the whole plane (they are linearly dependent). They can only span a single line, a subspace. Or worst case, they may both be zeros, in which case they span just a point
Any 2 non-colinear vectors can form a basis of the plane
Examples of vector spaces: n-dimensional coordinate systems, matrices, polynomials (finite or infinite)
Linear transformation: f(au + bv) = af(u) + bf(v), for all vectors u, v and all scalars a, b
Matrix-vector multiplication (as a linear transformation): Consider a 2D plane. if we have basis vectors u, v, a matrix defines the new location of u -> f(u), v -> f(v). Therefore any point on the old coordinates au + bv can be transformed as f(au + bv) or af(u) + b(v) hence a linear transformation.
Matrix-matrix multiplication (as function composition): Getting a single equivalent transformation to replace 2 transformations applied one after the other. First matrix describes where i, j will land up after the first transformation. Now we want to know where column 1 and column 2 will land up after the second transformation, we do that through matrix-vector multiplications. So mat-mat mul is 2 mat-vec mul side by side
Column space: Span of columns of your transformation matrix
Null space/kernel: The subspace that maps to 0 after the transformation is applied. Obviously for a full rank transformation only 0 will map to 0. So this measures "information loss"
Dimensional transformations: We could also transform from 2D to 3D vectors (3x2 matrix) or squish from 3D to 2D (2x3 matrix) (clearly this will not be full-rank). These would result in non square matrices
Orthonormal transformations: Starting with unit/orthogonal vectors, the transformed basis vectors are still orthogonal and unit. Think rotation matrices
The ith column of the matrix describes where the ith basis vector lands.
If we want to rotate by 90 degrees, i (1, 0) -> (0, -1) and j (0, 1) -> (1, 0), so the matrix/linear transformation is [[0, -1], [1, 0]]
If the linear transformation have linearly dependent columns, then it will squish all points into a line (creating a subspace)
Why is (AB)C = A(BC)?
Extend this understanding from 2D to 3D and we see why we need a 3x3 matrix to describe linear transformations
In the vector space of polynomials represented by basis {1, x, x*x}, derivative serves as a linear transformation. The mappings are the following (do you see why, and can you generalize it to n>2):
(1,0,0) -> (0,0,0)
(0,1,0) -> (1,0,0)
(0,0,1)-> (0,2,0)
For a full-rank transformation, why would 0 map to 0 only?
A change in basis should be trivial when thought of as a transformation. What if someone is using a coordinate system whose x axis is [2,3] and y axis is [-1, 2]? That just means canonical x axis [1,0] maps to [2,3] and canonical y axis [0,1] maps to [-1,2] thus giving us a transformation matrix, which we can then use to translate any point from canonical coordinates to the new coordinate. For the opposite direction we'd use inverse matrix. See this.
Determinants: The ratio of areas covered by basis vectors before and after the transformation. For example i, j covers area=1, but if we scale both axis by 2 ([[2,0],[0,2]]), we now cover an area of 4. Note that the area of any possible space will also same in the same ratio as the unit square after the transformation.
Negative determinants: mean the area is "flipped" over. Orientation of basis vectors have changed
Hence if determinant is 0, we are dropping from a 2D space to a 1D space
In polynomial space the transformation formed by derivatives have zero determinant, because they are non-invertible (the constant vanishes). Rank decreases by 1
Note the link between negative determinant values and "right hand rules" of cross product etc
Why is det(AB) = det(A)det(B)?
A transformation with determinant=0 must map inputs to a lower rank subspace
Must satisfy:
Bi-Linearity: <aw + bx, cy + dz> = a<w,cy + dz> + b<x,cy + dz>=ac<w,y>+ad<w,z>+bc<x,y>+bd<x,z>
Symmetry: <f,g> = <g,f>
Positive definiteness: <f,f> >= 0. Equality iff f=0
Measures how aligned 2 vectors are
Through the lens of a transformation into a subspace (or Duality): If we have a vector v (a, b), which we want to project on u (x, y), then we can also think of it as a transformation into a 1D subspace (given by u). Thinking in terms of transformations, the projection of i and j on u is simply x and y. By linearity of transformations then the projection of v on u is ax + by. Thus dot products are transformations into 1D subspaces
Dot product of a vector and a basis vector gives us the scalar contribution of that basis vector
Vector spaces might not have a unique dot product. For example in the vector space of polynomials.
Treat the coefficients as a vector and perform regular dot product.
Integral inner product (think fourier transforms)
Note that the 2 dot products are a result of thinking of polynomials being represented by 2 sets of basis vectors
Only applicable to special vector spaces like R^3, R^7
A vector orthogonal to both inputs and magnitude equalling area of parallelogram enclosed by 2 vectors along with orientation for sign determination
So related to determinant (signed volume). But note determinants are defined for any R^n, but cross products only for R^3 / R^7
Please check this out
Linear equations: Ax=v, we are asking what vector x when transformed through A becomes v. Possible cases:
determinant != 0: Then there exists a unique inverse transformation back
determinant = 0: we have lost information, cant transform back
Either no solution, or infinitely many solutions
null space computation: Solve for Ax = 0
Check Gaussian elimination, Row echelon form, Cramer's rule etc