Algebra for Notation and Geometry
Brief overview of matrix and vector notation, including size, transpose, inverse, determinant, multiplication, sets of numbers and vectors, vector norms, and transformations in the context of machine learning.
Linear Algebra
Linear algebra is the language of space manipulation.
Machine learning is controlled geometric transformation.
Why Linear Algebra Matters in ML
Machine learning deals with:
- Multiple features
- Large datasets
- Efficient computation
- Vectorized operations
Instead of writing nested loops, we use matrix operations to compute predictions and updates efficiently.
This is why multivariate linear algebra is essential.
Think of ML as:
-
Data → points in high-dimensional space
-
Features → axes
-
Models → transformations
-
Training → adjusting geometry
-
Loss → distance between vectors
-
Optimization → walking downhill in space
-
Gradient descent becomes directional movement
-
Regularization becomes shrinking vector norms
-
Overfitting becomes high-dimensional distortion
-
RAG embeddings become spatial similarity
-
Attention becomes weighted projection
Mathematical object
A mathematical object is an abstract concept which can be a value that can be assigned to a symbol, and therefore can be involved in formulas.
- Examples numbers, expressions, shapes, functions, and sets.
- Complex Objects: theorems, proofs.

Tensor
Algebraic object that describes a multilinear relationship between sets of algebraic objects associated with a vector space.
- Latin: tendere meaning 'to stretch'
Scaler()
Scalars are real numbers used in linear algebra
- A single number, a 0-dimensional tensor.
- Example: , , ,
Matrices()
A matrix is a 2D array of numbers or table of numbers.
Representation:
In mathematics:
- Where is a real-valued matrix with 4 rows and 2 columns.
In programming:
A = np.array([[85, 76, 66, 5],
[94, 75, 18, 28],
[68, 40, 71, 5]])
In theory:
- Uppercase letters (A, B, X) → Matrices
- Lowercase letters (x, y, z) → Vectors or scalars
Matrix Size
- A ∈ ℝᵐˣⁿ or A (m × n)
→ Matrix A has m rows and n columns
Example:
If A is 3 × 2, it has 3 rows and 2 columns.
Dimension:
Where:
- = rows
- = columns
- Example matrix in the example
Square Matrix
- A matrix with the same number of rows and columns ().
Element notation:
- The element in the row and column.
Example:
- = 85 → Row 1, Column 1
- → Row 3, Column 2
- → Row 4, Column 1
- → Row 2, Column 3
- → 6th row, 4th column does not exist
Use in Machine Learning
Represents Data Matrix, Model Parameters, Transformations
If we have:
- training examples
- features
The data matrix is:
Dimension where:
- Each row = one training example
- Each column = one feature
Vectors()
A vector is a Matrix with 1 Column
- Represents A point in high-dimensional space
- Latin: vector , meaning "carrier" or "driver"
- Have A direction () & A magnitude
Represent as:
In Maths
-
ℝ → Set of real numbers
Example: 0, −0.642, 2, 3.456 -
ℝ² → Set of 2-dimensional vectors
Example:
-
ℝⁿ → Set of n-dimensional vectors
-
v ∈ ℝ² → Vector v belongs to ℝ²
In Programming
y = np.array([460, 232, 315, 178])
In Theory:
- Uppercase letters (A, B, X) → Matrices
Dimension:
In ML, vectors represent:
- A data point → a vector
- A feature column → a direction
- A model weight vector → a direction of best fit
Example
- 4 × 1 matrix Or a 4-dimensional vector
Element Indexing
= i-th element.
- In mathematics, indexing usually starts at 1.
- In programming indexing often starts at 0.
- Unless otherwise specified, assume
one-indexed notationin linear algebra.
Example:
🔹 Vector Norms
- ‖v‖₁ → L1 norm
- ‖v‖₂, ‖v‖ → L2 norm (Euclidean norm)
Transpose ()
Transpose swaps rows and columns.
- Aᵀ → Transpose of matrix A
- vᵀ → Transpose of vector v
Transpose flips rows into columns.
If: then:
Element-wise:
- A column vector becomes a row vector.
Given:
Then:
Used heavily in:
- Normal Equation
- Gradient derivations
Identity Matrix ()
The identity matrix is the matrix equivalent of the number 1.
It is a square matrix with:
- 1’s on the diagonal
- 0’s everywhere else
Property:
Inverse Matrix()
The inverse of a matrix is like division.
- Only square matrices can have inverses.
Matrix inverse satisfies:
Used in Normal Equation:
Not all square matrices are invertible.
1. Invertible/ non-singular Matrix
A matrix can be inverted
- it has an inverse if it is full rank (rows and columns are linearly independent).
2. Non-Invertible/ Singular Matrix/ Degenerate Matrix
A matrix that does not have an inverse
- Does not have a inverse because it is not full rank (rows or columns are linearly dependent).
Cause for non invertible Matrix:
- Redundant feature: two feature related by a linear equation x2 = kx1 eg: size in feet and meter
- More feature than training set(m<=n)): delete some feature or use regularization
Octave method for inverting matrix:
- pinv(A) : Pseudo Inverse, calculates inverse even if matrix is non invertible
- inv(A) : Inverse
Determinant ( )
The determinant tells us whether a matrix is invertible.
For a 2 × 2 matrix:
If: : The matrix is invertible.
If: : The matrix is singular (not invertible).
- Either no solution
- Or infinitely many solution
Use in Machine Learning:
- Normal Equation requires matrix inversion.
Closed-form solution:
- In practice, we use numerical methods to avoid instability of matrix inversion.
- Regularization can help make matrices invertible by adding a small value to the diagonal (Ridge Regression).
A matrix is a transformation of space.
All machine learning models are compositions of transformations.
If:
Then transforms vector into a new vector .
🔹 Transformations
-
T : ℝ² → ℝ³
→ T maps vectors from 2D space to 3D space -
T(v) = w
→ Vector v ∈ ℝ² is transformed into w ∈ ℝ³
Geometrically, a matrix can:
- Stretch
- Compress
- Rotate
- Reflect
- Shear
- Project
Matrix Addition/Subtraction
When Is Addition Allowed?
Addition is done element by element.
Two matrices can be added only if they have the same dimensions.
If:
then:
where
is also an ) matrix.
Subtraction is the same but with minus signs.
Scalar Multiplication / Division
Scalar multiplication is multiplying every element of a matrix by a single number (scalar).
Matrix-Matrix Multiplication
Element-wise: (sum over k)
- AB → Matrix multiplication of A and B
(Valid only if inner dimensions match)
Given 2 Matrices:
- is or
- is or
Then:
where is a new matrix with dimensions:
- or
- inner dimensions must match (n) :
Properties:
- Not commutative: : Order matters
- Associative:
Use in Machine Learning
Everything in deep learning is matrix multiplication:
- Inputs × Weights
- Weights × Activations
- Gradient updates
Neural network forward pass:
Backpropagation is also matrix calculus.
Understanding multivariate linear algebra makes deep learning much easier to grasp.
Vectorization: Matrix-Vector Multiplication
If:
- is Matrix
- is Vector
Then
- Produces Vector
Use in Machine Learning
- This gives predictions for all training examples in one operation.
- Faster computation: optimized hardware usage (CPU/GPU)
- Clean mathematical formulation
Linear regression hypothesis:
For all training examples:
Dot Product()
The dot product is defined between two vectors of the same dimension.
If:
Then their dot product is:
It produces a single number (a scalar).
- u · v or ⟨u, v⟩ → Dot product of vectors
Dot product formula:
Summary
Key ideas:
- Vectors represent features and parameters
- Matrices represent datasets
- Matrix multiplication enables fast prediction
- Transpose and inverse enable optimization
- Vectorization is essential for performance
