Loading ⏳

Fetching content, this won’t take long…

💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.

Loading ⏳

Fetching content, this won’t take long…

💡 Did you know?

🦈 Sharks existed before trees 🌳.

AI-Math

AI-AgenticAI

AI-DeepLearning

AI-GenAI

AI-Infrastructure

AI-Machine-Learning

AI-Math

AWS

Azure

Hobbies

kubernetes

Management

Programming

Terraform

Z_Appendix

0-root

AI-Math

Algebra for Notation and Geometry

Brief overview of matrix and vector notation, including size, transpose, inverse, determinant, multiplication, sets of numbers and vectors, vector norms, and transformations in the context of machine learning.

Linear Algebra

Machine Learning

Multivariate Linear Algebra

Vectors

Matrices

Geometry

← Previous

Advance Maths for Machine Learning

️Advance MultiVariant Linear Algebra

Linear Algebra

Linear algebra is the language of space manipulation.

Machine learning is controlled geometric transformation.

Why Linear Algebra Matters in ML

Machine learning deals with:

Multiple features
Large datasets
Efficient computation
Vectorized operations

Instead of writing nested loops, we use matrix operations to compute predictions and updates efficiently.

This is why multivariate linear algebra is essential.

Think of ML as:

Data → points in high-dimensional space
Features → axes
Models → transformations
Training → adjusting geometry
Loss → distance between vectors
Optimization → walking downhill in space
Gradient descent becomes directional movement
Regularization becomes shrinking vector norms
Overfitting becomes high-dimensional distortion
RAG embeddings become spatial similarity
Attention becomes weighted projection

Mathematical object

A mathematical object is an abstract concept which can be a value that can be assigned to a symbol, and therefore can be involved in formulas.

Examples numbers, expressions, shapes, functions, and sets.
Complex Objects: theorems, proofs.

Tensor

Algebraic object that describes a multilinear relationship between sets of algebraic objects associated with a vector space.

Latin: tendere meaning 'to stretch'

Scaler( $s \in \mathbb{R}$ )

Scalars are real numbers used in linear algebra

A single number, a 0-dimensional tensor.
Example: $5$ , $-3.14$ , $\pi$ , $e$

Matrices( $A \in \mathbb{R}^{n \times m}$ )

A matrix is a 2D array of numbers or table of numbers.

$A = \begin{bmatrix} 85 & 76 & 66 & 5 \\ 94 & 75 & 18 & 28 \\ 68 & 40 & 71 & 5 \end{bmatrix}$

Representation:

In mathematics:

$A \in \mathbb{R}^{3 \times 4}$

Where $A$ is a real-valued matrix with 4 rows and 2 columns.

In programming:

A = np.array([[85, 76, 66, 5],
              [94, 75, 18, 28],
              [68, 40, 71, 5]])

In theory:

Uppercase letters (A, B, X) → Matrices
Lowercase letters (x, y, z) → Vectors or scalars

Matrix Size

A ∈ ℝᵐˣⁿ or A (m × n)
→ Matrix A has m rows and n columns

Example:
If A is 3 × 2, it has 3 rows and 2 columns.

Dimension:

$(m \times n)$

Where:

$m$ = rows
$n$ = columns
Example $3\times4$ matrix in the example

Square Matrix

A matrix with the same number of rows and columns ( $m = n$ ).

Element notation:

$A_{ij}$

The element in the $i-th$ row and $j-th$ column.

Example:

$A_{11}$ = 85 → Row 1, Column 1
$A_{32} = 40$ → Row 3, Column 2
$A_{41} = 5$ → Row 4, Column 1
$A_{23} = 18$ → Row 2, Column 3
$A_{64} = undefined$ → 6th row, 4th column does not exist

Use in Machine Learning

Represents Data Matrix, Model Parameters, Transformations

If we have:

$m$ training examples
$n$ features

The data matrix is:

X = \begin{bmatrix} --- x^{(1)} --- \\ --- x^{(2)} --- \\ \vdots \\ --- x^{(m)} --- \end{bmatrix}

Dimension $m \times n$ where:

Each row = one training example
Each column = one feature

Vectors( $\vec{x}$ )

A vector is a Matrix with 1 Column

Represents A point in $n$ high-dimensional space
Latin: vector , meaning "carrier" or "driver"
Have A direction ( $\vec{x}$ ) & A magnitude

$x = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}$

Represent as:

In Maths

$y \in \mathbb{R}^4$

ℝ → Set of real numbers
Example: 0, −0.642, 2, 3.456
ℝ² → Set of 2-dimensional vectors

Example:

v = \begin{bmatrix} 1 \\ 3 \end{bmatrix}

ℝⁿ → Set of n-dimensional vectors
v ∈ ℝ² → Vector v belongs to ℝ²

In Programming

y = np.array([460, 232, 315, 178])

In Theory:

Uppercase letters (A, B, X) → Matrices

Dimension: $n \times 1$

In ML, vectors represent:

A data point → a vector
A feature column → a direction
A model weight vector → a direction of best fit

Example

$y = \begin{bmatrix} 460 \\ 232 \\ 315 \\ 178 \end{bmatrix}$

4 × 1 matrix Or a 4-dimensional vector

Element Indexing

$y_i$ = i-th element.

In mathematics, indexing usually starts at 1.
In programming indexing often starts at 0.
Unless otherwise specified, assume one-indexed notation in linear algebra.

Example:

$y_1 = 460$
$y_2 = 232$
$y_3 = 315$
$y_4 = 178$

🔹 Vector Norms

‖v‖₁ → L1 norm

\|v\|_1 = \sum |v_i|

‖v‖₂, ‖v‖ → L2 norm (Euclidean norm)

\|v\|_2 = \sqrt{\sum v_i^2}

Transpose ( $\mathbf{x}^T$ )

Transpose swaps rows and columns.

Aᵀ → Transpose of matrix A
vᵀ → Transpose of vector v

Transpose flips rows into columns.

If: $A \in \mathbb{R}^{m \times n}$ then: $A^T \in \mathbb{R}^{n \times m}$

Element-wise:

$(A^T)_{ij} = A_{ji}$

A column vector becomes a row vector.

Given:

A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}

Then:

A^T = \begin{bmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{bmatrix}

Used heavily in:

Normal Equation
Gradient derivations

Identity Matrix ( $I$ )

The identity matrix is the matrix equivalent of the number 1.

It is a square matrix with:

1’s on the diagonal
0’s everywhere else

I = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}

Property:

AI = IA = A

Inverse Matrix( $A^{-1}$ )

The inverse of a matrix is like division.

Only square matrices can have inverses.

Matrix inverse satisfies:

A^{-1}A = AA^{-1} = I

Used in Normal Equation:

\theta = (X^T X)^{-1} X^T y

Not all square matrices are invertible.

1. Invertible/ non-singular Matrix

A matrix can be inverted

it has an inverse if it is full rank (rows and columns are linearly independent).

2. Non-Invertible/ Singular Matrix/ Degenerate Matrix

A matrix that does not have an inverse

Does not have a inverse because it is not full rank (rows or columns are linearly dependent).

Cause for non invertible Matrix:

Redundant feature: two feature related by a linear equation x2 = kx1 eg: size in feet and meter
More feature than training set(m<=n)): delete some feature or use regularization

Octave method for inverting matrix:

pinv(A) : Pseudo Inverse, calculates inverse even if matrix is non invertible
inv(A) : Inverse

Determinant ( $det(A)$ )

The determinant tells us whether a matrix is invertible.

For a 2 × 2 matrix:

A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}

\det(A) = ad - bc

If: $det(A) \neq 0$ : The matrix is invertible.

If: $\det(A) = 0$ : The matrix is singular (not invertible).

Either no solution
Or infinitely many solution

Use in Machine Learning:

Normal Equation requires matrix inversion.

Closed-form solution:

\theta = (X^T X)^{-1} X^T y

In practice, we use numerical methods to avoid instability of matrix inversion.
Regularization can help make matrices invertible by adding a small value to the diagonal (Ridge Regression).

A matrix is a transformation of space.

All machine learning models are compositions of transformations.

If:

y = Ax

Then $A$ transforms vector $x$ into a new vector $y$ .

🔹 Transformations

T : ℝ² → ℝ³
→ T maps vectors from 2D space to 3D space
T(v) = w
→ Vector v ∈ ℝ² is transformed into w ∈ ℝ³

Geometrically, a matrix can:

Stretch
Compress
Rotate
Reflect
Shear
Project

Matrix Addition/Subtraction

When Is Addition Allowed?

Addition is done element by element.

Two matrices can be added only if they have the same dimensions.

If:

$A, B \in \mathbb{R}^{m \times n}$

then:

$C = A + B$ where $(A + B)_{ij} = A_{ij} + B_{ij}$

is also an $( m \times n$ ) matrix.

A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \quad B = \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix}

A + B = \begin{bmatrix} 1+5 & 2+6 \\ 3+7 & 4+8 \end{bmatrix} = \begin{bmatrix} 6 & 8 \\ 10 & 12 \end{bmatrix}

Subtraction is the same but with minus signs.

Scalar Multiplication / Division

Scalar multiplication is multiplying every element of a matrix by a single number (scalar).

Matrix-Matrix Multiplication

Element-wise: $C_{ij} = A_{ik}B_{kj}$ (sum over k)

AB → Matrix multiplication of A and B
(Valid only if inner dimensions match)

Given 2 Matrices:

$A$ is $(m \times n)$ or $A \in \mathbb{R}^{m \times n}$
$B$ is $(n \times p)$ or $B \in \mathbb{R}^{n \times p}$

Then:

C = AB

where $C$ is a new matrix with dimensions:

$C(m \times p)$ or $C\in \mathbb{R}^{m \times p}$
inner dimensions must match (n) : $(m \times n)(n \times p) \rightarrow (m \times p)$

Properties:

Not commutative: $AB \ne BA$ : Order matters
Associative: $(AB)C = A(BC)$

Use in Machine Learning

Everything in deep learning is matrix multiplication:

Inputs × Weights
Weights × Activations
Gradient updates

Neural network forward pass: $Z = WX + b$

Backpropagation is also matrix calculus.

Understanding multivariate linear algebra makes deep learning much easier to grasp.

`Vectorization`: Matrix-Vector Multiplication

If:

$X$ is $m \times n$ Matrix
$\theta$ is $n \times 1$ Vector

Then

h = X\theta

Produces $h (m \times 1)$ Vector

Use in Machine Learning

This gives predictions for all training examples in one operation.
Faster computation: optimized hardware usage (CPU/GPU)
Clean mathematical formulation

Linear regression hypothesis:

h_\theta(x) = \theta^T x

For all training examples:

h = X\theta

Dot Product( $a.b$ )

The dot product is defined between two vectors of the same dimension.

If:

x, y \in \mathbb{R}^n

Then their dot product is:

x^T y

x = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} \quad y = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix}

x^T y = x_1 y_1 + x_2 y_2 + \dots + x_n y_n

It produces a single number (a scalar).

u · v or ⟨u, v⟩ → Dot product of vectors

Dot product formula:

u \cdot v = \sum_{i=1}^{n} u_i v_i

Summary

Key ideas:

Vectors represent features and parameters
Matrices represent datasets
Matrix multiplication enables fast prediction
Transpose and inverse enable optimization
Vectorization is essential for performance

Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Advance Maths for Machine Learning

️Advance MultiVariant Linear Algebra

AI-Math/1-1-Algebra

Loading ⏳

Fetching content, this won’t take long…

💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.

AI-Math

AI-AgenticAI

AI-DeepLearning

AI-GenAI

AI-Infrastructure

AI-Machine-Learning

AI-Math

AWS

Azure

Hobbies

kubernetes

Management

Programming

Terraform

Z_Appendix

0-root

AI-Math

Algebra for Notation and Geometry

Brief overview of matrix and vector notation, including size, transpose, inverse, determinant, multiplication, sets of numbers and vectors, vector norms, and transformations in the context of machine learning.

Linear Algebra

Machine Learning

Multivariate Linear Algebra

Vectors

Matrices

Geometry

← Previous

Advance Maths for Machine Learning

️Advance MultiVariant Linear Algebra

Linear Algebra

Linear algebra is the language of space manipulation.

Machine learning is controlled geometric transformation.

Why Linear Algebra Matters in ML

Machine learning deals with:

Multiple features
Large datasets
Efficient computation
Vectorized operations

Instead of writing nested loops, we use matrix operations to compute predictions and updates efficiently.

This is why multivariate linear algebra is essential.

Think of ML as:

Data → points in high-dimensional space
Features → axes
Models → transformations
Training → adjusting geometry
Loss → distance between vectors
Optimization → walking downhill in space
Gradient descent becomes directional movement
Regularization becomes shrinking vector norms
Overfitting becomes high-dimensional distortion
RAG embeddings become spatial similarity
Attention becomes weighted projection

Mathematical object

A mathematical object is an abstract concept which can be a value that can be assigned to a symbol, and therefore can be involved in formulas.

Examples numbers, expressions, shapes, functions, and sets.
Complex Objects: theorems, proofs.

Tensor

Algebraic object that describes a multilinear relationship between sets of algebraic objects associated with a vector space.

Latin: tendere meaning 'to stretch'

Scaler( $s \in \mathbb{R}$ )

Scalars are real numbers used in linear algebra

A single number, a 0-dimensional tensor.
Example: $5$ , $-3.14$ , $\pi$ , $e$

Matrices( $A \in \mathbb{R}^{n \times m}$ )

A matrix is a 2D array of numbers or table of numbers.

$A = \begin{bmatrix} 85 & 76 & 66 & 5 \\ 94 & 75 & 18 & 28 \\ 68 & 40 & 71 & 5 \end{bmatrix}$

Representation:

In mathematics:

$A \in \mathbb{R}^{3 \times 4}$

Where $A$ is a real-valued matrix with 4 rows and 2 columns.

In programming:

A = np.array([[85, 76, 66, 5],
              [94, 75, 18, 28],
              [68, 40, 71, 5]])

In theory:

Uppercase letters (A, B, X) → Matrices
Lowercase letters (x, y, z) → Vectors or scalars

Matrix Size

A ∈ ℝᵐˣⁿ or A (m × n)
→ Matrix A has m rows and n columns

Example:
If A is 3 × 2, it has 3 rows and 2 columns.

Dimension:

$(m \times n)$

Where:

$m$ = rows
$n$ = columns
Example $3\times4$ matrix in the example

Square Matrix

A matrix with the same number of rows and columns ( $m = n$ ).

Element notation:

$A_{ij}$

The element in the $i-th$ row and $j-th$ column.

Example:

$A_{11}$ = 85 → Row 1, Column 1
$A_{32} = 40$ → Row 3, Column 2
$A_{41} = 5$ → Row 4, Column 1
$A_{23} = 18$ → Row 2, Column 3
$A_{64} = undefined$ → 6th row, 4th column does not exist

Use in Machine Learning

Represents Data Matrix, Model Parameters, Transformations

If we have:

$m$ training examples
$n$ features

The data matrix is:

X = \begin{bmatrix} --- x^{(1)} --- \\ --- x^{(2)} --- \\ \vdots \\ --- x^{(m)} --- \end{bmatrix}

Dimension $m \times n$ where:

Each row = one training example
Each column = one feature

Vectors( $\vec{x}$ )

A vector is a Matrix with 1 Column

Represents A point in $n$ high-dimensional space
Latin: vector , meaning "carrier" or "driver"
Have A direction ( $\vec{x}$ ) & A magnitude

$x = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}$

Represent as:

In Maths

$y \in \mathbb{R}^4$

ℝ → Set of real numbers
Example: 0, −0.642, 2, 3.456
ℝ² → Set of 2-dimensional vectors

Example:

v = \begin{bmatrix} 1 \\ 3 \end{bmatrix}

ℝⁿ → Set of n-dimensional vectors
v ∈ ℝ² → Vector v belongs to ℝ²

In Programming

y = np.array([460, 232, 315, 178])

In Theory:

Uppercase letters (A, B, X) → Matrices

Dimension: $n \times 1$

In ML, vectors represent:

A data point → a vector
A feature column → a direction
A model weight vector → a direction of best fit

Example

$y = \begin{bmatrix} 460 \\ 232 \\ 315 \\ 178 \end{bmatrix}$

4 × 1 matrix Or a 4-dimensional vector

Element Indexing

$y_i$ = i-th element.

In mathematics, indexing usually starts at 1.
In programming indexing often starts at 0.
Unless otherwise specified, assume one-indexed notation in linear algebra.

Example:

$y_1 = 460$
$y_2 = 232$
$y_3 = 315$
$y_4 = 178$

🔹 Vector Norms

‖v‖₁ → L1 norm

\|v\|_1 = \sum |v_i|

‖v‖₂, ‖v‖ → L2 norm (Euclidean norm)

\|v\|_2 = \sqrt{\sum v_i^2}

Transpose ( $\mathbf{x}^T$ )

Transpose swaps rows and columns.

Aᵀ → Transpose of matrix A
vᵀ → Transpose of vector v

Transpose flips rows into columns.

If: $A \in \mathbb{R}^{m \times n}$ then: $A^T \in \mathbb{R}^{n \times m}$

Element-wise:

$(A^T)_{ij} = A_{ji}$

A column vector becomes a row vector.

Given:

A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}

Then:

A^T = \begin{bmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{bmatrix}

Used heavily in:

Normal Equation
Gradient derivations

Identity Matrix ( $I$ )

The identity matrix is the matrix equivalent of the number 1.

It is a square matrix with:

1’s on the diagonal
0’s everywhere else

I = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}

Property:

AI = IA = A

Inverse Matrix( $A^{-1}$ )

The inverse of a matrix is like division.

Only square matrices can have inverses.

Matrix inverse satisfies:

A^{-1}A = AA^{-1} = I

Used in Normal Equation:

\theta = (X^T X)^{-1} X^T y

Not all square matrices are invertible.

1. Invertible/ non-singular Matrix

A matrix can be inverted

it has an inverse if it is full rank (rows and columns are linearly independent).

2. Non-Invertible/ Singular Matrix/ Degenerate Matrix

A matrix that does not have an inverse

Does not have a inverse because it is not full rank (rows or columns are linearly dependent).

Cause for non invertible Matrix:

Redundant feature: two feature related by a linear equation x2 = kx1 eg: size in feet and meter
More feature than training set(m<=n)): delete some feature or use regularization

Octave method for inverting matrix:

pinv(A) : Pseudo Inverse, calculates inverse even if matrix is non invertible
inv(A) : Inverse

Determinant ( $det(A)$ )

The determinant tells us whether a matrix is invertible.

For a 2 × 2 matrix:

A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}

\det(A) = ad - bc

If: $det(A) \neq 0$ : The matrix is invertible.

If: $\det(A) = 0$ : The matrix is singular (not invertible).

Either no solution
Or infinitely many solution

Use in Machine Learning:

Normal Equation requires matrix inversion.

Closed-form solution:

\theta = (X^T X)^{-1} X^T y

In practice, we use numerical methods to avoid instability of matrix inversion.
Regularization can help make matrices invertible by adding a small value to the diagonal (Ridge Regression).

A matrix is a transformation of space.

All machine learning models are compositions of transformations.

If:

y = Ax

Then $A$ transforms vector $x$ into a new vector $y$ .

🔹 Transformations

T : ℝ² → ℝ³
→ T maps vectors from 2D space to 3D space
T(v) = w
→ Vector v ∈ ℝ² is transformed into w ∈ ℝ³

Geometrically, a matrix can:

Stretch
Compress
Rotate
Reflect
Shear
Project

Matrix Addition/Subtraction

When Is Addition Allowed?

Addition is done element by element.

Two matrices can be added only if they have the same dimensions.

If:

$A, B \in \mathbb{R}^{m \times n}$

then:

$C = A + B$ where $(A + B)_{ij} = A_{ij} + B_{ij}$

is also an $( m \times n$ ) matrix.

A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \quad B = \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix}

A + B = \begin{bmatrix} 1+5 & 2+6 \\ 3+7 & 4+8 \end{bmatrix} = \begin{bmatrix} 6 & 8 \\ 10 & 12 \end{bmatrix}

Subtraction is the same but with minus signs.

Scalar Multiplication / Division

Scalar multiplication is multiplying every element of a matrix by a single number (scalar).

Matrix-Matrix Multiplication

Element-wise: $C_{ij} = A_{ik}B_{kj}$ (sum over k)

AB → Matrix multiplication of A and B
(Valid only if inner dimensions match)

Given 2 Matrices:

$A$ is $(m \times n)$ or $A \in \mathbb{R}^{m \times n}$
$B$ is $(n \times p)$ or $B \in \mathbb{R}^{n \times p}$

Then:

C = AB

where $C$ is a new matrix with dimensions:

$C(m \times p)$ or $C\in \mathbb{R}^{m \times p}$
inner dimensions must match (n) : $(m \times n)(n \times p) \rightarrow (m \times p)$

Properties:

Not commutative: $AB \ne BA$ : Order matters
Associative: $(AB)C = A(BC)$

Use in Machine Learning

Everything in deep learning is matrix multiplication:

Inputs × Weights
Weights × Activations
Gradient updates

Neural network forward pass: $Z = WX + b$

Backpropagation is also matrix calculus.

Understanding multivariate linear algebra makes deep learning much easier to grasp.

`Vectorization`: Matrix-Vector Multiplication

If:

$X$ is $m \times n$ Matrix
$\theta$ is $n \times 1$ Vector

Then

h = X\theta

Produces $h (m \times 1)$ Vector

Use in Machine Learning

This gives predictions for all training examples in one operation.
Faster computation: optimized hardware usage (CPU/GPU)
Clean mathematical formulation

Linear regression hypothesis:

h_\theta(x) = \theta^T x

For all training examples:

h = X\theta

Dot Product( $a.b$ )

The dot product is defined between two vectors of the same dimension.

If:

x, y \in \mathbb{R}^n

Then their dot product is:

x^T y

x = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} \quad y = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix}

x^T y = x_1 y_1 + x_2 y_2 + \dots + x_n y_n

It produces a single number (a scalar).

u · v or ⟨u, v⟩ → Dot product of vectors

Dot product formula:

u \cdot v = \sum_{i=1}^{n} u_i v_i

Summary

Key ideas:

Vectors represent features and parameters
Matrices represent datasets
Matrix multiplication enables fast prediction
Transpose and inverse enable optimization
Vectorization is essential for performance

Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Advance Maths for Machine Learning

️Advance MultiVariant Linear Algebra

AI-Math/1-1-Algebra

Fetching content, this won’t take long…

🦥 Sloths can hold their breath longer than dolphins 🐬.

Fetching content, this won’t take long…

🦈 Sharks existed before trees 🌳.

AI-Math

AI-AgenticAI

AI-DeepLearning

AI-GenAI

AI-Infrastructure

AI-Machine-Learning

AI-Math

AWS

Azure

Hobbies

kubernetes

Management

Programming

Terraform

Z_Appendix

0-root

Algebra for Notation and Geometry

Brief overview of matrix and vector notation, including size, transpose, inverse, determinant, multiplication, sets of numbers and vectors, vector norms, and transformations in the context of machine learning.

Linear Algebra

Linear algebra is the language of space manipulation.

Machine learning is controlled geometric transformation.

Why Linear Algebra Matters in ML

Think of ML as:

Mathematical object

Tensor

Scaler(s∈Rs \in \mathbb{R}s∈R)

Matrices(A∈Rn×mA \in \mathbb{R}^{n \times m}A∈Rn×m)

Representation:

In mathematics:

Matrix Size

Dimension:

Element notation:

Use in Machine Learning

Vectors(x⃗\vec{x}x)

Represent as:

In Maths

In Programming

In Theory:

In ML, vectors represent:

Example

Element Indexing

🔹 Vector Norms

Transpose (xT\mathbf{x}^TxT)

Identity Matrix (III)

Inverse Matrix(A−1A^{-1}A−1)

Not all square matrices are invertible.

1. Invertible/ non-singular Matrix

2. Non-Invertible/ Singular Matrix/ Degenerate Matrix

Cause for non invertible Matrix:

Octave method for inverting matrix:

Determinant ( det(A)det(A)det(A))

Use in Machine Learning:

A matrix is a transformation of space.

🔹 Transformations

When Is Addition Allowed?

Scalar Multiplication / Division

Use in Machine Learning

Vectorization: Matrix-Vector Multiplication

Use in Machine Learning

Linear regression hypothesis:

Dot Product(a.ba.ba.b)

Summary

Written by Hitesh Sahu, a passionate developer and blogger.

Fetching content, this won’t take long…

🦥 Sloths can hold their breath longer than dolphins 🐬.

AI-Math

AI-AgenticAI

AI-DeepLearning

AI-GenAI

AI-Infrastructure

AI-Machine-Learning

AI-Math

AWS

Azure

Hobbies

kubernetes

Scaler( $s \in \mathbb{R}$ )

Matrices( $A \in \mathbb{R}^{n \times m}$ )

Vectors( $\vec{x}$ )

Transpose ( $\mathbf{x}^T$ )

Identity Matrix ( $I$ )

Inverse Matrix( $A^{-1}$ )

Determinant ( $det(A)$ )

`Vectorization`: Matrix-Vector Multiplication

Dot Product( $a.b$ )

Scaler( $s \in \mathbb{R}$ )

Matrices( $A \in \mathbb{R}^{n \times m}$ )

Vectors( $\vec{x}$ )

Transpose ( $\mathbf{x}^T$ )

Identity Matrix ( $I$ )

Inverse Matrix( $A^{-1}$ )