Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 7 0 RecommenderSystem

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.
AI-Machine-Learning

  • AI-Machine-Learning Index

  • Machine Learning Learning Path

  • Machine Learning: Introduction and Core Algorithms

  • Linear Regression Explained: Single Variable and Multivariate Models with Gradient Descent

  • Evaluating a Hypothesis in Neural Networks

  • Bias-Variance Dilemma

  • Cost Function Regularization: Balancing Bias and Variance in Machine Learning Models

  • Polynomial Regression

  • Normal Equation in Linear Regression: Formula, Intuition, and Comparison with Gradient Descent

  • Logistic Regression for Classification: Concept, Sigmoid Function, Cost Function, and Implementation

  • Logistic Regression for Classification: Concept, Sigmoid Function, Cost Function, and Implementation

  • Support Vector Machines (SVM): Maximizing Margins for Robust Machine Learning Models

  • XGBoost (Extreme Gradient Boosting) Explained

  • Dimensionality Reduction in Machine Learning

  • Principal Component Analysis (PCA) Explained

  • t-SNE (t-distributed Stochastic Neighbor Embedding) Explained

  • K-Means Clustering

  • Anomaly Detection: Identifying Rare and Unusual Patterns in Data

  • Anomaly Detection Using Gaussian Distribution in Machine Learning

  • Anomaly Detection Using Multivariate Gaussian Distribution

  • Recommender Systems: Collaborative Filtering, Content-Based Filtering, and Hybrid Approaches

  • Collaborative Filtering: Building Recommender Systems with Feature Learning

  • Anomaly Detection: Identifying Rare and Unusual Patterns in Data

  • Large Scale Machine Learning: Training Models on Massive Datasets

  • Stochastic Gradient Descent (SGD): Efficient Optimization for Large Datasets

  • MapReduce for Large-Scale Machine Learning: Distributed Training at Scale

Cover Image for Recommender Systems: Collaborative Filtering, Content-Based Filtering, and Hybrid Approaches

Recommender Systems: Collaborative Filtering, Content-Based Filtering, and Hybrid Approaches

Comprehensive guide to recommender systems, covering collaborative filtering, content-based filtering, and hybrid approaches, with practical implementation examples and best practices for building effective recommendation engines.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Anomaly Detection Using Multivariate Gaussian Distribution

Next →

Collaborative Filtering: Building Recommender Systems with Feature Learning

Recommender Systems 🍿

A recommender system predicts how users would rate items they have not yet rated.

Example:

  • nun_unu​ = number of users 👤
  • nmn_mnm​ = number of movies 📺

Users rate some movies (1–5 stars), but many ratings are missing.
Goal: predict the missing ratings.


Content-Based Recommendation 🎬

The system recommends items whose content matches the user's preferences.

Predict ratings based on movie features (x(i)x^{(i)}x(i)) and user preferences (θ(j)\theta^{(j)}θ(j)).

Content-Based Recommendation

1. Feature vector (x(i))(x^{(i)})(x(i)) 📺

Each movie has features describing its content.

Each movie is represented by a feature vector:

Example:

  • x1x_1x1​ = romance
  • x2x_2x2​ = action
  • x3x_3x3​ = genre
  • x4x_4x4​ = actors
  • x5x_5x5​ = etc.
x(i)=[1x1x2]x^{(i)} = \begin{bmatrix} 1 \\ x_1 \\ x_2 \end{bmatrix}x(i)=​1x1​x2​​​

The first element is the bias feature:

x0=1x_0 = 1x0​=1

If we have nnn features, then:

x(i)∈Rn+1x^{(i)} \in \mathbb{R}^{n+1}x(i)∈Rn+1

Movie Feature Vector

Example movie features:

Movie Romance (x1x_1x1​) Action (x2x_2x2​)
Love at Last 0.9 0
Romance Forever 1.0 0.01
Swords vs Karate 0 0.9
Nonstop Car Chases 0.1 1.0
Cute Puppies of Love 0.99 0

2. User Preference Model (θ(j))(\theta^{(j)})(θ(j)) 👤

Each user jjj has their own parameter vector:

It represents the user's preferences for features.

θ(j)∈Rn+1\theta^{(j)} \in \mathbb{R}^{n+1}θ(j)∈Rn+1

Example:

  • Alice likes romance → high weight on x1x_1x1​
  • Bob likes action → high weight on x2x_2x2​

User Preference


Rating Prediction (y^(i,j))(\hat{y}^{(i,j)})(y^​(i,j))

Predicted rating of movie iii for user jjj

y^(i,j)=(θ(j))Tx(i)\hat{y}^{(i,j)} = (\theta^{(j)})^T x^{(i)}y^​(i,j)=(θ(j))Tx(i)

y(i,j)y^{(i,j)}y(i,j) = actual rating by user jjj on movie iii (if defined)

where:

  • θ(j)\theta^{(j)}θ(j) = user preference vector
  • x(i)x^{(i)}x(i) = movie feature vector

This is just linear regression.

Example

Movie: Cute Puppies of Love

Feature vector:

x=[10.990]x = \begin{bmatrix} 1 \\ 0.99 \\ 0 \end{bmatrix}x=​10.990​​

Alice's preference vector:

θ(1)=[050]\theta^{(1)} = \begin{bmatrix} 0 \\ 5 \\ 0 \end{bmatrix}θ(1)=​050​​

Prediction y^(i,j)\hat{y}^{(i,j)}y^​(i,j):

y^=(θ(1))Tx\hat{y} = (\theta^{(1)})^T xy^​=(θ(1))Tx

Result:

y^=5×0.99=4.95\hat{y} = 5 \times 0.99 = 4.95y^​=5×0.99=4.95

Predicted rating ≈ 5 stars.


Rating Prediction for single User

Let

  • iii = movie/ item index
  • jjj = user index

Where:

  • m(j)m^{(j)}m(j) = number of movies rated by user jjj

Rating on movie iii by user jjj:

If user have rated movie iii r(i,j)=1r(i,j)=1r(i,j)=1

If user have rated movie iii r(i,j)=0r(i,j)=0r(i,j)=0

Actual Rating of movie iii given by user jjj (if defined)

y(i,j)=3y^{(i,j)} = 3y(i,j)=3

Parameter vector for user jjj

kind of movies liked by user jjj

θ(j)\theta^{(j)}θ(j)

Predicted Rating

For use jjj, movie iii predicted rating

y^(i,j)=(θ(j))Tx(i)\hat{y}^{(i,j)} = (\theta^{(j)})^T x^{(i)}y^​(i,j)=(θ(j))Tx(i)

💰 Cost Function

We want predictions close to actual ratings.

Predicted Rating

y^(i,j)=(θ(j))Tx(i)\hat{y}^{(i,j)} = (\theta^{(j)})^T x^{(i)}y^​(i,j)=(θ(j))Tx(i)

Loss can be calculated as Loss=Predicted Rating−Actual Rating Loss = Predicted \ Rating - Actual\ RatingLoss=Predicted Rating−Actual Rating

Which is equals to

y^(i,j)=(θ(j))Tx(i)−y(i,j)\hat{y}^{(i,j)} = (\theta^{(j)})^T x^{(i)} - y^{(i,j)}y^​(i,j)=(θ(j))Tx(i)−y(i,j)

For user jjj, minimize squared error:

min⁡x12∑(i,j):r(i,j)=1((θ(j))Tx(i)−y(i,j))2+λ2∑k(xk(i))2\min_x \frac{1}{2} \sum_{(i,j):r(i,j)=1} \left( (\theta^{(j)})^T x^{(i)} - y^{(i,j)} \right)^2+ \frac{\lambda}{2} \sum_k (x_k^{(i)})^2xmin​21​(i,j):r(i,j)=1∑​((θ(j))Tx(i)−y(i,j))2+2λ​k∑​(xk(i)​)2

Goal:

  • minimize prediction error
  • regularize features

Cost function

J(θ(j))=12∑i:r(i,j)=1((θ(j))Tx(i)−y(i,j))2+λ2∑k=1n(θk(j))2J(\theta^{(j)}) = \frac{1}{2} \sum_{i:r(i,j)=1} \left((\theta^{(j)})^T x^{(i)} - y^{(i,j)}\right)^2+ \frac{\lambda}{2} \sum_{k=1}^{n}(\theta_k^{(j)})^2J(θ(j))=21​i:r(i,j)=1∑​((θ(j))Tx(i)−y(i,j))2+2λ​k=1∑n​(θk(j)​)2

First Term = Mean Square Error

Second term = regularization (prevents overfitting).

All Users

We learn parameters for all users:

J(θ(1),...,θ(nu))=12∑j=1nu∑i:r(i,j)=1((θ(j))Tx(i)−y(i,j))2+λ2∑j=1nu∑k=1n(θk(j))2J(\theta^{(1)},...,\theta^{(n_u)}) = \frac{1}{2} \sum_{j=1}^{n_u} \sum_{i:r(i,j)=1} \left((\theta^{(j)})^T x^{(i)} - y^{(i,j)}\right)^2+ \frac{\lambda}{2} \sum_{j=1}^{n_u}\sum_{k=1}^{n}(\theta_k^{(j)})^2J(θ(1),...,θ(nu​))=21​j=1∑nu​​i:r(i,j)=1∑​((θ(j))Tx(i)−y(i,j))2+2λ​j=1∑nu​​k=1∑n​(θk(j)​)2

Minimize this to learn all user preferences.


Cost Optimization

Parameters are learned using:

  • Gradient Descent
  • or advanced optimizers (LBFGS, Conjugate Gradient)

Updates look similar to linear regression updates.


Limitation

Need to define features for items (movies).

Requires hand-crafted features for items.

In many real systems:

  • features are missing
  • hard to define

This leads to the next method: Collaborative Filtering described in the next post

← Previous

Anomaly Detection Using Multivariate Gaussian Distribution

Next →

Collaborative Filtering: Building Recommender Systems with Feature Learning

AI-Machine-Learning/7-0-RecommenderSystem
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.