Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 12 RecommenderSystem

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🦥 Sloths can hold their breath longer than dolphins 🐬.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Cover Image for Recommender Systems: Collaborative Filtering, Content-Based Filtering, and Hybrid Approaches

Recommender Systems: Collaborative Filtering, Content-Based Filtering, and Hybrid Approaches

Comprehensive guide to recommender systems, covering collaborative filtering, content-based filtering, and hybrid approaches, with practical implementation examples and best practices for building effective recommendation engines.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Anomaly Detection Using Gaussian Distribution: Detecting Outliers with Probability Models

Next →

Collaborative Filtering: Building Recommender Systems with Feature Learning

Recommender Systems

A recommender system predicts how users would rate items they have not yet rated.

Example:

  • nun_unu​ = number of users
  • nmn_mnm​ = number of movies

Users rate some movies (1–5 stars), but many ratings are missing.
Goal: predict the missing ratings.


Content-Based Idea

Each movie has features describing its content.

Example features:

  • x1x_1x1​ = romance level
  • x2x_2x2​ = action level

Example movie features:

Movie Romance (x1x_1x1​) Action (x2x_2x2​)
Love at Last 0.9 0
Romance Forever 1.0 0.01
Swords vs Karate 0 0.9
Nonstop Car Chases 0.1 1.0
Cute Puppies of Love 0.99 0

Each movie is represented by a feature vector:

x(i)=[1x1x2]x^{(i)} = \begin{bmatrix} 1 \\ x_1 \\ x_2 \end{bmatrix}x(i)=​1x1​x2​​​

The first element is the bias feature:

x0=1x_0 = 1x0​=1

If we have nnn features, then:

x(i)∈Rn+1x^{(i)} \in \mathbb{R}^{n+1}x(i)∈Rn+1

User Preference Model

Each user jjj has their own parameter vector:

θ(j)∈Rn+1\theta^{(j)} \in \mathbb{R}^{n+1}θ(j)∈Rn+1

It represents the user's preferences for features.

Example:

  • Alice likes romance → high weight on x1x_1x1​
  • Bob likes action → high weight on x2x_2x2​

Rating Prediction

Predicted rating of user jjj for movie iii:

y^(i,j)=(θ(j))Tx(i)\hat{y}^{(i,j)} = (\theta^{(j)})^T x^{(i)}y^​(i,j)=(θ(j))Tx(i)

This is just linear regression.

Example

Movie: Cute Puppies of Love

Feature vector:

x=[10.990]x = \begin{bmatrix} 1 \\ 0.99 \\ 0 \end{bmatrix}x=​10.990​​

Alice's preference vector:

θ(1)=[050]\theta^{(1)} = \begin{bmatrix} 0 \\ 5 \\ 0 \end{bmatrix}θ(1)=​050​​

Prediction:

y^=(θ(1))Tx\hat{y} = (\theta^{(1)})^T xy^​=(θ(1))Tx

Result:

y^=5×0.99=4.95\hat{y} = 5 \times 0.99 = 4.95y^​=5×0.99=4.95

Predicted rating ≈ 5 stars.


Cost Function

Single User

Let

  • jjj = user index
  • iii = movie/ item index

Rating

  • r(i,j)=1r(i,j)=1r(i,j)=1 if user jjj rated on movie iii, default = 000
  • y(i,j)y^{(i,j)}y(i,j) = rating by user jjj on movie iii (if defined)

θ(j)\theta^{(j)}θ(j)

parameter vector for user j

  • kind of movies liked by user jjj

Predicted Rating

For use jjj, movie iii predicted rating

(θ(j))Tx(i)(\theta^{(j)})^T x^{(i)}(θ(j))Tx(i)

For user jjj, minimize squared error:

Where:

  • m(j)m^{(j)}m(j) = number of movies rated by user jjj
J(θ(j))=12∑i:r(i,j)=1((θ(j))Tx(i)−y(i,j))2+λ2∑k=1n(θk(j))2J(\theta^{(j)}) = \frac{1}{2} \sum_{i:r(i,j)=1} \left((\theta^{(j)})^T x^{(i)} - y^{(i,j)}\right)^2+ \frac{\lambda}{2} \sum_{k=1}^{n}(\theta_k^{(j)})^2J(θ(j))=21​i:r(i,j)=1∑​((θ(j))Tx(i)−y(i,j))2+2λ​k=1∑n​(θk(j)​)2

Second term = regularization (prevents overfitting).

All Users

We learn parameters for all users:

J(θ(1),...,θ(nu))=12∑j=1nu∑i:r(i,j)=1((θ(j))Tx(i)−y(i,j))2+λ2∑j=1nu∑k=1n(θk(j))2J(\theta^{(1)},...,\theta^{(n_u)}) = \frac{1}{2} \sum_{j=1}^{n_u} \sum_{i:r(i,j)=1} \left((\theta^{(j)})^T x^{(i)} - y^{(i,j)}\right)^2+ \frac{\lambda}{2} \sum_{j=1}^{n_u}\sum_{k=1}^{n}(\theta_k^{(j)})^2J(θ(1),...,θ(nu​))=21​j=1∑nu​​i:r(i,j)=1∑​((θ(j))Tx(i)−y(i,j))2+2λ​j=1∑nu​​k=1∑n​(θk(j)​)2

Minimize this to learn all user preferences.


Optimization

Parameters are learned using:

  • Gradient Descent
  • or advanced optimizers (LBFGS, Conjugate Gradient)

Updates look similar to linear regression updates.


Why It's Called Content-Based

Because predictions depend on item features:

  • romance
  • action
  • genre
  • actors
  • etc.

The system recommends items whose content matches the user's preferences.


Limitation

Requires hand-crafted features for items.

In many real systems:

  • features are missing
  • hard to define

This leads to the next method:

Collaborative Filtering.

AI-Machine-Learning/12-RecommenderSystem
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.