Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 7 1 Collaborative Filtering

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🍯 Honey never spoils — archaeologists found 3,000-year-old jars still edible.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🐙 Octopuses have three hearts and blue blood.
Cover Image for Collaborative Filtering: Building Recommender Systems with Feature Learning

Collaborative Filtering: Building Recommender Systems with Feature Learning

Learn how collaborative filtering powers modern recommender systems by simultaneously learning user preferences and item features from rating data. Understand the optimization objective, matrix factorization approach, and how gradient-based methods enable scalable recommendations.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

← Previous

Recommender Systems: Collaborative Filtering, Content-Based Filtering, and Hybrid Approaches

Next →

Anomaly Detection: Identifying Rare and Unusual Patterns in Data

Collaborative Filtering 🫱🏻‍🫲🏽

Recommender system technique that learns both user preferences and item features automatically from rating data.

  • Unlike content-based methods, we do not know the features of movies beforehand.

  • Collaborative filtering learns hidden features of users and items so it can predict missing ratings and recommend things people will likely enjoy.

Chicken-and-Egg Problem 🥚

Previously we saw two ideas:

  1. If movie features x(i)x^{(i)}x(i) are known, we can learn user parameters θ(j)\theta^{(j)}θ(j).
  2. If user parameters θ(j)\theta^{(j)}θ(j) are known, we can learn movie features x(i)x^{(i)}x(i).

Instead of alternating between them, collaborative filtering learns both simultaneously.

Why It’s Called Collaborative?

Many users rate movies.

Their ratings collaboratively help the system learn features.

Result:

  • Better movie representations
  • Better recommendations for everyone
flowchart TD
    A[Randomly Initialize User Preferences θ]
    --> B[Learn Movie Features x]
    --> C[Update User Preferences θ]
    --> D[Update Movie Features x]
    --> E[Repeat Until Convergence]

Key Idea

People with similar tastes tend to like similar things.

Collaborative filtering simultaneously learns:

  • user preferences θ\thetaθ
  • item features xxx

directly from the rating matrix, without manually defining features.

flowchart TD
    A[Random User Rate a Movie]
    --> B[Learn User Preferences Vector]
    --> C[Learn Movie Features Vector]
    --> D[Predict & Update Missing Ratings for new Movies]
    --> E[Generate New Recommendations]

Movies feature Matrix x(i)x^{(i)}x(i)

Movie Romance Action
Titanic 0.9 0.1
Notebook 0.95 0.05
Avengers 0.1 0.9
John Wick 0.05 0.95

x(1)=[0.90.1]x^{(1)} = \begin{bmatrix} 0.9 \\ 0.1 \end{bmatrix}x(1)=[0.90.1​]

where

  • x1x_1x1​ = romantic level
  • x2x_2x2​ = action level

From this we infer x(1)x^{(1)}x(1):

  • Movie is romantic
  • Movie is not action

No Intercept Term

Unlike previous models:

  • We remove the intercept feature x0=1x_0 = 1x0​=1.
x(i)∈Rnx^{(i)} \in \mathbb{R}^nx(i)∈Rn θ(j)∈Rn\theta^{(j)} \in \mathbb{R}^nθ(j)∈Rn

Reason:

since the algorithm learns all features automatically, it can learn a constant feature itself if needed.

User Movie Rating Matrix

User Titanic The Notebook Avengers John Wick
Alice ⭐⭐⭐⭐⭐ ⭐ ⭐ ⭐
Bob ⭐⭐⭐⭐ ? ⭐ ⭐
Carol ⭐ ⭐ ⭐ ⭐⭐⭐⭐⭐

Prediction of user jjj rating movie iii:

y^ij=θ(j)Tx(i)\hat{y}_{ij} = \theta^{(j)T} x^{(i)}y^​ij​=θ(j)Tx(i)

User Preferences Matrix θ(j)\theta^{(j)}θ(j)

User Likes Romance Likes Action
Alice 0.95 0.05
Bob 0.85 0.15
Carol 0.05 0.95
  • These features are not manually defined.
  • The algorithm learns them from ratings.

Observation

  • Alice and Bob have similar taste
  • Both dislike action movies
  • Both like romantic movies

So if Bob has not rated The Notebook, we can predict:

  • Bob will probably rate it highly.

The algorithm uses behavior of other users to predict what someone will like.

Learning Movie Features

If user parameters θ(j)\theta^{(j)}θ(j) are known, we can learn movie features x(i)x^{(i)}x(i).

Minimize prediction error:

min⁡x(i)∑j:r(i,j)=1(θ(j)Tx(i)−yij)2+λ2∑k=1n(xk(i))2\min_{x^{(i)}} \sum_{j:r(i,j)=1} \left(\theta^{(j)T}x^{(i)} - y_{ij}\right)^2+ \frac{\lambda}{2}\sum_{k=1}^{n}(x_k^{(i)})^2x(i)min​j:r(i,j)=1∑​(θ(j)Tx(i)−yij​)2+2λ​k=1∑n​(xk(i)​)2

Where:

  • yijy_{ij}yij​ = actual rating
  • r(i,j)=1r(i,j)=1r(i,j)=1 if rating exists
  • λ\lambdaλ = regularization

Learning All Movie Features

flowchart TD
    A[Current Movie Features x]
    --> B[Predict User Ratings]
    --> C[Compute Error]
    --> D[Compute Gradient]
    --> E[Update Features]
    --> F[Better Predictions]

For all movies:

min⁡x(1),...,x(nm)∑i=1nm∑j:r(i,j)=1(θ(j)Tx(i)−yij)2+λ2∑i=1nm∑k=1n(xk(i))2\min_{x^{(1)},...,x^{(n_m)}} \sum_{i=1}^{n_m} \sum_{j:r(i,j)=1} \left(\theta^{(j)T}x^{(i)} - y_{ij}\right)^2+ \frac{\lambda}{2}\sum_{i=1}^{n_m}\sum_{k=1}^{n}(x_k^{(i)})^2x(1),...,x(nm​)min​i=1∑nm​​j:r(i,j)=1∑​(θ(j)Tx(i)−yij​)2+2λ​i=1∑nm​​k=1∑n​(xk(i)​)2
New Feature
=
Old Feature
-
Learning Rate × Gradient

We are updating:

xk(i)x_k^{(i)}xk(i)​

Predicted rating:

(θ(j))Tx(i)(\theta^{(j)})^T x^{(i)}(θ(j))Tx(i)

This is:

User Preferences·Movie Features = Predicted Rating

Error Term

Prediction error

(predicted rating - actual rating)^2
(θ(j))Tx(i)−y(i,j)(\theta^{(j)})^T x^{(i)} - y^{(i,j)}(θ(j))Tx(i)−y(i,j)

where:

  • predicted rating minus
  • actual rating

If:

  • error is large → update more
  • error is small → update less

Regularization (λ\lambdaλ)

λxk(i)\lambda x_k^{(i)}λxk(i)​

Prevents features from becoming too large.

Helps reduce overfitting.


Collaborative Filtering Algorithm

1. Initialize θ\thetaθ randomly.

Initialize with small random values:

x(i),θ(j)x^{(i)}, \theta^{(j)}x(i),θ(j)

We do this

2. Minimize the cost function J(x,θ)J(x,\theta)J(x,θ)

  • Estimate movie features: Fix θ\thetaθ, learn xxx
  • Estimate user preferences: Fix xxx, learn θ\thetaθ
  • Repeat until convergence.

If we:

  • fix xxx and minimize JJJ w.r.t. θ\thetaθ, we recover the user learning problem.
  • fix θ\thetaθ and minimize JJJ w.r.t. xxx, we recover the movie feature learning problem.

Instead of alternating between them, we optimize both together.

Minimize cost with:

  • Gradient Descent
  • Advanced optimizers (e.g., Conjugate Gradient, L-BFGS)
J(x,θ)J(x,\theta)J(x,θ)

We combine both learning problems into a single cost function.

J(x,θ)=12∑(i,j):r(i,j)=1(θ(j)Tx(i)−y(i,j))2+λ2∑j=1nu∑k=1n(θk(j))2+λ2∑i=1nm∑k=1n(xk(i))2J(x,\theta)= \frac{1}{2} \sum_{(i,j):r(i,j)=1} (\theta^{(j)T}x^{(i)} - y^{(i,j)})^2+ \frac{\lambda}{2} \sum_{j=1}^{n_u}\sum_{k=1}^{n}(\theta_k^{(j)})^2+ \frac{\lambda}{2} \sum_{i=1}^{n_m}\sum_{k=1}^{n}(x_k^{(i)})^2J(x,θ)=21​(i,j):r(i,j)=1∑​(θ(j)Tx(i)−y(i,j))2+2λ​j=1∑nu​​k=1∑n​(θk(j)​)2+2λ​i=1∑nm​​k=1∑n​(xk(i)​)2

Where:

  • y(i,j)y^{(i,j)}y(i,j) = rating user jjj gave movie iii
  • r(i,j)=1r(i,j)=1r(i,j)=1 if rating exists, otherwise 000
  • x(i)x^{(i)}x(i) = feature vector for movie iii
  • θ(j)\theta^{(j)}θ(j) = parameter vector for user jjj

This objective:

  • penalizes prediction error
  • regularizes user parameters
  • regularizes movie features

3. Rating Prediction

Once the model is trained, predicted rating:

y^(i,j)=θ(j)Tx(i)\hat{y}^{(i,j)} = \theta^{(j)T}x^{(i)}y^​(i,j)=θ(j)Tx(i)

If user jjj has not rated movie iii, we predict their rating using this value.

4. Result

The algorithm learns:

  • movie feature vectors x(i)x^{(i)}x(i)
  • user preference vectors θ(j)\theta^{(j)}θ(j)

from the rating matrix alone, without manually defining movie features.


AI-Machine-Learning/7-1-Collaborative-Filtering
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.