Collaborative Filtering: Building Recommender Systems with Feature Learning
Learn how collaborative filtering powers modern recommender systems by simultaneously learning user preferences and item features from rating data. Understand the optimization objective, matrix factorization approach, and how gradient-based methods enable scalable recommendations.
Recommender Systems: Collaborative Filtering, Content-Based Filtering, and Hybrid Approaches
Large Scale Machine Learning: Training Models on Massive Datasets
Collaborative Filtering
Recommender system technique that learns both user preferences and item features automatically from rating data.
Unlike content-based methods, we do not know the features of movies beforehand.
Collaborative filtering learns hidden features of users and items so it can predict missing ratings and recommend things people will likely enjoy.
Why It’s Called Collaborative
Many users rate movies.
Their ratings collaboratively help the system learn features.
Result:
- Better movie representations
- Better recommendations for everyone
Key Idea
People with similar tastes tend to like similar things.
Collaborative filtering simultaneously learns:
- user preferences
- item features
directly from the rating matrix, without manually defining features.
Movies feature Matrix
| Movie | Romance | Action |
|---|---|---|
| Titanic | 0.9 | 0.1 |
| Notebook | 0.95 | 0.05 |
| Avengers | 0.1 | 0.9 |
| John Wick | 0.05 | 0.95 |
where
- = romantic level
- = action level
From this we infer :
- Movie is romantic
- Movie is not action
No Intercept Term
Unlike previous models:
- We remove the intercept feature .
Reason: since the algorithm learns all features automatically, it can learn a constant feature itself if needed.
User Movie Rating Matrix
| User | Titanic | The Notebook | Avengers | John Wick |
|---|---|---|---|---|
| Alice | ⭐⭐⭐⭐⭐ | ⭐ | ⭐ | ⭐ |
| Bob | ⭐⭐⭐⭐ | ? |
⭐ | ⭐ |
| Carol | ⭐ | ⭐ | ⭐ | ⭐⭐⭐⭐⭐ |
Prediction of user rating movie :
User Preferences Matrix
| User | Likes Romance | Likes Action |
|---|---|---|
| Alice | 0.95 |
0.05 |
| Bob | 0.85 |
0.15 |
| Carol | 0.05 |
0.95 |
- These features are not manually defined.
- The algorithm learns them from ratings.
Observation
- Alice and Bob have similar taste
- Both dislike action movies
- Both like romantic movies
So if Bob has not rated The Notebook, we can predict:
- Bob will probably rate it highly.
The algorithm uses behavior of other users to predict what someone will like.
Learning Movie Features
If user parameters are known, we can learn movie features .
Minimize prediction error:
Where:
- = actual rating
- if rating exists
- = regularization
Learning All Movie Features
For all movies:
Collaborative Filtering Algorithm
Chicken-and-Egg Problem
Previously we saw two ideas:
- If movie features are known, we can learn user parameters .
- If user parameters are known, we can learn movie features .
Instead of alternating between them, collaborative filtering learns both simultaneously.
1. Initialize randomly.
Initialize with small random values:
2. Minimize the cost function
- Estimate movie features: Fix , learn
- Estimate user preferences: Fix , learn
- Repeat until convergence.
If we:
- fix and minimize w.r.t. , we recover the user learning problem.
- fix and minimize w.r.t. , we recover the movie feature learning problem.
Instead of alternating between them, we optimize both together.
Minimize cost with:
- Gradient Descent
- Advanced optimizers (e.g., Conjugate Gradient, L-BFGS)
We combine both learning problems into a single cost function.
Where:
- = rating user gave movie
- if rating exists, otherwise
- = feature vector for movie
- = parameter vector for user
This objective:
- penalizes prediction error
- regularizes user parameters
- regularizes movie features
3. Rating Prediction
Once the model is trained, predicted rating:
If user has not rated movie , we predict their rating using this value.
4. Result
The algorithm learns:
- movie feature vectors
- user preference vectors
from the rating matrix alone, without manually defining movie features.
