Recommender Systems: Collaborative Filtering, Content-Based Filtering, and Hybrid Approaches
Comprehensive guide to recommender systems, covering collaborative filtering, content-based filtering, and hybrid approaches, with practical implementation examples and best practices for building effective recommendation engines.
Anomaly Detection Using Multivariate Gaussian Distribution
Collaborative Filtering: Building Recommender Systems with Feature Learning
Recommender Systems 🍿
A recommender system predicts how users would rate items they have not yet rated.
Example:
- = number of users 👤
- = number of movies 📺
Users rate some movies (1–5 stars), but many ratings are missing.
Goal: predict the missing ratings.
Content-Based Recommendation 🎬
The system recommends items whose content matches the user's preferences.
Predict ratings based on movie features () and user preferences ().

1. Feature vector 📺
Each movie has features describing its content.
Each movie is represented by a feature vector:
Example:
- = romance
- = action
- = genre
- = actors
- = etc.
The first element is the bias feature:
If we have features, then:

Example movie features:
| Movie | Romance () | Action () |
|---|---|---|
| Love at Last | 0.9 | 0 |
| Romance Forever | 1.0 | 0.01 |
| Swords vs Karate | 0 | 0.9 |
| Nonstop Car Chases | 0.1 | 1.0 |
| Cute Puppies of Love | 0.99 | 0 |
2. User Preference Model 👤
Each user has their own parameter vector:
It represents the user's preferences for features.
Example:
- Alice likes romance → high weight on
- Bob likes action → high weight on

Rating Prediction
Predicted rating of movie for user
= actual rating by user on movie (if defined)
where:
- = user preference vector
- = movie feature vector
This is just linear regression.
Example
Movie: Cute Puppies of Love
Feature vector:
Alice's preference vector:
Prediction :
Result:
Predicted rating ≈ 5 stars.
Rating Prediction for single User
Let
- = movie/ item index
- = user index
Where:
- = number of movies rated by user
Rating on movie by user :
If user have rated movie
If user have rated movie
Actual Rating of movie given by user (if defined)
Parameter vector for user
kind of movies liked by user
Predicted Rating
For use , movie predicted rating
💰 Cost Function
We want predictions close to actual ratings.
Predicted Rating
Loss can be calculated as
Which is equals to
For user , minimize squared error:
Goal:
- minimize prediction error
- regularize features
Cost function
First Term = Mean Square Error
Second term = regularization (prevents overfitting).
All Users
We learn parameters for all users:
Minimize this to learn all user preferences.
Cost Optimization
Parameters are learned using:
Gradient Descent- or advanced optimizers (
LBFGS,Conjugate Gradient)
Updates look similar to linear regression updates.
Limitation
Need to define features for items (movies).
Requires hand-crafted features for items.
In many real systems:
- features are missing
- hard to define
This leads to the next method: Collaborative Filtering described in the next post
