Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 1 Introduction

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🦈 Sharks existed before trees 🌳.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Cover Image for Machine Learning: Introduction and Core Algorithms

Machine Learning: Introduction and Core Algorithms

Beginner-friendly introduction to machine learning, covering key concepts, model types, supervised and unsupervised learning, and essential algorithms such as linear regression, logistic regression, decision trees, and clustering.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Tue Feb 24 2026

Share This on

← Previous

Retrieval-Augmented Generation (RAG) for AI Applications

Next →

AWS Serverless & Other Services

Machine Learning 🤖

AI

AI is the field of study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals

ML

ML is the study of computer algorithms that improve automatically through experience.

  • ML is Subset of AI
  • Learning from data
  • Improving performance (P) with experience(E) while performing Task (T)

Older definition -- Arthur Samuel (1959)

The field of study that gives computers the ability to learn without being explicitly programmed.

Modern definition -- Tom Mitchell (1998)

A program learns from:

  • E (Experience)- User-labeled emails
  • T (Task) - Classify emails as spam or not spam
  • P (Performance measure) - Fraction of correctly classified emails

If performance on task T, measured by P, improves with experience E, then it is learning.

Use Cases

ML is powerful when:

  • Handling problems too complex to hard-code
  • Finding hidden patterns in large datasets

1. Large Datasets Exist

  • Web Analytics data
  • Medical records
  • Biological data

2. Problems Are Hard to Hand-Code

  • Autonomous Drive
  • Handwriting recognition
  • NLP(Natural Language Processing)
  • Computer vision

3. Self-Customizing Systems

  • Amazon recommendations
  • Netflix recommendations

ML Algos types

1. Supervised Learning

You give the algorithm input data and the correct outputs (“right answers”), and it learns to predict outputs for new inputs.

Training set:

  • You are given labeled data.

(x(1),y(1)),(x(2),y(2)),...,(x(m),y(m)) (x^(1), y^(1)), (x^(2), y^(2)), ..., (x^(m), y^(m))(x(1),y(1)),(x(2),y(2)),...,(x(m),y(m))

Where:

  • x(i)x^{(i)}x(i) is the input (features)

  • y(i)y^{(i)}y(i) is the correct output (label)

  • Goal: Learn a function that maps inputs → outputs.

  • Example: Find a decision boundary separating positive and negative examples.

In supervised learning, we know the correct answers and train a model to predict them.

Example

  • Spam filtering with labeled emails
  • Diabetes classification with labeled patients
  • Cancer Type Prediction

Types

1.1 Regression

Regression means predicting a continuous value output.

Example: Housing Price Prediction (Regression)

Predict the price of a house based on its size.

  • Feature (x): House size (square feet)
  • Output (y): Price (continuous value)

We are given historical data:

Size (sq ft) Price ($)
1000 200000
1500 300000
2000 400000

The algorithm may:

  • Fit a straight line (Linear Regression)
  • Fit a quadratic curve (Polynomial Regression)

Different models may produce different predictions.

1.2 Classification

Classification means predicting a discrete category as output

  • We train using past labeled examples.
  • Only specific categories allowed as output (0 or 1)

Example: Breast Cancer Detection (Classification)

What is the probability this tumor is malignant?

  • Malignant (1)
  • Benign (0)

Using One Feature

  • Feature: Tumor size
  • Output: 0 or 1

Even if there are multiple categories:

  • 0 → No cancer
  • 1 → Type 1 cancer
  • 2 → Type 2 cancer
  • 3 → Type 3 cancer

It is still classification because the output is from a finite set of categories.

Multiple Features

In real problems, we use more than one feature:

  • Tumor size
  • Age
  • Clump thickness
  • Uniformity of cell size
  • Uniformity of cell shape

The algorithm learns a decision boundary that separates categories.


2. Unsupervised Learning

There are no labeled outputs. The system tries to find structure in the data.

Training set:

unlabeled data.

  • No labels
  • No correct answers
  • No predefined categories

x(1),x(2),...,x(m) x^(1), x^(2), ..., x^(m)x(1),x(2),...,x(m)

Where:

  • x(i)x^{(i)}x(i) is the input (features)
  • There are no y labels.

Goal

Discover hidden structure in the data.

"Here is the data. Can you find structure in it?"

We do not tell the algorithm what the correct output is.
We ask it to find patterns on its own.

  • Discovers hidden structure
  • Common task: Clustering
  • Advanced example: Cocktail Party Problem

2.1 Clustering

The algorithm automatically groups similar data points together.

  • Used to find patterns

We are not told:

  • How many groups exist
  • What the groups represent
  • Which example belongs to which group

The algorithm discovers that on its own.

Example

  • Given market data Identify patterns in buying behavior
  • Given news articles data find topics
  • Given Data Centers logs find machines that frequently work together
  • Given Social Network data find groups or communities
  • Given customer data find Market Segmentation
  • Given Astronomical data find galaxies

2.2 Blind Source Separation

Separating mixed signals into original independent components.

The Cocktail Party Problem

Given only the mixed recordings:

  • Detect that multiple sources exist
  • Separate them into independent signals
  • Recover the original voices

No labels are given:

  • We do not tell the algorithm what each voice sounds like
  • It discovers structure in the signal

Separate the original voices from mixed signals.

Difference Between Supervised and Unsupervised Learning

Aspect Supervised Learning Unsupervised Learning
Data Labeled data (input + correct output) Unlabeled data (input only)
Goal Learn mapping from input → output Discover hidden structure or patterns
Output Type Continuous (regression) or discrete (classification) Clusters, groups, latent structure
Example Problem House price prediction Customer segmentation
Example Problem Spam detection Grouping news articles
Human Guidance Requires correct answers during training No correct answers provided
Typical Tasks Regression, Classification Clustering, Dimensionality Reduction
Evaluation Compare predictions with true labels Evaluate structure quality (e.g., cohesion, separation)
Use Case When you know what you want to predict When you want to explore unknown patterns

3. Reinforced Learning

4. Recommender System

AI-Machine-Learning/1-Introduction
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.