AI for Students · Class 8 · Age 12–13 · Lesson 3 of 12

How AI Learns: Classification and Prediction 🎯

From sorting mangoes to predicting exam results — this lesson explains how AI learns to put things in categories and make predictions, the two tasks it does most often.

📘 Class 8 · Lesson 3 🕐 45–55 min 🚫 No coding needed 🆓 Free lesson
Illustrated scene: Indian student sorting cards into two bins labelled 'pass' and 'fail', representing AI classification
Watch first · 2–3 minutes

Class 8 Lesson 3 — Classification and Prediction

No sign-in needed · English narration · Safe for all school ages

Story · Asha's Mango Sorting Machine

How Does a Machine Know Which Mango Is Ripe? 🥭

Asha, 13, from Rajahmundry, visited her aunt's mango orchard. Workers were sorting mangoes — ripe, unripe, or too ripe — by looking at colour, feeling firmness, and smelling the fruit.

A company had recently installed an automated sorting machine that used a camera and AI to sort mangoes automatically. Asha was curious: "Does the machine smell them too?"

The engineer laughed. "Not smell — but it analyses colour and shape with 98% accuracy. We showed it 50,000 photos of mangoes with labels: ripe, unripe, overripe. After training, it sorts better than a tired worker at 4am."

Asha asked the important question: "What happens if it sees a mango variety it has never been trained on?" The engineer nodded. "Good question. It will probably make mistakes. That is why we keep adding new training data when we expand to new varieties."

👉 Asha has just seen classification in action. This lesson explains how it works — and its sibling task, prediction — in simple terms.
Section 1 of 7

📂 Classification vs Prediction — The Two Big Tasks

Almost everything an AI model does falls into one of two categories:

🏷️ Classification

Assign an input to one of a fixed set of categories. The output is a label.

Examples: spam/not spam · ripe/unripe/overripe · cat/dog/bird · disease/no disease · pass/fail

📈 Prediction (Regression)

Predict a number on a continuous scale. The output is a value.

Examples: predict exam score · predict tomorrow's rainfall · predict house price · predict energy consumption

Rule of thumb: If the output is a category (yes/no, type A/B/C), it is classification. If the output is a number on a scale, it is regression (prediction).
Section 2 of 7

🔢 How Classification Works: A Simple Walk-Through

Let us use a simple example: predicting whether a student will pass or fail, based on two features: hours studied per day and attendance percentage.

1
Collect labelled data. You gather records of 1,000 past students — each with study hours, attendance, and whether they passed or failed.
2
The model looks for a boundary. Imagine plotting students on a graph: x-axis = study hours, y-axis = attendance. Passing students cluster in one region; failing students in another. The model tries to find a line (or curve) that separates them.
3
That line is the decision boundary. For a new student, the model checks which side of the boundary their data falls on — and predicts accordingly.
4
Confidence scores. Most models do not just say "pass" — they say "72% chance of passing." The probability is called a confidence score. Decisions made with low confidence (e.g. 51% vs 49%) should be treated with more caution.
5
Test the boundary. The model is tested on students it has never seen to check whether the boundary generalises or is overfit to the training data.
Section 3 of 7

📊 Measuring How Good a Classifier Is

Saying a model is "95% accurate" sounds impressive. But accuracy alone can be misleading. Here is why.

Imagine 1,000 emails: 950 are not spam, 50 are spam. A model that simply labels everything as "not spam" would be 95% accurate — but useless at actually catching spam.

The confusion matrix

Predicted: SpamPredicted: Not Spam
Actually: SpamTrue Positive (TP) ✅False Negative (FN) ❌ — missed spam
Actually: Not SpamFalse Positive (FP) ❌ — wrongly blockedTrue Negative (TN) ✅

Key metrics from the confusion matrix

Why this matters: In medical diagnosis, high recall is critical — missing a real disease (false negative) is dangerous. In fraud detection, high precision matters — wrongly blocking a legitimate transaction (false positive) hurts trust. The right metric depends on the real-world consequences of each type of error.
Section 4 of 7

🌳 Decision Trees: AI You Can Read

One of the most intuitive ML models is the decision tree. It asks a series of yes/no questions about the features of an input and follows a branch for each answer until it reaches a conclusion.

Example decision tree for mango classification:
  • Is the mango yellow? → If YES → Is it firm? → If YES → RIPE; If NO → OVERRIPE
  • Is the mango yellow? → If NO → Is it mostly green? → If YES → UNRIPE; If NO → OVERRIPE

Decision trees are easy to understand and explain. A farmer or doctor can look at the tree and verify whether the rules make sense. This is called interpretability — and it is an important property for AI systems used in high-stakes decisions.

More complex models (like neural networks with millions of weights) produce better results but are much harder to explain — they are often called "black boxes".

Important trade-off: Better accuracy often comes with less explainability. Choosing the right balance depends on the application — a slightly less accurate model that you can explain may be far more trustworthy and safe to deploy than a highly accurate black box.
Section 5 of 7

📈 Prediction (Regression) — Estimating Values

When the output is a number, the task is called regression. The model tries to find a mathematical relationship between the input features and the output value.

Simple example: Predicting exam score from study hours.
If you studied 3 hours → model predicts 65 marks
If you studied 5 hours → model predicts 78 marks
The model finds the best-fitting line (or curve) through the training data points.

Real Indian examples of prediction (regression)

Prediction taskInput featuresPredicted output
Crop yield predictionRainfall, temperature, soil type, fertiliserExpected harvest in tonnes/hectare
Electricity demand forecastingTime of day, day of week, season, temperatureExpected demand in MW
Flood risk estimationRainfall over 48 hours, river level, soil moistureProbability of flooding (0–100%)
Hospital readmission riskAge, diagnosis, treatment, previous admissionsProbability of readmission within 30 days
Section 6 of 7

⚠️ When AI Gets It Wrong: Error Analysis

Every AI classifier makes mistakes. Understanding the pattern of mistakes is more important than knowing the overall accuracy score.

Asha's question, answered: When the mango sorting machine sees a new variety it was not trained on, it may confidently classify it incorrectly — because the patterns it learned do not apply to the new variety. This is why engineers keep expanding the training data and regularly test models on new situations.
Section 7 of 7

🗺️ Key Vocabulary Summary

TermSimple meaning
ClassificationSorting input into a fixed set of categories (pass/fail, ripe/unripe)
Regression (prediction)Predicting a number on a continuous scale (exam score, rainfall)
Decision boundaryThe line (or surface) that separates different categories in the model's learned space
Confidence scoreThe probability the model assigns to its prediction (e.g. 73% confident this is spam)
Confusion matrixA table showing correct and incorrect predictions broken down by category
False positivePredicted positive but actually negative (wrongly blocked email)
False negativePredicted negative but actually positive (missed real spam)
Decision treeA model that makes predictions using a series of yes/no questions — interpretable and explainable
InterpretabilityHow well humans can understand why a model made a particular prediction

🎯 Quiz — Lesson 3

8 questions · Click your answer · Submit for your score

1. Which of these is a classification task?
2. What is a "decision boundary" in a classification model?
3. A spam filter that simply labels every email as "not spam" achieves 95% accuracy on a dataset where 95% of emails are legitimate. This shows:
4. In medical diagnosis AI, which metric is MOST important to maximise?
5. Why are decision trees valued for high-stakes AI applications (medical, legal, financial)?
6. A model predicts the probability of rain tomorrow as 58%. This 58% is called:
7. A model is 96% accurate for urban users but only 72% accurate for rural users. This difference is:
8. Which of these is a regression (prediction) task?

📝 Worksheet — Classify Your World

Tip: in the print dialog, choose "Save as PDF" to download.

In your notebook, answer these questions:

  1. Think of 3 AI systems you have used or read about. For each one, say whether it is mainly a classification task, a regression task, or both.
  2. Describe a real-world situation in India where a false negative from an AI classifier would be dangerous. What would you recommend the engineers prioritise — recall or precision?
  3. Design a simple decision tree (3–5 yes/no questions) to classify whether a given day is good for flying a kite (consider wind, rain, and time of year).

📋 Note for Parents and Teachers

What this lesson covers: Classification vs regression, the training process for classifiers, the confusion matrix and evaluation metrics, decision trees and interpretability, and real Indian examples of both tasks. No mathematics beyond percentages is required.

Discussion prompts:

← Lesson 2: Data Lesson 4: AI Workflows →