Regression: Predicting Numbers ๐Ÿ“ˆ

Class 9Age 13โ€“14Lesson 6 of 12๐Ÿ†“ Free
Scatter plot with a regression line on a laptop screen, student pointing at predicted price of a house on a chart
Watch first - 2-3 minutes

Class 9 Lesson 6 - Regression: Predicting Numbers

No sign-in needed - English narration - Safe for all school ages

Meet Ananya โ€” Class 9, Bengaluru

Ananya's family is looking to buy a flat. Her father told her: "Location, size, age of building โ€” all these affect the price." Ananya opened Magicbricks.com and saw 200 flats in Whitefield. She thought: "Could an AI learn from these 200 listings and predict the price of any new flat in that area?"

That's exactly what regression does. Not categories (like pass/fail) โ€” but a specific number (like โ‚น45 lakhs). She ran the numbers after this lesson and the model predicted within โ‚น3 lakhs of actual prices 80% of the time. Let's learn how.

Classification vs Regression
Two Types of Supervised Learning

You've already learned classification. Regression is its sibling โ€” they're both supervised learning, but with a key difference:

Classification

Predicts a Category

  • Output: one label from a fixed list
  • Pass / Fail
  • Spam / Not Spam
  • Cat / Dog / Bird
  • Metric: Accuracy, F1
Regression

Predicts a Number

  • Output: any continuous number
  • Flat price: โ‚น42,50,000
  • Temperature tomorrow: 34.5ยฐC
  • Student marks: 78.3
  • Metric: MAE, RMSE, Rยฒ
Part 1
Linear Regression: The Simplest Model

Linear regression finds the best straight line through your data points. That line is the model โ€” and you use it to predict any new value.

Study Hours vs Exam Marks (scatter plot with regression line)
100755025
1h3h5h7h9h

Each orange dot = one student. The dark line = the linear regression prediction line. A new student studying 6 hours โ†’ read up from 6h to the line โ†’ predicted marks โ‰ˆ 72.

The equation of the line is: y = mx + b

Linear regression learns the best m and b by minimising the total squared error between its predictions and the actual values. This is called Ordinary Least Squares (OLS) and is solved in milliseconds.
Part 2
Regression Evaluation Metrics

Accuracy doesn't make sense for regression โ€” there's no "correct category". Instead we use error metrics:

MAE
Mean Absolute Error
Average absolute difference between predicted and actual. "Off by โ‚น3 lakhs on average." Easy to understand.
RMSE
Root Mean Squared Error
Like MAE but penalises large errors more. Good when big mistakes are especially costly.
Rยฒ Score
R-squared (Coefficient of Determination)
0 = model is no better than the mean. 1 = perfect predictions. Target Rยฒ > 0.85 for good models.
Part 3
Build a Marks Predictor in Python
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, r2_score
import matplotlib.pyplot as plt

# Generate student study/marks data
np.random.seed(42)
n = 100
study_hours = np.random.uniform(1, 9, n)
marks = 10 + 8.5 * study_hours + np.random.normal(0, 5, n)
marks = np.clip(marks, 20, 100)

df = pd.DataFrame({'study_hours': study_hours, 'marks': marks})

# Single feature regression
X = df[['study_hours']]   # 2D array (required by sklearn)
y = df['marks']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

# Train linear regression
model = LinearRegression()
model.fit(X_train, y_train)

print(f"Slope (m): {model.coef_[0]:.2f}  โ† each extra hour adds ~{model.coef_[0]:.1f} marks")
print(f"Intercept (b): {model.intercept_:.2f}")

# Evaluate
y_pred = model.predict(X_test)
print(f"\nMAE: {mean_absolute_error(y_test, y_pred):.2f} marks")
print(f"Rยฒ:  {r2_score(y_test, y_pred):.3f}")

# Predict for 5 hours study
pred = model.predict([[5]])[0]
print(f"\nPredicted marks for 5 hours study: {pred:.1f}")

# Plot
plt.figure(figsize=(8,5))
plt.scatter(X_test, y_test, alpha=0.6, label='Actual', color='#f97316')
plt.plot(X_test.sort_values('study_hours'),
         model.predict(X_test.sort_values('study_hours')),
         color='#7c2d12', linewidth=2, label='Predicted line')
plt.xlabel('Study Hours')
plt.ylabel('Marks')
plt.title('Linear Regression: Study Hours vs Marks')
plt.legend()
plt.show()

๐ŸŽš Try It: Study Hours โ†’ Marks Predictor

This uses the linear equation: marks = 10 + 8.5 ร— study_hours (approximate from the model above). Slide to explore:

Predicted Marks
52.5

๐Ÿงช Check Your Understanding โ€” Lesson 6 Quiz

1. What is the key difference between classification and regression?
a) Classification needs more data than regression
b) Classification predicts a category; regression predicts a continuous number
c) Regression is only used for time series data
d) Classification uses neural networks; regression uses decision trees
2. In the linear equation y = mx + b, what does "b" represent?
a) The slope โ€” how much y increases per unit of x
b) The number of data points
c) The intercept โ€” the predicted y value when x = 0
d) The error of the model
3. A flat price predictor has MAE = โ‚น2.5 lakhs. This means:
a) The model is always wrong by exactly โ‚น2.5 lakhs
b) On average, predictions are off by โ‚น2.5 lakhs
c) 2.5% of predictions are wrong
d) The model is 97.5% accurate
4. An Rยฒ score of 0.92 means:
a) 92% of test rows were predicted correctly
b) The model explains 92% of the variance in the data โ€” a very good fit
c) The model made errors in 8% of cases
d) 92 rows were in the training set
5. Which of these is a regression problem?
a) Will this email be spam or not spam?
b) What is the next word in this sentence?
c) What will the temperature be in Hyderabad tomorrow (ยฐC)?
d) Is this image a cat, dog, or bird?
6. Why do we write X = df[['study_hours']] (double brackets) instead of X = df['study_hours'] in scikit-learn?
a) Double brackets are required for string columns
b) scikit-learn requires X to be a 2D array (DataFrame), not a 1D Series
c) It's just a Python convention, both work equally
d) To include multiple copies of the same column
7. RMSE penalises large errors more than MAE. When is this useful?
a) When all errors are equally important
b) When a few very large prediction errors are especially harmful (e.g., bridge load calculations)
c) Only for classification problems
d) When the dataset has fewer than 100 rows
8. Using the equation marks = 10 + 8.5 ร— study_hours, what would a student who studies 4 hours be predicted to score?
a) 34
b) 42
c) 44
d) 48
โ† Lesson 5: Model Evaluation Lesson 7: Generative AI โ†’