AI for Students · Class 8 · Age 12–13 · Lesson 6 of 12

Computer Vision: How AI Sees 👁️

You can glance at a photo and instantly know what is in it. How does a machine do the same thing? This lesson goes inside computer vision — from pixels and numbers to crop disease AI and eye health screening in India.

📘 Class 8 · Lesson 6 🕐 45–55 min 🚫 No coding needed 🆓 Free lesson
Illustrated scene: Indian student holding a smartphone pointed at a plant leaf, with colourful grid overlays representing pixel detection and AI vision
Watch first · 2–3 minutes

Class 8 Lesson 6 — Computer Vision: How AI Sees

No sign-in needed · English narration · Safe for all school ages

Story · Riya and the Diseased Tomato Plant

The App Identified It in 3 Seconds. But How? 🍅

Riya, 13, from Pune, was visiting her uncle's tomato farm on the outskirts of the city when she noticed some plants had brown spots on their leaves. Her uncle, who had been farming for 20 years, said it looked like early blight — but he was not certain.

Riya opened the Plantix app, took a photo of the affected leaf, and within 3 seconds the app identified it as early blight caused by Alternaria solani. It also showed the confidence level (91%) and recommended a treatment.

"How does it see this in a photo?" Riya asked. "A photo is just colours, right?"

Her uncle smiled. "It is all numbers. And the AI has seen millions of photos of diseased plants. It has learned which patterns of numbers go with which diseases."

👉 This lesson explains exactly how a computer reads an image, how AI learns to recognise patterns in images, and how this technology is being used across India right now.
Section 1 of 6

🔢 How a Computer Reads an Image

Every digital image is a grid of tiny squares called pixels. Each pixel has a colour, which is stored as three numbers (0–255) representing the amount of red, green, and blue light. A colour image that is 1,000 × 1,000 pixels contains 3 million numbers.

Here is a tiny 8×8 demonstration. Each cell is one pixel. The lighter cells have higher values; darker cells are lower:

When AI processes an image, it is literally reading a large table of numbers. Your eye and brain do something similar at incredible speed — you just never think about it at the pixel level.

Scale note: A typical phone camera photo is about 12 megapixels (12 million pixels). As a colour image, that is 36 million numbers. A modern computer vision model can process this in milliseconds.
Section 2 of 6

🔍 How AI Learns to Recognise Images: Convolution

If you showed someone a crop disease photo and said "this is blight", they would learn by looking at patterns — the shape of the spots, their colour, their distribution on the leaf. AI does something similar using a technique called convolution.

What convolution does — simple explanation

A convolutional neural network (CNN) scans the image with a small sliding filter — like looking at the image through a tiny window. The filter is trained to detect a specific low-level feature: an edge, a colour patch, a texture.

The layer pipeline

Raw pixels
Edge detection
Shape features
Object parts
Classification
Analogy: Imagine learning to identify birds. First you notice beaks, feathers, wing shapes. Then you combine: long curved beak + small body = sunbird. Then you classify: sunbird vs sparrow. CNN layers do exactly the same thing — building up from simple features to complex objects.
Section 3 of 6

🇮🇳 Computer Vision in India: Real Applications Right Now

Plantix — Crop Disease AI
Developed with support from German research institute Leibniz. Farmers photograph their affected plants; the AI identifies the disease and recommends treatment. Over 10 million downloads across India. Trains on user-submitted photos, improving continuously.
Diabetic Retinopathy Screening
Aravind Eye Hospital (Madurai) and startups like Remidio and NetrAI use AI to analyse retinal photographs for signs of diabetic retinopathy — a leading cause of blindness. The AI was trained on over 100,000 retinal images. Enables screening in rural areas without a specialist ophthalmologist on site.
ISRO Satellite Image Analysis
ISRO uses computer vision on satellite images to monitor crop health across states, detect deforestation, track flood extent, and identify urban growth. What would take surveyors months can be done in hours with AI-powered image analysis.
Road Safety and Traffic AI
Several Indian cities (Hyderabad, Bengaluru) use camera-based AI to monitor traffic density, detect signal violations, and identify accident-prone road conditions. The system reads license plates and analyses vehicle flow patterns in real time.
Heritage and Document Preservation
AI vision tools are being used to digitise and transcribe handwritten Indian manuscripts — in Sanskrit, Telugu, Tamil, and other scripts — helping preserve irreplaceable historical documents by converting images of handwriting into searchable digital text.
Aadhaar Face Authentication
India's Aadhaar biometric system uses face recognition AI to verify identity at various service points. This is one of the world's largest deployments of face recognition — affecting over a billion citizens.
Section 4 of 6

🔒 Privacy and Face Recognition: Important Concerns

Computer vision is powerful — but face recognition in public spaces raises serious ethical and privacy questions that India and the world are still working through.

Discussion question: Should facial recognition be used in Indian schools for attendance tracking? What are the benefits? What are the risks? Who should decide?
Section 5 of 6

🏆 What Computer Vision Is Still Bad At

Computer vision has achieved superhuman accuracy on many image classification benchmarks. But it still has real limitations:

LimitationExample
Adversarial examplesA tiny change to a few pixel values — invisible to humans — can completely fool a vision model (e.g. a stop sign with a few stickers that the AI reads as a speed limit sign)
Out-of-distribution failureA crop disease model trained in one region may fail on the same disease in different lighting, soil colour, or leaf variety
Common sense understandingA model can classify "a dog on a table" but has no understanding of why that is unusual
3D understandingA photo is 2D — inferring depth and 3D structure from flat images is still challenging, especially in uncontrolled environments
Section 6 of 6

🗺️ Key Vocabulary Summary

TermSimple meaning
PixelThe smallest unit of a digital image — each pixel is stored as 3 numbers (R, G, B)
Computer visionThe field of AI that enables machines to interpret and classify visual information from images or video
ConvolutionScanning an image with a small filter to detect specific features at each location
CNNConvolutional Neural Network — the type of model most commonly used for image recognition tasks
Feature mapThe output of a convolution layer — showing where specific features appear in an image
Face recognitionUsing AI to identify a person by their facial features — raises significant privacy concerns
Adversarial exampleAn image modified in a tiny way that humans cannot notice, but that fools a vision model into a wrong classification

👁️ Quiz — Lesson 6

8 questions · Click your answer · Submit for your score

1. What is a pixel?
2. In a Convolutional Neural Network, what do the earliest layers typically detect?
3. How does the Plantix app identify crop diseases?
4. Why is diabetic retinopathy AI screening particularly important for rural India?
5. What is an "adversarial example" in computer vision?
6. Face recognition AI has been shown to perform worse on darker skin tones. The most likely cause is:
7. ISRO uses computer vision on satellite images to:
8. What does convolution do in a CNN?

📝 Worksheet — Computer Vision in Your Daily Life

Tip: in the print dialog, choose "Save as PDF" to download.

In your notebook, answer these questions:

  1. List 3 apps or services you have used personally that use computer vision (e.g. photo search, face unlock, camera filters, QR code scanning). For each one, describe what input it takes and what output it produces.
  2. If you were designing a computer vision system to detect potholes in Indian roads using dashcam footage, what challenges would you face? Think about: what would you need in the training data? What could go wrong? Who could be harmed by errors?
  3. The section on face recognition raised privacy concerns. In your notebook: list one benefit and one risk of using face recognition for school attendance in India. Write one rule you would recommend if such a system were used.

📋 Note for Parents and Teachers

What this lesson covers: Pixels as numbers, convolutional neural networks (layers, features, classification), six real Indian applications of computer vision, privacy concerns around face recognition (including demographic accuracy gaps and mass surveillance), and limitations of vision AI. The lesson is designed to build both technical understanding and responsible-use thinking.

Discussion prompts:

← Lesson 5: Language Models Lesson 7: AI Bias →