The App Identified It in 3 Seconds. But How? 🍅
Riya, 13, from Pune, was visiting her uncle's tomato farm on the outskirts of the city when she noticed some plants had brown spots on their leaves. Her uncle, who had been farming for 20 years, said it looked like early blight — but he was not certain.
Riya opened the Plantix app, took a photo of the affected leaf, and within 3 seconds the app identified it as early blight caused by Alternaria solani. It also showed the confidence level (91%) and recommended a treatment.
"How does it see this in a photo?" Riya asked. "A photo is just colours, right?"
Her uncle smiled. "It is all numbers. And the AI has seen millions of photos of diseased plants. It has learned which patterns of numbers go with which diseases."
🔢 How a Computer Reads an Image
Every digital image is a grid of tiny squares called pixels. Each pixel has a colour, which is stored as three numbers (0–255) representing the amount of red, green, and blue light. A colour image that is 1,000 × 1,000 pixels contains 3 million numbers.
Here is a tiny 8×8 demonstration. Each cell is one pixel. The lighter cells have higher values; darker cells are lower:
When AI processes an image, it is literally reading a large table of numbers. Your eye and brain do something similar at incredible speed — you just never think about it at the pixel level.
🔍 How AI Learns to Recognise Images: Convolution
If you showed someone a crop disease photo and said "this is blight", they would learn by looking at patterns — the shape of the spots, their colour, their distribution on the leaf. AI does something similar using a technique called convolution.
What convolution does — simple explanation
A convolutional neural network (CNN) scans the image with a small sliding filter — like looking at the image through a tiny window. The filter is trained to detect a specific low-level feature: an edge, a colour patch, a texture.
- Early layers detect: edges, corners, and colour patches
- Middle layers combine those into: shapes and textures (a curved edge + brown = spot-like feature)
- Deeper layers combine those into: object parts (spots on leaf surface)
- Final layers make the: classification decision (early blight vs late blight vs healthy)
The layer pipeline
🇮🇳 Computer Vision in India: Real Applications Right Now
🔒 Privacy and Face Recognition: Important Concerns
Computer vision is powerful — but face recognition in public spaces raises serious ethical and privacy questions that India and the world are still working through.
- Mass surveillance risk: If cameras in public spaces can identify every face, every movement of every citizen can potentially be tracked without their knowledge or consent.
- Accuracy varies across groups: Multiple studies have found face recognition systems perform worse on women and people with darker skin tones — meaning higher false identification rates for groups that were underrepresented in training data.
- Risk of misidentification: If a face recognition system falsely identifies an innocent person as a suspect, the consequences can be severe. Several wrongful arrests linked to facial recognition errors have been documented in other countries.
- Lack of regulation: India does not yet have a comprehensive law specifically governing facial recognition use by government or law enforcement. The debate about appropriate limits is ongoing.
🏆 What Computer Vision Is Still Bad At
Computer vision has achieved superhuman accuracy on many image classification benchmarks. But it still has real limitations:
| Limitation | Example |
|---|---|
| Adversarial examples | A tiny change to a few pixel values — invisible to humans — can completely fool a vision model (e.g. a stop sign with a few stickers that the AI reads as a speed limit sign) |
| Out-of-distribution failure | A crop disease model trained in one region may fail on the same disease in different lighting, soil colour, or leaf variety |
| Common sense understanding | A model can classify "a dog on a table" but has no understanding of why that is unusual |
| 3D understanding | A photo is 2D — inferring depth and 3D structure from flat images is still challenging, especially in uncontrolled environments |
🗺️ Key Vocabulary Summary
| Term | Simple meaning |
|---|---|
| Pixel | The smallest unit of a digital image — each pixel is stored as 3 numbers (R, G, B) |
| Computer vision | The field of AI that enables machines to interpret and classify visual information from images or video |
| Convolution | Scanning an image with a small filter to detect specific features at each location |
| CNN | Convolutional Neural Network — the type of model most commonly used for image recognition tasks |
| Feature map | The output of a convolution layer — showing where specific features appear in an image |
| Face recognition | Using AI to identify a person by their facial features — raises significant privacy concerns |
| Adversarial example | An image modified in a tiny way that humans cannot notice, but that fools a vision model into a wrong classification |
👁️ Quiz — Lesson 6
8 questions · Click your answer · Submit for your score
📝 Worksheet — Computer Vision in Your Daily Life
Tip: in the print dialog, choose "Save as PDF" to download.In your notebook, answer these questions:
- List 3 apps or services you have used personally that use computer vision (e.g. photo search, face unlock, camera filters, QR code scanning). For each one, describe what input it takes and what output it produces.
- If you were designing a computer vision system to detect potholes in Indian roads using dashcam footage, what challenges would you face? Think about: what would you need in the training data? What could go wrong? Who could be harmed by errors?
- The section on face recognition raised privacy concerns. In your notebook: list one benefit and one risk of using face recognition for school attendance in India. Write one rule you would recommend if such a system were used.
📋 Note for Parents and Teachers
What this lesson covers: Pixels as numbers, convolutional neural networks (layers, features, classification), six real Indian applications of computer vision, privacy concerns around face recognition (including demographic accuracy gaps and mass surveillance), and limitations of vision AI. The lesson is designed to build both technical understanding and responsible-use thinking.
Discussion prompts:
- "The lesson mentions that face recognition performs worse for darker skin tones. What should be done before deploying such systems in India? Who should be responsible?"
- "Plantix can help a farmer identify crop disease in 3 seconds on a phone. What are the conditions under which this is genuinely useful — and conditions where it could mislead?"