computer vision - iffatAGheyas/computer-vision-handbook GitHub Wiki

📘 What is Computer Vision?

Computer Vision is a field of Computer Science and Artificial Intelligence that teaches machines to “see” and understand the visual world—much like humans do with their eyes and brains. Whereas humans rely on biological eyes and neural processes, computers use cameras and sophisticated algorithms or models to interpret images and videos.

🎥 Real-World Analogy

Consider how you:

Look at a traffic light and decide when to cross the road
Recognise a friend in a photograph
Watch a video clip and understand who is speaking and what’s happening

Computer Vision aims to replicate these everyday tasks in software.

🔎 Core Tasks in Computer Vision

Image Classification
“Is this image a cat or a dog?”
Object Detection
“Where is the car in this image?”
Segmentation
“Which pixels belong to the car?”
Video Analysis
“Is the person walking or running?”

🔄 How It Works (Overview)

Input
An image or a video frame
Processing
The system uses algorithms to extract features (e.g. edges, shapes)
Understanding
The model classifies, detects or tracks based on those features
Output
A decision (e.g. “face detected”) or an action (e.g. unlock phone)