Computer vision is the science of perceiving and understanding the world around you through images and videos. Throughout this tutorial, we'll be learning about a variety of computer vision applications. We'll also dive into the details of image analysis techniques, programming and computer vision tools and libraries (such as OpenCV), and we'll learn more about the math behind certain applications.
Computer vision is used in AI systems to visually perceive the world by gathering images, analyzing data and eventually responding to it.
The structure of animal and insect eyes differs based on how their vision systems have evolved and adapted to it's environment and behavior; vision systems change to help a creature fulfill the tasks it needs to survive.
The structure of animal and insect eyes differs based on how their vision systems have evolved and adapted to it's environment and behavior; vision systems change to help a creature fulfill the tasks it needs to survive.
Bees, for instance, and many other insects have compound eyes that consist of multiple lenses (as many as 30,000 lenses in a single compound eye) that all low resolution. So they are not so good at recognizing things from distance. But they are very sensitive to motions which is very essential while flying fast. In essence, there are a variety of visual systems. Each is designed to perform their own task.
If you’d like to learn more about compound eyes, take a look at this reference.
A close up of the compound eyes of a bee; to the right are many tightly-spaced lenses that form a single compound eye.
The basis of any AI system is that it can: 1) perceive its environment and 2) take actions, based on those perceptions. And computer vision is used to perceive and construct a physical model of the world, so that an AI system can then take the appropriate action.
It's important to note that vision is only one aspect of perception. Just think of how you observe the world: through sight, but also through smell, sound, and many other "sensors" that humans have. It's the same with AI systems; computer vision is just one - visual - way to perceive physical surroundings.
In the case of self-driving cars, computer vision is used to analyze the visual input from cameras mounted on the car (computer vision is not used to analyze data from other sensors like radar, and LiDAR, which uses\ radio waves and lasers respectively). Computer vision is used to look at images and video data to intelligently detect lane markings, vehicles, pedestrians, and other elements in the environment, in order to navigate safely!
In general, computer vision is used in many applications to recognize objects and their behavior. Here are some examples:
Computer vision is used for vehicle and pedestrian recognition and tracking (to see their speed and predict movement).
An AI system, using computer vision, can learn to recognize images of cancerous tissue and help with early detection and diagnoses.
Brain MRI, in which a tumor is recognized and colorized in an image
Computer vision can be trained to recognize and tag (or label) faces or different features in any given photo library. This is already a feature that many of our phones have!
Face recognition with labels for emotions
As a compliment to image recognition, computer vision is also used to retrieve relevant images based on some search text/given label; this is called image retrieval. For example, searching the term "sunflower" in Google images, should return relevant images of sunflowers!
Searching for images of a "sunflower"
Using AI techniques and computer vision, a system can recognize behavior in images and caption it correctly.
Automatically captioned image; caption reads: "a man is eating a hot dog in a crowd"
Automatically captioned image; caption reads: "a man is eating a hot dog in a crowd"