One area of technology that allows a computer to see objects around it is computer vision. If humans are given a reason to recognize an object, then computer vision models can be trained to do the same. For example, when we see a photo of Mother, just by looking at it a little, of course, we will know that it is Mother. This happens because we already know the mother well enough, so with just a little input, our brain can process complete information.
Computer vision allows computers to identify and process objects in the same way as humans. The example that we use most often today is the use of scan QR on smartphones as a way to make payments. When the scan QR code is detected by a computer (in this case a smartphone), then this scanned code will send a special command, it can be a payment order, unlocking, or other activities.
To make it easier to understand how computer vision works, let’s think of it as when we put together a puzzle. We have all the pieces of the image scattered in various shapes with different edges or ends, then have to arrange them into a unified whole. Well, this is how computer vision works when translating an object and labeling the object. When putting together a puzzle, our brains work to match up end-by-end with matching pairs to form an appropriate image so as not to be misplaced.
This task is carried out by a network of neurons in computer vision that will separate the different parts of the image, identify the edges and then combine them one by one to form a complete image. In this case, the computer is not given a big picture of the data entered, because the computer will automatically label the image (eg cat, face, or paper).
So, when we feed the computer millions of images of cats, the computer automatically directs them to algorithms. That allows them to analyze the colors in the photos, the shapes, the distances from one shape to another, where objects border one another, and so on. Until he can identify what profiles are needed to label an object as “cat”.