If you consider a child's eyes as a pair of biological cameras, they take one picture about every 200 milliseconds, it is the average time an eye movement is made. So by age three, a child would have seen hundreds of millions of pictures of the real world. That's a lot of training examples.
Using a big data to train the algorithms may seem obvious now, but it was not the case in 2007