I Crawl, I See, I Learn: Teaching Computers to Think
Human beings are remarkably good at learning. Every day, our brain processes visual data from the world around us, learning common sense relationships and developing inferences about new things we may have never seen before. For example, when faced with an image of a strange creature, we infer that it is an insect despite never having seen it before, based on its physical characteristics and the context in which the image is taken.
But is it possible for a computer learn common sense from visual data in a fashion similar to humans, just by browsing images found on the internet? Researchers at the Robotics Institute (http://goo.gl/se0wTo
) and Language Technology Institute (http://goo.gl/9aPxwH
) of Carnegie Mellon University (CMU) believe so.
Assistant Research Professor Abhinav Gupta (http://goo.gl/fG0Rbm
) aims to build the world’s largest visual knowledge base with the Never Ending Image Learner
(NEIL) program. Gupta, working alongside PhD student and former Google intern Abhinav Shrivastava (http://goo.gl/og37Ho
) and PhD student Xinlei Chen (http://goo.gl/tQ8sCv
), has developed NEIL to automatically extract visual knowledge from the Internet, enabling it to learn common sense relationships between objects and categories. For example:Deer can be a kind of / look similar to AntelopeCar can have a part WheelsSunflower is/has Yellow
Recipient of a Google Focused Research Award (http://goo.gl/hn59r
) and #9 in CNN’s top 10 ideas of 2013 (http://goo.gl/cFXK1F
), NEIL runs 24 hours a day, 7 days a week, using small amounts of human labeled visual data in conjunction with a large amount of unlabeled data to iteratively learn reliable and robust visual models. NEIL is then able to develop associations between different
concepts, using these learned relationships to further improve the visual models themselves. This is in contrast to machine learning approaches where concepts are isolated and recognized independently.
To date, NEIL has learned to identify 1,500 objects, 1,200 scenes, and made 2,500 associations between scene and object, from the millions of images found on the Internet. It is the hope of Gupta et al. that NEIL will learn new relationships between different concepts without the need for human labeled data, and in the process develop common sense that is needed for better for perception, reasoning and decision making.
To learn more details about NEIL and see what it has learned, visit the program page linked below.