3D Nearest-Neighbor Geometry Matching: Detailed 3D models of a scene from a single image
People are remarkably good at inferring the geometry of a physical space from single photograph, or recognizing that two dissimilar photographs are merely images of the same room taken from different physical perspectives. While these may be simple and natural tasks to us, the creation of systems that can understand images as accurately as humans is one of the main goals in the field of Computer Vision.
Although many current techniques for scene understanding, such as 2D Nearest Neighbor Search (http://goo.gl/Z4hZb
), show impressive results, they are limited in that they unable to generalize a scene to the arbitrary geometric perspective from which a photo was taken, Furthermore, they require large amount of manually annotated data and precise geometry estimates to match a single image to an accurate model.
In 3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding
, Google Software Engineer +Scott Satkin
and Carnegie Mellon School of Computer Science Professor Martial Hebert introduce the 3D Nearest-Neighbor (3DNN) algorithm, demonstrating a viewpoint invariant scene matching approach that produces detailed 3D models of a scene from a single image.
Decoupling viewpoint and geometry, the algorithm begins by estimating the viewpoint from which an image was captured and searching for a 3D model which closely matches. The 3D model then undergoes a geometry refinement
, in which object locations in the model are adjusted so that they more precisely match the image.
The ability for a system to infer the geometry of a scene from a single image enables many applications, such as the ability to realistically render additional objects into scenes, the creation of interactive 3D image editing tools, individual object detection and localization, and much more. To learn more, read the full paper at http://goo.gl/TSz6a3