The results of the 2013 ImageNet Large Scale Visual Recognition Challenge are out!

The NYU teams did quite well (yes, there are several NYU teams).

The competition had three components
- Classification: ImageNet dataset with 1000 categories
- Classification+Localization: same, but one must provide a bounding box
- Detection: 200 categories, possibly several objects per image

On classification, Matt Zeiler's "Clarifai" system won with less than 12% error (top 5), followed by NUS, Andrew Howard, the Zeiler-Fergus team (NYU), the OverFeat team (NYU), and the University of Amsterdam-Euvision team. Other teams were above 15% error.

Matt Zeiler graduated a few weeks ago from his PhD with +Rob Fergus at NYU. Matt has been playing with ImageNet classification for about 1 year, and had a chance to fine-tune his system. The ZF system is what he built with Rob, while the Clarifai system is what he built since he graduated. 

The OverFeat team (composed +Pierre Sermanet,  David Eigen, Michael Mathieu, +Xiang Zhang, +Rob Fergus, and myself) has had less time to tune it classification entry, and concentrated on the classification+localization competition. We won that one handily, but there was only one other entry (but a mighty one: VGG/Oxford).

Our OverFeat entries are fairly standard convolutional networks (surprise!) with a few training tricks. Our best entry (14.1% error) is a committee of 7 convnets, while our second entry (15.6%) is a single convnet. Unlike many of the other teams, we use our own GPU implementation interfaced to the Torch7 package (http://torch.ch) and we do not use Alex Krizhevsky's GPU code.

Our OverFeat team also participated in the detection challenge. We did pretty well, but our numbers were still increasing quite rapidly by the time of the deadline, and we think we can do much better. The detection challenge was won by U of Amsterdam-Euvision, with NEC and OverFeat closely behind. The other teams are quite some ways behind. Our system was pre-trained with the ImageNet-1K before being fine-tuned on the 200-category detection dataset (which is why it appears on a dark grey background in the table of results).

#ImageNet   #convnet   #Torch7   #computervision   #machinelearning   #deeplearning  
Shared publiclyView activity