A crop of new papers submitted to ICLR 2015 by various combinations of my co-authors from Facebook AI Research and NYU.http://arxiv.org/abs/1412.7580
: "Fast Convolutional Nets With fbfft: A GPU Performance Evaluation" by Nicolas Vasilache, Jeff Johnson, Michael Mathieu, Soumith Chintala, Serkan Piantino, Yann LeCun: two FFT-based implementations of convolutional layers on GPU. The first one is built around NVIDIA's cuFFT, and the second one around custom FFT code called fbfft. This follows our ICLR 2014 paper "Fast Training of Convolutional Networks through FFTs" by Michael Mathieu, Mikael Henaff, Yann LeCun: http://arxiv.org/abs/1312.5851http://arxiv.org/abs/1412.6651
"Deep learning with Elastic Averaging SGD" by Sixin Zhang, Anna Choromanska, Yann LeCun: a way to distribute the training of deep nets over multiple GPUs by linking the parameter vectors of the workers with "elastics".http://arxiv.org/abs/1412.7022
"Audio Source Separation with Discriminative Scattering Networks" Pablo Sprechmann, Joan Bruna, Yann LeCun: audio source separation using convnets operating on coefficients of a scattering transform of the signal.http://arxiv.org/abs/1412.6056
"Unsupervised Learning of Spatiotemporally Coherent Metrics" Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun: training sparse convolutional auto-encoders so that the pooled features change minimally between successive video frames. Beautiful filters come out of it.http://arxiv.org/abs/1412.6615
"Explorations on high dimensional landscapes" Levent Sagun, V. Ugur Guney, Yann LeCun: another take on the application of random matrix theory to the geometry of the loss surface in deep nets. We use a "teacher network" and a "student network" scenario to see if SGD manages to find a zero-energy global minimum that we know exists. Bottom line: SGD can't find it, but it doesn't matter because the (local) minima that it finds are equally good as far as test error is concerned.