Profile cover photo
Profile photo
Yann LeCun
13,129 followers
13,129 followers
About
Yann's posts

Post has attachment
The slides of my CVPR 2015 keynote talk are available here: https://drive.google.com/open?id=0BxKBnD5y2M8NVHRiVXBnOVpiYUk&authuser=0

The last time I gave a keynote at CVPR was in 2000. I talked about many of the same topics as this year: ConvNets for recognition and detection, and structured prediction (graph transformer networks). For those interested in archaeology, the slides of my CVPR 2000 keynote talk are here: https://drive.google.com/open?id=0BxKBnD5y2M8NOVl0azMzOVd6dUU&authuser=0

Post has shared content
The recent surge of interest in convolutional nets in computer vision has brought many new applications and many new ideas.

But some young researchers have been working wth ConvNets for years. Many of them are my former students. They have suffered through the dark years when getting a ConvNet paper accepted at computer vision conferences was very difficult. A lot of their work is pre-2012, pre-AlexNet, pre-ImageNet.

Many ideas now being "rediscovered" are things they take for granted. They naturally feel pretty disappointed when an idea they thought of as obvious is reinvented, renamed, is presented as an innovation, or receives an award. They are entitled to feel annoyed when it is an idea they published 5 or 10 years ago, or an idea they couldn't get published as recently as 3 years ago.

I have zero reason to be annoyed myself (I'm doing fine, thank you), but I can't blame them for their feelings.
There is a trend at CVPR 2015 of renaming existing deep learning techniques following minor modifications.

For example, "hypercolumns" have been commonly known as "skip connections" in the deep learning community for a long time. The authors justify this renaming in that they maintain resolution of the skip connections as opposed to subsampling in previous work.

Similarly, the name "fully convolutional training" makes it sound like an innovation while as Yann LeCun puts it, "there is no such thing as a fully-connected layer, there are only 1x1 convolutions". Not only every convolutional network is trained fully convolutional by nature, doing backprop from multiple outputs has been done since the 90s and recently in OverFeat and Tompson's works. Hence the name "fully convolutional training" is not very accurate, it is more about spatial backpropagation rather and not novel.

These papers present good work and good results and are definitely worth reading, and they properly cite previous literature, however renaming of existing techniques by slightly modifying them and making them sound novel is not appropriate. This adds noise to the naming conventions and undermines previous work. Interestingly, this trend comes from groups that are fairly new to the field.

I understand that in the current deluge of deep learning work and accelerating research times, a catchy name can be helpful, however it should not be tolerated by the community when overlapping this much with existing techniques. It is also damaging when introduced to such a big conference as CVPR where many people are still relatively new to deep learning.

Post has attachment
Today, we are announcing the creation of a new branch of Facebook AI Research on Paris.

The news was picked up by the press on both sides of the Atlantic Ocean.

The journal Nature just published a review paper on deep learning co-authored by myself, Yoshua Bengio​, and Geoff Hinton.

It is part of a "Nature Insight" supplement on Machine Intelligence, with two other articles on machine learning by Zoubin Ghahramani​ (probabilistic machines) and Michael Littman​ (reinforcement learning).

Paper: http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html

Insight supplement on AI: http://www.nature.com/nature/journal/v521/n7553/index.html#insight

Post has shared content
A nice and largely accurate article in The Chronicle of Higher Education about the history of neural nets and deep learning, with quotes from +Geoffrey Hinton, +Terrence Sejnowski, +Yoshua Bengio, and yours truly.

http://chronicle.com/article/The-Believers/190147/

Post has attachment
A nice and largely accurate article in The Chronicle of Higher Education about the history of neural nets and deep learning, with quotes from +Geoffrey Hinton, +Terrence Sejnowski, +Yoshua Bengio, and yours truly.

http://chronicle.com/article/The-Believers/190147/

Post has attachment

Post has shared content
Facebook AI Research is open sourcing fbcunn, FAIR's deep learning package for the Torch7 development environment.

This package provides a number of classes and tools for training convolutional nets and other deep learning models. Our library uses super-fast FFT-based convolutions running on NVIDIA GPUs. The package allows training on multiple GPUs. A technical paper gives all the details.

A complete script to train a convolutional net on the ImageNet dataset is provided.

The FAIR blog post announcing the release is here: https://research.facebook.com/blog/879898285375829/fair-open-sources-deep-learning-modules-for-torch/

- Torch website: http://torch.ch/
- fbcunn Github repo: https://github.com/facebook/fbcunn
- ImageNet Training Script: https://github.com/facebook/fbcunn/tree/master/examples/imagenet
- Technical paper on the FFT method used in fbcunn: http://arxiv.org/abs/1412.7580

Our announcement was picked up by quite a few news sites (NY Times, Wired, Gigaom, The Verge, Techcrunch, Venturebeat, ZDNet).

Oren Etzioni (director of the Paul Allen Institute for AI) is quoted in The Verge article: "Whenever you're dealing with a for-profit lab, whether it's Google or Facebook, the question is to what extent will they be part of the academic community and, you know, play nice with others,"

Exactly right, Oren. We very much see ourselves as part of the research community. Research accelerates when people share ideas and tools. We hope that making our tools available will enable brilliant, creative, and fearless young researchers to invent brand new things and push the field forward.

Press coverage:
- NY Times: http://bits.blogs.nytimes.com/2015/01/16/facebook-offers-artificial-intelligence-tech-to-open-source-group
- Gigaom: https://gigaom.com/2015/01/16/facebook-open-sources-tools-for-bigger-faster-deep-learning-models/
- Techcrunch: http://techcrunch.com/2015/01/16/facebook-open-sources-some-of-its-deep-learning-tools/
- Wired: http://www.wired.com/2015/01/facebook-open-sources-trove-ai-tools/
- Venturebeat: http://venturebeat.com/2015/01/16/facebook-opens-up-about-more-of-its-cutting-edge-deep-learning-tools/
- ZDNet: http://www.zdnet.com/article/facebook-open-sources-ai-tools-possibly-turbo-charges-deep-learning/
- The Verge: http://www.theverge.com/2015/1/16/7556691/facebook-artificial-intelligence-research-torch-optimization

Post has attachment
Facebook AI Research is open sourcing fbcunn, FAIR's deep learning package for the Torch7 development environment.

This package provides a number of classes and tools for training convolutional nets and other deep learning models. Our library uses super-fast FFT-based convolutions running on NVIDIA GPUs. The package allows training on multiple GPUs. A technical paper gives all the details.

A complete script to train a convolutional net on the ImageNet dataset is provided.

The FAIR blog post announcing the release is here: https://research.facebook.com/blog/879898285375829/fair-open-sources-deep-learning-modules-for-torch/

- Torch website: http://torch.ch/
- fbcunn Github repo: https://github.com/facebook/fbcunn
- ImageNet Training Script: https://github.com/facebook/fbcunn/tree/master/examples/imagenet
- Technical paper on the FFT method used in fbcunn: http://arxiv.org/abs/1412.7580

Our announcement was picked up by quite a few news sites (NY Times, Wired, Gigaom, The Verge, Techcrunch, Venturebeat, ZDNet).

Oren Etzioni (director of the Paul Allen Institute for AI) is quoted in The Verge article: "Whenever you're dealing with a for-profit lab, whether it's Google or Facebook, the question is to what extent will they be part of the academic community and, you know, play nice with others,"

Exactly right, Oren. We very much see ourselves as part of the research community. Research accelerates when people share ideas and tools. We hope that making our tools available will enable brilliant, creative, and fearless young researchers to invent brand new things and push the field forward.

Press coverage:
- NY Times: http://bits.blogs.nytimes.com/2015/01/16/facebook-offers-artificial-intelligence-tech-to-open-source-group
- Gigaom: https://gigaom.com/2015/01/16/facebook-open-sources-tools-for-bigger-faster-deep-learning-models/
- Techcrunch: http://techcrunch.com/2015/01/16/facebook-open-sources-some-of-its-deep-learning-tools/
- Wired: http://www.wired.com/2015/01/facebook-open-sources-trove-ai-tools/
- Venturebeat: http://venturebeat.com/2015/01/16/facebook-opens-up-about-more-of-its-cutting-edge-deep-learning-tools/
- ZDNet: http://www.zdnet.com/article/facebook-open-sources-ai-tools-possibly-turbo-charges-deep-learning/
- The Verge: http://www.theverge.com/2015/1/16/7556691/facebook-artificial-intelligence-research-torch-optimization

Post has attachment
A crop of new papers submitted to ICLR 2015 by various combinations of my co-authors from Facebook AI Research and NYU.

http://arxiv.org/abs/1412.7580 : "Fast Convolutional Nets With fbfft: A GPU Performance Evaluation" by Nicolas Vasilache, Jeff Johnson, Michael Mathieu, Soumith Chintala, Serkan Piantino, Yann LeCun: two FFT-based implementations of convolutional layers on GPU. The first one is built around NVIDIA's cuFFT, and the second one around custom FFT code called fbfft. This follows our ICLR 2014 paper "Fast Training of Convolutional Networks through FFTs" by Michael Mathieu, Mikael Henaff, Yann LeCun: http://arxiv.org/abs/1312.5851

http://arxiv.org/abs/1412.6651 "Deep learning with Elastic Averaging SGD" by Sixin Zhang, Anna Choromanska, Yann LeCun: a way to distribute the training of deep nets over multiple GPUs by linking the parameter vectors of the workers with "elastics".

http://arxiv.org/abs/1412.7022 "Audio Source Separation with Discriminative Scattering Networks" Pablo Sprechmann, Joan Bruna, Yann LeCun: audio source separation using convnets operating on coefficients of a scattering transform of the signal.

http://arxiv.org/abs/1412.6056 "Unsupervised Learning of Spatiotemporally Coherent Metrics" Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun: training sparse convolutional auto-encoders so that the pooled features change minimally between successive video frames. Beautiful filters come out of it.

http://arxiv.org/abs/1412.6615 "Explorations on high dimensional landscapes" Levent Sagun, V. Ugur Guney, Yann LeCun: another take on the application of random matrix theory to the geometry of the loss surface in deep nets. We use a "teacher network" and a "student network" scenario to see if SGD manages to find a zero-energy global minimum that we know exists. Bottom line: SGD can't find it, but it doesn't matter because the (local) minima that it finds are equally good as far as test error is concerned.
Wait while more posts are being loaded