Profile

Cover photo
Stan Bileschi
Works at Google
Attended MIT
134 followers|131,198 views
AboutPostsPhotosYouTubeReviews

Stream

Stan Bileschi

Shared publicly  - 
 
 
Our Deep Learning Neural Networks just became the best artificial recognisers of Chinese characters (from the ICDAR 2013 competition), approaching human performance [1]. First author is Dan Claudiu Cireșan.
 
Why is this important? For example, all major smartphone companies want you to point your cell phone camera to texts written in foreign languages, say, Chinese metro signs or lunch menus, and get reliable translations.

As always in such competitions, GPU-based pure supervised gradient descent (40-year-old backprop à la Paul Werbos) was applied to our deep and wide multi-column networks with alternating max-pooling layers and convolutional layers (multi-column MPCNN) [2,3]. Most if not all leading IT companies and research labs are now using this technique, too.
 
In 2011, such multi-column MPCNN became the first artificial devices to achieve human-competitive performance [3] on major benchmarks, including the MNIST handwritten digits of  +Yann LeCun, possibly the most famous benchmark of machine learning. Chinese handwriting à la ICDAR, however, is much harder, as there are not only 10 classes (one for each digit), but 3755.

None of us speaks a word of Chinese.

The report [1] also mentions a funny preprocessing bug.

When we started Deep Learning research over 20 years ago [4], slow computers forced us to focus on toy applications. How things have changed!  Today, deep NN can already learn to rival human pattern recognisers in certain domains. And each decade we gain another factor of 100-1000 in terms of raw computational power per cent.

[1]  http://arxiv.org/abs/1309.0261 (1 September 2013)

[2] D. C. Cireșan, U. Meier, J. Masci, L. M. Gambardella, J. Schmidhuber. Flexible, High Performance Convolutional Neural Networks for Image Classification. IJCAI-2011, Barcelona, 2011. Preprint http://arxiv.org/abs/1102.0183

[3] D. C. Cireșan, U. Meier, J. Schmidhuber. Multi-column Deep Neural Networks for Image Classification.  CVPR 2012, p 3642-3649, 2012. http://www.idsia.ch/~juergen/cvpr2012.pdf , preprint http://arxiv.org/abs/1202.2745

[4]  http://www.idsia.ch/~juergen/deeplearning.html  
1
Add a comment...

Stan Bileschi

Shared publicly  - 
 
To all my programming friends: if you've ever wanted to learn Common Lisp, or want some refreshing practice, please check out my new open source project at 
https://github.com/google/lisp-koans/
Briefly number 6 on hacker news!
#lisp
1
Yann LeCun's profile photo
 
I didn't know you were a Lisp kind of guy, Stan.
Also, I didn't realize you were at Google now.
Add a comment...

Stan Bileschi

Shared publicly  - 
 
Cambridge is now full of the packed slush of a ski resort. 
1
Add a comment...

Stan Bileschi

Shared publicly  - 
 
Winterpocalypse, cambridge.
1
Add a comment...

Stan Bileschi

Shared publicly  - 
 
 
Google flight search now lets you specify # of passengers and child/adult to simplify the booking process.
1
Add a comment...
In his circles
121 people
Have him in circles
134 people
TOMASO POGGIO's profile photo
Paul Rosendall's profile photo
Adam Tripi's profile photo
katie p.'s profile photo
Susan Tripi's profile photo
Matt Wallace's profile photo
Tom Lee's profile photo

Stan Bileschi

Shared publicly  - 
 
Gary Shteyngart, a writer I enjoy reading, ("Russian Debutante's Handbook", "Absurdistan", etc.) writes about using Google Glass. 
http://www.newyorker.com/reporting/2013/08/05/130805fa_fact_shteyngart?currentPage=all
1
Add a comment...

Stan Bileschi

Shared publicly  - 
 
 
If you hear Travis’ “Hit Me Baby One More Time”, you might immediately know it’s a cover of the Britney Spears song, even though the tempo, instrumentation, and arrangement are all strikingly different.  Our brains, excellent pattern recognition devices, are able to easily pinpoint similar melodies even when many of the other audio characteristics of a song are quite dissimilar.  But, can computers listen to, and recognize, music the same way we do?

To help computers do just that, Googlers Thomas Walters, David Ross, and Richard Lyon built a system capable of identifying the melodic similarity between any two audio tracks.  It does this with the help of what they call an intervalgram, a depiction of the correlation between two songs, sets of which form the basis of a system for detection of similar or identical melodies across a database of music.

A heat map showing intervalgram similarity of both versions of Baby Hit Me One More Time is shown below, where the horizontal and vertical axes represent each version of the song.  Both songs start at the bottom left corner, with the color of each pixel showing how similar the  intervalgrams are for the two songs at each point in time; blue colors are low similarity and red colors are high similarity. The bright diagonal line along the middle shows the best similarity between the versions. Lines parallel to this diagonal are 'echoes' where one segment of the music matches another segment in a different place (for example, multiple verses will match each other). 

In addition to showing that the two songs are a good match for each other, one can also learn something about the structure of the song.  To learn more about intervalgrams, visit http://goo.gl/8s6yp

And if you just like the music, you can find Britney’s original version at http://goo.gl/ZG623 and Travis’ cover at http://goo.gl/0M2cJ.
1
2
Matthew Cacibauda's profile photo
Add a comment...

Stan Bileschi

Shared publicly  - 
 
I guess there should be some limits:  http://i.imgur.com/KvQPN4Q.png
1
Add a comment...

Stan Bileschi

Shared publicly  - 
 
 
The videos of the NIPS 2012 have been out for a while.

For talks connected with Deep Learning, have a look at:
- Stéphane Mallat: Classification with Deep Invariant Scattering Networks http://videolectures.net/nips2012_mallat_classification/
- Geoffrey Hinton: Dropout: a simple and effective way to improve neural networks http://videolectures.net/nips2012_hinton_networks/
- Nicolas Le Roux: A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets http://videolectures.net/nips2012_le_roux_gradient_method/

Sadly, none of the deep learning related workshops were recorded.
1
Add a comment...
People
In his circles
121 people
Have him in circles
134 people
TOMASO POGGIO's profile photo
Paul Rosendall's profile photo
Adam Tripi's profile photo
katie p.'s profile photo
Susan Tripi's profile photo
Matt Wallace's profile photo
Tom Lee's profile photo
Education
  • MIT
    AI, 2000 - 2009
  • University at Buffalo, The State University of New York
    CS, EE, 1996 - 2000
  • Fairport High School
    snark, 1992 - 1996
Links
Contributor to
Work
Occupation
Software Engineer at Google.
Employment
  • Google
    Software Engineer, 2012 - present
  • DataXu
    Machine Learning Specialist, 2009 - 2012
  • MIT
    PostDoc, 2006 - 2009
Basic Information
Gender
Male
Public - 6 months ago
reviewed 6 months ago
Enjoyable for meeting up friends for beers. Wide selection of cask ales and heavier styles.
Public - a year ago
reviewed a year ago
Public - 2 years ago
reviewed 2 years ago
4 reviews
Map
Map
Map
The Royal East has great food. You will not be disappointed.
Food: ExcellentDecor: Very goodService: Very good
Public - a year ago
reviewed a year ago