Profile

Cover photo
Hao Huang
8 followers|65,151 views
AboutPostsPhotosVideos

Stream

Hao Huang

Shared publicly  - 
 
 
Learning to Execute and Neural Turing Machines

I'd like to draw your attention to two papers that have been posted in the last few days from some of my colleagues at Google that I think are pretty interesting and exciting:

  Learning to Execute: http://arxiv.org/abs/1410.4615

  Neural Turing Machines: http://arxiv.org/abs/1410.5401

The first paper, "Learning to Execute", by +Wojciech Zaremba and +Ilya Sutskever attacks the problem of trying to train a neural network to take in a small Python program, one character at a time, and to predict its output.  For example, as input, it might take:

"i=8827
c=(i-5347)
print((c+8704) if 2641<8500 else 5308)"

During training, the model is given that the desired output for this program is "12185".  During inference, though, the model is able to generalize to completely new programs and does a pretty good of learning a simple Python interpreter from examples.


The second paper, "Neural Turing Machines", by +alex graves, Greg Wayne, and +Ivo Danihelka from Google's DeepMind group in London, couples an external memory ("the tape") with a neural network in a way that the whole system, including the memory access, is differentiable from end-to-end.  This allows the system to be trained via gradient descent, and the system is able to learn a number of interesting algorithms, including copying, priority sorting, and associative recall.

Both of these are interesting steps along the way of having systems learn more complex behavior, such as learning entire algorithms, rather than being used for just learning functions.

(Edit: changed link to Learning to Execute paper to point to the top-level Arxiv HTML page, rather than to the PDF).
Abstract: We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes. The combined system is analogous to a Turing Machine or Von Neumann architecture but is differentiable end-to-end, allowing it to be ...
1
Add a comment...
Hao hasn't shared anything on this page with you.