Learning to Execute
and Neural Turing Machines
I'd like to draw your attention to two papers that have been posted in the last few days from some of my colleagues at Google that I think are pretty interesting and exciting:
Learning to Execute: http://arxiv.org/abs/1410.4615
Neural Turing Machines: http://arxiv.org/abs/1410.5401
The first paper, "Learning to Execute", by +Wojciech Zaremba and +Ilya Sutskever
attacks the problem of trying to train a neural network to take in a small Python program, one character at a time, and to predict its output. For example, as input, it might take:
print((c+8704) if 2641<8500 else 5308)"
During training, the model is given that the desired output for this program is "12185". During inference, though, the model is able to generalize to completely new programs and does a pretty good of learning a simple Python interpreter from examples.
The second paper, "Neural Turing Machines", by +alex graves
, Greg Wayne, and +Ivo Danihelka
from Google's DeepMind group in London, couples an external memory ("the tape") with a neural network in a way that the whole system, including the memory access, is differentiable from end-to-end. This allows the system to be trained via gradient descent, and the system is able to learn a number of interesting algorithms, including copying, priority sorting, and associative recall.
Both of these are interesting steps along the way of having systems learn more complex behavior, such as learning entire algorithms, rather than being used for just learning functions.
(Edit: changed link to Learning to Execute paper to point to the top-level Arxiv HTML page, rather than to the PDF).