Profile

Cover photo
Albert Zeyer
Works at RWTH Aachen
Attends RWTH Aachen
Lives in Wuppertal
161 followers|258,765 views
AboutPostsPhotosYouTube+1'sReviews

Stream

Albert Zeyer

Discussion  - 
 
http://arxiv.org/abs/1507.01526

Grid LSTM by DeepMind
Abstract: This paper introduces Grid Long Short-Term Memory, a network of LSTM cells arranged in a multidimensional grid that can be applied to vectors, sequences or higher dimensional data such as images. The network differs from existing deep LSTM architectures in that the cells are connected ...
26
9
Hailin Jin's profile photoli shen's profile photo
Add a comment...

Albert Zeyer

Discussion  - 
 
How do you initialize the softmax bias?

I think a zero initialization might not be the best for softmax.

It can be interpret in some way as the prior log probability for the classes. For N classes, you could thus initialize them all with log(1/N). Or maybe with the prior log probabilities themselves.

Some experiences?
1
Gelin GUO's profile photoAlbert Zeyer's profile photoSander Dieleman's profile photo
4 comments
 
Yes, you can just do the shifting in your softmax function implementation (Theano does this automatically for example).
Add a comment...

Albert Zeyer

Discussion  - 
 
http://arxiv.org/abs/1406.6247

Recurrent Models of Visual Attention, by +Volodymyr Mnih, Nicolas Heess, +alex graves, +koray kavukcuoglu (Google DeepMind)

The title reminded me of Recurrent Processing during Object Recognition (http://psych.colorado.edu/~oreilly/papers/OReillyWyatteHerdEtAl13.pdf), by +Randall O'Reilly et al.
Abstract: Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is capable of extracting information from an image or video by ...
10
1
Albert Zeyer's profile photoNathaniel Kim's profile photo
Add a comment...

Albert Zeyer

Discussion  - 
 
If someone wants to play with Speech Recognition.
 
Hi,
I thought this might be an appropriate forum to announce, that we recently made a 1000 hour corpus of read English speech available for download at http://www.openslr.org/12/. Example scripts are included in Kaldi.
Open Speech and Language Resources.
8
Add a comment...

Albert Zeyer

Discussion  - 
 
 
Learning to Execute and Neural Turing Machines

I'd like to draw your attention to two papers that have been posted in the last few days from some of my colleagues at Google that I think are pretty interesting and exciting:

  Learning to Execute: http://arxiv.org/abs/1410.4615

  Neural Turing Machines: http://arxiv.org/abs/1410.5401

The first paper, "Learning to Execute", by +Wojciech Zaremba and +Ilya Sutskever attacks the problem of trying to train a neural network to take in a small Python program, one character at a time, and to predict its output.  For example, as input, it might take:

"i=8827
c=(i-5347)
print((c+8704) if 2641<8500 else 5308)"

During training, the model is given that the desired output for this program is "12185".  During inference, though, the model is able to generalize to completely new programs and does a pretty good of learning a simple Python interpreter from examples.


The second paper, "Neural Turing Machines", by +alex graves, Greg Wayne, and +Ivo Danihelka from Google's DeepMind group in London, couples an external memory ("the tape") with a neural network in a way that the whole system, including the memory access, is differentiable from end-to-end.  This allows the system to be trained via gradient descent, and the system is able to learn a number of interesting algorithms, including copying, priority sorting, and associative recall.

Both of these are interesting steps along the way of having systems learn more complex behavior, such as learning entire algorithms, rather than being used for just learning functions.

(Edit: changed link to Learning to Execute paper to point to the top-level Arxiv HTML page, rather than to the PDF).
Abstract: We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes. The combined system is analogous to a Turing Machine or Von Neumann architecture but is differentiable end-to-end, allowing it to be ...
14
4
Ron WANG's profile photoxi sizhe's profile photo
Add a comment...

Albert Zeyer

Unsupervised Learning  - 
 
You probably know how you can easily calculate some running average of a sequence of vectors. I wonder about how I can do something similar for the variance, esp. the covariance matrix.

I explained my thoughts in detail here: http://math.stackexchange.com/questions/794322/mean-and-variance-normalization-of-vectors

I'm posting this here because you might be the better target group to answer this. The question has unfortunately not gained much attraction on Math.SE. Maybe I also just worded it badly.

Edit: I just found [this](http://metaoptimize.com/qa/questions/8307/regularizing-a-covariance-matrix). This might be related... I have to read further into the material.
2
3
Dan Weston's profile photoDan Fay's profile photoCharudatta deshmukh's profile photoli kev's profile photo
19 comments
 
Albert, in hindsight I think there are serious numerical problems with what I described (e.g. maintaining positive definiteness). I think what you really want is a square-root information filter. Take a look at http://www.dpi.physik.uni-goettingen.de/cns/modules/BibtexModule/uploads/PDF/liuwoergoettermarkelic2012a.pdf for a modern readable algorithm. The matrix S there is your matrix M (you can demean first in the obvious way).
Add a comment...

Albert Zeyer

Shared publicly  - 
 
Just installed MacOSX 10.8. It broke some things for me.

- Command line dev tools like GCC were not available anymore. In Xcode, under Preferences, under Downloads, clicking on "Install" on Command Line Tools fixed this.

- Many Python tools stopped to work because most/all `easy_install`ed packages got lost. From what I tested so far, that was pycrypto and IPython. After installing the command line tools (GCC and co), just re`easy_install`ing the stuff worked.

- My self-developed screensaver stopped working. It always crashed with a segfault. It seems that some NSObject ARC related changes caused this crash. See here for my fixes: http://goo.gl/omV0b
1
Albert Zeyer's profile photoThomas McColgan's profile photoChris Grabinski's profile photo
4 comments
 
Knallcharge, hab heute noch mit Homebrew die Python und Subversion Bindings geladen. Ich bin ein leet hax0r. Deal with it.
 ·  Translate
Add a comment...
Have him in circles
161 people
Kareem Moussa's profile photo
Mat Gan (cosmoio)'s profile photo
Ilya Kulikov's profile photo
Bryan Joseph's profile photo
Koji Jäger's profile photo
Zeynep Yılmaz's profile photo
Sebastian Schulz's profile photo
How Videos's profile photo
Erik Lindroos's profile photo

Albert Zeyer

Shared publicly  - 
 
 
Some interesting figures from our Similarity explorer:
Semantic overlap of "Christmas" with:
Jesus Christ: 10%
TV: 20%
Santa Claus: 24%
Show: 32%
Holiday: 34%
.... Happy Holidays!
Compare the meaning of any two texts by overlaying their semantic fingerprints. % related. Left Expression. Combined. Right Expression. Press tab to transform entered text into a term. Connecting operators such as AND, OR, SUB and XOR may also be used to form complex expressions ...
1
Add a comment...

Albert Zeyer

Discussion  - 
 
Via +Yann LeCun : "+Amnon Shashua talks about SimNet. He describes it as a generalization of ConvNet in which the dot product is replaced by a "similarity" function and the pooling is the log(sum(exp()) operator (that many of us have been using) with per-input bias (which can be seen as a convolution in the log domain)."
Abstract: We present a deep layered architecture that generalizes classical convolutional neural networks (ConvNets). The architecture, called SimNets, is driven by two operators, one being a similarity function whose family contains the convolution operator used in ConvNets, and the other is a ...
8
7
Yang Chen's profile photoPaul Wohlhart's profile photoMatthew Baggott's profile photoMihail Sirotenko's profile photo
 
Very nice work!  Would this "grand theory" be able to explain the quirkiness of ConvNet discussed in the paper by Christian Szegedy, Wojciech Zaremba, Christian Szegedy, Joan Bruna, Dumitru Erhan, +Ian Goodfellow, and +Rob Fergus [ Intriguing properties of neural networks, http://arxiv.org/pdf/1312.6199v2.pdf ].  +Yann LeCun ?
Add a comment...

Albert Zeyer

Discussion  - 
 
Distributed vector representations, word embeddings, zero-shot learning.

Recognize a cat having only seen dogs, but having read about both dogs and cats.

word2vec, DISSECT, Paragraph Vector. Phrase Representations using RNN Encoder–Decoder, Life-long Off-policy Learning, Neural Tensor Networks.

AGI.
8
1
John Kellden's profile photo
Add a comment...

Albert Zeyer

Discussion  - 
 
I just read 'Knowledge Matters: Importance of Prior Information for Optimization' by Gulcehre and Bengio from 2013. http://arxiv.org/pdf/1301.4083.pdf

It's about a simple task to recognize whether there are 3 same shapes in a picture or not. And that all state-of-the-art ML algorithms fail to solve this without guidance.

I wonder a bit about that. I would have expected that a combination of some unsupervised deep convolutional NN with a supervised NN would be able to learn that. Maybe even removing the convolution and provide respectively more training data.

I searched a bit for any follow-up work about some methods which can solve this task. Maybe also with the use of evolutionary algorithms.

Does anyone have comments on that?
10
John Taylor's profile photoÇağlar Gülçehre's profile photoAlbert Zeyer's profile photoDavid Reichert's profile photo
9 comments
 
While I agree that thinking about how to represent equality is interesting, again note that in this case, the task can be learned once you learned about categories (with supervision in this case). To even talk about this form of equality (in terms of matching categories across multiple concurrently present object instances) you have to have an explicit or implicit representation of what makes a category in the first place.

So I think this task is difficult not necessarily because learning about equality is difficult, but because to solve the task, you need to learn invariant object recognition first, and apparently if the supervision signal is only the equality signal then learning the object recognition implicitly is difficult (and it would be nice to analyse further why).

So if you ask how people learn this task, well, they probably learn invariant object recognition first, i.e. to recognize object categories across many instances and views, before they learn/reason about relationships between multiple categories...

As for learning invariant object recognition, of course there are many ideas how to do that including in relatively unsupervised ways.  Coming back to that point once more, like I said, an unsupervised algorithm is not going to discover the categories unless there is structure in the data that reflects the categories (e.g. transformation sequences) and/or there is some relevant information built into the algorithm (e.g. in this case, knowledge about transformations).

Given a set of images, I can arbitrarily assign category boundaries. E.g. in this case, what if I now decide the categories aren't about the shape types, but about how many pixels there are, or whether a shape is symmetric or not, etc. An unsupervised algorithm isn't going to read my mind--all it has is the data and the assumptions built into the algorithm.
Add a comment...

Albert Zeyer

Shared publicly  - 
 ·  Translate
Reportage / Dokumentation - Seit mehr als sieben Jahren sitzt Gustl Mollath aus Nürnberg in der geschlossenen Psychiatrie. Der Film arbeitet den Fall erstmals im Fernsehen umfassend auf.
1
Chris Grabinski's profile photoSergii Pylypenko's profile photoAlbert Zeyer's profile photo
3 comments
 
Hey Pelya! :)
Add a comment...
People
Have him in circles
161 people
Kareem Moussa's profile photo
Mat Gan (cosmoio)'s profile photo
Ilya Kulikov's profile photo
Bryan Joseph's profile photo
Koji Jäger's profile photo
Zeynep Yılmaz's profile photo
Sebastian Schulz's profile photo
How Videos's profile photo
Erik Lindroos's profile photo
Education
  • RWTH Aachen
    PhD Machine Learning, present
  • RWTH Aachen
    M.S. Mathematics, 2013
  • RWTH Aachen
    M.S. Computer Science, 2013
Basic Information
Gender
Male
Story
Tagline
Machine Learning Research Developer
Introduction
I am a developer. You can see some of my projects here:
https://github.com/albertzhttp://www.az2000.de/ and https://sourceforge.net/users/albertzeyer/.

I am also one of the developers of OpenLieroX:
http://openlierox.sf.net

My music player:

I work on Deep Neural Networks and Speech Recognition here:

Work
Occupation
Research
Skills
Machine Learning, Programming, Algorithms
Employment
  • RWTH Aachen
    PhD Student, present
    Machine Learning, Speech Recognition, Deep Learning, Neural Networks, ...
  • inmation
    Senior Software Engineer, present
Places
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Currently
Wuppertal
Previously
Aachen - Turku - Bochum
Contact Information
Home
Email
Albert Zeyer's +1's are the things they like, agree with, or want to recommend.
Video-Clip "Die Story im Ersten: Der Fall Mollath" | ARD Mediathek | Das...
www.ardmediathek.de

Reportage / Dokumentation - Seit mehr als sieben Jahren sitzt Gustl Mollath aus Nürnberg in der geschlossenen Psychiatrie. Der Film arbeitet

Project Zomboid: Giving Indie gaming a black eye? « Icrontic Gaming
gaming.icrontic.com

Project Zomboid: Giving Indie gaming a black eye? David Kenkel (NiGHTS) A dramatic tale of an Indie gaming community, theft, intrigue, and a

Iron Sky
www.google.com

You did NAZI this coming! Iron Sky Habt ihr uns vermisst? 1945 verloren die Nazis zwar den Zweiten Weltkrieg, besiegt waren sie damit jedoch

Jailbreak iPhone 4S iOS 5 Untethered Status, Who Is Going To Do It?
cydiahelp.com

Jailbreak iPhone 4S on iOS 5. iPhone 4S Jailbreak Untethered and Tethered. iOS 5 Untethered Jailbreak on iPhone 4S. Everything about iPhone

OpenLieroX
www.openlierox.net

OpenLieroX - extremely addictive realtime worms shoot-em-up backed by an active gamers community. Play the most famous Liero clone! Windows,

Food: ExcellentDecor: ExcellentService: Excellent
Public - 3 years ago
reviewed 3 years ago
1 review
Map
Map
Map