Profile cover photo
Profile photo
Juergen Schmidhuber
2,900 followers -
Towards True AI Since 1987
Towards True AI Since 1987

2,900 followers
About
Juergen's posts

Post has attachment
NIPS 2016 Symposium: Recurrent Neural Networks and Other Machines that Learn Algorithms (Thursday, December 8, 2016, Barcelona) - Call for Posters

Soon after the birth of modern computer science in the 1930s, two fundamental questions arose: 1. How can computers learn useful programs from experience, as opposed to being programmed by human programmers? 2. How to program parallel multiprocessor machines, as opposed to traditional serial architectures? Both questions found natural answers in the field of Recurrent Neural Networks (RNNs), which are brain-inspired general purpose computers that can learn parallel-sequential programs or algorithms encoded as weight matrices.

The first RNNaissance NIPS workshop dates back to 2003: http://people.idsia.ch/~juergen/rnnaissance.html . Since then, a lot has happened. Some of the most successful applications in machine learning (including deep learning) are now driven by RNNs such as Long Short-Term Memory, e.g., speech recognition, video recognition, natural language processing, image captioning, time series prediction, etc. Through the world's most valuable public companies, billions of people can now access this technology through their smartphones and other devices, e.g., in the form of Google Voice or on Apple's iOS. Reinforcement-learning and evolutionary RNNs are solving complex control tasks from raw video input. Many RNN-based methods learn sequential attention strategies.

At this symposium, we will review the latest developments in all of these fields, and focus not only on RNNs, but also on learning machines in which RNNs interact with external memory such as neural Turing machines, memory networks, and related memory architectures such as fast weight networks and neural stack machines. In this context we will also will discuss asymptotically optimal program search methods and their practical relevance.

Our target audience has heard a bit about RNNs, the deepest of all neural networks, but will be happy to hear again a summary of the basics and then delve into the latest advanced topics to see and understand what has recently become possible. All invited talks will be followed by open discussions, with further discussions during a poster session. Finally, we will also have a panel discussion on the bright future of RNNs, and their pros and cons.

A tentative list of speakers can be found at the symposium website: http://people.idsia.ch/~rupesh/rnnsymposium2016/index.html



Call for Posters

We invite researchers and practitioners to submit poster abstracts for presentation during the symposium (min. 2 pages, no page limit). All contributions related to the symposium theme are encouraged. The organizing committee will select posters to maximize quality and diversity within the available display space.

For submissions, non-anonymous abstracts should be emailed to rnn.nips2016@gmail.com by the corresponding authors. Selected abstracts will be advertised on the symposium website, and posters will be visible throughout the duration of the symposium. NIPS attendees will interact with poster presenters during the light dinner break (6:30 - 7:30 PM). The submission deadline is October 15th, 23:59 PM CET.



Jürgen Schmidhuber & Sepp Hochreiter & Alex Graves & Rupesh Srivastava


#artificialintelligence
#deeplearning
#machinelearning
#computervision


Photo

Post has attachment

Update of 13 February 2017: New Jobs for PostDocs and PhD students thanks to Google DeepMind, NVIDIA and SNF: Please follow instructions under http://people.idsia.ch/~juergen/jobs2017.html


Update of 11 November 2016: We have an open call for Tenure-Track Assistant Professor - deadline Dec 15: http://www.usi.ch/call-inf-assistant-professor-tenure-track-291728.pdf We encourage experts in Computer Vision / Machine Learning / Neural Networks / Deep Learning to apply!

Fall 2016 - jobs for postdocs and PhD students: Join the Deep Learning team (since 1991) that won more competitions than any other. We are seeking researchers for the project RNNAIssance based on this tech report on "learning to think:” http://arxiv.org/abs/1511.09249 . The project is about general purpose artificial intelligence for agents living in partially observable environments, controlled by reinforcement learning recurrent neural networks (RNNs), supported by unsupervised predictive RNN world models. Location: The Swiss AI Lab, IDSIA, in Switzerland, the world’s leading science nation, and most competitive country for the 7th year in a row. Competitive Swiss salary. Preferred start: As soon as possible.  More details and instructions can be found here: http://people.idsia.ch/~juergen/rnnai2016.html

#artificialintelligence
#deeplearning
#machinelearning
#computervision
Photo

Post has attachment

Post has attachment
Microsoft wins ImageNet 2015 through feedforward LSTM without gates

Microsoft Research dominated the ImageNet 2015 contest with a deep neural network of 150 layers [1]. Congrats to Kaiming He & Xiangyu Zhang & Shaoqing Ren & Jian Sun on the great results [2]!

Their CNN layers compute G(F(x)+x), which is essentially a feedforward Long Short-Term Memory (LSTM) [3] without gates!

Their net is similar to the very deep Highway Networks [4] (with hundreds of layers), which are feedforward LSTMs with forget gates (= gated recurrent units) [5].

The authors mention the vanishing gradient problem, but do not mention my very first student Sepp Hochreiter (now professor) who identified and analyzed this fundamental deep learning problem in 1991, years before anybody else did [6].

Apart from the above, I liked the paper [1] a lot. LSTM concepts keep invading CNN territory [e.g., 7a-e], also through GPU-friendly multi-dimensional LSTMs [8].

References

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. Deep Residual Learning for Image Recognition. http://arxiv.org/abs/1512.03385

[2] ImageNet Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) - Results: http://www.image-net.org/challenges/LSVRC/2015/results

[3] S. Hochreiter, J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997. Led to a lot of follow-up work http://people.idsia.ch/~juergen/rnn.html, and is now heavily used by leading IT companies around the world.

[4] R. K. Srivastava, K. Greff, J. Schmidhuber. Training Very Deep Networks. NIPS 2015; http://arxiv.org/abs/1505.00387

[5] F. A. Gers, J. Schmidhuber, F. Cummins. Learning to Forget: Continual Prediction with LSTM. Neural Computation, 12(10):2451-2471, 2000. ftp://ftp.idsia.ch/pub/juergen/FgGates-NC.pdf

[6] Hochreiter, S. (1991). Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, TU Munich. Advisor: J. Schmidhuber. Overview: http://people.idsia.ch/~juergen/fundamentaldeeplearningproblem.html

[7a] 2011: first superhuman CNNs http://people.idsia.ch/~juergen/superhumanpatternrecognition.html
[7b] 2011: First human-competitive CNNs for handwriting http://people.idsia.ch/~juergen/handwriting.html
[7b] 2012: first CNN to win segmentation contest http://people.idsia.ch/~juergen/deeplearningwinsbraincontest.html
[7c] 2012: first CNN to win object discovery contest http://people.idsia.ch/~juergen/deeplearningwinsMICCAIgrandchallenge.html
[7d] Scholarpedia: http://www.scholarpedia.org/article/Deep_Learning

[8] M. Stollenga, W. Byeon, M. Liwicki, J. Schmidhuber. Parallel Multi-Dimensional LSTM, with Application to Fast Biomedical Volumetric Image Segmentation. NIPS 2015; http://arxiv.org/abs/1506.07452

Link: http://people.idsia.ch/~juergen/microsoft-wins-imagenet-through-feedforward-LSTM-without-gates.html

#computervision
#deeplearning
#machinelearning
#artificialintelligence
Photo

Post has attachment
How to Learn an Algorithm (video). I review 3 decades of our research on both gradient-based and more general problem solvers that search the space of algorithms running on general purpose computers with internal memory. Architectures include traditional computers, Turing machines, recurrent neural networks, fast weight networks, stack machines, and others. Some of our algorithm searchers are based on algorithmic information theory and are optimal in asymptotic or other senses. Most can learn to direct internal and external spotlights of attention. Some of them are self-referential and can even learn the learning algorithm itself (recursive self-improvement). Without a teacher, some of them can reinforcement-learn to solve very deep algorithmic problems (involving billions of steps) infeasible for more recent memory-based deep learners. And algorithms learned by our Long Short-Term Memory recurrent networks defined the state-of-the-art in handwriting recognition, speech recognition, natural language processing, machine translation, image caption generation, etc. Google and other companies made them available to over a billion users.

The video was taped on Oct 7 2015 during MICCAI 2015 at the Deep Learning Meetup Munich:  http://www.meetup.com/en/deeplearning/events/225423302/  Link to video: https://www.youtube.com/watch?v=mF5-tr7qAF4

Similar talk at the Deep Learning London Meetup of Nov 4 2015: http://www.meetup.com/Deep-Learning-London/events/225841989/ (video not quite ready yet)

Most of the slides for these talks are here: http://people.idsia.ch/~juergen/deep2015white.pdf

These also includes slides for the AGI keynote in Berlin http://agi-conf.org/2015/keynotes/, the IEEE distinguished lecture in Seattle (Microsoft Research, Amazon), the INNS BigData plenary talk in San Francisco, the keynote for the Swiss eHealth summit, two MICCAI 2015 workshops, and a recent talk for CERN (some of the above were videotaped as well).

Parts of these talks (and some of the slides) are also relevant for upcoming talks in the NYC area (Dec 4-6 and 13-16) and at NIPS workshops in Montreal:

1. Reasoning, Attention, Memory (RAM) Workshop, NIPS 2015 https://research.facebook.com/pages/764602597000662/reasoning-attention-memory-ram-nips-workshop-2015/

2. Deep Reinforcement Learning Workshop, NIPS 2015 http://rll.berkeley.edu/deeprlworkshop/

3. Applying (machine) Learning to Experimental Physics (ALEPH) Workshop, NIPS 2015 http://yandexdataschool.github.io/aleph2015/pages/keynote-speakers.html

More videos: http://people.idsia.ch/~juergen/videos.html

Also available now: Scholarpedia article on Deep Learning: http://www.scholarpedia.org/article/Deep_Learning

Finally, a recent arXiv preprint: On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models. http://arxiv.org/abs/1511.09249

#machinelearning
#artificialintelligence
#computervision
#deeplearning


Post has attachment
Announcing Brainstorm Open Source Software for Neural Networks

We are open-sourcing a new neural networks library called Brainstorm, developed over the past year at the Swiss AI Lab IDSIA by PhD students Klaus Greff and Rupesh Srivastava: https://github.com/IDSIA/brainstorm

Brainstorm is designed to make neural networks fast, flexible and fun. Lessons learned from earlier open source projects led to new design elements compatible with multiple platforms and computing backends.

Brainstorm already has a robust base feature set, including support for recurrent neural networks (RNNs) such as LSTM, Clockwork, 2D Convolution/Pooling and Highway layers on CPU/GPU. All data is considered sequential, and RNNs are first-class citizens.

We hope the community will help us to further improve Brainstorm.

#machinelearning
#artificialintelligence
#computervision
#deeplearning

http://people.idsia.ch/~juergen/brainstorm.html
Photo

Post has attachment
The good news came on 8/5/15: I am recipient of the 2016 IEEE CIS Neural Networks Pioneer Award, “for pioneering contributions to deep learning and neural networks.” The list of all awardees since 1991 is here: http://cis.ieee.org/award-recipients.html

#machinelearning
#artificialintelligence
#computervision
#deeplearning
Photo

Post has attachment
Critique of Paper by "Deep Learning Conspiracy" (Nature 521 p 436)

Machine learning is the science of credit assignment. The machine learning community itself profits from proper credit assignment to its members. The inventor of an important method should get credit for inventing it. She may not always be the one who popularizes it. Then the popularizer should get credit for popularizing it (but not for inventing it). Relatively young research areas such as machine learning should adopt the honor code of mature fields such as mathematics: if you have a new theorem, but use a proof technique similar to somebody else's, you must make this very clear. If you "re-invent" something that was already known, and only later become aware of this, you must at least make it clear later.

As a case in point, let me now comment on a recent article in Nature (2015) about "deep learning" in artificial neural networks (NNs), by LeCun & Bengio & Hinton (LBH for short), three CIFAR-funded collaborators who call themselves the "deep learning conspiracy" (e.g., LeCun, 2015). They heavily cite each other. Unfortunately, however, they fail to credit the pioneers of the field, which originated half a century ago. All references below are taken from the recent deep learning overview (Schmidhuber, 2015), except for a few papers listed beneath this critique focusing on nine items.

1. LBH's survey does not even mention the father of deep learning, Alexey Grigorevich Ivakhnenko, who published the first general, working learning algorithms for deep networks (e.g., Ivakhnenko and Lapa, 1965). A paper from 1971 already described a deep learning net with 8 layers (Ivakhnenko, 1971), trained by a highly cited method still popular in the new millennium. Given a training set of input vectors with corresponding target output vectors, layers of additive and multiplicative neuron-like nodes are incrementally grown and trained by regression analysis, then pruned with the help of a separate validation set, where regularisation is used to weed out superfluous nodes. The numbers of layers and nodes per layer can be learned in problem-dependent fashion.

2. LBH discuss the importance and problems of gradient descent-based learning through backpropagation (BP), and cite their own papers on BP, plus a few others, but fail to mention BP's inventors. BP's continuous form was derived in the early 1960s (Bryson, 1961; Kelley, 1960; Bryson and Ho, 1969). Dreyfus (1962) published the elegant derivation of BP based on the chain rule only. BP's modern efficient version for discrete sparse networks (including FORTRAN code) was published by Linnainmaa (1970). Dreyfus (1973) used BP to change weights of controllers in proportion to such gradients. By 1980, automatic differentiation could derive BP for any differentiable graph (Speelpenning, 1980). Werbos (1982) published the first application of BP to NNs, extending thoughts in his 1974 thesis (cited by LBH), which did not have Linnainmaa's (1970) modern, efficient form of BP. BP for NNs on computers 10,000 times faster per Dollar than those of the 1960s can yield useful internal representations, as shown by Rumelhart et al. (1986), who also did not cite BP's inventors.

3. LBH claim: "Interest in deep feedforward networks [FNNs] was revived around 2006 (refs 31-34) by a group of researchers brought together by the Canadian Institute for Advanced Research (CIFAR)." Here they refer exclusively to their own labs, which is misleading. For example, by 2006, many researchers had used deep nets of the Ivakhnenko type for decades. LBH also ignore earlier, closely related work funded by other sources, such as the deep hierarchical convolutional neural abstraction pyramid (e.g., Behnke, 2003b), which was trained to reconstruct images corrupted by structured noise, enforcing increasingly abstract image representations in deeper and deeper layers. (BTW, the term "Deep Learning" (the very title of LBH's paper) was introduced to Machine Learning by Dechter (1986), and to NNs by Aizenberg et al (2000), none of them cited by LBH.)

4. LBH point to their own work (since 2006) on unsupervised pre-training of deep FNNs prior to BP-based fine-tuning, but fail to clarify that this was very similar in spirit and justification to the much earlier successful work on unsupervised pre-training of deep recurrent NNs (RNNs) called neural history compressors (Schmidhuber, 1992b, 1993b). Such RNNs are even more general than FNNs. A first RNN uses unsupervised learning to predict its next input. Each higher level RNN tries to learn a compressed representation of the information in the RNN below, to minimise the description length (or negative log probability) of the data. The top RNN may then find it easy to classify the data by supervised learning. One can even "distill" a higher, slow RNN (the teacher) into a lower, fast RNN (the student), by forcing the latter to predict the hidden units of the former. Such systems could solve previously unsolvable very deep learning tasks, and started our long series of successful deep learning methods since the early 1990s (funded by Swiss SNF, German DFG, EU and others), long before 2006, although everybody had to wait for faster computers to make very deep learning commercially viable. LBH also ignore earlier FNNs that profit from unsupervised pre-training prior to BP-based fine-tuning (e.g., Maclin and Shavlik, 1995). They cite Bengio et al.'s post-2006 papers on unsupervised stacks of autoencoders, but omit the original work on this (Ballard, 1987).

5. LBH write that "unsupervised learning (refs 91-98) had a catalytic effect in reviving interest in deep learning, but has since been overshadowed by the successes of purely supervised learning." Again they almost exclusively cite post-2005 papers co-authored by themselves. By 2005, however, this transition from unsupervised to supervised learning was an old hat, because back in the 1990s, our unsupervised RNN-based history compressors (see above) were largely phased out by our purely supervised Long Short-Term Memory (LSTM) RNNs, now widely used in industry and academia for processing sequences such as speech and video. Around 2010, history repeated itself, as unsupervised FNNs were largely replaced by purely supervised FNNs, after our plain GPU-based deep FNN (Ciresan et al., 2010) trained by BP with pattern distortions (Baird, 1990) set a new record on the famous MNIST handwritten digit dataset, suggesting that advances in exploiting modern computing hardware were more important than advances in algorithms. While LBH mention the significance of fast GPU-based NN implementations, they fail to cite the originators of this approach (Oh and Jung, 2004).

6. In the context of convolutional neural networks (ConvNets), LBH mention pooling, but not its pioneer (Weng, 1992), who replaced Fukushima's (1979) spatial averaging by max-pooling, today widely used by many, including LBH, who write: "ConvNets were largely forsaken by the mainstream computer-vision and machine-learning communities until the ImageNet competition in 2012," citing Hinton's 2012 paper (Krizhevsky et al., 2012). This is misleading. Earlier, committees of max-pooling ConvNets were accelerated on GPU (Ciresan et al., 2011a), and used to achieve the first superhuman visual pattern recognition in a controlled machine learning competition, namely, the highly visible IJCNN 2011 traffic sign recognition contest in Silicon Valley (relevant for self-driving cars). The system was twice better than humans, and three times better than the nearest non-human competitor (co-authored by LeCun of LBH). It also broke several other machine learning records, and surely was not "forsaken" by the machine-learning community. In fact, the later system (Krizhevsky et al. 2012) was very similar to the earlier 2011 system. Here one must also mention that the first official international contests won with the help of ConvNets actually date back to 2009 (three TRECVID competitions) - compare Ji et al. (2013). A GPU-based max-pooling ConvNet committee also was the first deep learner to win a contest on visual object discovery in large images, namely, the ICPR 2012 Contest on Mitosis Detection in Breast Cancer Histological Images (Ciresan et al., 2013). A similar system was the first deep learning FNN to win a pure image segmentation contest (Ciresan et al., 2012a), namely, the ISBI 2012 Segmentation of Neuronal Structures in EM Stacks Challenge.

7. LBH discuss their FNN-based speech recognition successes in 2009 and 2012, but fail to mention that deep LSTM RNNs had outperformed traditional speech recognizers on certain tasks already in 2007 (Fernández et al., 2007) (and traditional connected handwriting recognisers by 2009), and that today's speech recognition conferences are dominated by (LSTM) RNNs, not by FNNs of 2009 etc. While LBH cite work co-authored by Hinton on LSTM RNNs with several LSTM layers, this approach was pioneered much earlier (e.g., Fernandez et al., 2007).

8. LBH mention recent proposals such as "memory networks" and the somewhat misnamed "Neural Turing Machines" (which do not have an unlimited number of memory cells like real Turing machines), but ignore very similar proposals of the early 1990s, on neural stack machines, fast weight networks, self-referential RNNs that can address and rapidly modify their own weights during runtime, etc (e.g., AMAmemory 2015). They write that "Neural Turing machines can be taught algorithms," as if this was something new, although LSTM RNNs were taught algorithms many years earlier, even entire learning algorithms (e.g., Hochreiter et al., 2001b).

9. In their outlook, LBH mention "RNNs that use reinforcement learning to decide where to look" but not that they were introduced a quarter-century ago (Schmidhuber & Huber, 1991). Compare the more recent Compressed NN Search for large attention-directing RNNs (Koutnik et al., 2013).

One more little quibble: While LBH suggest that "the earliest days of pattern recognition" date back to the 1950s, the cited methods are actually very similar to linear regressors of the early 1800s, by Gauss and Legendre. Gauss famously used such techniques to recognize predictive patterns in observations of the asteroid Ceres.

LBH may be backed by the best PR machines of the Western world (Google hired Hinton; Facebook hired LeCun). In the long run, however, historic scientific facts (as evident from the published record) will be stronger than any PR. There is a long tradition of insights into deep learning, and the community as a whole will benefit from appreciating the historical foundations.

The contents of this critique may be used (also verbatim) for educational and non-commercial purposes, including articles for Wikipedia and similar sites.

References not yet in the survey (Schmidhuber, 2015):

Y. LeCun, Y. Bengio, G. Hinton (2015). Deep Learning. Nature 521, 436-444. http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html

Y. LeCun (2015). IEEE Spectrum Interview by L. Gomes, Feb 2015: http://spectrum.ieee.org/automaton/robotics/artificial-intelligence/facebook-ai-director-yann-lecun-on-deep-learning

R. Dechter (1986). Learning while searching in constraint-satisfaction problems. University of California, Computer Science Department, Cognitive Systems Laboratory. First paper to introduce the term "Deep Learning" to Machine Learning.

I. Aizenberg, N.N. Aizenberg, and J. P.L. Vandewalle (2000). Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications. Springer Science & Business Media. First paper to introduce the term "Deep Learning" to Neural Networks. Compare a popular G+ post on this: https://plus.google.com/100849856540000067209/posts/7N6z251w2Wd?pid=6127540521703625346&oid=100849856540000067209.

J. Schmidhuber (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85-117. Preprint: http://arxiv.org/abs/1404.7828

AMAmemory (2015): Answer at reddit AMA (Ask Me Anything) on "memory networks" etc (with references): http://www.reddit.com/r/MachineLearning/comments/2xcyrl/i_am_j%C3%BCrgen_schmidhuber_ama/cp0q12t


#machinelearning
#artificialintelligence
#computervision
#deeplearning

Link: http://people.idsia.ch/~juergen/deep-learning-conspiracy.html
Photo

Post has attachment
Who introduced the term “deep learning” to the field of Machine Learning (ML) and Neural Networks (NNs)? Just a few days ago we had an interesting discussion about this on the connectionists mailing list http://www.cnbc.cmu.edu/connectionists

Although Ivakhnenko had working, deep learning nets in the 1960s (still used in the new millennium), and Fukushima had them in the 1970s, and backpropagation also was invented back then, nobody called this “deep learning.”

In other contexts, the term has been around for centuries, but apparently it was first introduced to the field of Machine Learning in a paper by Rina Dechter at AAAI 1986 (thanks to Brian Mingus for pointing this out). She wrote not only about “deep learning,” but also “deep first-order learning” and “second-order deep learning.” Her paper was not about NNs though: http://www.aaai.org/Papers/AAAI/1986/AAAI86-029.pdf

To my knowledge, the term was introduced to the NN field by Aizenberg & Aizenberg & Vandewalle's book (2000): "Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications.” They wrote about “deep learning of the features of threshold Boolean functions, one of the most important objects considered in the theory of perceptrons …”  http://www.researchbooks.org/0792378245/MULTI-VALUED-UNIVERSAL-BINARY-NEURONS/

The Google-generated graph seems to indicate that the term’s popularity went up right after Aizenberg et al.’s book came out in 2000. However, this graph is not limited to NN-specific usage. (Thanks to Antoine Bordes and Yoshua Bengio for pointing this out.)

Although my own team has published on deep learning for a quarter-century, we adopted the terminology only in the new millennium. Our first paper with the word combination “learn deep” in the title appeared at GECCO 2005
ftp://ftp.idsia.ch/pub/juergen/gecco05gomez.pdf

Of course, all of this is just syntax, not semantics. The real deep learning pioneers did their work in the 1960s and 70s!

I also mentioned this somewhere deep down in the AMA at reddit: http://www.reddit.com/r/MachineLearning/comments/2xcyrl/i_am_j%C3%BCrgen_schmidhuber_ama/cpfrrnr
Photo

Post has attachment
Ask me anything! I’ll try to answer at reddit.com in this thread: http://www.reddit.com/r/MachineLearning/comments/2xcyrl/i_am_j%C3%BCrgen_schmidhuber_ama/ . The AMA started on 4th March 2015; I'll keep answering questions in the next few days. Please bear with my sluggish responses!

Here is a short introduction about me from my website (you can read more about my lab’s work at http://people.idsia.ch/~juergen/):

Since age 15 or so, Jürgen Schmidhuber's main scientific ambition has been to build an optimal scientist through self-improving Artificial Intelligence (AI), then retire. He has pioneered self-improving general problem solvers since 1987, and Deep Learning Neural Networks (NNs) since 1991. The recurrent NNs (RNNs) developed by his research groups at the Swiss AI Lab IDSIA (USI & SUPSI) & TU Munich were the first RNNs to win official international contests. They recently helped to improve connected handwriting recognition, speech recognition, machine translation, optical character recognition, image caption generation, and are now in use at Google, Microsoft, IBM, Baidu, and many other companies. IDSIA's Deep Learners were also the first to win object detection and image segmentation contests, and achieved the world's first superhuman visual classification results, winning nine international competitions in machine learning & pattern recognition (more than any other team). His research group also established the field of mathematically rigorous universal AI and optimal universal problem solvers. His formal theory of creativity & curiosity & fun explains art, science, music, and humor. He also generalized algorithmic information theory and the many-worlds theory of physics, and introduced the concept of Low-Complexity Art, the information age's extreme form of minimal art. Since 2009 he has been member of the European Academy of Sciences and Arts. He has published 333 peer-reviewed papers, earned seven best paper/best video awards, and is recipient of the 2013 Helmholtz Award of the International Neural Networks Society.
Photo
Wait while more posts are being loaded