Profile

Cover photo
Nickolay Shmyrev
Lives in Москва
416 followers|163,006 views
AboutPosts+1's

Stream

Nickolay Shmyrev

Shared publicly  - 
 
CMUSphinx-powered app for League of Legends https://play.google.com/store/apps/details?id=nl.selwyn420.vast
Voice Activated Summoner Timer or VAST is a timer for the players of League of Legends / LoL to keep track of summoner spell cooldowns. VAST can be used either by voice commands or the GUI.Functions:- Automaticly detects enemy champions and their summoner spells. - Takes summonerspell reduction into account such as Insight and Boots of Lucidity. - Plays an audio que when summoners are up. - Customisable reaction time and warning offset. - Works o...
1
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
 
This is a big technical problem to solve, pretty interesting one

http://gizmodo.com/tv-report-on-accidental-amazon-orders-triggers-attempte-1790958217

and, i-vectors do not really work for short utterances.
In millions of homes across the country, Amazon’s voice-controlled personal assistant, Alexa, is listening. And whether you want to or not, she’s ready to play.
1
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
 
Learning with huge memory

Recently a set of papers were published about "memorization" in neural networks. For example:

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer https://openreview.net/forum?id=B1ckMDqlg

also

Understanding deep learning requires rethinking generalization https://openreview.net/forum?id=Sy8gdB9xx

It seems that large memory system has a point, you don’t need millions of computing cores in CPU and, it is too power-expensive, you could just go ahead with very large memory and reasonable amount of cores to access memory with hashing (think of Shazam or randlm, or G2P by analogy). You probably do not need heavy tying either.

Advantages are: you can quickly incorporate new knowledge, just put new values in memory, you can model corner cases since they are all still accessible, and, again, you are much more energy-efficient.

Maybe we will see mobile phones with 1Tb of memory sometimes.
2
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
 
In the year of voice interfaces this becomes pretty interesting thought:

http://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/EWD667.html
On the foolishness of "natural language programming". Since the early days of automatic computing we have had people that have felt it as a shortcoming that programming required the care and accuracy that is characteristic for the use of any formal symbolism. They blamed the mechanical slave for ...
1
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
 
Zero cost project results recently raised http://slim-sig.irisa.fr/me16proc, not too many participants but pretty interesting research and results. Basically GMMs worked better for unsupervised adaptation.

Very relevant research to build the speech recognition in a language from public resources from the internet.

The cool thing is there is a leaderboard, so it is still possible to submit results:

http://www.zero-cost.org

Thanks to +Xavier Anguera
Date, Team name, Title, Devel local (WER), Devel (WER), Test (WER). 1. 2016-09-13 14:26:01, BUT, BUT - Babel Kaldi BLSTM 8kHz - LM tune Late submission, 12.2 6.4 | 36.0 | 19.1. ELSA | FORVO | RHINOSPIKE, 17.6 6.2 | 56.4 | 16.9. ELSA | FORVO | RHINOSPIKE, 46.3 4.6 | 52.6 | 32.2 | 84.7
2
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
rhubarb-lip-sync - Rhubarb Lip-Sync is a command-line tool that automatically creates 2D mouth animation from voice recordings. You can use it for characters in computer games, in animated cartoons...
2
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
 
Hagen Soltau moved from IBM to Google and started to work on Google scale - training set of 125000 hours for youtube captions.

Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition

Hagen Soltau, Hank Liao, Hasim Sak

https://arxiv.org/pdf/1610.09975.pdf
2
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
Think today's computers are smart? Just look at what's coming. Meet a multinational bullpen of computer scientists who are rapidly bridging the divide between humans and machines.
1
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
News from Cambridge businesses. Network members upload news here about their products, services and achievements.
2
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
 
Not quite a scientific paper, but "memorization" is a very promising concept.

UNDERSTANDING DEEP LEARNING REQUIRES RE-THINKING GENERALIZATION
Chiyuan Zhang et al

https://openreview.net/pdf?id=Sy8gdB9xx
1
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
World chess champion Magnus Carlsen narrowly retained his title against Russian challenger Sergey Karjakin after the match went to rapid playoffs in New York. Carlsen played a brilliant queen sacrifice to force a checkmate with seconds on the clock.
1
Add a comment...

Nickolay Shmyrev

Shared publicly  - 
 
There is something very much appealing in end-to-end learning. Trainers and recognizers are hundred lines of code total, no need for huge scale codebases. Things like language identification also become very simple, you don't need i-vectors or anything. If only the world be that simple.
 
TensorFlow implementation of speech recognition based on DeepMind's WaveNet paper https://goo.gl/SDI3ps and using the VCTK corpus on a Titan X GPU.
speech-to-text-wavenet - Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
View original post
1
Daniel Povey's profile photo
 
impressive.
Add a comment...
Basic Information
Gender
Male
Other names
Николай Шмырёв
Places
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Currently
Москва
Previously
Астрахань
Contact Information
Home
Email
Nickolay Shmyrev's +1's are the things they like, agree with, or want to recommend.
Unsupervised Feature Selection on Data Streams / Streaming Anomaly Detec...
nuit-blanche.blogspot.com

Today, we see the use of streaming algorithms to figure out anomaly detection or unsupervised feature selection. Streaming Anomaly Detection

Solving the Cocktail Party Problem with a 3D Printed Metamaterial Disc
3dprint.com

If you have ever tried to give vocal directions to your smartphone while amidst a group of voices, you know how hard it is for a computer to

Neural Word Embeddings as Implicit Matrix Factorization
nuit-blanche.blogspot.com

Recently at the Paris Machine Learning meetup there was a brief presentation on Word2Vec by Charles Ollion. Well, I was wondering about the

How Crowdsourcing Will Help Startups Build Their Own Versions of Siri | ...
www.wired.com

Speech recognition is hard, even for the world’s largest tech companies. Apple and Google draw on massive collections of recordings of real

Interspeech 2014 Recap
spokenlanguageprocessing.blogspot.com

This year's Interspeech was in Singapore. Singapore is, in some ways, a very easy venue to travel to. It's a modern, cosmopolitan city. They

Miro
market.android.com

This magic mirror loves you. You are the most beautiful girl on the world for him. He will try to satisfy your most crazy desires, just ask

Theory of Convex Optimization for Machine Learning / Estimation in high ...
nuit-blanche.blogspot.com

Sebastien Bubeck just came out with a monograph on the Theory of Convex Optimization for Machine Learning while Roman Vershynin just release

Intel Pays Up To $30M For A Personal Assistant Platform From Ginger Soft...
techcrunch.com

Apple has Siri, and now Intel has Ginger. The chipmaker has made one more acquisition to bolster its advanced computing and artificial intel

Smile - Smart Photo Annotation
market.android.com

Are you tired from taking pictures and then not being able to find them in between the hundreds of pictures on your smartphone? Are you tire

Ispikit
plus.google.com

Ispikit helps you practice, assess and improve your English pronunication

With A Voice Interface API For Any App, Wit.ai Wants To Be The Twilio Fo...
techcrunch.com

Last year, voice technology giant Nuance quietly acquired VirtuOz, a developer of virtual assistants for online sales, marketing and support

​Hot. Cool. Yours. Fin! Sochi Olympics close with breathtaking show
rt.com

After two weeks of cheering, daring and record breaking, the Sochi 2014 Olympic Games finally bids farewell as athletes and fans gather one

Saturday Morning Video: Unraveling dolphin communication complexity: Pas...
nuit-blanche.blogspot.com

For some odd reason, this video from last saturday on the Analyzing Animal Vocal Sequences Investigative Workshop could not play correctly.

Chrome hack lets websites keep listening after you close the tab
www.theverge.com

Toying around with voice-recognition apps, developer Tal Ater noticed something strange. Because of a quirk in Chrome's microphone settings,

Sunday Morning Insight: Randomization is not a dirty word
nuit-blanche.blogspot.com

From [8] The recent announcement of Yann LeCun's appointment as a director of the new Artificial Intelligence Lab at Facebook and Geoff Hint

Machined Learnings: ICML 2013: Sparse, Deep, and Random
www.machinedlearnings.com

ICML 2013 was a great conference this year, kudos to the organizers. It's too big for an individual to get a comprehensive view of everythin

OpenEars 1.3.0 out now with Pocketsphinx and Sphinxbase .8 | Politepix
www.politepix.com

Sign up for the Politepix OpenEars frameworks mailing list here in order to receive infrequent notifications of when OpenEars frameworks (su

Speech recognition engine PocketSphinx landed in Ubuntu 13.10 by default...
www.iloveubuntu.net

Months ago, the developers announced and explained Ubuntu's converged vision, where a singular OS is to power phones, tablets, desktops, TVs

Deep Thoughts on ICASSP 2013
spokenlanguageprocessing.blogspot.com

ICASSP 2013 is wrapping up today in Vancouver. Unfortunately, I missed the last day (and sessions on speech synthesis and prosody that I wou

Blame the linguists!
thelousylinguist.blogspot.com

Pullum has let me down. His latest NLP lament isn’t nearly as enraging or baffling as his previous posts. I basically agree with his points