Profile

Cover photo
Patrick Nguyen
120,942 views
AboutPosts

Stream

Patrick Nguyen

Shared publicly  - 
 
Thank you, google, thank you, for spell checking my email in whichever language I happen to type it.
5
Add a comment...

Patrick Nguyen

Shared publicly  - 
 
Mind blowing. There's a recording of the Virtual Rachmaninoff on spotify.
1
Ciprian Chelba's profile photoPatrick Nguyen's profile photo
2 comments
 
I ordered the book, we shall see. One of his LISP programs is called "Markov".
Add a comment...

Patrick Nguyen

Shared publicly  - 
 
More data is better data.
 
Large Scale Language Modeling in Automatic Speech Recognition by +Ciprian Chelba 

At Google, we’re able to use the large amounts of data made available by the Web’s fast growth. Two such data sources are the anonymized queries on google.com and the web itself. They help improve automatic speech recognition through large language models: Voice Search makes use of the former, whereas YouTube speech transcription benefits significantly from the latter. 

The language model is the component of a speech recognizer that assigns a probability to the next word in a sentence given the previous ones. As an example, if the previous words are “new york”, the model would assign a higher probability to “pizza” than say “granola”. The n-gram approach to language modeling (predicting the next word based on the previous n-1 words) is particularly well-suited to such large amounts of data: it scales gracefully, and the non-parametric nature of the model allows it to grow with more data. For example, on Voice Search we were able to train and evaluate 5-gram language models consisting of 12 billion n-grams, built using large vocabularies (1 million words), and trained on as many as 230 billion words. 

The computational effort pays off, as highlighted by the plot below: both word error rate (a measure of speech recognition accuracy) and search error rate (a metric we use to evaluate the output of the speech recognition system when used in a search engine) decrease significantly with larger language models.

http://goo.gl/GqHOs: A more detailed summary of results on Voice Search and a few YouTube speech transcription tasks, written by +Ciprian Chelba, +Dan Bikel, +Masha Shugrina, +Patrick Nguyen and Shankar Kumar (http://goo.gl/QhQCl), presents our results when increasing both the amount of training data, and the size of the language model estimated from such data. Depending on the task, availability and amount of training data used, as well as language model size and the performance of the underlying speech recognizer, we observe reductions in word error rate between 6% and 10% relative, for systems on a wide range of operating points.

Cross-posted with the Research Blog: http://googleresearch.blogspot.com/
1
Add a comment...

Patrick Nguyen

Shared publicly  - 
 
Disclaimer: the hand is not actually drawing.
3
Add a comment...

Patrick Nguyen

Shared publicly  - 
15
2
Patrick Nguyen's profile photoJelena Pjesivac-Grbovic (Pjesa)'s profile photoMatt Stuttle's profile photoIzhak Shafran's profile photo
5 comments
 
Fantastic!
Add a comment...

Patrick Nguyen

Shared publicly  - 
 
Yep. That's the speech button right there.
 
Just a hint for +Google ... anyone else agree?
2
Add a comment...
Collections Patrick is following
Links
Basic Information
Gender
Male