Profile cover photo
Profile photo
Nickolay Shmyrev

Post has shared content
Text Normalization Challenge at Kaggle, sponsored by Google.

"As many of us can attest, learning another language is tough. Picking up on nuances like slang, dates and times, and local expressions, can often be a distinguishing factor between proficiency and fluency. This challenge is even more difficult for a machine.

Many speech and language applications, including text-to-speech synthesis (TTS) and automatic speech recognition (ASR), require text to be converted from written expressions into appropriate "spoken" forms. This is a process known as text normalization, and helps convert 12:47 to "twelve forty-seven" and $3.16 into "three dollars, sixteen cents."

However, one of the biggest challenges when developing a TTS or ASR system for a new language is to develop and test the grammar for all these rules, a task that requires quite a bit of linguistic sophistication and native speaker intuition.

In this competition, you are challenged to automate the process of developing text normalization grammars via machine learning. This track will focus on English, while a separate will focus on Russian here: Russian Text Normalization Challenge"
Add a comment...

A secret of the successful grant proposal (rarely mentioned) - write proposal for the already solved problem and use the grant to solve the next task which you will use in writing the next proposal. This is the only way you can actually answer all those stupid questions like "What risks you will encounter?". Poor academia people.
Add a comment...

Post has attachment
We have released Kaldi chain model for Russian
Add a comment...

CSTR invites you to participate in the second Voice Conversion Challenge.

The purpose of this challenge is to compare different approaches for converting source speakers' voices into different target speakers' voices included in the common corpus provided by organizers and to deeply understand the current performance and remaining issues of the voice conversion technology. Naturalness and similarity scores of the converted speech will be evaluated via listening tests.

In this first challenge held in 2016, we focused on voice conversion strategies using a parallel corpus. In the second challenge, we will release a new database and protocols that allow participants to build their voice conversion systems based on parallel data and/or non-parallel data. We will provide baseline scripts for training GMM and DNN voice conversion systems. For more details including rules, please see

The current schedule is as follows:

Oct 1st, 2017 release of training data (and registration deadline)
Dec 1st, 2017: release of evaluation data
Dec 8th, 2017: deadline to submit the converted audio
Jan 26, 2018: notification of results
Likewise the 1st challenge, there is no participation fee. Interested participants should do online registration of the information of your team at by Oct 1st, 2017.

Please freely contact us if you have questions.

Looking forward to hear from you.

The Second Voice Conversion Challenge

Junichi Yamagishi & Jaime Lorenzo-Trueba (National Institute of Informatics)
Tomoki Toda (Nagoya University)
Daisuke Saito (Tokyo University)
Fernando Villavicencio (Oben)
Tomi Kinnunen (University of Eastern Finland)
Zhenhua Ling (University of Science and Technology of China)

Post has attachment
Add a comment...

Post has attachment

Post has attachment

Post has shared content
So Google comes to open source ASR
Announcing the Speech Commands Dataset, enabling you to build basic but useful voice interfaces for applications.
Add a comment...

Post has attachment
Interspeech starts tomorrow
Add a comment...

Post has attachment
Battle rap gets a lot of attention, even in mainstream media here in Russia.

Stanford teaches software to read rap

Rappify: Adding Rhythm to Speech

HKUST develops battle rap bot

Freestyle: a Rap Battle Bot that Learns to Improvise

All sponsored by DARPA.

Add a comment...
Wait while more posts are being loaded