Cover photo
Dave Orr
Works at Google
Attended Stanford University
Lives in Los Altos, CA
173 followers|82,047 views


Dave Orr

Shared publicly  - 
We started working on the suggested reminder feature two years ago, on the Research team. Thrilled to see it out there and getting some love.
Dave Orr, Google Research Product Manager, introduces two new ways to add Reminders in +Inbox by Gmail 
115 comments on original post
Add a comment...

Dave Orr

Shared publicly  - 
I wrote a blog post about some of the language modeling work happening in Google Research. This was a little bit out of my domain, so it was interesting to learn about the ins and outs of LM's.

Trivia: the swiping keyboards are called IME keyboards, but there's no appropriate wikipedia page for them. Someone should make one!
A Billion Words: Because today's language modeling standard should be higher
Posted by +Dave Orr, Product Manager, and +Ciprian Chelba, Research Scientist

Language is chock full of ambiguity, and it can turn up in surprising places. Many words are hard to tell apart without context: most Americans pronounce “ladder” and “latter” identically, for instance.

One key way computers use context is with language models ( These are used for predictive keyboards, but also speech recognition, machine translation, spelling correction, query suggestions, and so on.

We believe that the field could benefit from a large, standard set with benchmarks for easy comparison and experiments with new modeling techniques. To that end, we are releasing scripts that convert a set of public data into a language model consisting of over a billion words, with standardized training and test splits, described in an arXiv paper (  

Along with the scripts, we’re releasing the processed data in one convenient location, along with the training and test data. This will make it much easier for the research community to quickly reproduce results, and we hope will speed up progress on these tasks. The benchmark scripts and data are freely available, and can be found at

For all the researchers out there, try out this model, run your experiments, and let us know how it goes -- or publish, and we’ll enjoy finding your results at conferences and in journals. Head over to the Google Research Blog to learn more.
11 comments on original post
Hubert Chen's profile photoDave Orr's profile photo
Input method editor. The problem is that it's a general term that's being used for this specific kind of keyboard, which is confusing.
Add a comment...

Dave Orr

Shared publicly  - 
Here are photos from dinner at Atelier Crenn. I will never get around to sharing these if I write up every dish, so I'm not going to. For those of you who like puzzles, one of the photos is the menu. It's a poem. The lines in the poem refer to the dishes presented here: see if you can figure out which is which.
Dave Orr's profile photoEric Wu's profile photo
Eric Wu
It's basically still chicken soup with crunchy bits in it, right?  I think it's the only thing I've ever really enjoyed quinoa in.  I think that and the mushroom dish have been on the menu in one form or another for a couple years now.
Add a comment...

Dave Orr

Shared publicly  - 
Writing blog posts for the research blog is fun, but sometimes it can be tricky to make sure everyone involved is happy. Despite being just a list of other blog posts, several of which I wrote, I think this was the post I've written that required the most rewrites and edits. All in the short intro.
Add a comment...

Dave Orr

Shared publicly  - 
I just wrote this post about a dataset where we used our NLP tools to annotate 800 million documents with freebase MID's. The quote is maybe kinda relevant, but mostly a nod to the name of the project.

The process was pretty crazy -- the dataset is so large that we had to fall back on a high bandwidth/low latency solution: CMU shipped us a set of disks with the data on it, and then after we iterated a few times, we shipped it back to them. 

And while I wrote the blog post, most of the work was done by lots of other people, mainly my coauthors, but at least 10 other people contributed significantly. Thanks to all of them!
11 Billion Clues in 800 Million Documents: A Web Research Corpus Annotated with Freebase Concepts
Posted by +Dave Orr, +Amar Subramanya, +Evgeniy Gabrilovich, and +Michael Ringgaard, Google Research

“I assume that by knowing the truth you mean knowing things as they really are.” - Plato

When you type in a search query -- perhaps Plato -- are you interested in the string of letters you typed? Or the concept or entity represented by that string? But knowing that the string represents something real and meaningful only gets you so far in computational linguistics or information retrieval -- you have to know what the string actually refers to.

The Knowledge Graph ( and Freebase ( are databases of things, not strings, and references to them let you operate in the realm of concepts and entities rather than strings and n-grams. We’ve previously released data to help with disambiguation ( and recently awarded $1.2M in research grants to work on related problems (

Today we’re taking another step: releasing data consisting of nearly 800 million documents automatically annotated with over 11 billion references to Freebase entities. To learn more details, and to download the data, visit the Google Research Blog, linked below.
View original post
Dahlkim's profile photo
Add a comment...
  • Google
    Product Manager, 2012 - present
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Los Altos, CA
Stanford, CA - Socorro, NM
  • Stanford University
    Symbolic Systems, 1992 - 1996
Basic Information