Profile cover photo
Profile photo
John Pate
Computational Psycholinguist
Computational Psycholinguist

John's posts

Post has attachment
I shook the dust off the old blog to talk about why there is a conflict between brevity and incrementality in natural language, and give some thoughts on grammaticalization

Post has attachment
I just got my VPN set up with Macquarie, and thought I'd share some networking stuff that has been useful to me in the recent past:

Post has attachment
In which I have read some neuroscience:

Post has attachment

Post has shared content

"The krumpets gnorked the koof with a shlap"

While this sentence may not make much sense, we bet you could infer quite a lot from its structure.  For example, perhaps you would be able to guess that group of something called a “krumpet” did something called "gnorking" to something called a "koof", and that they did so with a "shlap".

This is because sentences in languages such as English have structure. This structure is called syntax, and knowing the syntax of a sentence is a step towards understanding its meaning. The process of taking a sentence and transforming it into a syntactic structure is called parsing. At Google, we parse a lot of text every day, in order to better understand it and be able to provide better results and services in many of our products.

There are many kinds of syntactic representations (such as sentence diagramming,, and at Google, we've been focused on a certain type of syntactic representation called "dependency trees". Dependency-trees representation is centered around words and the relations between them. Each word in a sentence can either modify or be modified by other words. The various modifications can be represented as a tree, in which each node is a word.

This property by which you could infer the structure of the sentence based on various hints, without knowing the actual meaning of the words, is very useful. For one, it suggests that a even computer could do a reasonable job at such an analysis, and indeed it can! While still not perfect, parsing algorithms these days can analyze sentences with impressive speed and accuracy. For instance, our parser correctly analyzes the made-up sentence at the beginning of this post.

Today, Google announces the release of a very large dataset of counted dependency tree fragments from the English Books Corpus. This resource will help researchers, among other things, to model the meaning of English words over time and create better natural-language analysis tools. The resource is based on information derived from a syntactic analysis of the text of millions of English books. 

To learn more, visit the Google Research Blog, linked below. 

Post has attachment
"Principles and Parameters and Manifolds, oh my!": In which I accuse Generativism of not caring about grammar.

Post has attachment
John K Pate. (2013) Predictability effects in language acquisition. PhD dissertation. Submitted, defense pending. 

Post has attachment

Just like the previous post, but with pictures!

Post has attachment
Bayesian modelling as the "right" way to do a computational-level statistical cognitive model
Wait while more posts are being loaded