Cover photo
John Mount
Works at Win-Vector LLC


John Mount

Shared publicly  - 
In Gelman and Nolan's paper “You Can Load a Die, But You Can't Bias a Coin” The American Statistician, November 2002, Vol. 56, No. 4 it is argued you can't easily produce a coin that is biased when flipped (and caught). A number of variations that can be easily biased (such as spinning) are also ...

John Mount

Shared publicly  - 
Proud to share Win-Vector LLC's new (pay) statistics course: Campaign Result Testing
I am proud to announce a new Win-Vector LLC statistics video course: Campaign Response Testing 467362 fdea 6. John Mount, Win-Vector LLC. This course works through the very specific statistics problem of trying to estimate the unknown true response rates one or more populations in responding to ...

John Mount

Shared publicly  - 
New technical article: "Using closures as objects in R"  Some R as a programming language writing.

John Mount

Shared publicly  - 
Don’t use the Sharpe ratio to A/B test email campaigns 
Having worked in finance I am a public fan of the Sharpe ratio. I have written about this here and here. One thing I have often forgotten (driving some bad analyses) is: the Sharpe ratio isn't appropriate for models of repeated events that already have linked mean and variance (such as Poisson ...

John Mount

Shared publicly  - 
Win-Vector LLC is proud to announce the R data science value pack. 50% off our video course Introduction to Data Science (available at Udemy) and 30% off Practical Data Science with R (from Manning). Pick any combination of video, e-book, and/or print-book you want. Instructions below.

John Mount

Shared publicly  - 
Nina Zumel and I are proud to announce our new data science video course: Introduction to Data Science.  Here is a half-off coupon for those of you who want to check it out (should be good for about 2 weeks): 
Michael Witbrock's profile photo
+ann this may be of interest to you

John Mount

Shared publicly  - 
As an #R programmer have you every wondered what can be in a data.frame column?
If you ask an R programmer the commonly depended upon properties of a data.frame columns are: All columns in a data frame have the same length. (true, but with an asterisk); All columns in a data frame are vectors with type (see help(typeof) ) and class (see help(class) ) deriving from one of ...

John Mount

Shared publicly  - 
New technical R article, where I get to use the term "unfulfilled promise leak"
One of the advantages of functional languages (such as R) is the ability to create and return functions “on the fly.” We will discuss one good use of this capability and what to look out for when creating functions in R. Why wrap/return functions? One of my favorite uses of “on the fly ...
Andrej Bauer's profile photoJohn Mount's profile photo
Yes, it turns out R is a functional language with only immutable data structures (but mutable environments).  It is essentially a scheme (it has static/lexical closures) executing fexprs (functions like things that take lazy arguments).

It just through some syntactic sugar and environment mutation foo pretends to be imperative or object oriented (and it is not good at object oriented).

John Mount

Shared publicly  - 
Deal of the Day March 15: Half off my book Practical Data Science with R. Use code dotd031515au at

John Mount

Shared publicly  - 
The Win-Vector LLC value pack!

Half off Introduction to Data Science video course:

10% off Practical Data Science with R book

Free in-depth blog content:

And Win-Vector LLC consulting services:

John Mount

Shared publicly  - 
I don't just write about ghosty folklore. I write about folk theorems, too.
It's a folk theorem I sometimes hear from colleagues and clients: that you must balance the class prevalence before training a classifier. Certainly, I believe that classification tends to be easier when the classes are nearly balanced, especially when the class you are actually interested in is ...
View original post
Principal Consultant, Win-Vector LLC
  • Win-Vector LLC
    Principal Consultant, present
I produce applied research, prototyping and training in information extraction, algorithms and data-mining for web-scale businesses, hedge funds and start ups. Right now I do this as a consultant at Win-Vector LLC. 

Please check out our book Practical Data Science with R

Also check out the Win-Vector LLC blog our Twitter feed .
Basic Information
John Mount's +1's are the things they like, agree with, or want to recommend.
Estimating Generalization Error with the PRESS statistic

As we’ve mentioned on previous occasions, one of the defining characteristics of data science is the emphasis on the availability of “large”

Factors are not first-class citizens in R

The primary user-facing data types in the R statistical computing environment behave as vectors. That is: one dimensional arrays of scalar v

Frequentist inference only seems easy

Two of the most common methods of statistical inference are frequentism and Bayesianism (see Bayesian and Frequentist Approaches: Ask the Ri

R style tip: prefer functions that return data frames

While following up on Nina Zumel’s excellent Trimming the Fat from glm() Models in R I got to thinking about code style in R. And I realized

Trimming the Fat from glm() Models in R

One of the attractive aspects of logistic regression models (and linear models in general) is their compactness: the size of the model grows

Save 45% on Practical Data Science with R (expires May 21, 2013)

Please share this generous deal from Manning publications: save 45% on Practical Data Science with R through May 21, 2014. Please tweet, for

R has some sharp corners

R is definitely our first choice go-to analysis system. In our opinion you really shouldn’t use something else until you have an articulated

Some R Resources for GLMs

by Joseph Rickert Generalized Linear Models have become part of the fabric of modern statistics, and logistic regression, at least, is a “go

You don’t need to understand pointers to program using R

R is a statistical analysis package based on writing short scripts or programs (versus being based on GUIs like spreadsheets or directed wor

Oldies but Goldies: Statistical Graphics Books

I just wanted to plug for three classical books on statistical graphics that I really enjoyed reading. The books are old (that is, older tha

I can haz buzzwords?

Catty title aside, this post takes a good swing at defining terms we hear thrown around about data these days and they mostly do a good job.

Practical Data Science with R October 2013 update

A quick status update on our upcoming book “Practical Data Science with R” by Nina Zumel and John Mount. We are really happy with how the bo

[Book] Practical Data Science with R

Nina Zumel and John Mount have been working very hard on producing an exciting new book called “Practical Data Science with R.” The book has

Prefer = for assignment in R

We share our opinion that = should be preferred to the more standard <- for assignment in R. This is from a draft of the appendix of our upc

Win-Vector Blog » How to outrun a crashing alien spaceship

Hollywood movies are obsessed with outrunning explosions and outrunning crashing alien spaceships. For explosions the movies give the optima

Allen Bushnell, Fish Rap: Salmon on the prowl near the shore in Monterey...

Allen Bushnell Fish Rap The weather forecasts a 4- to 6-foot northwest swell this weekend, but that shouldnt slow fishing down.