John Johnson
A data story-teller
I set up a new data analysis blog
Well, I tried to write a blog post using the RStudio  Rmarkdown system, and utterly failed. Thus, I set up a system where I could write from RStudio. So I set up a Github pages blog at . There I can easily write and publish posts involv...

Windows 10 anniversary updates includes a whole Linux layer - this is good news for data scientists
If you are on Windows 10, no doubt you have heard that Microsoft included the bash shell in its 2016 Windows 10 anniversary update. What you may not know is that this is much, much more than just the bash shell. This is a whole Linux layer that enables you ...

Which countries have Regrexit?
This doesn't have a lot to do with bio part of biostatistics, but is an interesting data analysis that I just started. In the wake of the Brexit vote, there is a petition for a redo . The data for the petition is here , in JSON format. Fortunately, in R, wo...

Little Debate: defining baseline
In an April 30, 2015 note in Nature  (vol 520, p. 612), Jeffrey Leek and Roger Peng note that p -values get intense scrutiny, while all the decisions that lead up to the p -values get little debate. I wholeheartedly agree, and so I'm creating a Little Debat...

Simulating a Weibull conditional on time-to-event is greater than a given time
Recently, I had to simulate a time-to-event of subjects who have been on a study, are still ongoing at the time of a data cut, but who are still at risk of an event (e.g. progressive disease, cardiac event, death). This requires the simulation of a conditio...

The American Statistical Association makes a statement on p-values
This deserves to be read, re-read, re-re-read, and taken to heart. The ASA makes their statement on p-values . Some key points: p -values by themselves offer very little information on the utility of a model, the truth of a statement, or what is behind a da...

Talk to Upstate Data Science Group on Caret
Last night I gave an introduction and demo of the caret  R package to the Upstate Data Science group, meeting at Furman University. It was fairly well attended (around 20 people), and well received. It was great to get out of my own comfort zone a bit (sinc...

Even the tiniest error messages can indicate an invalid statistical analysis
The other day, I was reading in a data set in R, and the function indicated that there was a warning about a parsing error on one line. I went ahead with the analysis anyway, but that small parsing error kept bothering me. I thought it was just one line of ...

The thirty-day trial
Steve Pavlina wrote about a self-help technique called the thirty-day trial. To perform the technique, you commit 30 days of some new habit, such as quitting smoking or writing in a journal. The idea is that it’s psychologically easier to commit to somethin...
