owner

Discussion  - 
 
Sean Taylor has a new blog post with a semi-serious discussion of what your choice of statistical software says about you. Here's what he says about using Python for statistics:

"You are a hacker who may have already been a programmer before you delved into statistics. You are probably willing to run alpha or beta-quality algorithms because the statistical package ecosystem is still evolving. You care about integrating your statistics code into a production codebase."

Personally, yes, I was a programmer before I became involved in statistics. And yes, I like Python because I do care about integrating statistical code into larger systems.
Last night on Twitter, I went on a bit of a rant about statistics packages (namely Stata and SPSS). My point was not that these software packages are bad per se, but that I have found them to be...
7
2
John Cook's profile photoJason Moore's profile photoAlexander Skomorokhov's profile photoChris Brew's profile photo
5 comments
 
+John Cook Do people actually use Python for statistical work? I mean, serious work...
 
You said R: "You do not care about aesthetics, only availability of packages and getting results quickly."

That's the problem: it has to be available, and somewhat user friendly. You don't have to be a hacker to run some calculations.

SAS is a good stat package but it is so $$$$$ and hyperclosed.
 
+Daniel Lemire I use Python for serious statistical work. I write my code from scratch. I don't need a statistical library -- the statistical content of my work is new, so it's not in a library -- but I do need a mathematical library, and SciPy is fine for my needs.

More than Python, I use C++ and C#, and these languages provide even less statistical support. I've had to write the libraries I use in those languages.

I may be atypical because I'm an applied mathematician. I don't just do statistics, and I don't want to use a language that is only useful for statistics. I'm willing to put up with some inconvenience to avoid the greater inconvenience of having to use a different language for every task.
 
Python + Pandas + scipy.stats + statsmodels + R2Py + matplotlib already gives R a run for its money.
Add a comment...