+John Cook Do people actually use Python for statistical work? I mean, serious work...

### John Cookowner

Discussion - Sean Taylor has a new blog post with a semi-serious discussion of what your choice of statistical software says about you. Here's what he says about using Python for statistics:

"You are a hacker who may have already been a programmer before you delved into statistics. You are probably willing to run alpha or beta-quality algorithms because the statistical package ecosystem is still evolving. You care about integrating your statistics code into a production codebase."

Personally, yes, I was a programmer before I became involved in statistics. And yes, I like Python because I do care about integrating statistical code into larger systems.

"You are a hacker who may have already been a programmer before you delved into statistics. You are probably willing to run alpha or beta-quality algorithms because the statistical package ecosystem is still evolving. You care about integrating your statistics code into a production codebase."

Personally, yes, I was a programmer before I became involved in statistics. And yes, I like Python because I do care about integrating statistical code into larger systems.

Last night on Twitter, I went on a bit of a rant about statistics packages (namely Stata and SPSS). My point was not that these software packages are bad per se, but that I have found them to be...

7

2

5 comments

You said R: "You do not care about aesthetics, only availability of packages and getting results quickly."

That's the problem: it has to be available, and somewhat user friendly. You don't have to be a

SAS is a good stat package but it is so $$$$$ and hyperclosed.

That's the problem: it has to be available, and somewhat user friendly. You don't have to be a

**hacker**to run some calculations.SAS is a good stat package but it is so $$$$$ and hyperclosed.

For making R easier there's Zelig: http://projects.iq.harvard.edu/zelig

+Daniel Lemire I use Python for serious statistical work. I write my code from scratch. I don't need a statistical library -- the statistical content of my work is new, so it's not in a library -- but I do need a mathematical library, and SciPy is fine for my needs.

More than Python, I use C++ and C#, and these languages provide even less statistical support. I've had to write the libraries I use in those languages.

I may be atypical because I'm an applied mathematician. I don't just do statistics, and I don't want to use a language that is only useful for statistics. I'm willing to put up with some inconvenience to avoid the greater inconvenience of having to use a different language for every task.

More than Python, I use C++ and C#, and these languages provide even less statistical support. I've had to write the libraries I use in those languages.

I may be atypical because I'm an applied mathematician. I don't just do statistics, and I don't want to use a language that is only useful for statistics. I'm willing to put up with some inconvenience to avoid the greater inconvenience of having to use a different language for every task.

Python + Pandas + scipy.stats + statsmodels + R2Py + matplotlib already gives R a run for its money.

Add a comment...