http://blog.revolutionanalytics.com/2015/07/efficient-accumulation-in-r.html

John Mount

New Win-Vector LLC technical article: "Efficient accumulation in R"

http://blog.revolutionanalytics.com/2015/07/efficient-accumulation-in-r.html

http://blog.revolutionanalytics.com/2015/07/efficient-accumulation-in-r.html

by John Mount Data Scientist, Win-Vector LLC R has a number of very good packages for manipulating and aggregating data (plyr, sqldf, RevoScaleR, data.table, and more), but when it comes to accumulating results the beginning R user is often at sea. The R execution model is a bit exotic so many R users are very uncertain which methods of accumulating results are efficient and which are inefficient. Accumulating wheat (Photo: Cyron Ray Macey, some ...

Update: data.table is totally the way to go.

Nina Zumel's next Win-Vector LLC technical article: Working with sessionized data 2: variable selection http://www.win-vector.com/blog/2015/07/working-with-sessionized-data-2-variable-selection/

In our previous post in this series, we introduced sessionization, or converting log data into a form that's suitable for analysis. We looked at basic considerations, like dealing with time, choosi...

Today's one minute hate: OSX "my god its full of meh" https://www.youtube.com/watch?v=okmJ7IESJe8

New Win-Vector LLC technical finance article

What is a good Sharpe ratio?

http://www.win-vector.com/blog/2015/06/what-is-a-good-sharpe-ratio/

What is a good Sharpe ratio?

http://www.win-vector.com/blog/2015/06/what-is-a-good-sharpe-ratio/

We have previously written that we like the investment performance summary called the Sharpe ratio (though it does have some limits). What the Sharpe ratio does is: give you a dimensionless score t...

Neural net image salad again (with code)

http://www.win-vector.com/blog/2015/06/neural-net-image-salad-again-with-code/ (with Michael Witbrock and Scott Neal Reilly, plus a call-out to Scott Draves).

http://www.win-vector.com/blog/2015/06/neural-net-image-salad-again-with-code/ (with Michael Witbrock and Scott Neal Reilly, plus a call-out to Scott Draves).

Alexander Mordvintsev, Christopher Olah, and Mike Tyka, recently posted a great research blog article where they tried to visualize what a image classification neural net “wants to see.” They achieve this by optimizing the input to correspond to a fixed pattern of neural net internal node ...

Text encoding is a convoluted mess

http://www.win-vector.com/blog/2015/07/text-encoding-is-a-convoluted-mess/

http://www.win-vector.com/blog/2015/07/text-encoding-is-a-convoluted-mess/

Modern text encoding is a convoluted mess where costs can easily exceed benefits. I admit we are in a world that has moved beyond ASCII (which at best served only English, and even then without ful...

John Mount

Anyone remember the correct text and source of this almost remembered maxim? "Laws/axioms/rule that are true are true in the extreme.” It is the mathematical idea that if a rule is correct than it is correct in all cases (even the cases that seem ridiculously hard).

New Win-Vector LLC technical article by Nina Zumel: "Working with Sessionized Data 1: Evaluating Hazard Models" http://www.win-vector.com/blog/2015/07/working-with-sessionized-data-1-evaluating-hazard-models/

When we teach data science we emphasize the data scientist's responsibility to transform available data from multiple systems of record into a wide or denormalized form. In such a “ready to analyze...

A/B test design via dynamic programming and R

http://blog.revolutionanalytics.com/2015/07/ab-testing-advertisements-with-r.html

http://blog.revolutionanalytics.com/2015/07/ab-testing-advertisements-with-r.html

by John Mount Ph. D. Data Scientist at Win-Vector LLC Win-Vector's last article on A/B testing described the scope of the realistic circumstances of A/B testing in practice and gave links to different standard solutions. In this article we will be take an idealized specific situation allowing us to show a particularly beautiful solution to one very special type of A/B test. For this article we are assigning two different advertising message to ou...

A bit about Win-Vector LLC

http://www.win-vector.com/blog/2015/06/a-bit-about-win-vector-llc/ #R #Rlang #datascience #consulting #training #analytics

http://www.win-vector.com/blog/2015/06/a-bit-about-win-vector-llc/ #R #Rlang #datascience #consulting #training #analytics

Win-Vector LLC is a consultancy founded in 2007 that specializes in research, algorithms, data-science, and training. (The name is an attempt at a mathematical pun.) Win-Vector LLC can complete your high value project quickly (some examples), and train your data science team to work much more ...

Why does designing a simple a/b test seem so complicated?

http://blog.revolutionanalytics.com/2015/06/why-does-planning-something-as-simple-as-an-ab-test-always-end-up-feeling-so-complicated.html

http://blog.revolutionanalytics.com/2015/06/why-does-planning-something-as-simple-as-an-ab-test-always-end-up-feeling-so-complicated.html

John Mount Ph. D. Data Scientist at Win-Vector LLC An A/B test is a very simple controlled experiment where one group is subject to a new treatment (often group "B") and the other group (often group "A") is considered a control group. The classic example is attempting to compare defect rates of two production processes (the current process, and perhaps a new machine). Illustration: Boris Artzybasheff (photo James Vaughan, some rights reserved) In...

