Erin Catto: You are correct. The choice of row or column vectors is independent of how matrices are stored in memory.

Column major storage is often chosen for performance. For example, SSE version 1 does not have fast dot products, but it has fast splats and component-wise multiplication. So this makes column major order faster. Likewise, row vectors are faster with row major ordering. This means that matrices stored in column major order for use with column vectors are identical to matrices stored in row major order for use with row vectors! Performance is then identical and the only difference is the API. That was the crux of my original post: I want an API that matches math books.

I also prefer column major order because it lets you see the axes of the rotated frame: column 1 is the x-axis, etc.

Robert Winkler: Thanks, that makes sense that it's origins are performance related. It looks like the latest version of SSE has the dot product but without that if they're doing a lot of matrix multiplication on the CPU it makes sense to use column major.

I guess physics definitely qualifies for that since it's rarely done on the GPU (outside of PhysX etc).

Thanks again.