Hey everyone. I've been playing around with MNIST, with a simple one-layer net, in the spirit of Adam Coates' 2011 Aistats paper. There they argue whitening provides the most significant improvement to the performance. I found however, that with MNIST whitening really messed up everything. I went down from .96 accuracy to around .7. It might be the specific type of whitening, not sure. I'm using sklearn's PCA decomposition with whitening, and then training an autoencoder on the transformed data.

Any thoughts?

Thanks in advance.

Any thoughts?

Thanks in advance.

- The basic autoencoder, which I think is what you're using, just calculates PCA. So that would explain why it is doing nothing if you've already performed PCA.Aug 26, 2013
- Right, that makes sense, although it's not exactly the same. And in principle it would mean you're stacking two autoencoders on top of each other, which often works...

So the smart thing to do would be to do simple covariance whitening, I guess.Aug 26, 2013 - Hi Alex,

I found a similar thing with using MNIST, whitening/nonwhitening, autoencoder/clustering (same processes as Coates but with MNIST).

The drop in accuracy made me paranoid that I'd messed up somewhere though.

I've noticed that Coates' careful don't actually say "whitening**always**makes an improvement".

Also, the MNIST dataset doesn't have much variance intrinsically (most of it is solid black or white pixels). I'm thinking that this makes whitening.. not work (using the covariance matrix in particular), but I'm not good enough on the theory/maths to state why.

This is kinda making it hard to write any sort of summary for my school assignment.

Anyway, I'm interested in how you got along with it - if you managed to get improved results, then I've probably done something wrong somewhere.

Oh, and I used ZCA (PCA without dimensionality reduction), from here: http://ufldl.stanford.edu/wiki/index.php/WhiteningNov 30, 2013 - Hi there, I eventually gave up on combining PCA and autoencoders. You might get better results while combining PCA with SVM's or Random Forests, but I'm guessing overcomplete representations in high-dimensional autoencoders will give better results than straight PCA. Autoencoders also give a nonlinear decomposition, while PCA is a linear process. One might try kernelPCA, but the sklearn implementation doesn't like large datasets, at least in my experience. Might have been some backend issue with my installation and liblinear though. Hope you get better luck.Dec 1, 2013