describes in detail his deep learning method and tricks he used to win the Galaxy Recognition challenge.
42 plus ones
Shared publicly•View activity
- Thanks for sharing this :) If anyone has questions, feel free to ask them here as well.Apr 5, 2014
- Great write up. Thanks!Apr 5, 2014
- Many thanks,. Great write up! Our team landed the third using ConvNet with almost similar tricks. I will try to follow your writing. : I am curious how long did one pass (minibatch 128) take you? And how did two GPUs help?Apr 6, 2014
- I actually used a much more modest minibatch size of 16. Because the images are sliced into 16 parts which are processed by the same convolutional architecture, this meant that the effective minibatch size for the convolutional part of the network was 256. I'm not entirely sure how long one update takes. As a crude approximation: training the best net with ~42M parameters took about 67 hours, for upwards of 1.5 million parameter updates. So dividing those two, about 150-160ms per update.
Theano can only use one GPU at a time, so I just trained two networks simultaneously. I didn't use them together, as Krizhevsky et al. did in their ImageNet paper, if that's what you mean.Apr 6, 2014
- yes, that's what I meant. Many thanks! Have you compared the time performance of "cuda-convnet" (using Alex's python controller code) and its wrapper in pylearn2? I guess they are the same :-?Apr 7, 2014
- I've never used cuda-convnet by itself, so I cannot compare. I started out with Theano, and after a while I replaced Theano's own convolution operator with the wrapped cuda-convnet one. This resulted in a 2x-3x speedup. I would guess that there is a small performance penalty compared to using cuda-convnet directly, but I can't quantify it. At any rate it's a small price to pay, as Theano's flexibility and ability to do symbolic differentiation is a big advantage.Apr 7, 2014
- cool. Thanks, Sander!Apr 7, 2014