Profile cover photo
Profile photo
Bradley Neuberg
Bradley's posts

Anyone I know played with Google Springboard yet? What did you think? I'm looking to give it a try if possible.

What's the state of the art in terms of unrolling LSTM networks? 150 layers (about the same as ResNet) or more? Are any techniques similar to batch normalization used but in unrolled deep LSTM networks to aid in gradient propagation?

I've been working on this the last four months and finally got a release of Cloudless 1.0 out. Cloudless is an open source computer vision pipeline for orbital satellite data, powered by data from Planet Labs and using deep learning under the covers. Tell me what you think!

Post has attachment
I summarized some of the trends I saw in deep learning at the NIPS 2015 conference this year:

Tell me what you think.

I've set up Caffe on an Amazon EC2 GPU instance. I'm storing my data and trained weights on an EBS volume. However, I've noticed that my IO seems to be a bit slow, slowing down Caffe. Any tips on how to speed up IO when dealing with Caffe on EC2? I'm also aware of cost, so don't want to use any of the EC2 options that are too expensive.

I'm starting a deep learning paper reading group at my workplace. I'd like to start by reviewing important fundamental papers on adding memory to neural networks (LSTM, neural Turing machines, etc) and attention models. For those two topics what papers do you recommend as good intros and important developments?

I'm planning on applying deep learning to orbital image data, starting with classifying and segmenting things like forest edges, cars, etc. Are people aware of strong papers in this area, specifically around applying deep learning to visual satellite data, I should review before hand? Any good raw or training sets publicly available for this?

I'm dealing with some large pre-processing and looking for advice on how to setup my pipeline.

I'm training a siamese network on the Labelled Faces in the Wild (LFW) dataset in order to do face verification. LFW has about 13,000 different faces. A siamese network ends up training a metric function that places each face into a discrete x/y space, with similar faces clustering together. At testing time I can take a given face, see where it lies in the face, and see if it clusters near another face to do positive/negative identification.

Through a subset of the data, I've learned that the best performance happens when I pair every face with every other face to get all possible positive and negative face combinations to train on. However, if I were to do this with the full 13,000 faces, it would be a huge processing task.

I'm currently pre-processing my data with Python, pairing faces, shuffling them, then writing them to a LevelDB database for use with Caffe. If I try to pair every possible combo of faces with the full 13K faces I run out of memory. I need to either get much cleverer with how I'm processing these with Python, or move to some other solution to do every possible combo. What data processing pipeline do folks recommend for dealing with a large dataset like this to do the pairing and shuffling without exhausting available memory?

Eventually I'll move from using the LFW dataset to the CASIA-WebFace dataset, which has about 500K images, so my data processing pipeline needs will get even more intense, so I'll need something that can scale up to that level.

Post has attachment
I'm almost finished with the Hinton 2012 Coursera course. I know that two things have changed since then:

* Using statistical generative models such as RBMs, DBNs, and Sigmoid Belief Nets for pretraining is no longer done, as simply properly initializing your weights properly and using backprop is good enough, even for deep networks.
* The logistic activation function is not used as much, with ReLU's preferred as they are simpler and faster.

Does anyone know the scientific papers that established these two changes? I'd love to study the original papers.

I've been learning deep learning through Coursera and other online courses. Are there any masters programs yet that go into deep learning, or is it restricted to mostly PhDs programs still? Do Berkeley/Cal and Stanford have masters programs in these subjects yet? How about online masters programs?
Wait while more posts are being loaded