Profile

Cover photo
Bradley Neuberg
Worked at Dropbox, Inc.
Attended Columbia University
Lives in San Francisco, California
3,079 followers|82,108 views
AboutPostsPhotosYouTube

Stream

Bradley Neuberg

Discussion  - 
 
What's the state of the art in terms of unrolling LSTM networks? 150 layers (about the same as ResNet) or more? Are any techniques similar to batch normalization used but in unrolled deep LSTM networks to aid in gradient propagation?
4
Add a comment...

Bradley Neuberg

Discussion  - 
 
I've been working on this the last four months and finally got a release of Cloudless 1.0 out. Cloudless is an open source computer vision pipeline for orbital satellite data, powered by data from Planet Labs and using deep learning under the covers. Tell me what you think!

http://codinginparadise.org/ebooks/html/blog/introducing_cloudless.html
4
Armando Vieira's profile photo
 
Great job
Add a comment...

Bradley Neuberg

Discussion  - 
 
I've set up Caffe on an Amazon EC2 GPU instance. I'm storing my data and trained weights on an EBS volume. However, I've noticed that my IO seems to be a bit slow, slowing down Caffe. Any tips on how to speed up IO when dealing with Caffe on EC2? I'm also aware of cost, so don't want to use any of the EC2 options that are too expensive.
1
Kostia Antoniuk's profile photoEdo Cohen's profile photoStefano Fabri's profile photo

Bradley Neuberg

Discussion  - 
 
I'm planning on applying deep learning to orbital image data, starting with classifying and segmenting things like forest edges, cars, etc. Are people aware of strong papers in this area, specifically around applying deep learning to visual satellite data, I should review before hand? Any good raw or training sets publicly available for this?
1
Min Ooch's profile photoBill Trowbridge's profile photo

Bradley Neuberg

Discussion  - 
 
I'm almost finished with the Hinton 2012 Coursera course. I know that two things have changed since then:

* Using statistical generative models such as RBMs, DBNs, and Sigmoid Belief Nets for pretraining is no longer done, as simply properly initializing your weights properly and using backprop is good enough, even for deep networks.
* The logistic activation function is not used as much, with ReLU's preferred as they are simpler and faster.

Does anyone know the scientific papers that established these two changes? I'd love to study the original papers.
Neural Networks for Machine Learning from University of Toronto. Take free online classes from 115+ top universities and educational organizations. We partner with schools like Stanford, Yale, Princeton, and others to offer courses in dozens of topics, from computer science to teaching and beyond. Whether you are pursuing a passion or looking to advance your career, Coursera provides open, free education for everyone.
54
20
Alec Radford's profile photoBojan Ploj's profile photoScott Le Grand's profile photoAdi Hayat's profile photo
6 comments
 
Hi All
I’m new to the field of deep learning.
I’m wondering about the same question myself.

were RBM/DBN/autoencoders found to be less effective? (when using proper BPROP with Relu)

for example I’m looking on latest Caffe deep learning framework and not finding reference to networks that uses RBM or proper textbook autoencoders. 

I would very much appreciate your thoughts on this
thanks!
Add a comment...

Bradley Neuberg

Discussion  - 
 
I'm training a siamese network (http://yann.lecun.com/exdb/publis/pdf/chopra-05.pdf) for binary classification of facial images (i.e. are these two faces the same or different)?

By default a siamese network simply outputs a 'distance' value where the same faces have lower values and different faces have higher values.

I'd like to turn this into a binary classifier so that 1 indicates the two faces are the same and 0 indicates they are not. It seems like it might be possible to take a siamese network and have its 'distance' value feed into another fully connected layer, with the output of this layer indicating 1 or 0 as the binary classifier. I could then train not only the distance layer but also the target binary classifier output layer in order to discover a threshold value that indicates that two distance values are the same face or not, using backpropagation. Having two backpropagation targets though seems like it might get confusing for the network. Perhaps this actually has to be two different networks?

Am I off here? Is there a better way to do this?

I'm using Caffe to drive all of this.
1
Kai Arulkumaran's profile photoBradley Neuberg's profile photo
5 comments
 
Ah right sigmoid saturates at 0 and 1. Thanks!
Add a comment...
In his circles
171 people
Have him in circles
3,079 people
Travis Peters's profile photo
Alejandro Olvera's profile photo
ulfa paul's profile photo
Emirates Experience's profile photo
Βασίλης Καρακασίδης's profile photo
Mariya Dondukova's profile photo
Simu Lu's profile photo
Rafael Meza's profile photo
Aureliano Pandolfina Del Vasto's profile photo

Communities

Bradley Neuberg

Discussion  - 
 
I summarized some of the trends I saw in deep learning at the NIPS 2015 conference this year:

http://codinginparadise.org/ebooks/html/blog/ten_deep_learning_trends_at_nips_2015.html

Tell me what you think.
10 Deep Learning Trends at NIPS 2015. I attended the Neural Information Processing Systems (NIPS) 2015 conference this week in Montreal. It was an incredible experience, like drinking from a firehose of information. Special thanks to my employer Dropbox for sending me to the show (we're hiring!) ...
24
6
Prakash Arunachalam's profile photoVinh Luong's profile photo
2 comments
 
Thanks for the nice review, I have a small correction. Some people already get tensorflow running in Amazon's AWS and if you're not a casual coder, it shouldn't be difficult to do:
https://groups.google.com/a/tensorflow.org/forum/#!topic/discuss/jRkkvsB1iWA
Add a comment...

Bradley Neuberg

Discussion  - 
 
I'm starting a deep learning paper reading group at my workplace. I'd like to start by reviewing important fundamental papers on adding memory to neural networks (LSTM, neural Turing machines, etc) and attention models. For those two topics what papers do you recommend as good intros and important developments?
4
1
Bill Trowbridge's profile photo
2 comments
 
By the way, will you be making your own list of papers available, so we have access to it?   Perhaps some of us could follow along with your workplace group.
Add a comment...

Bradley Neuberg

Discussion  - 
 
I'm dealing with some large pre-processing and looking for advice on how to setup my pipeline.

I'm training a siamese network on the Labelled Faces in the Wild (LFW) dataset in order to do face verification. LFW has about 13,000 different faces. A siamese network ends up training a metric function that places each face into a discrete x/y space, with similar faces clustering together. At testing time I can take a given face, see where it lies in the face, and see if it clusters near another face to do positive/negative identification.

Through a subset of the data, I've learned that the best performance happens when I pair every face with every other face to get all possible positive and negative face combinations to train on. However, if I were to do this with the full 13,000 faces, it would be a huge processing task.

I'm currently pre-processing my data with Python, pairing faces, shuffling them, then writing them to a LevelDB database for use with Caffe. If I try to pair every possible combo of faces with the full 13K faces I run out of memory. I need to either get much cleverer with how I'm processing these with Python, or move to some other solution to do every possible combo. What data processing pipeline do folks recommend for dealing with a large dataset like this to do the pairing and shuffling without exhausting available memory?

Eventually I'll move from using the LFW dataset to the CASIA-WebFace dataset, which has about 500K images, so my data processing pipeline needs will get even more intense, so I'll need something that can scale up to that level.
1
Amir Rosenfeld's profile photo
 
You can avoid this by using a better scheme for sampling your examples. Check out the triplet loss in google's facenet paper. 
Add a comment...

Bradley Neuberg

Discussion  - 
 
I've been learning deep learning through Coursera and other online courses. Are there any masters programs yet that go into deep learning, or is it restricted to mostly PhDs programs still? Do Berkeley/Cal and Stanford have masters programs in these subjects yet? How about online masters programs?
3
Bradley Neuberg's profile photoEthan Caballero's profile photoYagnesh Revar's profile photoSam Bowman's profile photo
5 comments
 
A fair number of students in the general Stanford CS masters program seem to be focusing on deep learning.
Add a comment...

Bradley Neuberg

Discussion  - 
 
I'm working on a neural network that can take two segmented facial images as input and return a binary "same/not same" answer on whether the two given images are the same person or not. Is anyone aware of any prior work in the literature that can help provide direction on a best approach to this?
4
1
Chris Russell's profile photoMilan Lajtoš's profile photoBradley Neuberg's profile photo
3 comments
 
Chris that paper looks incredible; thanks for pointing that out to me. Milan, Siamese networks look like a good fit; there's even a Caffe model checked into the Caffe repo recently I can use.
Add a comment...

Bradley Neuberg

Discussion  - 
 
Can Caffe be used for Recurrent Neural Networks (RNN)? If not, what are most people in the field using to model their RNNs?
1
1
Pedro Porto Buarque de Gusmão's profile photoWill Williams's profile photoSoumith Chintala's profile photoMin Ooch's profile photo
5 comments
 
Check out the pull requests for caffe, there s one pending for integration that brings RNN and LSTM to cafe.
Add a comment...
People
In his circles
171 people
Have him in circles
3,079 people
Travis Peters's profile photo
Alejandro Olvera's profile photo
ulfa paul's profile photo
Emirates Experience's profile photo
Βασίλης Καρακασίδης's profile photo
Mariya Dondukova's profile photo
Simu Lu's profile photo
Rafael Meza's profile photo
Aureliano Pandolfina Del Vasto's profile photo
Communities
Places
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Currently
San Francisco, California
Previously
McAllen, Texas - New York, NY - Kamala, Phuket, Thailand - Esalen, California
Work
Occupation
Software Engineer
Employment
  • Dropbox, Inc.
    Senior Software Engineer
  • Rojo
  • Bootstrap Foundation
  • Random Walk
  • Internet Archive
  • Sitepen
Education
  • Columbia University
  • The Science Academy
Basic Information
Gender
Male
Other names
Brad