Good rant by Vincent here. I completely agree that research includes people able to do things no one else can, including having data or compute at the frontier beyond what anyone else has done before.

This is controversial though. One criticism, that I don't like, is that industrial work isn't research, and scale doesn't matter, which I don't think is true. A better criticism of this kind of work is that it isn't reproducible by anyone else, which makes it subject to the argument that it isn't really science, though I still disagree with that, as it is reproducible by some, and later will be reproducible by more. In any case, work at the frontier is always research, and Vincent is right that just doing anything people haven't been able to do before should qualify as interesting research.
I often hear researchers complaining how Google tends to publish a lot about large-scale, comparatively dumb approaches to solving problems. Guilty as charged: think about ProdLM and 'stupid backoff', or the 'billion neuron' cat paper, AlphaGo, the more recent work on obscenely large mixture of experts or the large-scale learning-to-learn papers.
The charges levied against this line of work is that they're inefficiently using large amounts of resources, not being 'clever', and that nobody else can reproduce them as a result. But that's exactly the point!! The marginal benefit of us exploring the computational regimes that every other academic lab can do just as well is inherently limited. Better explore the frontier that few others have the resources to explore: what happens when we go all out; try the simple stuff first, and then if it looks promising we can work backwards and make it more efficient. ProdLM gave us the focus on data for machine translation that made production-grade neural MT possible. The 'cat paper' gave us DistBelief and eventually TensorFlow. That's not waste, that's progress.
Shared publiclyView activity