With the ubiquity of videos on the internet, the development of algorithms that can analyze, summarize, and classify their content is an active field of research. While Convolutional Neural Networks (CNNs, http://goo.gl/PqYGzp) are an effective class of models for understanding image content, their development and use for video classification has been limited by the lack of video datasets that match the scale and variety of existing image datasets.
With Large-scale Video Classification with Convolutional Neural Networks (http://goo.gl/6sOs3C), a paper to be presented at the 2014 Computer Vision and Pattern Recognition conference (http://goo.gl/vY2v9k), researchers from Google and Stanford University collaborate to study the performance of CNNs for large-scale video classification using Sports-1M, a new dataset consisting of ~1 million YouTube videos belonging to a taxonomy of 487 classes of sports.
Utilizing the information present in single, static frames as well as the complex temporal evolution present in video, the CNN learns features on full frame, low-resolution context streams along with centrally cropped high-resolution fovea streams. In doing so, the research shows that CNN architectures are capable of learning powerful features from video data, even if the provided description does not match the content, or there is significant variation on the frame level.
To support future work in this area, the Sports-1M dataset, consisting of 1,133,158 video URLs which have been annotated automatically with labels, has been made available to the research community at http://goo.gl/8imYDN. To see an example of per-frame classification results overlaid on top of a video, watch the video below.
- ClariDevOps, present
- Wepollscofounder, 2011 - 2013
- SpyFuSoftware Engineer, 2007 - 2011
- Microchip, Velocityscape, SpyFu
- ASUComputer Systems Engineering
- Buzz (current)
Students - Guide to Technical Development - Google Careers
Guide for Technical Development. Having a solid foundation in Computer Science is important to become a successful Software Engineer. This g
Testing Your Code — The Hitchhiker's Guide to Python
Testing Your Code¶. Testing your code is very important. Getting used to writing the testing code and the running code in parallel is now co
The Matrix Cheatsheet by Sebastian Raschka is licensed under a ...
Task. MATLAB/Octave. Python NumPy. R. Julia. Task. CREATING MATRICES. Creating Matrices (here: 3x3 matrix). M> A = [1 2 3; 4 5 6; 7 8 9] A =
Our Exclusive Hands-On With Microsoft's Unbelievable New Holographic Gog...
The prototype is amazing. It amplifies the special powers that Kinect introduced, using a small fraction of the energy. Project HoloLens’ ke
Conjugate prior - Wikipedia, the free encyclopedia
In Bayesian probability theory, if the posterior distributions p(θ|x) are in the same family as the prior probability distribution p(θ), the
Baidu built a supercomputer for deep learning — Tech News and Analysis
Chinese search engine company Baidu says it has built the world’s most-accurate computer vision system, dubbed Deep Image, which runs on a s