Audio causes things to vibrate. You can capture the vibrations on camera and then extract the audio information.

Generally, this requires a high frame rate camera, but the video also demonstrates a method of extracting the audio information from regular 60fps video by taking advantage of the rolling shutter effect (
