How would life be if you could tell your computer to cut down a 3-hour
movie to one hour?
The basic idea is to find ‘distinctive’ parts of the video, for
example, someone talking at a high pitch or lots of moving scenes
which, intuitively, would be more important than a slow scene or
They consider multiple facets of the video such as speech, camera
motion, significant differences in color, suppression of repeated
scenes and of course, identification of visually distinct segments.
The caveat is that their test data set are drama “rushes” video which
are raw footage including the clapboards, the color tones, repeated
takes, etc. This is very conducive to such an algorithm, which could
probably explain why they had such good results (details are in the
But if this is the state of things today, I can imagine that around
five years down the lane they would really be applying it to
commercial movies and television shows. It is amazing on what can be
done with a combination of mathematics, statistics and computers.
Update : Now Microsoft Research has done it for audio as well!