Scalable Content-Based Video Copy Detection

Michel Crucianu, Professor of Computer Science
Conservatoire National des Arts et Metiers (CNAM) in Paris


Video copy detection has multiple applications, either in protecting owners against unauthorized use of their content or in structuring the video collections of large archives, including video-sharing web sites. Content-based copy detection methods rely on the similarity between potential copies and the original videos. Scalability is the key issue in making these methods practical for very large video databases. This talk will present recent advances in this domain, with a focus on two different types of applications: video stream monitoring and the identification of content links in a large video database. For the first application, we aim at reliable cost-effective detection in a video stream of even very brief transformed copies of original video sequences from a very large database. We show that this can be achieved with an optimized similarity-based search method exploiting refined models of the distortions undergone by the video signatures and local estimates of signature density. The reliability is evaluated on two ground truth databases, then scalability is demonstrated on much larger databases. The method allows to monitor a video stream in deferred real time against a database of 280,000 hours of video. The identification of content links in large video databases also relies on copy detection, but the direct application of the stream monitoring solution would result in unacceptably long processing times for large databases. We describe a new method relying on the definition of keyframe-level descriptors embedding sets of local video signatures and on a specific indexing solution. We evaluate the reliability with the help of ground truth databases, then we show that a little more than three days are needed for mining 10,000 hours of video.