Cover image for A unified framework for video summarization, browsing and retrieval : with applications to consumer and surveillance video
Title:
A unified framework for video summarization, browsing and retrieval : with applications to consumer and surveillance video
Publication Information:
Burlington, MA : Academic Press, 2006
ISBN:
9780123693877

139780123693877 (hbk.)

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000004699561 TK6680.5 U55 2006 Open Access Book Book
Searching...

On Order

Summary

Summary

Large volumes of video content can only be easily accessed by the use of rapid browsing and retrieval techniques. Constructing a video table of contents (ToC) and video highlights to enable end users to sift through all this data and find what they want, when they want are essential. This reference puts forth a unified framework to integrate these functions supporting efficient browsing and retrieval of video content. The authors have developed a cohesive way to create a video table of contents, video highlights, and video indices that serve to streamline the use of applications in consumer and surveillance video applications.

The authors discuss the generation of table of contents, extraction of highlights, different techniques for audio and video marker recognition, and indexing with low-level features such as color, texture, and shape. Current applications including this summarization and browsing technology are also reviewed. Applications such as event detection in elevator surveillance, highlight extraction from sports video, and image and video database management are considered within the proposed framework. This book presents the latest in research and readers will find their search for knowledge completely satisfied by the breadth of the information covered in this volume.


Author Notes

Ziyou Xiong is a senior research engineer/scientist at the Dynamic Modeling and Analysis group of the United Technologies Research Center
Regunathan Radhakrishnan currently is a visiting researcher at Mitsubishi Electric Research Labortories
Ajay Divakaran currently leads the Data and Sensor Systems Team at the Technology Laboratory of Mitsubishi Electric Research Laboratories
Yong Rui is a researcher in the Communication and Collaboration Systems group at Microsoft Research, where he leads the Multimedia Collaboration team
Thomas S. Huang is a William L. Everitt Distinguished Professor of Electrical and Computer Engineering, and head of the Image Formation and Processing Group at the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign


Table of Contents

List of Figuresp. xi
List of Tablesp. xvii
Prefacep. xix
Acknowledgmentsp. xxi
Chapter 1 Introductionp. 1
1.1 Introductionp. 1
1.2 Terminologyp. 3
1.3 Video Analysisp. 6
1.3.1 Shot Boundary Detectionp. 6
1.3.2 Key Frame Extractionp. 6
1.3.3 Play/Break Segmentationp. 7
1.3.4 Audio Marker Detectionp. 7
1.3.5 Video Marker Detectionp. 7
1.4 Video Representationp. 7
1.4.1 Video Representation for Scripted Contentp. 8
1.4.2 Video Representation for Unscripted Contentp. 9
1.5 Video Browsing and Retrievalp. 11
1.5.1 Video Browsing Using ToC-Based Summaryp. 11
1.5.2 Video Browsing Using Highlights-Based Summaryp. 11
1.5.3 Video Retrievalp. 12
1.6 The Rest of the Bookp. 12
Chapter 2 Video Table-of-Content Generationp. 15
2.1 Introductionp. 15
2.2 Related Workp. 17
2.2.1 Shot- and Key Frame-Based Video ToCp. 17
2.2.2 Group-Based Video ToCp. 18
2.2.3 Scene-Based Video ToCp. 19
2.3 The Proposed Approachp. 20
2.3.1 Shot Boundary Detection and Key Frame Extractionp. 20
2.3.2 Spatiotemporal Feature Extractionp. 20
2.3.3 Time-Adaptive Groupingp. 21
2.3.4 Scene Structure Constructionp. 24
2.4 Determination of the Parametersp. 30
2.4.1 Gaussian Normalizationp. 30
2.4.2 Determining W[subscript C] and W[subscript A]p. 31
2.4.3 Determining groupThreshold and sceneThresholdp. 32
2.5 Experimental Resultsp. 33
2.6 Conclusionsp. 37
Chapter 3 Highlights Extraction from Unscripted Videop. 39
3.1 Introductionp. 39
3.1.1 Audio Marker Recognitionp. 39
3.1.2 Visual Marker Detectionp. 39
3.1.3 Audio-Visual Marker Association and Finer-Resolution Highlightsp. 41
3.2 Audio Marker Recognitionp. 42
3.2.1 Estimating the Number of Mixtures in GMMsp. 42
3.2.2 Evaluation Using the Precision-Recall Curvep. 44
3.2.3 Performance Comparisonp. 46
3.2.4 Experimental Results on Golf Highlights Generationp. 47
3.3 Visual Marker Detectionp. 52
3.3.1 Motivationp. 52
3.3.2 Choice of Visual Markersp. 52
3.3.3 Robust Real-Time Object Detection Algorithmp. 60
3.3.4 Results of Baseball Catcher Detectionp. 62
3.3.5 Results of Soccer Goalpost Detectionp. 64
3.3.6 Results of Golfer Detectionp. 68
3.4 Finer-Resolution Highlights Extractionp. 71
3.4.1 Audio-Visual Marker Associationp. 71
3.4.2 Finer-Resolution Highlights Classificationp. 71
3.4.3 Method 1: Clusteringp. 72
3.4.4 Method 2: Color/Motion Modeling Using HMMsp. 73
3.4.5 Method 3: Audio-Visual Modeling Using CHMMsp. 82
3.4.6 Experimental Results with DCHMMp. 85
3.5 Conclusionsp. 96
Chapter 4 Video Structure Discovery Using Unsupervised Learningp. 97
4.1 Motivation and Related Workp. 97
4.2 Proposed Inlier/Outlier-Based Representation for "Unscripted" Multimedia Using Audio Analysisp. 98
4.3 Feature Extraction and the Audio Classification Frameworkp. 101
4.3.1 Feature Extractionp. 102
4.3.2 Mel Frequency Cepstral Coefficients (MFCC)p. 102
4.3.3 Modified Discrete Cosine Transform (MDCT) Features from AC-3 Streamp. 103
4.3.4 Audio Classification Frameworkp. 109
4.4 Proposed Time Series Analysis Frameworkp. 111
4.4.1 Problem Formulationp. 112
4.4.2 Kernel/Affinity Matrix Computationp. 113
4.4.3 Segmentation Using Eigenvector Analysis of Affinity Matricesp. 114
4.4.4 Past Work on Detecting "Surprising" Patterns from Time Seriesp. 117
4.4.5 Proposed Outlier Subsequence Detection in Time Seriesp. 119
4.4.6 Generative Model for Synthetic Time Seriesp. 121
4.4.7 Performance of the Normalized Cut for Case 2p. 122
4.4.8 Comparison with Other Clustering Approaches for Case 2p. 127
4.4.9 Performance of Normalized Cut for Case 3p. 135
4.5 Ranking Outliers for Summarizationp. 141
4.5.1 Kernel Density Estimationp. 141
4.5.2 Confidence Measure for Outliers with Binomial and Multinomial PDF Models for the Contextsp. 142
4.5.3 Confidence Measure for Outliers with GMM and HMM Models for the Contextsp. 149
4.5.4 Using Confidence Measures to Rank Outliersp. 153
4.6 Application to Consumer Video Browsingp. 154
4.6.1 Highlights Extraction from Sports Videop. 154
4.6.2 Scene Segmentation for Situation Comedy Videosp. 171
4.7 Systematic Acquisition of Key Audio Classesp. 179
4.7.1 Application to Sports Highlights Extractionp. 179
4.7.2 Event Detection in Elevator Surveillance Audiop. 185
4.8 Possibilities for Future Researchp. 192
Chapter 5 Video Indexingp. 199
5.1 Introductionp. 199
5.1.1 Motivationp. 199
5.1.2 Overview of MPEG-7p. 199
5.2 Indexing with Low-Level Features: Motionp. 200
5.2.1 Introductionp. 200
5.2.2 Overview of MPEG-7 Motion Descriptorsp. 201
5.2.3 Camera Motion Descriptorp. 201
5.2.4 Motion Trajectoryp. 203
5.2.5 Parametric Motionp. 203
5.2.6 Motion Activityp. 204
5.2.7 Applications of Motion Descriptorsp. 206
5.2.8 Video Browsing System Based on Motion Activityp. 208
5.2.9 Conclusionp. 212
5.3 Indexing with Low-Level Features: Colorp. 212
5.4 Indexing with Low-Level Features: Texturep. 213
5.5 Indexing with Low-Level Features: Shapep. 214
5.6 Indexing with Low-Level Features: Audiop. 215
5.7 Indexing with User Feedbackp. 217
5.8 Indexing Using Conceptsp. 218
5.9 Discussion and Conclusionsp. 219
Chapter 6 A Unified Framework for Video Summarization, Browsing, and Retrievalp. 221
6.1 Video Browsingp. 221
6.2 Video Highlights Extractionp. 223
6.2.1 Audio Marker Detectionp. 223
6.2.2 Visual Marker Detectionp. 224
6.2.3 Audio-Visual Markers Association for Highlights Candidates Generationp. 225
6.2.4 Finer-Resolution Highlights Recognition and Verificationp. 226
6.3 Video Retrievalp. 227
6.4 A Unified Framework for Summarization, Browsing, and Retrievalp. 229
6.5 Conclusions and Promising Research Directionsp. 235
Chapter 7 Applicationsp. 237
7.1 Introductionp. 237
7.2 Consumer Video Applicationsp. 238
7.2.1 Challenges for Consumer Video Browsing Applicationsp. 241
7.3 Image/Video Database Managementp. 242
7.4 Surveillancep. 244
7.5 Challenges of Current Applicationsp. 247
7.6 Conclusionsp. 247
Chapter 8 Conclusionsp. 249
Bibliographyp. 253
About the Authorsp. 261
Indexp. 265