Cover image for Introduction to video search engines
Title:
Introduction to video search engines
Personal Author:
Publication Information:
Berlin : Springer, 2008
Physical Description:
xv, 274 p. : ill. ; 24 cm.
ISBN:
9783540793366
Added Author:

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010194026 TK5105.884 G52 2008 Open Access Book Book
Searching...

On Order

Summary

Summary

The evolution of technology has set the stage for the rapid growth of the video Web: broadband Internet access is ubiquitous, and streaming media protocols, systems, and encoding standards are mature. In addition to Web video delivery, users can easily contribute content captured on low cost camera phones and other consumer products. The media and entertainment industry no longer views these developments as a threat to their established business practices, but as an opportunity to provide services for more viewers in a wider range of consumption contexts. The emergence of IPTV and mobile video services offers unprecedented access to an ever growing number of broadcast channels and provides the flexibility to deliver new, more personalized video services. Highly capable portable media players allow us to take this personalized content with us, and to consume it even in places where the network does not reach. Video search engines enable users to take advantage of these emerging video resources for a wide variety of applications including entertainment, education and communications. However, the task of information extr- tion from video for retrieval applications is challenging, providing opp- tunities for innovation. This book aims to first describe the current state of video search engine technology and second to inform those with the req- site technical skills of the opportunities to contribute to the development of this field. Today's Web search engines have greatly improved the accessibility and therefore the value of the Web.


Author Notes

David Gibbon joined Bell Laboratories in 1985 and is currently a Lead Member of Technical Staff in the Video and Multimedia Services Research Department at AT&T Labs - Research. His research interests include multimedia processing for searching and browsing of video databases and real-time video processing for communications applications. David has written book chapters and encyclopedia articles as well as numerous technical papers; he has 40 US patent filings and holds 14 US patents in the areas of multimedia indexing, streaming, and video analysis; and he is a member of the ACM, and a senior member of the IEEE. David contributes to IPTV industry standards for metadata and in 2007 he was awarded the AT&T Science and Technology Medal for outstanding technical leadership and innovation in the field of Video and Multimedia Processing and Digital Content Management.

Zhu Liu joined AT&T Labs - Research in 2000, and he is currently a Principal Member of Technical Staff in the Video and Multimedia Services Research Department. His research interests include multimedia content processing, multimedia databases, pattern recognition, and machine learning. Zhu holds 7 US patents and he is the inventor of more than 20 pending patents in the areas of multimedia service and content analysis. He has published more than 40 refereed papers in international leading journals and at key conferences in the areas of multimedia. He is a member of ACM and Tau Beta Pi, and a senior member of the IEEE.


Table of Contents

Prefacep. v
1 Video Searchp. 1
1.1 Introductionp. 1
1.2 Addressing the Opportunityp. 2
1.3 Classification of Web Video Sitesp. 5
1.3.1 Content Originators and Traditional Broadcastersp. 5
1.3.2 Aggregatorsp. 6
1.3.3 Downloadp. 6
1.3.4 Sharingp. 6
1.3.5 Application Specificp. 7
1.3.6 Other Video Systemsp. 7
1.4 Classification of Video Sourcesp. 8
1.4.1 Webcams / Securityp. 9
1.4.2 Video Telephony / Teleconferencingp. 9
1.4.3 Industrial / Academic / Medicalp. 9
1.4.4 User Generated Contentp. 10
1.4.5 Public Access and Government (PEG) Contentp. 10
1.4.6 Enterprise Contentp. 10
1.4.7 Rushes, Raw Footagep. 11
1.4.8 Newsp. 11
1.4.9 Advertisingp. 11
1.4.10 Episodic TV Programmingp. 11
1.4.11 Feature Filmsp. 12
1.4.12 Content Valuep. 12
1.5 Challenges of Video Searchp. 13
1.5.1 Acquisitionp. 14
1.5.2 Media File Formatsp. 15
1.5.3 Data Transportp. 16
1.5.4 Browsingp. 16
1.5.5 Duplicationp. 17
1.5.6 Ranking and Indexingp. 17
1.6 Advantages of Video Search over Textp. 18
1.6.1 Applicationsp. 18
1.6.2 Metadatap. 19
1.7 Metadata vs. Contentp. 19
1.7.1 Content-based retrievalp. 19
1.8 Conclusionp. 20
Referencesp. 21
2 Video Data Sources and Applicationsp. 23
2.1 Introductionp. 23
2.1.1 Evolution of Digital Media Metadatap. 23
2.1.2 Consumer Video Metadatap. 24
2.1.3 Metadata Lossp. 24
2.1.4 Metadata Standardsp. 25
2.1.5 Dublin Corep. 26
2.1.6 MPEG-7p. 27
2.1.7 MPEG-21p. 27
2.2 Essential Media Metadatap. 29
2.2.1 Embed Global Metadatap. 29
2.2.2 Elementary Metadatap. 29
2.3 Metadata for Personal Media Collectionsp. 31
2.3.1 Consumer Media Librariesp. 31
2.3.2 UPnP Forump. 33
2.3.3 MP3 ID3p. 33
2.3.4 3GP / QuickTime / MP4p. 34
2.3.5 Metadata Servicesp. 34
2.3.6 Content Identificationp. 36
2.3.7 Recorded Televisionp. 37
2.4 Media Syndication: RSS Content Descriptionp. 39
2.4.1 Content Syndicationp. 39
2.4.2 Media Enclosuresp. 39
2.4.3 Podcastsp. 41
2.4.4 RSS for Content Ingestp. 42
2.4.5 MediaRSSp. 43
2.5 Metadata for Broadcast Televisionp. 43
2.5.1 Electronic Programming Guide (EPG)p. 44
2.5.2 Extended Data Service (XDS)p. 46
2.5.3 Program and System Identifier Protocol (PSIP)p. 47
2.6 Metadata for Video on Demandp. 47
2.6.1 Introductionp. 47
2.6.2 Cable Labsp. 49
2.7 Production Metadatap. 50
2.8 Timed Text Formatsp. 51
2.8.1 Introductionp. 51
2.8.2 Synchronization Precision and Resolutionp. 52
2.8.3 Transcriptsp. 53
2.8.4 Closed Captionsp. 54
2.8.5 Synchronized Accessible Media Interchangep. 55
2.8.6 Metadata from Social Sourcesp. 55
2.8.7 Metadata Issuesp. 55
2.9 Conclusionp. 56
Referencesp. 56
3 Internet Videop. 59
3.1 Introductionp. 59
3.2 Digital Videop. 59
3.2.1 Aspect Ratiop. 59
3.2.2 Luminance and Chrominance Resolutionp. 61
3.2.3 Video Compressionp. 62
3.3 Internet Protocol Media Systemsp. 66
3.3.1 Transportp. 66
3.3.2 Searching VoD vs. Livep. 67
3.3.3 IPTVp. 68
3.3.4 Rights Managementp. 70
3.3.5 Redirector Filesp. 70
3.3.6 Layered Encodingp. 73
3.3.7 Illustrated Audiop. 73
3.4 Media Captioningp. 74
3.5 Conclusionp. 75
Referencesp. 76
4 Video Search Engine Systemsp. 77
4.1 Introductionp. 77
4.2 Content Acquisitionp. 78
4.2.1 Metadata Normalizationp. 78
4.2.2 User Contributedp. 79
4.2.3 Syndicated Contributionp. 80
4.2.4 Broadcast Acquisitionp. 81
4.3 Content Processingp. 82
4.3.1 Asset Managementp. 82
4.4 Retrievalp. 84
4.5 User Perspectivesp. 85
4.5.1 Interaction Statesp. 85
4.5.2 Granularity of Search Results Representationp. 87
4.6 Factors Concerning Scalabilityp. 88
4.6.1 Introductionp. 88
4.6.2 Acquisitionp. 89
4.6.3 Processingp. 89
4.6.4 Storagep. 90
4.6.5 Retrievalp. 91
4.7 Retrieval Interfacesp. 92
4.8 Typical System Featuresp. 93
4.9 Conclusionp. 94
Referencesp. 94
5 Media Processingp. 97
5.1 Introductionp. 97
5.2 Feature Extractionp. 99
5.3 Media Segmentationp. 100
5.4 Clustering, Structure Generationp. 101
5.5 Real-Time Processingp. 103
5.6 Systems Issues and Architecturesp. 103
5.7 Conclusionp. 104
Referencesp. 105
6 Video Processingp. 107
6.1 Introductionp. 107
6.2 Shot Boundary Determinationp. 108
6.2.1 Feature Extractionp. 110
6.2.2 Shot Boundary Detectorsp. 111
6.2.3 Fusion of Detector Resultsp. 117
6.2.4 Evaluation Resultsp. 117
6.3 Representative Image Selectionp. 118
6.4 Face Detectionp. 121
6.5 Face Recognitionp. 126
6.6 Video Optical Character Recognitionp. 129
6.7 Concept Detectionp. 131
6.7.1 Color Featurep. 133
6.7.2 Texture Featurep. 133
6.7.3 Edge Featurep. 135
6.8 Video Browsingp. 135
6.9 Conclusionp. 140
Referencesp. 141
7 Audio Processingp. 145
7.1 Introductionp. 145
7.2 Audio Signal and Its Representationp. 146
7.3 Audio Featuresp. 148
7.3.1 Frame-Level Featuresp. 148
7.3.2 Clip-Level Featuresp. 154
7.4 Audio Segmentationp. 156
7.4.1 Speaker Segmentationp. 157
7.4.2 Audio Scene Segmentationp. 158
7.5 Audio Content Categorizationp. 160
7.5.1 Speaker Recognitionp. 160
7.5.2 Audio Scene Detectionp. 162
7.5.3 Music Genre Classificationp. 163
7.6 Speech Recognitionp. 164
7.7 Audio Query and Browsing Techniquesp. 166
7.7.1 SpeechLoggerp. 167
7.7.2 Query by Examplep. 171
7.8 Conclusionp. 172
Referencesp. 173
8 Text Processingp. 177
8.1 Introductionp. 177
8.2 Story Segmentationp. 178
8.2.1 Cue Phrasesp. 178
8.2.2 Cosine Similarityp. 179
8.2.3 Dynamic Programmingp. 181
8.2.4 Topic Classificationp. 183
8.3 Named Entity Extractionp. 183
8.3.1 Rule Based NEEp. 184
8.3.2 Data Driven NEEp. 185
8.3.3 NEE Toolsp. 186
8.4 Part-of-Speech Taggingp. 187
8.5 Capitalizationp. 189
8.5.1 Linguistic Processing Architecturep. 191
8.5.2 Web Document Collectionp. 191
8.5.3 Text Capitalization Algorithmp. 192
8.6 Information Retrievalp. 194
8.6.1 Stemmingp. 194
8.6.2 Term Weightingp. 195
8.6.3 Rankingp. 196
8.7 Text Summarizationp. 197
8.7.1 Keyword Extractionp. 199
8.8 Conclusionp. 201
Referencesp. 201
9 Multimodal Processingp. 203
9.1 Introductionp. 203
9.2 Case Studiesp. 205
9.2.1 Closed Caption Alignmentp. 205
9.2.2 Multimodal News Story Segmentationp. 209
9.2.3 Major Cast Detectionp. 214
9.3 Conclusionp. 217
Referencesp. 217
10 Research Systemsp. 221
10.1 Introductionp. 221
10.2 Academic and Industrial Researchp. 222
10.3 Early Internet Deploymentsp. 226
10.3.1 SpeechBotp. 226
10.3.2 StreamSagep. 227
10.3.3 SingingFishp. 227
10.4 Selected Commercial Systemsp. 228
10.4.1 Virage and Converap. 228
10.4.2 Nexidia (FastTalk)p. 228
10.5 Resources: Datasets, Evaluations, Conferencesp. 229
10.6 Media Monitoring Deploymentsp. 231
10.7 Case Study: AT&T MIRACLEp. 232
10.7.1 Introductionp. 232
10.7.2 System Architecturep. 232
10.7.3 Collectionsp. 233
10.7.4 Data Organizationp. 235
10.7.5 Acquisition / Ingestp. 236
10.7.6 Content Processingp. 238
10.7.7 Real-time processingp. 239
10.7.8 Query Enginep. 239
10.7.9 Applicationsp. 240
10.7.10 Performancep. 240
10.8 Conclusionp. 242
Referencesp. 242
11 Current Trends in Video Searchp. 247
11.1 Introductionp. 247
11.2 Video Productionp. 248
11.2.1 Metadata Retentionp. 248
11.2.2 Multiple Distribution Channelsp. 248
11.2.3 Mobisodes and Webisodesp. 249
11.3 Video Distributionp. 249
11.3.1 Streaming Protocolsp. 250
11.3.2 Electronic Sell Throughp. 250
11.3.3 Peer-to-peer Deliveryp. 251
11.3.4 Managed Downloadp. 251
11.3.5 Syndicationp. 252
11.4 The Video Web and User Interactionp. 252
11.4.1 Web-Based Editingp. 252
11.4.2 Media Browsingp. 252
11.4.3 Social Taggingp. 253
11.4.4 Dynamic Interfacesp. 253
11.4.5 Video Blogs (vlogs)p. 254
11.4.6 Integrated Collectionsp. 254
11.5 Television Technology and Consumptionp. 254
11.5.1 Proliferation of Channelsp. 255
11.5.2 Live to Time Shiftedp. 255
11.5.3 Mobile Consumptionp. 255
11.6 Trends in Media Devicesp. 256
11.6.1 Increased Media Capabilitiesp. 256
11.6.2 Increasing Accessibilityp. 257
11.6.3 DRMp. 257
11.6.4 Home Media Systemsp. 257
11.7 Media Processing Researchp. 257
11.8 Deploymentsp. 260
11.9 Conclusionp. 261
Referencesp. 261
Glossaryp. 265
Indexp. 271