Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010194026 | TK5105.884 G52 2008 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
The evolution of technology has set the stage for the rapid growth of the video Web: broadband Internet access is ubiquitous, and streaming media protocols, systems, and encoding standards are mature. In addition to Web video delivery, users can easily contribute content captured on low cost camera phones and other consumer products. The media and entertainment industry no longer views these developments as a threat to their established business practices, but as an opportunity to provide services for more viewers in a wider range of consumption contexts. The emergence of IPTV and mobile video services offers unprecedented access to an ever growing number of broadcast channels and provides the flexibility to deliver new, more personalized video services. Highly capable portable media players allow us to take this personalized content with us, and to consume it even in places where the network does not reach. Video search engines enable users to take advantage of these emerging video resources for a wide variety of applications including entertainment, education and communications. However, the task of information extr- tion from video for retrieval applications is challenging, providing opp- tunities for innovation. This book aims to first describe the current state of video search engine technology and second to inform those with the req- site technical skills of the opportunities to contribute to the development of this field. Today's Web search engines have greatly improved the accessibility and therefore the value of the Web.
Author Notes
David Gibbon joined Bell Laboratories in 1985 and is currently a Lead Member of Technical Staff in the Video and Multimedia Services Research Department at AT&T Labs - Research. His research interests include multimedia processing for searching and browsing of video databases and real-time video processing for communications applications. David has written book chapters and encyclopedia articles as well as numerous technical papers; he has 40 US patent filings and holds 14 US patents in the areas of multimedia indexing, streaming, and video analysis; and he is a member of the ACM, and a senior member of the IEEE. David contributes to IPTV industry standards for metadata and in 2007 he was awarded the AT&T Science and Technology Medal for outstanding technical leadership and innovation in the field of Video and Multimedia Processing and Digital Content Management.
Zhu Liu joined AT&T Labs - Research in 2000, and he is currently a Principal Member of Technical Staff in the Video and Multimedia Services Research Department. His research interests include multimedia content processing, multimedia databases, pattern recognition, and machine learning. Zhu holds 7 US patents and he is the inventor of more than 20 pending patents in the areas of multimedia service and content analysis. He has published more than 40 refereed papers in international leading journals and at key conferences in the areas of multimedia. He is a member of ACM and Tau Beta Pi, and a senior member of the IEEE.
Table of Contents
Preface | p. v |
1 Video Search | p. 1 |
1.1 Introduction | p. 1 |
1.2 Addressing the Opportunity | p. 2 |
1.3 Classification of Web Video Sites | p. 5 |
1.3.1 Content Originators and Traditional Broadcasters | p. 5 |
1.3.2 Aggregators | p. 6 |
1.3.3 Download | p. 6 |
1.3.4 Sharing | p. 6 |
1.3.5 Application Specific | p. 7 |
1.3.6 Other Video Systems | p. 7 |
1.4 Classification of Video Sources | p. 8 |
1.4.1 Webcams / Security | p. 9 |
1.4.2 Video Telephony / Teleconferencing | p. 9 |
1.4.3 Industrial / Academic / Medical | p. 9 |
1.4.4 User Generated Content | p. 10 |
1.4.5 Public Access and Government (PEG) Content | p. 10 |
1.4.6 Enterprise Content | p. 10 |
1.4.7 Rushes, Raw Footage | p. 11 |
1.4.8 News | p. 11 |
1.4.9 Advertising | p. 11 |
1.4.10 Episodic TV Programming | p. 11 |
1.4.11 Feature Films | p. 12 |
1.4.12 Content Value | p. 12 |
1.5 Challenges of Video Search | p. 13 |
1.5.1 Acquisition | p. 14 |
1.5.2 Media File Formats | p. 15 |
1.5.3 Data Transport | p. 16 |
1.5.4 Browsing | p. 16 |
1.5.5 Duplication | p. 17 |
1.5.6 Ranking and Indexing | p. 17 |
1.6 Advantages of Video Search over Text | p. 18 |
1.6.1 Applications | p. 18 |
1.6.2 Metadata | p. 19 |
1.7 Metadata vs. Content | p. 19 |
1.7.1 Content-based retrieval | p. 19 |
1.8 Conclusion | p. 20 |
References | p. 21 |
2 Video Data Sources and Applications | p. 23 |
2.1 Introduction | p. 23 |
2.1.1 Evolution of Digital Media Metadata | p. 23 |
2.1.2 Consumer Video Metadata | p. 24 |
2.1.3 Metadata Loss | p. 24 |
2.1.4 Metadata Standards | p. 25 |
2.1.5 Dublin Core | p. 26 |
2.1.6 MPEG-7 | p. 27 |
2.1.7 MPEG-21 | p. 27 |
2.2 Essential Media Metadata | p. 29 |
2.2.1 Embed Global Metadata | p. 29 |
2.2.2 Elementary Metadata | p. 29 |
2.3 Metadata for Personal Media Collections | p. 31 |
2.3.1 Consumer Media Libraries | p. 31 |
2.3.2 UPnP Forum | p. 33 |
2.3.3 MP3 ID3 | p. 33 |
2.3.4 3GP / QuickTime / MP4 | p. 34 |
2.3.5 Metadata Services | p. 34 |
2.3.6 Content Identification | p. 36 |
2.3.7 Recorded Television | p. 37 |
2.4 Media Syndication: RSS Content Description | p. 39 |
2.4.1 Content Syndication | p. 39 |
2.4.2 Media Enclosures | p. 39 |
2.4.3 Podcasts | p. 41 |
2.4.4 RSS for Content Ingest | p. 42 |
2.4.5 MediaRSS | p. 43 |
2.5 Metadata for Broadcast Television | p. 43 |
2.5.1 Electronic Programming Guide (EPG) | p. 44 |
2.5.2 Extended Data Service (XDS) | p. 46 |
2.5.3 Program and System Identifier Protocol (PSIP) | p. 47 |
2.6 Metadata for Video on Demand | p. 47 |
2.6.1 Introduction | p. 47 |
2.6.2 Cable Labs | p. 49 |
2.7 Production Metadata | p. 50 |
2.8 Timed Text Formats | p. 51 |
2.8.1 Introduction | p. 51 |
2.8.2 Synchronization Precision and Resolution | p. 52 |
2.8.3 Transcripts | p. 53 |
2.8.4 Closed Captions | p. 54 |
2.8.5 Synchronized Accessible Media Interchange | p. 55 |
2.8.6 Metadata from Social Sources | p. 55 |
2.8.7 Metadata Issues | p. 55 |
2.9 Conclusion | p. 56 |
References | p. 56 |
3 Internet Video | p. 59 |
3.1 Introduction | p. 59 |
3.2 Digital Video | p. 59 |
3.2.1 Aspect Ratio | p. 59 |
3.2.2 Luminance and Chrominance Resolution | p. 61 |
3.2.3 Video Compression | p. 62 |
3.3 Internet Protocol Media Systems | p. 66 |
3.3.1 Transport | p. 66 |
3.3.2 Searching VoD vs. Live | p. 67 |
3.3.3 IPTV | p. 68 |
3.3.4 Rights Management | p. 70 |
3.3.5 Redirector Files | p. 70 |
3.3.6 Layered Encoding | p. 73 |
3.3.7 Illustrated Audio | p. 73 |
3.4 Media Captioning | p. 74 |
3.5 Conclusion | p. 75 |
References | p. 76 |
4 Video Search Engine Systems | p. 77 |
4.1 Introduction | p. 77 |
4.2 Content Acquisition | p. 78 |
4.2.1 Metadata Normalization | p. 78 |
4.2.2 User Contributed | p. 79 |
4.2.3 Syndicated Contribution | p. 80 |
4.2.4 Broadcast Acquisition | p. 81 |
4.3 Content Processing | p. 82 |
4.3.1 Asset Management | p. 82 |
4.4 Retrieval | p. 84 |
4.5 User Perspectives | p. 85 |
4.5.1 Interaction States | p. 85 |
4.5.2 Granularity of Search Results Representation | p. 87 |
4.6 Factors Concerning Scalability | p. 88 |
4.6.1 Introduction | p. 88 |
4.6.2 Acquisition | p. 89 |
4.6.3 Processing | p. 89 |
4.6.4 Storage | p. 90 |
4.6.5 Retrieval | p. 91 |
4.7 Retrieval Interfaces | p. 92 |
4.8 Typical System Features | p. 93 |
4.9 Conclusion | p. 94 |
References | p. 94 |
5 Media Processing | p. 97 |
5.1 Introduction | p. 97 |
5.2 Feature Extraction | p. 99 |
5.3 Media Segmentation | p. 100 |
5.4 Clustering, Structure Generation | p. 101 |
5.5 Real-Time Processing | p. 103 |
5.6 Systems Issues and Architectures | p. 103 |
5.7 Conclusion | p. 104 |
References | p. 105 |
6 Video Processing | p. 107 |
6.1 Introduction | p. 107 |
6.2 Shot Boundary Determination | p. 108 |
6.2.1 Feature Extraction | p. 110 |
6.2.2 Shot Boundary Detectors | p. 111 |
6.2.3 Fusion of Detector Results | p. 117 |
6.2.4 Evaluation Results | p. 117 |
6.3 Representative Image Selection | p. 118 |
6.4 Face Detection | p. 121 |
6.5 Face Recognition | p. 126 |
6.6 Video Optical Character Recognition | p. 129 |
6.7 Concept Detection | p. 131 |
6.7.1 Color Feature | p. 133 |
6.7.2 Texture Feature | p. 133 |
6.7.3 Edge Feature | p. 135 |
6.8 Video Browsing | p. 135 |
6.9 Conclusion | p. 140 |
References | p. 141 |
7 Audio Processing | p. 145 |
7.1 Introduction | p. 145 |
7.2 Audio Signal and Its Representation | p. 146 |
7.3 Audio Features | p. 148 |
7.3.1 Frame-Level Features | p. 148 |
7.3.2 Clip-Level Features | p. 154 |
7.4 Audio Segmentation | p. 156 |
7.4.1 Speaker Segmentation | p. 157 |
7.4.2 Audio Scene Segmentation | p. 158 |
7.5 Audio Content Categorization | p. 160 |
7.5.1 Speaker Recognition | p. 160 |
7.5.2 Audio Scene Detection | p. 162 |
7.5.3 Music Genre Classification | p. 163 |
7.6 Speech Recognition | p. 164 |
7.7 Audio Query and Browsing Techniques | p. 166 |
7.7.1 SpeechLogger | p. 167 |
7.7.2 Query by Example | p. 171 |
7.8 Conclusion | p. 172 |
References | p. 173 |
8 Text Processing | p. 177 |
8.1 Introduction | p. 177 |
8.2 Story Segmentation | p. 178 |
8.2.1 Cue Phrases | p. 178 |
8.2.2 Cosine Similarity | p. 179 |
8.2.3 Dynamic Programming | p. 181 |
8.2.4 Topic Classification | p. 183 |
8.3 Named Entity Extraction | p. 183 |
8.3.1 Rule Based NEE | p. 184 |
8.3.2 Data Driven NEE | p. 185 |
8.3.3 NEE Tools | p. 186 |
8.4 Part-of-Speech Tagging | p. 187 |
8.5 Capitalization | p. 189 |
8.5.1 Linguistic Processing Architecture | p. 191 |
8.5.2 Web Document Collection | p. 191 |
8.5.3 Text Capitalization Algorithm | p. 192 |
8.6 Information Retrieval | p. 194 |
8.6.1 Stemming | p. 194 |
8.6.2 Term Weighting | p. 195 |
8.6.3 Ranking | p. 196 |
8.7 Text Summarization | p. 197 |
8.7.1 Keyword Extraction | p. 199 |
8.8 Conclusion | p. 201 |
References | p. 201 |
9 Multimodal Processing | p. 203 |
9.1 Introduction | p. 203 |
9.2 Case Studies | p. 205 |
9.2.1 Closed Caption Alignment | p. 205 |
9.2.2 Multimodal News Story Segmentation | p. 209 |
9.2.3 Major Cast Detection | p. 214 |
9.3 Conclusion | p. 217 |
References | p. 217 |
10 Research Systems | p. 221 |
10.1 Introduction | p. 221 |
10.2 Academic and Industrial Research | p. 222 |
10.3 Early Internet Deployments | p. 226 |
10.3.1 SpeechBot | p. 226 |
10.3.2 StreamSage | p. 227 |
10.3.3 SingingFish | p. 227 |
10.4 Selected Commercial Systems | p. 228 |
10.4.1 Virage and Convera | p. 228 |
10.4.2 Nexidia (FastTalk) | p. 228 |
10.5 Resources: Datasets, Evaluations, Conferences | p. 229 |
10.6 Media Monitoring Deployments | p. 231 |
10.7 Case Study: AT&T MIRACLE | p. 232 |
10.7.1 Introduction | p. 232 |
10.7.2 System Architecture | p. 232 |
10.7.3 Collections | p. 233 |
10.7.4 Data Organization | p. 235 |
10.7.5 Acquisition / Ingest | p. 236 |
10.7.6 Content Processing | p. 238 |
10.7.7 Real-time processing | p. 239 |
10.7.8 Query Engine | p. 239 |
10.7.9 Applications | p. 240 |
10.7.10 Performance | p. 240 |
10.8 Conclusion | p. 242 |
References | p. 242 |
11 Current Trends in Video Search | p. 247 |
11.1 Introduction | p. 247 |
11.2 Video Production | p. 248 |
11.2.1 Metadata Retention | p. 248 |
11.2.2 Multiple Distribution Channels | p. 248 |
11.2.3 Mobisodes and Webisodes | p. 249 |
11.3 Video Distribution | p. 249 |
11.3.1 Streaming Protocols | p. 250 |
11.3.2 Electronic Sell Through | p. 250 |
11.3.3 Peer-to-peer Delivery | p. 251 |
11.3.4 Managed Download | p. 251 |
11.3.5 Syndication | p. 252 |
11.4 The Video Web and User Interaction | p. 252 |
11.4.1 Web-Based Editing | p. 252 |
11.4.2 Media Browsing | p. 252 |
11.4.3 Social Tagging | p. 253 |
11.4.4 Dynamic Interfaces | p. 253 |
11.4.5 Video Blogs (vlogs) | p. 254 |
11.4.6 Integrated Collections | p. 254 |
11.5 Television Technology and Consumption | p. 254 |
11.5.1 Proliferation of Channels | p. 255 |
11.5.2 Live to Time Shifted | p. 255 |
11.5.3 Mobile Consumption | p. 255 |
11.6 Trends in Media Devices | p. 256 |
11.6.1 Increased Media Capabilities | p. 256 |
11.6.2 Increasing Accessibility | p. 257 |
11.6.3 DRM | p. 257 |
11.6.4 Home Media Systems | p. 257 |
11.7 Media Processing Research | p. 257 |
11.8 Deployments | p. 260 |
11.9 Conclusion | p. 261 |
References | p. 261 |
Glossary | p. 265 |
Index | p. 271 |