Video processing and communications

Useful as a reference work, this book offers a good balance between theoretical concepts and practical solutions, with more rigorous formulation of certain problems such as motion estimation, sampling, basic coding theory. Provides an in-depth exposition of fundamental theory and techniques for video processing, including frequency domain characterization of video signals and visual perception, video sampling and format conversion, two dimensional and three dimensional motion estimation. Also presents techniques important for video communications, including video coding and error control, and up-to-date coverage on recent international standards on video communications. A chapter is devoted to video streaming over Internet and wireless networks, one of the most popular video communication applications. In addition, it discusses processing and communications of stereoscopic and multiview video. Practicing researchers and engineers.

Author Notes

Yao Wang received the B.S. and M.S. degrees in electrical engineering from Tsinghua University, Beijing, China, in 1983 and 1985, respectively, and the Ph.D. degree in electrical and computer engineering from the University of California at Santa Barbara in 1990. Since 1990, she has been with the Faculty of Electrical Engineering, Polytechnic University, Brooklyn, NY. Her research areas include video communications, multimedia signal processing, and medical imaging. She has authored and co-authored over 100 papers in journals and conference proceedings. She is a senior member of IEEE and has served as an Associate Editor for the IEEE Transactions on Circuits and Systems for Video Technology and the IEEE Transactions on Multimedia. She won the Mayor's Award of the City of New York for Excellence in Science and Technology in the Young Investigator category in 2000.

Jorn Ostermann studied electrical engineering and communications engineering at the University of Hannover and Imperial College London, respectively. He received Dipl.-Ing. and Dr.-Ing. from the University of Hannover in 1988 and 1994, respectively. He has been a staff member with Image Processing and Technology Research, AT&T Labs Research since 1996, where he is engaged in research on video coding, shape coding, multi-modal human-computer interfaces with talking avatars, standardization, and image analysis. He is a German National Foundation scholar. In 1998, he received the AT&T Standards Recognition Award and the ISO award. He is a member of the IEEE, the IEEE Technical Committee on Multimedia Signal Processing, and chair of the IEEE CAS Visual Signal Processing and Communications (VSPC) Technical Committee.

Ya-Qin Zhang received the B.S. and M.S. degrees in electrical engineering from the University of Science and Technology of China (USTC) in 1983 and 1985, respectively, and the Ph.D. degree from George Washington University in 1989. He is currently the Managing Director of Microsoft Research in Beijing, after leaving his post as the Director of the Multimedia Technology Laboratory at the Sarnoff Corporation in Princeton, NJ (formerly the David Sarnoff Research Center, and RCA Laboratories). He has been engaged in research and commercialization of MPEG2/DTV, MPEG4/VLBR, and multimedia information technologies. He has authored and co-authored over 200 refereed papers in leading international conference proceedings and journals. He has been granted over 40 U.S. patents in digital video, Internet, multimedia, wireless and satellite communications. He was the Editor-in-Chief of the IEEE Transactions on Circuits and Systems for Video Technology from 1997 to 1999. He is a Fellow of the IEEE.

Excerpts

In the past decade or so, there have been fascinating developments in multimedia representation and communications. First of all, it has become very clear that all aspects of media are "going digital"; from representation to transmission, from processing to retrieval, from studio to home. Second, there have been significant advances in digital multimedia compression and communication algorithms, which make it possible to deliver high-quality video at relatively low bit rates in today's networks. Third, the advancement in VLSI technologies has enabled sophisticated software to be implemented in a cost-effective manner. Last but not least, the establishment of half a dozen international standards by ISO/MPEG and ITU-T laid the common groundwork for different vendors and content providers. At the same time, the explosive growth in wireless and networking technology has profoundly changed the global communications infrastructure. It is the confluence of wireless, multimedia, and networking that will fundamentally change the way people conduct business and communicate with each other. The future computing and communications infrastructure will be empowered by virtually unlimited bandwidth, full connectivity, high mobility, and rich multimedia capability. As multimedia becomes more pervasive, the boundaries between video, graphics, computer vision, multimedia database, and computer networking start to blur, making video processing an exciting field with input from many disciplines. Today, video processing lies at the core of multimedia. Among the many technologies involved, video coding and its standardization are definitely the key enablers of these developments. This book covers the fundamental theory and techniques for digital video processing, with a focus on video coding and communications. It is intended as a textbook for a graduate-level course on video processing, as well as a reference or self-study text for researchers and engineers. In selecting the topics to cover, we have tried to achieve a balance between providing a solid theoretical foundation and presenting complex system issues in real video systems. SYNOPSIS Chapter 1 gives a broad overview of video technology, from analog color TV system to digital video. Chapter 2 delineates the analytical framework for video analysis in the frequency domain, and describes characteristics of the human visual system. Chapters 3-12 focus on several very important sub-topics in digital video technology. Chapters 3 and 4 consider how a continuous-space video signal can be sampled to retain the maximum perceivable information within the affordable data rate, and how video can be converted from one format to another. Chapter 5 presents models for the various components involved in forming a video signal, including the camera, the illumination source, the imaged objects and the scene composition. Models for the three-dimensional (3-D) motions of the camera and objects, as well as their projections onto the two-dimensional (2-D) image plane, are discussed at length, because these models are the foundation for developing motion estimation algorithms, which are the subjects of Chapters 6 and 7. Chapter 6 focuses on 2-D motion estimation, which is a critical component in modern video coders. It is also a necessary preprocessing step for 3-D motion estimation. We provide both the fundamental principles governing 2-D motion estimation, and practical algorithms based on different 2-D motion representations. Chapter 7 considers 3-D motion estimation, which is required for various computer vision applications, and can also help improve the efficiency of video coding. Chapters 8-11 are devoted to the subject of video coding. Chapter 8 introduces the fundamental theory and techniques for source coding, including information theory bounds for both lossless and lossy coding, binary encoding methods, and scalar and vector quantization. Chapter 9 focuses on waveform-based methods (including transform and predictive coding), and introduces the block-based hybrid coding framework, which is the core of all international video coding standards. Chapter 10 discusses content-dependent coding, which has the potential of achieving extremely high compression ratios by making use of knowledge of scene content. Chapter 11 presents scalable coding methods, which are well-suited for video streaming and broadcasting applications, where the intended recipients have varying network connections and computing powers. Chapter 12 introduces stereoscopic and multiview video processing techniques, including disparity estimation and coding of such sequences. Chapters 13-15 cover system-level issues in video communications. Chapter 13 introduces the H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 standards for video coding, comparing their intended applications and relative performance. These standards integrate many of the coding techniques discussed in Chapters 8-11. The MPEG-7 standard for multimedia content description is also briefly described. Chapter 14 reviews techniques for combating transmission errors in video communication systems, and also describes the requirements of different video applications, and the characteristics of various networks. As an example of a practical video communication system, we end the text with a chapter devoted to video streaming over the Internet and wireless network. Chapter 15 discusses the requirements and representative solutions for the major subcomponents of a streaming system. SUGGESTED USE FOR INSTRUCTION AND SELF-STUDY As prerequisites, students are assumed to have finished undergraduate courses in signals and systems, communications, probability, and preferably a course in image processing. For a one-semester course focusing on video coding and communications, we recommend covering the two beginning chapters, followed by video modeling (Chapter 5), 2-D motion estimation (Chapter 6), video coding (Chapters 8-11), standards (Chapter 13), error control (Chapter 14) and video streaming systems (Chapter 15). On the other hand, for a course on general video processing, the first nine chapters, including the introduction (Chapter 1), frequency domain analysis (Chapter 2), sampling and sampling rate conversion (Chapters 3 and 4), video modeling (Chapter 5), motion estimation (Chapters 6 and 7), and basic video coding techniques (Chapters 8 and 9), plus selected topics from Chapters 10-13 (content-dependent coding, scalable coding, stereo, and video coding standards) may be appropriate. In either case, Chapter 8 may be skipped or only briefly reviewed if the students have finished a prior course on source coding. Chapters 7 (3-D motion estimation), 10 (content-dependent coding), 11 (scalable coding), 12 (stereo), 14 (error-control), and 15 (video streaming) may also be left for an advanced course in video, after covering the other chapters in a first course in video. In all cases, sections denoted by asterisks (*) may be skipped or left for further exploration by advanced students. Problems are provided at the end of Chapters 1-14 for self-study or as homework assignments for classroom use. Appendix D gives answers to selected problems. The website for this book ( www.prenhall.com/wang ) provides MATLAB scripts used to generate some of the plots in the figures. Instructors may modify these scripts to generate similar examples. The scripts may also help students to understand the underlying operations. Sample video sequences can be downloaded from the website, so that students can evaluate the performance of different algorithms on real sequences. Some compressed sequences using standard algorithms are also included, to enable instructors to demonstrate coding artifacts at different rates by different techniques. Excerpted from Video Processing and Communications by Yao Wang, Ya-quin Zhang, Joern Ostermann, Ya-Qin Zhang All rights reserved by the original copyright owners. Excerpts are provided for display purposes only and may not be reproduced, reprinted or distributed without the written permission of the publisher.

(NOTE: Each chapter concludes with Summary, Problems, and Bibliography

1 Video Formation, Perception, and Representation

Color Perception and Specification

Video Capture and Display

Analog Video Raster

Analog Color Television Systems

Digital Video

2 Fourier Analysis of Video Signals and Frequency Response of the Human Visual System

Multidimensional Continuous-Space Signals and Systems

Multidimensional Discrete-Space Signals and Systems

Frequency Domain Characterization of Video Signals

Frequency Response of the Human Visual System

3 Video Sampling

Basics of the Lattice Theory

Sampling over Lattices

Sampling of Video Signals

Filtering Operations in Cameras and Display Devices

4 Video Sampling Rate Conversion

Conversion of Signals Sampled on Different Lattices

Sampling Rate Conversion of Video Signals

5 Video Modeling

Camera Model

Illumination Model

Object Model

Scene Model

Two-Dimensional Motion Models

6 Two-Dimensional Motion Estimation

Optical Flow

General Methodologies

Pixel-Based Motion Estimation

Block-Matching Algorithm

Deformable Block-Matching Algorithms

Mesh-Based Motion Estimation

Global Motion Estimation

Region-Based Motion Estimation

Multiresolution Motion Estimation

Application of Motion Estimation in Video Coding

7 Three-Dimensional Motion Estimation

Feature-Based Motion Estimation

Direct Motion Estimation

Iterative Motion Estimation

8 Foundations of Video Coding

Overview of Coding Systems

Basic Notions in Probability and Information Theory

Information Theory for Source Coding

Binary Encoding

Scalar Quantization

Vector Quantization

9 Waveform-Based Video Coding

Block-Based Transform Coding

Predictive Coding

Video Coding Using Temporal Prediction and Transform Coding

10 Content-Dependent Video Coding

Two-Dimensional Shape Coding

Texture Coding for Arbitrarily Shaped Regions

Joint Shape and Texture Coding

Region-Based Video Coding

Object-Based Video Coding

Knowledge-Based Video Coding

Semantic Video Coding

Layered Coding System

11 Scalable Video Coding

Basic Modes of Scalability

Object-Based Scalability

Wavelet-Transform-Based Coding

12 Stereo and Multiview Sequence Processing

Depth Perception

Stereo Imaging Principle

Disparity Estimation

Intermediate View Synthesis

Stereo Sequence Coding

13 Video Compression Standards

Standardization

Video Telephony with H.261 and H.263

Standards for Visual Communication Systems

Consumer Video Communications with MPEG-1

Digital TV with MPEG-2

Coding of Audiovisual Objects with MPEG-4

Video Bit Stream Syntax

Multimedia Content Description Using MPEG-7

14 Error Control in Video Communications

Motivation and Overview of Approaches

Typical Video Applications and Communications Networks

Transport-Level Error Control

Error-Resilient Encoding

Decoder Error Concealment

Encoder-Decoder Interactive Error Control

Error-Resilience Tools in H.263 and MPEG-4

15 Streaming Video over the Internet and Wirele

Available:*

On Order

Summary

Summary

Author Notes

Excerpts

Excerpts

Table of Contents