Title:
Phase-based speech processing
Publication Information:
Singapore : World Scientific Publishing Company, 2006
ISBN:
9789812566126
Added Author:
Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010148672 | TK7882.S65 P42 2006 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
This is the first book that takes a detailed look at the importance of phase in the design of speech processing systems. Phase, in comparison with amplitude, is often ignored for speech recognition applications. Thus, this book highlights some of the important ways in which the phase of speech signals can be utilized for sound localization, enhancement, and recognition.This book also discusses the state-of-the-art research in phase-based speech processing, starting from the basics of signal processing and recording, to single microphone speech recognition, the recognition of speech and the processing of speech by humans, as well as the importance of phase in human speech recognition and multi-microphone phase-based speech processing.
Table of Contents
1 Introduction | p. 1 |
1.1 Motivation | p. 1 |
1.2 The Meaning of Phase | p. 2 |
1.3 Dual Microphone Speech Processing, or Why Two Ears Are Better Than One | p. 3 |
1.4 The Microphone From the 22nd Century - The Human Ear | p. 5 |
1.5 Why Smart Computers Are Hard To Find | p. 6 |
1.6 The Bigger Picture | p. 7 |
1.7 Book Overview | p. 7 |
2 Signal Processing Basics | p. 9 |
2.1 Continuous and Discrete Time Signals | p. 9 |
2.2 Continuous Time Fourier Transform | p. 12 |
2.2.1 Useful Mathematical Identities | p. 13 |
2.3 Sampling | p. 17 |
2.4 Spectral Analysis of Discrete Time Signals | p. 20 |
2.4.1 The Effect of Sampling on the Fourier Transform of a Signal | p. 20 |
2.4.2 The Reconstruction Theorem | p. 23 |
2.4.3 The Discrete Time Fourier Transform (DTFT) | p. 26 |
2.4.4 Sampling in the Frequency Domain | p. 28 |
2.4.5 The Discrete Fourier Transform (DFT) | p. 33 |
2.5 Windowing | p. 48 |
2.6 Delaying Discrete Time Signals by Non-Integer Amounts | p. 58 |
3 Single-Microphone Speech Processing | p. 63 |
3.1 Introduction | p. 63 |
3.2 Background | p. 63 |
3.3 The Role of Phase in Speech Enhancement | p. 64 |
3.4 The Role of Phase in Speech Recognition | p. 66 |
3.4.1 The Fundamentals of HMM Based ASR | p. 66 |
3.4.2 Performance Overview of ASR Systems | p. 71 |
3.5 Phase Estimation from Magnitude | p. 76 |
3.5.1 Signal Estimation from the Modified STFT | p. 76 |
3.5.2 Signal Estimation from the STFT Magnitude | p. 77 |
3.6 Recent Developments in Phase Utilization | p. 78 |
3.7 Summary | p. 81 |
4 Human Hearing | p. 83 |
4.1 Anatomy of the Ear | p. 83 |
4.1.1 External Ear | p. 83 |
4.1.2 Middle Ear | p. 84 |
4.1.3 Inner Ear | p. 84 |
4.2 Physiology of the Ear | p. 84 |
4.2.1 Transmission of Sound Through the Middle Ear | p. 84 |
4.2.2 Physiology of the Cochlea | p. 85 |
4.2.3 Inner Ear Performs Super Fast Fourier Transform | p. 86 |
4.2.4 The "Place" Principle | p. 86 |
4.2.5 Action Potential and Determination of Loudness | p. 86 |
4.2.6 Detection of the Change in the Loudness and the Power Law | p. 88 |
4.2.7 Threshold for Hearing and Frequency Range of Hearing | p. 88 |
4.3 Hearing in the Central Nervous System | p. 89 |
4.3.1 Parallel Processing of Sound in the Cerebral Cortex | p. 89 |
4.3.2 Importance of the Cerebral Cortex in Hearing | p. 89 |
4.4 The Importance of Phase in Human Speech Processing | p. 90 |
4.5 Experimental Setup | p. 91 |
4.6 Experimental Results | p. 92 |
4.7 Modeling the Effect of Phase | p. 94 |
4.8 Speech Recognition Experiments Incorporating Phase Restoration | p. 97 |
4.8.1 Experimental Setup | p. 97 |
4.8.2 Experimental Results | p. 97 |
4.9 Conclusions | p. 98 |
5 Multi-Microphone Phase-Based Speech Processing | p. 101 |
5.1 Introduction | p. 101 |
5.1.1 Dual Microphone Sound Model | p. 105 |
5.1.2 Frequency Dependent Nature of Phase Wrapping | p. 106 |
5.2 Delay-and-sum Beamforming | p. 109 |
5.2.1 Two-microphone Sum Beamforming | p. 109 |
5.2.2 Multi-element Sum Beamforming | p. 112 |
5.2.3 Steering the Array | p. 114 |
5.3 Sound Localization Using a Delay-and-sum Beamformer | p. 116 |
5.4 TDOA Based Sound Localization | p. 116 |
5.4.1 TDOA Estimation | p. 119 |
5.5 A Detailed Look at the Phase Error | p. 122 |
5.6 The Relationship Between Phase-Error and SNR | p. 124 |
5.7 Probabilistic constraints on the SNRs | p. 127 |
5.8 Phase-Based Time-Varying Filters | p. 131 |
5.9 Beamforming as a Phase-Error Filter | p. 134 |
6 Concluding Remarks | p. 137 |
6.1 Summary (i.e. things you would have known if you had read the book) | p. 137 |
6.2 Directions for Future Research | p. 138 |
6.3 Where Does it End? | p. 139 |
6.4 Disclaimer | p. 140 |
Bibliography | p. 141 |