Speech and audio processing in adverse environments

Users of signal processing systems are never satis?ed with the system they currently use. They are constantly asking for higher quality, faster perf- mance, more comfort and lower prices. Researchers and developers should be appreciative for this attitude. It justi?es their constant e?ort for improved systems. Better knowledge about biological and physical interrelations c- ing along with more powerful technologies are their engines on the endless road to perfect systems. This book is an impressive image of this process. After "Acoustic Echo 1 and Noise Control" published in 2004 many new results lead to "Topics in 2 Acoustic Echo and Noise Control" edited in 2006 . Today - in 2008 - even morenew?ndingsandsystemscouldbecollectedinthisbook.Comparingthe contributions in both edited volumes progress in knowledge and technology becomesclearlyvisible:Blindmethodsandmultiinputsystemsreplace"h- ble" low complexity systems. The functionality of new systems is less and less limited by the processing power available under economic constraints. The editors have to thank all the authors for their contributions. They cooperated readily in our e?ort to unify the layout of the chapters, the ter- nology, and the symbols used. It was a pleasure to work with all of them. Furthermore, it is the editors concern to thank Christoph Baumann and the Springer Publishing Company for the encouragement and help in publi- ing this book.

Abbreviations and Acronyms	p. 1
1 IntroductionE. Hänsler and G. Schmidt
1.1 Overview about the Book	p. 8
Part I Speech Enhancement
2 Low Delay Filter-Banks for Speech and Audio ProcessingH. W. Löllmann and P. Vary
2.1 Introduction	p. 13
2.2 Analysis-Synthesis Filter-Banks	p. 15
2.2.1 General Structure	p. 15
2.2.2 Tree-Structured Filter-Banks	p. 16
2.2.3 Modulated Filter-Banks	p. 17
2.2.4 Frequency Warped Filter-Banks	p. 20
2.2.5 Low Delay Filter-Banks	p. 26
2.3 The Filter-Bank Equalizer	p. 29
2.3.1 Concept	p. 29
2.3.2 Prototype Filter Design	p. 31
2.3.3 Relation between GDFT and GDCT	p. 33
2.3.4 Realization for Different Filter Structures	p. 35
2.3.5 Polyphase Network Implementation	p. 37
2.3.6 The Non-Uniform Filter-Bank Equalizer	p. 41
2.3.7 Comparison between FBE and AS FB	p. 43
2.3.8 Algorithmic Complexity	p. 43
2.4 Further Measures for Signal Delay Reduction	p. 44
2.4.1 Concept	p. 45
2.4.2 Approximation by a Moving-Average Filter	p. 45
2.4.3 Approximation by an Auto-Regressive Filter	p. 46
2.4.4 Algorithmic Complexity	p. 47
2.4.5 Warped Filter Approximation	p. 48
2.5 Application to Noise Reduction	p. 49
2.5.1 System Configurations	p. 49
2.5.2 Instrumental Quality Measures	p. 50
2.5.3 Simulation Results for the Uniform Filter-Banks	p. 51
2.5.4 Simulation Results for the Warped Filter-Banks	p. 53
2.6 Conclusions	p. 55
References	p. 56
3 A Pre-Filter for Hands-Free Car Phone Noise Reduction: Suppression of Harmonic Engine Noise ComponentsH. Puder
3.1 Introduction	p. 63
3.2 Analysis of the Different Car Noise Components	p. 64
3.2.1 Wind Noise	p. 65
3.2.2 Tire Noise	p. 65
3.2.3 Engine Noise	p. 66
3.3 Engine Noise Removal Based on Notch Filters	p. 68
3.4 Compensation of Engine Harmonics with Adaptive Filters	p. 73
3.4.1 Step-Size Control	p. 75
3.4.2 Calculating the Optimal Step-Size	p. 78
3.4.3 Results of the Compensation Approach	p. 80
3.5 Evaluation and Comparison of the Results Obtained by the Notch Filter and the Compensation Approach	p. 84
3.6 Conclusions and Summary	p. 85
3.6.1 Conclusion	p. 85
3.6.2 Summary	p. 86
References	p. 87
4 Model-Based Speech EnhancementM. Krini and G. Schmidt
4.1 Introduction	p. 89
4.2 Conventional Speech Enhancement Schemes	p. 91
4.3 Speech Enhancement Schemes Based on Nonlinearities	p. 93
4.4 Speech Enhancement Schemes Based on Speech Reconstruction	p. 97
4.4.1 Feature Extraction and Control	p. 99
4.4.2 Reconstruction of Speech Signals	p. 110
4.5 Combining the Reconstructed and the Noise Suppressed Signal	p. 124
4.5.1 Adding the Fully Reconstructed Signal	p. 125
4.5.2 Adding only the Voiced Part of the Reconstructed Signal	p. 129
4.6 Summary and Outlook	p. 133
References	p. 133
5 Bandwidth Extension of Telephony SpeechB.Iser and G.Schmidt
5.1 Introduction	p. 135
5.2 Organization of the Chapter 137
5.3 Basics	p. 138
5.3.1 Human Speech Generation	p. 139
5.3.2 Source-Filter Model	p. 141
5.3.3 Parametric Representations of the Spectral Envelope	p. 143
5.3.4 Distance Measures	p. 147
5.4 Non-Model-Based Algorithms for Bandwidth Extension	p. 149
5.4.1 Oversampling with Imaging	p. 149
5.4.2 Spectral Shifting	p. 151
5.4.3 Application of Non-Linear Characteristics	p. 153
5.5 Model-Based Algorithms for Bandwidth Extension	p. 153
5.5.1 Generation of the Excitation Signal	p. 155
5.5.2 Vocal Tract Transfer Function Estimation	p. 159
5.6 Evaluation of Bandwidth Extension Algorithms	p. 176
5.6.1 Objective Distance Measures	p. 177
5.6.2 Subjective Measures	p. 180
5.7 Conclusions	p. 181
References	p. 182
6 Dereverberation and Residual Echo Suppression in Noisy EnvironmentsE. A. P. Habets and S. Gannot and I. Cohen
6.1 Introduction	p. 186
6.2 Problem Formulation	p. 188
6.3 OM-LSA Estimator for Multiple Interferences	p. 191
6.3.1 OM-LSA Estimator	p. 191
6.3.2 A priori SIR Estimator	p. 193
6.4 Dereverberation of Noisy Speech Signals	p. 195
6.4.1 Short Introduction to Speech Dereverberation	p. 195
6.4.2 Problem Formulation	p. 197
6.4.3 Statistical Reverberation Model	p. 199
6.4.4 Late Reverberant Spectral Variance Estimator	p. 200
6.4.5 Summary and Discussion	p. 203
6.5 Residual Echo Suppression	p. 203
6.5.1 Problem Formulation	p. 204
6.5.2 Late Residual Echo Spectral Variance Estimator	p. 206
6.5.3 Parameter Estimation	p. 208
6.5.4 Summary	p. 210
6.6 Joint Suppression of Reverberation, Residual Echo, and Noise	p. 210
6.7 Experimental Results	p. 212
6.7.1 Experimental Setup	p. 214
6.7.2 Joint Suppression of Reverberation and Noise	p. 214
6.7.3 Suppression of Residual Echo	p. 216
6.7.4 Joint Suppression of Reverberation, Residual Echo, and Noise	p. 221
6.8 Summary and Outlook	p. 223
References	p. 224
7 Low Distortion Noise Cancellers -- Revival of a Classical TechniqueA. Sugiyama
7.1 Introduction	p. 229
7.2 Distortions in Widrow's Adaptive Noise Canceller	p. 230
7.2.1 Distortion by Interference	p. 230
7.2.2 Distortion by Crosstalk	p. 232
7.3 Paired Filter (PF) Structure	p. 233
7.3.1 Algorithm	p. 233
7.3.2 Evaluations	p. 235
7.4 Crosstalk Resistant ANC and Cross-Coupled Structure	p. 239
7.4.1 Crosstalk Resistant ANC	p. 240
7.4.2 Cross-Coupled Structure	p. 241
7.5 Cross-Coupled Paired Filter (CCPF) Structure	p. 242
7.5.1 Algorithm	p. 242
7.5.2 Evaluations	p. 245
7.6 Generalized Cross-Coupled Paired Filter (GCCPF) Structure	p. 247
7.6.1 Algorithm	p. 250
7.6.2 Evaluation by Recorded Signals	p. 251
7.7 Demonstration in a Personal Robot	p. 261
7.8 Conclusions	p. 261
References	p. 263
Part II Echo Cancellation
8 Nonlinear Echo Cancellation Based on Spectral ShapingO. Hoshuyama and A. Sugiyama
8.1 Introduction	p. 267
8.2 Frequency-Domain Model of Highly Nonlinear Residual Echo	p. 268
8.2.1 Spectral Correlation Between Residual Echo and Echo Replica	p. 269
8.2.2 Model of Residual Echo Based on Spectral Correlation	p. 273
8.3 Echo Canceller Based on the New Residual Echo Model	p. 274
8.3.1 Overall Structure	p. 274
8.3.2 Estimation of Near-End Speech	p. 275
8.3.3 Spectral Gain Control	p. 276
8.4 Evaluations	p. 277
8.4.1 Objective Evaluations	p. 277
8.4.2 Subjective Evaluation	p. 279
8.5 DSP Implementation and Real-Time Evaluation	p. 280
8.6 Conclusions	p. 280
References	p. 281
Part III Signal and System Quality Evaluation
9 Telephone-Speech QualityU. Heute
9.1 Telephone-Speech Signals	p. 287
9.1.1 Telephone Scenario	p. 287
9.1.2 Telephone-Scenario Model	p. 287
9.2 Speech-Signal Quality	p. 289
9.2.1 Intelligibility	p. 289
9.2.2 Speech-Sound Quality	p. 290
9.3 Speech-Quality Assessment	p. 292
9.3.1 Auditory Quality Assessment	p. 292
9.3.2 Aims	p. 292
9.3.3 Instrumental Quality Assessment	p. 293
9.4 Compound-System Quality Prediction	p. 293
9.4.1 The System-Planning Task	p. 293
9.4.2 ETSI Network-Planning Model (E-Model)	p. 293
9.5 Auditory Total-Quality Assessment	p. 294
9.5.1 Conversation Tests	p. 294
9.5.2 Listening Tests	p. 296
9.5.3 LOTs with Pair Comparisons	p. 296
9.5.4 Absolute-Category Rating (ACR) LOTs	p. 297
9.6 Auditory Quality-Attribute Analysis	p. 298
9.6.1 Quality Attributes	p. 298
9.6.2 Attribute-Oriented LOTs	p. 298
9.6.3 Search for Suitable Attributes	p. 302
9.6.4 Integral-Quality Estimation from Attributes	p. 305
9.7 Instrumental Total-Quality Measurement	p. 306
9.7.1 Signal Comparisons	p. 306
9.7.2 Evaluation Approaches	p. 306
9.7.3 Psychoacoustically Motivated Measures	p. 312
9.8 Instrumental Attribute-Based Quality Measurements	p. 320
9.8.1 Basic Ideas	p. 320
9.8.2 Loudness	p. 322
9.8.3 Sharpness	p. 323
9.8.4 Roughness	p. 323
9.8.5 Directness/Frequency Content (DFC)	p. 324
9.8.6 Continuity	p. 326
9.8.7 Noisiness	p. 329
9.8.8 Combined Direct and Attribute-Based Total Quality Determination	p. 331
9.9 Conclusions, Outlook, and Final Remarks	p. 331
References	p. 332
10 Evaluation of Hands-free TerminalsF. Kettler and H.-W. Gierlich
10.1 Introduction	p. 339
10.2 Quality Assessment of Hands-free Terminals	p. 340
10.3 Subjective Methods for Determining the Communicational Quality	p. 342
10.3.1 General Setup and Opinion Scales Used for Subjective Performance Evaluation	p. 343
10.3.2 Conversation Tests	p. 345
10.3.3 Double Talk Tests	p. 346
10.3.4 Talking and Listening Tests	p. 347
10.3.5 Listening-only Tests (LOT) and Third Party Listening Tests	p. 348
10.3.6 Experts Tests for Assessing Real Life Situations	p. 349
10.4 Test Environment	p. 350
10.4.1 The Acoustical Environment	p. 351
10.4.2 Background Noise Simulation Techniques	p. 351
10.4.3 Positioning of the Hands-Free Terminal	p. 352
10.4.4 Positioning of the Artificial Head	p. 352
10.4.5 Influence of the Transmission System	p. 354
10.5 Test Signals and Analysis Methods	p. 354
10.5.1 Speech and Perceptual Speech Quality Measures	p. 356
10.5.2 Speech-like Test Signals	p. 356
10.5.3 Background Noise	p. 360
10.5.4 Applications	p. 363
10.6 Result Representation	p. 365
10.6.1 Interpretation of HFT "Quality Pies"	p. 366
10.6.2 Examples	p. 368
10.7 Related Aspects	p. 368
10.7.1 The Lombard Effect	p. 368
10.7.2 Intelligibility Outside Vehicles	p. 372
References	p. 375
Part IV Multi-Channel Processing
11 Correlation-Based TDOA-Estimation for Multiple Sources in Reverberant EnvironmentsJ. Scheuing and B. Yang
11.1 Introduction	p. 381
11.2 Analysis of TDOA Ambiguities	p. 383
11.2.1 Signal Model	p. 383
11.2.2 Multipath Ambiguity	p. 384
11.2.3 Multiple Source Ambiguity	p. 384
11.2.4 Ambiguity due to Periodic Signals	p. 386
11.2.5 Principles of TDOA Disambiguation	p. 386
11.3 Estimation of Direct Path TDOAs	p. 390
11.3.1 Correlation and Extremum Positions	p. 390
11.3.2 Raster Matching	p. 392
11.4 Consistent TDOA Graphs	p. 397
11.4.1 TDOA Graph	p. 397
11.4.2 Strategies of Consistency Check	p. 398
11.4.3 Properties of TDOA Graphs	p. 399
11.4.4 Efficient Synthesis Algorithm	p. 402
11.4.5 Initialization and Termination	p. 404
11.4.6 Estimating the Number of Active Sources	p. 405
11.5 Experimental Results	p. 406
11.5.1 Localization System	p. 406
11.5.2 TDOA Estimation of a Single Signal Block	p. 408
11.5.3 Source Position Estimation	p. 412
11.5.4 Evaluation of Continuous Measurements	p. 412
11.6 Summary	p. 414
References	p. 415
12 Microphone Calibration for Multi-Channel Signal ProcessingM. Buck and T. Haulick and H.-J. Pfleiderer
12.1 Introduction	p. 417
12.2 Beamforming with Ideal Microphones	p. 418
12.2.1 Principle of Beamforming	p. 418
12.2.2 Evaluation of Beamformers	p. 421
12.2.3 Statistically Optimum Beamformers	p. 424
12.3 Microphone Mismatch and its Effect on Beamforming	p. 427
12.3.1 Model for Non-Ideal Microphone Characteristics	p. 428
12.3.2 Effect of Microphone Mismatch on Fixed Beamformers	p. 429
12.3.3 Effect of Microphone Mismatch on Adaptive Beamformers	p. 430
12.3.4 Comparison of Fixed and Adaptive Beamformers	p. 432
12.4 Calibration Techniques and their Limits for Real-World Applications	p. 432

Available:*

On Order

Summary

Summary

Table of Contents