Cover image for Mathematical models for speech technology
Title:
Mathematical models for speech technology
Personal Author:
Publication Information:
Chichester, West Sussex, England : John Wiley, 2005
ISBN:
9780470844076

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010074515 TK7882.S65 L484 2005 Open Access Book Book
Searching...

On Order

Summary

Summary

Mathematical Models of Spoken Language presents the motivations for, intuitions behind, and basic mathematical models of natural spoken language communication. A comprehensive overview is given of all aspects of the problem from the physics of speech production through the hierarchy of linguistic structure and ending with some observations on language and mind.

The author comprehensively explores the argument that these modern technologies are actually the most extensive compilations of linguistic knowledge available.Throughout the book, the emphasis is on placing all the material in a mathematically coherent and computationally tractable framework that captures linguistic structure.

It presents material that appears nowhere else and gives a unification of formalisms and perspectives used by linguists and engineers. Its unique features include a coherent nomenclature that emphasizes the deep connections amongst the diverse mathematical models and explores the methods by means of which they capture linguistic structure.

This contrasts with some of the superficial similarities described in the existing literature; the historical background and origins of the theories and models; the connections to related disciplines, e.g. artificial intelligence, automata theory and information theory; an elucidation of the current debates and their intellectual origins; many important little-known results and some original proofs of fundamental results, e.g. a geometric interpretation of parameter estimation techniques for stochastic models and finally the author's own unique perspectives on the future of this discipline.

There is a vast literature on Speech Recognition and Synthesis however, this book is unlike any other in the field. Although it appears to be a rapidly advancing field, the fundamentals have not changed in decades. Most of the results are presented in journals from which it is difficult to integrate and evaluate all of these recent ideas. Some of the fundamentals have been collected into textbooks, which give detailed descriptions of the techniques but no motivation or perspective. The linguistic texts are mostly descriptive and pictorial, lacking the mathematical and computational aspects. This book strikes a useful balance by covering a wide range of ideas in a common framework. It provides all the basic algorithms and computational techniques and an analysis and perspective, which allows one to intelligently read the latest literature and understand state-of-the-art techniques as they evolve.


Author Notes

Stephen Levinson is the author of Mathematical Models for Speech Technology, published by Wiley.


Table of Contents

Prefacep. xi
1 Introductionp. 1
1.1 Milestones in the history of speech technologyp. 1
1.2 Prospects for the futurep. 3
1.3 Technical synopsisp. 4
2 Preliminariesp. 9
2.1 The physics of speech productionp. 9
2.1.1 The human vocal apparatusp. 9
2.1.2 Boundary conditionsp. 14
2.1.3 Non-stationarityp. 16
2.1.4 Fluid dynamical effectsp. 16
2.2 The source-filter modelp. 17
2.3 Information-bearing features of the speech signalp. 17
2.3.1 Fourier methodsp. 19
2.3.2 Linear prediction and the Webster equationp. 21
2.4 Time-frequency representationsp. 23
2.5 Classification of acoustic patterns in speechp. 27
2.5.1 Statistical decision theoryp. 28
2.5.2 Estimation of class-conditional probability density functionsp. 30
2.5.3 Information-preserving transformationsp. 39
2.5.4 Unsupervised density estimation - quantizationp. 42
2.5.5 A note on connectionismp. 43
2.6 Temporal invariance and stationarityp. 44
2.6.1 A variational problemp. 45
2.6.2 A solution by dynamic programmingp. 47
2.7 Taxonomy of linguistic structurep. 51
2.7.1 Acoustic phonetics, phonology, and phonotacticsp. 52
2.7.2 Morphology and lexical structurep. 55
2.7.3 Prosody, syntax, and semanticsp. 55
2.7.4 Pragmatics and dialogp. 56
3 Mathematical models of linguistic structurep. 57
3.1 Probabilistic functions of a discrete Markov processp. 57
3.1.1 The discrete observation hidden Markov modelp. 57
3.1.2 The continuous observation casep. 80
3.1.3 The autoregressive observation casep. 87
3.1.4 The semi-Markov process and correlated observationsp. 88
3.1.5 The non-stationary observation casep. 99
3.1.6 Parameter estimation via the EM algorithmp. 107
3.1.7 The Cave-Neuwirth and Poritz resultsp. 107
3.2 Formal grammars and abstract automatap. 109
3.2.1 The Chomsky hierarchyp. 110
3.2.2 Stochastic grammarsp. 113
3.2.3 Equivalence of regular stochastic grammars and discrete HMMsp. 114
3.2.4 Recognition of well-formed stringsp. 115
3.2.5 Representation of phonology and syntaxp. 116
4 Syntactic analysisp. 119
4.1 Deterministic parsing algorithmsp. 119
4.1.1 The Dijkstra algorithm for regular languagesp. 119
4.1.2 The Cocke-Kasami-Younger algorithm for context-free languagesp. 121
4.2 Probabilistic parsing algorithmsp. 122
4.2.1 Using the Baum algorithm to parse regular languagesp. 122
4.2.2 Dynamic programming methodsp. 123
4.2.3 Probabilistic Cocke-Kasami-Younger methodsp. 130
4.2.4 Asynchronous methodsp. 130
4.3 Parsing natural languagep. 131
4.3.1 The right-linear casep. 132
4.3.2 The Markovian casep. 133
4.3.3 The context-free casep. 133
5 Grammatical Inferencep. 137
5.1 Exact inference and Gold's theoremp. 137
5.2 Baum's algorithm for regular grammarsp. 137
5.3 Event counting in parse treesp. 139
5.4 Baker's algorithm for context-free grammarsp. 140
6 Information-theoretic analysis of speech communicationp. 143
6.1 The Miller et al. experimentsp. 143
6.2 Entropy of an information sourcep. 143
6.2.1 Entropy of deterministic formal languagesp. 144
6.2.2 Entropy of languages generated by stochastic grammarsp. 150
6.2.3 Epsilon representations of deterministic languagesp. 153
6.3 Recognition error rates and entropyp. 153
6.3.1 Analytic results derived from the Fano boundp. 154
6.3.2 Experimental resultsp. 156
7 Automatic speech recognition and constructive theories of languagep. 157
7.1 Integrated architecturesp. 157
7.2 Modular architecturesp. 161
7.2.1 Acoustic-phonetic transcriptionp. 161
7.2.2 Lexical accessp. 162
7.2.3 Syntax analysisp. 165
7.3 Parameter estimation from fluent speechp. 166
7.3.1 Use of the Baum algorithmp. 166
7.3.2 The role of text analysisp. 167
7.4 System performancep. 168
7.5 Other speech technologiesp. 169
7.5.1 Articulatory speech synthesisp. 169
7.5.2 Very low-bandwidth speech codingp. 170
7.5.3 Automatic language identificationp. 170
7.5.4 Automatic language translationp. 171
8 Automatic speech understanding and semanticsp. 173
8.1 Transcription and comprehensionp. 173
8.2 Limited domain semanticsp. 174
8.2.1 A semantic interpreterp. 175
8.2.2 Error recoveryp. 182
8.3 The semantics of natural languagep. 189
8.3.1 Shallow semantics and mutual informationp. 189
8.3.2 Graphical methodsp. 190
8.3.3 Formal logical models of semanticsp. 190
8.3.4 Relationship between syntax and semanticsp. 194
8.4 System architecturesp. 195
8.5 Human and machine performancep. 197
9 Theories of mind and languagep. 199
9.1 The challenge of automatic natural language understandingp. 199
9.2 Metaphors for mindp. 199
9.2.1 Wiener's cybernetics and the diachronic historyp. 201
9.2.2 The crisis in the foundations of mathematicsp. 205
9.2.3 Turing's universal machinep. 210
9.2.4 The Church-Turing hypothesisp. 212
9.3 The artificial intelligence programp. 213
9.3.1 Functional equivalence and the strong theory of AIp. 213
9.3.2 The broken promisep. 214
9.3.3 Schorske's causes of cultural declinep. 214
9.3.4 The ahistorical blind alleyp. 215
9.3.5 Observation, introspection and divine inspirationp. 215
9.3.6 Resurrecting the program by unifying the synchronic and diachronicp. 216
10 A Speculation on the prospects for a science of mindp. 219
10.1 The parable of the thermos bottle: measurements and symbolsp. 219
10.2 The four questions of sciencep. 220
10.2.1 Reductionism and emergencep. 220
10.2.2 From early intuition to quantitative reasoningp. 221
10.2.3 Objections to mathematical realismp. 223
10.2.4 The objection from the diversity of the sciencesp. 224
10.2.5 The objection from Cartesian dualityp. 225
10.2.6 The objection from either free will or determinismp. 225
10.2.7 The postmodern objectionp. 226
10.2.8 Beginning the new sciencep. 227
10.3 A constructive theory of mindp. 228
10.3.1 Reinterpreting the strong theory of AIp. 228
10.3.2 Generalizing the Turing testp. 228
10.4 The problem of consciousnessp. 229
10.5 The role of sensorimotor function, associative memory and reinforcement learning in automatic acquisition of spoken language by an autonomous robotp. 230
10.5.1 Embodied mind from integrated sensorimotor functionp. 231
10.5.2 Associative memory as the basis for thoughtp. 231
10.5.3 Reinforcement learning via interaction with physical realityp. 232
10.5.4 Semantics as sensorimotor memoryp. 234
10.5.5 The primacy of semantics in linguistic structurep. 234
10.5.6 Thought as linguistic manipulation of mental representations of realityp. 235
10.5.7 Illy the autonomous robotp. 235
10.5.8 Softwarep. 237
10.5.9 Associative memory architecturep. 238
10.5.10 Performancep. 238
10.5.11 Obstacles to the programp. 239
10.6 Final thoughts: predicting the course of discoveryp. 241
Bibliographyp. 243
Indexp. 257