Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010074515 | TK7882.S65 L484 2005 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Mathematical Models of Spoken Language presents the motivations for, intuitions behind, and basic mathematical models of natural spoken language communication. A comprehensive overview is given of all aspects of the problem from the physics of speech production through the hierarchy of linguistic structure and ending with some observations on language and mind.
The author comprehensively explores the argument that these modern technologies are actually the most extensive compilations of linguistic knowledge available.Throughout the book, the emphasis is on placing all the material in a mathematically coherent and computationally tractable framework that captures linguistic structure.
It presents material that appears nowhere else and gives a unification of formalisms and perspectives used by linguists and engineers. Its unique features include a coherent nomenclature that emphasizes the deep connections amongst the diverse mathematical models and explores the methods by means of which they capture linguistic structure.
This contrasts with some of the superficial similarities described in the existing literature; the historical background and origins of the theories and models; the connections to related disciplines, e.g. artificial intelligence, automata theory and information theory; an elucidation of the current debates and their intellectual origins; many important little-known results and some original proofs of fundamental results, e.g. a geometric interpretation of parameter estimation techniques for stochastic models and finally the author's own unique perspectives on the future of this discipline.
There is a vast literature on Speech Recognition and Synthesis however, this book is unlike any other in the field. Although it appears to be a rapidly advancing field, the fundamentals have not changed in decades. Most of the results are presented in journals from which it is difficult to integrate and evaluate all of these recent ideas. Some of the fundamentals have been collected into textbooks, which give detailed descriptions of the techniques but no motivation or perspective. The linguistic texts are mostly descriptive and pictorial, lacking the mathematical and computational aspects. This book strikes a useful balance by covering a wide range of ideas in a common framework. It provides all the basic algorithms and computational techniques and an analysis and perspective, which allows one to intelligently read the latest literature and understand state-of-the-art techniques as they evolve.
Author Notes
Stephen Levinson is the author of Mathematical Models for Speech Technology, published by Wiley.
Table of Contents
Preface | p. xi |
1 Introduction | p. 1 |
1.1 Milestones in the history of speech technology | p. 1 |
1.2 Prospects for the future | p. 3 |
1.3 Technical synopsis | p. 4 |
2 Preliminaries | p. 9 |
2.1 The physics of speech production | p. 9 |
2.1.1 The human vocal apparatus | p. 9 |
2.1.2 Boundary conditions | p. 14 |
2.1.3 Non-stationarity | p. 16 |
2.1.4 Fluid dynamical effects | p. 16 |
2.2 The source-filter model | p. 17 |
2.3 Information-bearing features of the speech signal | p. 17 |
2.3.1 Fourier methods | p. 19 |
2.3.2 Linear prediction and the Webster equation | p. 21 |
2.4 Time-frequency representations | p. 23 |
2.5 Classification of acoustic patterns in speech | p. 27 |
2.5.1 Statistical decision theory | p. 28 |
2.5.2 Estimation of class-conditional probability density functions | p. 30 |
2.5.3 Information-preserving transformations | p. 39 |
2.5.4 Unsupervised density estimation - quantization | p. 42 |
2.5.5 A note on connectionism | p. 43 |
2.6 Temporal invariance and stationarity | p. 44 |
2.6.1 A variational problem | p. 45 |
2.6.2 A solution by dynamic programming | p. 47 |
2.7 Taxonomy of linguistic structure | p. 51 |
2.7.1 Acoustic phonetics, phonology, and phonotactics | p. 52 |
2.7.2 Morphology and lexical structure | p. 55 |
2.7.3 Prosody, syntax, and semantics | p. 55 |
2.7.4 Pragmatics and dialog | p. 56 |
3 Mathematical models of linguistic structure | p. 57 |
3.1 Probabilistic functions of a discrete Markov process | p. 57 |
3.1.1 The discrete observation hidden Markov model | p. 57 |
3.1.2 The continuous observation case | p. 80 |
3.1.3 The autoregressive observation case | p. 87 |
3.1.4 The semi-Markov process and correlated observations | p. 88 |
3.1.5 The non-stationary observation case | p. 99 |
3.1.6 Parameter estimation via the EM algorithm | p. 107 |
3.1.7 The Cave-Neuwirth and Poritz results | p. 107 |
3.2 Formal grammars and abstract automata | p. 109 |
3.2.1 The Chomsky hierarchy | p. 110 |
3.2.2 Stochastic grammars | p. 113 |
3.2.3 Equivalence of regular stochastic grammars and discrete HMMs | p. 114 |
3.2.4 Recognition of well-formed strings | p. 115 |
3.2.5 Representation of phonology and syntax | p. 116 |
4 Syntactic analysis | p. 119 |
4.1 Deterministic parsing algorithms | p. 119 |
4.1.1 The Dijkstra algorithm for regular languages | p. 119 |
4.1.2 The Cocke-Kasami-Younger algorithm for context-free languages | p. 121 |
4.2 Probabilistic parsing algorithms | p. 122 |
4.2.1 Using the Baum algorithm to parse regular languages | p. 122 |
4.2.2 Dynamic programming methods | p. 123 |
4.2.3 Probabilistic Cocke-Kasami-Younger methods | p. 130 |
4.2.4 Asynchronous methods | p. 130 |
4.3 Parsing natural language | p. 131 |
4.3.1 The right-linear case | p. 132 |
4.3.2 The Markovian case | p. 133 |
4.3.3 The context-free case | p. 133 |
5 Grammatical Inference | p. 137 |
5.1 Exact inference and Gold's theorem | p. 137 |
5.2 Baum's algorithm for regular grammars | p. 137 |
5.3 Event counting in parse trees | p. 139 |
5.4 Baker's algorithm for context-free grammars | p. 140 |
6 Information-theoretic analysis of speech communication | p. 143 |
6.1 The Miller et al. experiments | p. 143 |
6.2 Entropy of an information source | p. 143 |
6.2.1 Entropy of deterministic formal languages | p. 144 |
6.2.2 Entropy of languages generated by stochastic grammars | p. 150 |
6.2.3 Epsilon representations of deterministic languages | p. 153 |
6.3 Recognition error rates and entropy | p. 153 |
6.3.1 Analytic results derived from the Fano bound | p. 154 |
6.3.2 Experimental results | p. 156 |
7 Automatic speech recognition and constructive theories of language | p. 157 |
7.1 Integrated architectures | p. 157 |
7.2 Modular architectures | p. 161 |
7.2.1 Acoustic-phonetic transcription | p. 161 |
7.2.2 Lexical access | p. 162 |
7.2.3 Syntax analysis | p. 165 |
7.3 Parameter estimation from fluent speech | p. 166 |
7.3.1 Use of the Baum algorithm | p. 166 |
7.3.2 The role of text analysis | p. 167 |
7.4 System performance | p. 168 |
7.5 Other speech technologies | p. 169 |
7.5.1 Articulatory speech synthesis | p. 169 |
7.5.2 Very low-bandwidth speech coding | p. 170 |
7.5.3 Automatic language identification | p. 170 |
7.5.4 Automatic language translation | p. 171 |
8 Automatic speech understanding and semantics | p. 173 |
8.1 Transcription and comprehension | p. 173 |
8.2 Limited domain semantics | p. 174 |
8.2.1 A semantic interpreter | p. 175 |
8.2.2 Error recovery | p. 182 |
8.3 The semantics of natural language | p. 189 |
8.3.1 Shallow semantics and mutual information | p. 189 |
8.3.2 Graphical methods | p. 190 |
8.3.3 Formal logical models of semantics | p. 190 |
8.3.4 Relationship between syntax and semantics | p. 194 |
8.4 System architectures | p. 195 |
8.5 Human and machine performance | p. 197 |
9 Theories of mind and language | p. 199 |
9.1 The challenge of automatic natural language understanding | p. 199 |
9.2 Metaphors for mind | p. 199 |
9.2.1 Wiener's cybernetics and the diachronic history | p. 201 |
9.2.2 The crisis in the foundations of mathematics | p. 205 |
9.2.3 Turing's universal machine | p. 210 |
9.2.4 The Church-Turing hypothesis | p. 212 |
9.3 The artificial intelligence program | p. 213 |
9.3.1 Functional equivalence and the strong theory of AI | p. 213 |
9.3.2 The broken promise | p. 214 |
9.3.3 Schorske's causes of cultural decline | p. 214 |
9.3.4 The ahistorical blind alley | p. 215 |
9.3.5 Observation, introspection and divine inspiration | p. 215 |
9.3.6 Resurrecting the program by unifying the synchronic and diachronic | p. 216 |
10 A Speculation on the prospects for a science of mind | p. 219 |
10.1 The parable of the thermos bottle: measurements and symbols | p. 219 |
10.2 The four questions of science | p. 220 |
10.2.1 Reductionism and emergence | p. 220 |
10.2.2 From early intuition to quantitative reasoning | p. 221 |
10.2.3 Objections to mathematical realism | p. 223 |
10.2.4 The objection from the diversity of the sciences | p. 224 |
10.2.5 The objection from Cartesian duality | p. 225 |
10.2.6 The objection from either free will or determinism | p. 225 |
10.2.7 The postmodern objection | p. 226 |
10.2.8 Beginning the new science | p. 227 |
10.3 A constructive theory of mind | p. 228 |
10.3.1 Reinterpreting the strong theory of AI | p. 228 |
10.3.2 Generalizing the Turing test | p. 228 |
10.4 The problem of consciousness | p. 229 |
10.5 The role of sensorimotor function, associative memory and reinforcement learning in automatic acquisition of spoken language by an autonomous robot | p. 230 |
10.5.1 Embodied mind from integrated sensorimotor function | p. 231 |
10.5.2 Associative memory as the basis for thought | p. 231 |
10.5.3 Reinforcement learning via interaction with physical reality | p. 232 |
10.5.4 Semantics as sensorimotor memory | p. 234 |
10.5.5 The primacy of semantics in linguistic structure | p. 234 |
10.5.6 Thought as linguistic manipulation of mental representations of reality | p. 235 |
10.5.7 Illy the autonomous robot | p. 235 |
10.5.8 Software | p. 237 |
10.5.9 Associative memory architecture | p. 238 |
10.5.10 Performance | p. 238 |
10.5.11 Obstacles to the program | p. 239 |
10.6 Final thoughts: predicting the course of discovery | p. 241 |
Bibliography | p. 243 |
Index | p. 257 |