Mathematical models for speech technology

Select an Action

Place Hold(s)
Add to My Lists
Email
Print

Title:

Personal Author:

Levinson, Stephen E.

Publication Information:

Chichester, West Sussex, England : John Wiley, 2005

ISBN:

9780470844076

Subject Term:

Speech processing systems

Computational linguistics

Applied linguistics -- Mathematics

Stochastic processes

Available:*

Library	Item Barcode	Call Number	Material Type	Item Category 1	Status
Searching... PSZ JB	30000010074515	TK7882.S65 L484 2005	Open Access Book	Book	Searching... Unknown

Mathematical Models of Spoken Language presents the motivations for, intuitions behind, and basic mathematical models of natural spoken language communication. A comprehensive overview is given of all aspects of the problem from the physics of speech production through the hierarchy of linguistic structure and ending with some observations on language and mind.

The author comprehensively explores the argument that these modern technologies are actually the most extensive compilations of linguistic knowledge available.Throughout the book, the emphasis is on placing all the material in a mathematically coherent and computationally tractable framework that captures linguistic structure.

It presents material that appears nowhere else and gives a unification of formalisms and perspectives used by linguists and engineers. Its unique features include a coherent nomenclature that emphasizes the deep connections amongst the diverse mathematical models and explores the methods by means of which they capture linguistic structure.

This contrasts with some of the superficial similarities described in the existing literature; the historical background and origins of the theories and models; the connections to related disciplines, e.g. artificial intelligence, automata theory and information theory; an elucidation of the current debates and their intellectual origins; many important little-known results and some original proofs of fundamental results, e.g. a geometric interpretation of parameter estimation techniques for stochastic models and finally the author's own unique perspectives on the future of this discipline.

There is a vast literature on Speech Recognition and Synthesis however, this book is unlike any other in the field. Although it appears to be a rapidly advancing field, the fundamentals have not changed in decades. Most of the results are presented in journals from which it is difficult to integrate and evaluate all of these recent ideas. Some of the fundamentals have been collected into textbooks, which give detailed descriptions of the techniques but no motivation or perspective. The linguistic texts are mostly descriptive and pictorial, lacking the mathematical and computational aspects. This book strikes a useful balance by covering a wide range of ideas in a common framework. It provides all the basic algorithms and computational techniques and an analysis and perspective, which allows one to intelligently read the latest literature and understand state-of-the-art techniques as they evolve.

Author Notes

Stephen Levinson is the author of Mathematical Models for Speech Technology, published by Wiley.

Preface	p. xi
1 Introduction	p. 1
1.1 Milestones in the history of speech technology	p. 1
1.2 Prospects for the future	p. 3
1.3 Technical synopsis	p. 4
2 Preliminaries	p. 9
2.1 The physics of speech production	p. 9
2.1.1 The human vocal apparatus	p. 9
2.1.2 Boundary conditions	p. 14
2.1.3 Non-stationarity	p. 16
2.1.4 Fluid dynamical effects	p. 16
2.2 The source-filter model	p. 17
2.3 Information-bearing features of the speech signal	p. 17
2.3.1 Fourier methods	p. 19
2.3.2 Linear prediction and the Webster equation	p. 21
2.4 Time-frequency representations	p. 23
2.5 Classification of acoustic patterns in speech	p. 27
2.5.1 Statistical decision theory	p. 28
2.5.2 Estimation of class-conditional probability density functions	p. 30
2.5.3 Information-preserving transformations	p. 39
2.5.4 Unsupervised density estimation - quantization	p. 42
2.5.5 A note on connectionism	p. 43
2.6 Temporal invariance and stationarity	p. 44
2.6.1 A variational problem	p. 45
2.6.2 A solution by dynamic programming	p. 47
2.7 Taxonomy of linguistic structure	p. 51
2.7.1 Acoustic phonetics, phonology, and phonotactics	p. 52
2.7.2 Morphology and lexical structure	p. 55
2.7.3 Prosody, syntax, and semantics	p. 55
2.7.4 Pragmatics and dialog	p. 56
3 Mathematical models of linguistic structure	p. 57
3.1 Probabilistic functions of a discrete Markov process	p. 57
3.1.1 The discrete observation hidden Markov model	p. 57
3.1.2 The continuous observation case	p. 80
3.1.3 The autoregressive observation case	p. 87
3.1.4 The semi-Markov process and correlated observations	p. 88
3.1.5 The non-stationary observation case	p. 99
3.1.6 Parameter estimation via the EM algorithm	p. 107
3.1.7 The Cave-Neuwirth and Poritz results	p. 107
3.2 Formal grammars and abstract automata	p. 109
3.2.1 The Chomsky hierarchy	p. 110
3.2.2 Stochastic grammars	p. 113
3.2.3 Equivalence of regular stochastic grammars and discrete HMMs	p. 114
3.2.4 Recognition of well-formed strings	p. 115
3.2.5 Representation of phonology and syntax	p. 116
4 Syntactic analysis	p. 119
4.1 Deterministic parsing algorithms	p. 119
4.1.1 The Dijkstra algorithm for regular languages	p. 119
4.1.2 The Cocke-Kasami-Younger algorithm for context-free languages	p. 121
4.2 Probabilistic parsing algorithms	p. 122
4.2.1 Using the Baum algorithm to parse regular languages	p. 122
4.2.2 Dynamic programming methods	p. 123
4.2.3 Probabilistic Cocke-Kasami-Younger methods	p. 130
4.2.4 Asynchronous methods	p. 130
4.3 Parsing natural language	p. 131
4.3.1 The right-linear case	p. 132
4.3.2 The Markovian case	p. 133
4.3.3 The context-free case	p. 133
5 Grammatical Inference	p. 137
5.1 Exact inference and Gold's theorem	p. 137
5.2 Baum's algorithm for regular grammars	p. 137
5.3 Event counting in parse trees	p. 139
5.4 Baker's algorithm for context-free grammars	p. 140
6 Information-theoretic analysis of speech communication	p. 143
6.1 The Miller et al. experiments	p. 143
6.2 Entropy of an information source	p. 143
6.2.1 Entropy of deterministic formal languages	p. 144
6.2.2 Entropy of languages generated by stochastic grammars	p. 150
6.2.3 Epsilon representations of deterministic languages	p. 153
6.3 Recognition error rates and entropy	p. 153
6.3.1 Analytic results derived from the Fano bound	p. 154
6.3.2 Experimental results	p. 156
7 Automatic speech recognition and constructive theories of language	p. 157
7.1 Integrated architectures	p. 157
7.2 Modular architectures	p. 161
7.2.1 Acoustic-phonetic transcription	p. 161
7.2.2 Lexical access	p. 162
7.2.3 Syntax analysis	p. 165
7.3 Parameter estimation from fluent speech	p. 166
7.3.1 Use of the Baum algorithm	p. 166
7.3.2 The role of text analysis	p. 167
7.4 System performance	p. 168
7.5 Other speech technologies	p. 169
7.5.1 Articulatory speech synthesis	p. 169
7.5.2 Very low-bandwidth speech coding	p. 170
7.5.3 Automatic language identification	p. 170
7.5.4 Automatic language translation	p. 171
8 Automatic speech understanding and semantics	p. 173
8.1 Transcription and comprehension	p. 173
8.2 Limited domain semantics	p. 174
8.2.1 A semantic interpreter	p. 175
8.2.2 Error recovery	p. 182
8.3 The semantics of natural language	p. 189
8.3.1 Shallow semantics and mutual information	p. 189
8.3.2 Graphical methods	p. 190
8.3.3 Formal logical models of semantics	p. 190
8.3.4 Relationship between syntax and semantics	p. 194
8.4 System architectures	p. 195
8.5 Human and machine performance	p. 197
9 Theories of mind and language	p. 199
9.1 The challenge of automatic natural language understanding	p. 199
9.2 Metaphors for mind	p. 199
9.2.1 Wiener's cybernetics and the diachronic history	p. 201
9.2.2 The crisis in the foundations of mathematics	p. 205
9.2.3 Turing's universal machine	p. 210
9.2.4 The Church-Turing hypothesis	p. 212
9.3 The artificial intelligence program	p. 213
9.3.1 Functional equivalence and the strong theory of AI	p. 213
9.3.2 The broken promise	p. 214
9.3.3 Schorske's causes of cultural decline	p. 214
9.3.4 The ahistorical blind alley	p. 215
9.3.5 Observation, introspection and divine inspiration	p. 215
9.3.6 Resurrecting the program by unifying the synchronic and diachronic	p. 216
10 A Speculation on the prospects for a science of mind	p. 219
10.1 The parable of the thermos bottle: measurements and symbols	p. 219
10.2 The four questions of science	p. 220
10.2.1 Reductionism and emergence	p. 220
10.2.2 From early intuition to quantitative reasoning	p. 221
10.2.3 Objections to mathematical realism	p. 223
10.2.4 The objection from the diversity of the sciences	p. 224
10.2.5 The objection from Cartesian duality	p. 225
10.2.6 The objection from either free will or determinism	p. 225
10.2.7 The postmodern objection	p. 226
10.2.8 Beginning the new science	p. 227
10.3 A constructive theory of mind	p. 228
10.3.1 Reinterpreting the strong theory of AI	p. 228
10.3.2 Generalizing the Turing test	p. 228
10.4 The problem of consciousness	p. 229
10.5 The role of sensorimotor function, associative memory and reinforcement learning in automatic acquisition of spoken language by an autonomous robot	p. 230
10.5.1 Embodied mind from integrated sensorimotor function	p. 231
10.5.2 Associative memory as the basis for thought	p. 231
10.5.3 Reinforcement learning via interaction with physical reality	p. 232
10.5.4 Semantics as sensorimotor memory	p. 234
10.5.5 The primacy of semantics in linguistic structure	p. 234
10.5.6 Thought as linguistic manipulation of mental representations of reality	p. 235
10.5.7 Illy the autonomous robot	p. 235
10.5.8 Software	p. 237
10.5.9 Associative memory architecture	p. 238
10.5.10 Performance	p. 238
10.5.11 Obstacles to the program	p. 239
10.6 Final thoughts: predicting the course of discovery	p. 241
Bibliography	p. 243
Index	p. 257

Available:*

On Order

Summary

Summary

Author Notes

Table of Contents