Cover image for Inference in hidden Markov models
Title:
Inference in hidden Markov models
Personal Author:
Series:
Springer series in statistics
Publication Information:
New York, NY: Springer, 2005
ISBN:
9780387402642
Subject Term:

Available:*

Library
Item Barcode
Call Number
Material Type
Status
Searching...
30000010126811 QA274.7 C36 2005 Open Access Book
Searching...

On Order

Summary

Summary

This book is a comprehensive treatment of inference for hidden Markov models, including both algorithms and statistical theory. Topics range from filtering and smoothing of the hidden Markov chain to parameter estimation, Bayesian methods and estimation of the number of states. In a unified way the book covers both models with finite state spaces and models with continuous state spaces (also called state-space models) requiring approximate simulation-based algorithms that are also described in detail. Many examples illustrate the algorithms and theory. This book builds on recent developments to present a self-contained view.


Author Notes

Olivier Cappe is Researcher for the French National Center for Scientific Research (CNRS)
Eric Moulines is Professor at Ecole Nationale Superieure des Telecommunications (ENST), Paris, France
Tobias Ryden is Professor of Mathematical Statistics at Lund University, Sweden


Excerpts

Excerpts

"Hidden Markov models have become a widely used class of statistical models with applications in diverse areas such as communications engineering, bioinformatics, finance and many more. This book is a comprehensive treatment of inference for hidden Markov models, including both algorithms and statistical theory. Topics range from filtering and smoothing of the hidden Markov chain to parameter estimation, Bayesian methods and estimation of the number of states." "In a unified way the book covers both models with finite state spaces, which allow for exact algorithms for filtering, estimation etc., and models with continuous state spaces (also called state-space models) requiring approximate simulation-based algorithms that are also described in detail." "This volume will suit anybody with an interest in inference for stochastic processes, and it will be useful for researchers and practitioners in areas such as statistics, signal processing, communications engineering, control theory, econometrics, finance and more. The algorithmic parts of the book do not require an advanced mathematical background, while the more theoretical parts require knowledge of probability theory at the measure-theoretical level."--BOOK JACKET.

Table of Contents

Prefacep. V
Contributorsp. IX
1 Introductionp. 1
1.1 What Is a Hidden Markov Model?p. 1
1.2 Beyond Hidden Markov Modelsp. 4
1.3 Examplesp. 6
1.3.1 Finite Hidden Markov Modelsp. 6
1.3.2 Normal Hidden Markov Modelsp. 13
1.3.3 Gaussian Linear State-Space Modelsp. 15
1.3.4 Conditionally Gaussian Linear State-Space Modelsp. 17
1.3.5 General (Continuous) State-Space HMMsp. 24
1.3.6 Switching Processes with Markov Regimep. 29
1.4 Left-to-Right and Ergodic Hidden Markov Modelsp. 33
2 Main Definitions and Notationsp. 35
2.1 Markov Chainsp. 35
2.1.1 Transition Kernelsp. 35
2.1.2 Homogeneous Markov Chainsp. 37
2.1.3 Non-homogeneous Markov Chainsp. 40
2.2 Hidden Markov Modelsp. 42
2.2.1 Definitions and Notationsp. 42
2.2.2 Conditional Independence in Hidden Markov Modelsp. 44
2.2.3 Hierarchical Hidden Markov Modelsp. 46
Part I State Inference
3 Filtering and Smoothing Recursionsp. 51
3.1 Basic Notations and Definitionsp. 53
3.1.1 Likelihoodp. 53
3.1.2 Smoothingp. 54
3.1.3 The Forward-Backward Decompositionp. 56
3.1.4 Implicit Conditioning (Please Read This Section!)p. 58
3.2 Forward-Backwardp. 59
3.2.1 The Forward-Backward Recursionsp. 59
3.2.2 Filtering and Normalized Recursionp. 61
3.3 Markovian Decompositionsp. 66
3.3.1 Forward Decompositionp. 66
3.3.2 Backward Decompositionp. 70
3.4 Complementsp. 74
4 Advanced Topics in Smoothingp. 77
4.1 Recursive Computation of Smoothed Functionalsp. 77
4.1.1 Fixed Point Smoothingp. 78
4.1.2 Recursive Smoothers for General Functionalsp. 79
4.1.3 Comparison with Forward-Backward Smoothingp. 82
4.2 Filtering and Smoothing in More General Modelsp. 85
4.2.1 Smoothing in Markov-switching Modelsp. 86
4.2.2 Smoothing in Partially Observed Markov Chainsp. 86
4.2.3 Marginal Smoothing in Hierarchical HMMsp. 87
4.3 Forgetting of the Initial Conditionp. 89
4.3.1 Total Variationp. 90
4.3.2 Lipshitz Contraction for Transition Kernelsp. 95
4.3.3 The Doeblin Condition and Uniform Ergodicityp. 97
4.3.4 Forgetting Propertiesp. 100
4.3.5 Uniform Forgetting Under Strong Mixing Conditionsp. 105
4.3.6 Forgetting Under Alternative Conditionsp. 110
5 Applications of Smoothingp. 121
5.1 Models with Finite State Spacep. 121
5.1.1 Smoothingp. 122
5.1.2 Maximum a Posteriori Sequence Estimationp. 125
5.2 Gaussian Linear State-Space Modelsp. 127
5.2.1 Filtering and Backward Markovian Smoothingp. 127
5.2.2 Linear Prediction Interpretationp. 131
5.2.3 The Prediction and Filtering Recursions Revisitedp. 137
5.2.4 Disturbance Smoothingp. 143
5.2.5 The Backward Recursion and the Two-Filter Formulap. 148
5.2.6 Application to Marginal Filtering and Smoothing in CGLSSMsp. 155
6 Monte Carlo Methodsp. 161
6.1 Basic Monte Carlo Methodsp. 161
6.1.1 Monte Carlo Integrationp. 162
6.1.2 Monte Carlo Simulation for HMM State Inferencep. 163
6.2 A Markov Chain Monte Carlo Primerp. 166
6.2.1 The Accept-Reject Algorithmp. 166
6.2.2 Markov Chain Monte Carlop. 170
6.2.3 Metropolis-Hastingsp. 171
6.2.4 Hybrid Algorithmsp. 179
6.2.5 Gibbs Samplingp. 180
6.2.6 Stopping an MCMC Algorithmp. 185
6.3 Applications to Hidden Markov Modelsp. 186
6.3.1 Generic Sampling Strategiesp. 186
6.3.2 Gibbs Sampling in CGLSSMsp. 194
7 Sequential Monte Carlo Methodsp. 209
7.1 Importance Sampling and Resamplingp. 210
7.1.1 Importance Samplingp. 210
7.1.2 Sampling Importance Resamplingp. 211
7.2 Sequential Importance Samplingp. 214
7.2.1 Sequential Implementation for HMMsp. 214
7.2.2 Choice of the Instrumental Kernelp. 218
7.3 Sequential Improtance Sampling with Resamplingp. 231
7.3.1 Weight Degeneracyp. 231
7.3.2 Resamplingp. 236
7.4 Complementsp. 242
7.4.1 Implementation of Multinomial Resamplingp. 242
7.4.2 Alternatives to Multinomial Resamplingp. 244
8 Advanced Topics in Sequential Monte Carlop. 251
8.1 Alternatives to SISRp. 251
8.1.1 I.I.D. Samplingp. 253
8.1.2 Two-Stage Samplingp. 256
8.1.3 Interpretation with Auxiliary Variablesp. 260
8.1.4 Auxiliary Accept-Reject Samplingp. 261
8.1.5 Markov Chain Monte Carlo Auxiliary Samplingp. 263
8.2 Sequential Monte Carlo in Hierarchical HMMsp. 264
8.2.1 Sequential Importance Sampling and Global Samplingp. 265
8.2.2 Optimal Samplingp. 267
8.2.3 Application to CGLSSMsp. 274
8.3 Particle Approximation of Smoothing Functionalsp. 278
9 Analysis of Sequential Monte Carlo Methodsp. 287
9.1 Importance Samplingp. 287
9.1.1 Unnormalized Importance Samplingp. 287
9.1.2 Deviation Inequalitiesp. 291
9.1.3 Self-normalized Importance Sampling Estimatorp. 293
9.2 Sampling Importance Resamplingp. 295
9.2.1 The Algorithmp. 295
9.2.2 Definitions and Notationsp. 297
9.2.3 Weighting and Resamplingp. 300
9.2.4 Application to the Single-Stage SIR Algorithmp. 307
9.3 Single-Step Analysis of SMC Methodsp. 311
9.3.1 Mutation Stepp. 311
9.3.2 Description of Algorithmsp. 315
9.3.3 Analysis of the Mutation/Selection Algorithmp. 319
9.3.4 Analysis of the Selection/Mutation Algorithmp. 320
9.4 Sequential Monte Carlo Methodsp. 321
9.4.1 SISRp. 321
9.4.2 I.I.D. Samplingp. 324
9.5 Complementsp. 333
9.5.1 Weak Limits Theorems for Triangular Arrayp. 333
9.5.2 Bibliographic Notesp. 342
Part II Parameter Inference
10 Maximum Likelihood Inference, Part I: Optimization Through Exact Smoothingp. 347
10.1 Likelihood Optimization in Incomplete Data Modelsp. 347
10.1.1 Problem Statement and Notationsp. 348
10.1.2 The Expectation-Maximization Algorithmp. 349
10.1.3 Gradient-based Methodsp. 353
10.1.4 Pros and Cons of Gradient-based Methodsp. 358
10.2 Application to HMMsp. 359
10.2.1 Hidden Markov Models as Missing Data Modelsp. 359
10.2.2 EM in HMMsp. 360
10.2.3 Computing Derivativesp. 362
10.2.4 Connection with the Sensitivity Equation Approachp. 364
10.3 The Example of Normal Hidden Markov Modelsp. 367
10.3.1 EM Parameter Update Formulasp. 367
10.3.2 Estimation of the Initial Distributionp. 370
10.3.3 Recursive Implementation of E-Stepp. 371
10.3.4 Computation of the Score and Observed Informationp. 374
10.4 The Example of Gaussian Linear State-Space Modelsp. 384
10.4.1 The Intermediate Quantity of EMp. 385
10.4.2 Recursive Implementationp. 387
10.5 Complementsp. 389
10.5.1 Global Convergence of the EM Algorithmp. 389
10.5.2 Rate of Convergence of EMp. 392
10.5.3 Generalized EM Algorithmsp. 393
10.5.4 Bibliographic Notesp. 394
11 Maximum Likelihood Inference, Part II: Monte Carlo Optimizationp. 397
11.1 Methods and Algorithmsp. 398
11.1.1 Monte Carlo EMp. 398
11.1.2 Simulation Schedulesp. 403
11.1.3 Gradient-based Algorithmsp. 408
11.1.4 Interlude: Stochastic Approximation and the Robbins-Monro Approachp. 411
11.1.5 Stochastic Gradient Algorithmsp. 412
11.1.6 Stochastic Approximation EMp. 414
11.1.7 Stochastic EMp. 416
11.2 Analysis of the MCEM Algorithmp. 419
11.2.1 Convergence of Perturbed Dynamical Systemsp. 420
11.2.2 Convergence of the MCEM Algorithmp. 423
11.2.3 Rate of Convergence of MCEMp. 426
11.3 Analysis of Stochastic Approximation Algorithmsp. 429
11.3.1 Basic Results for Stochastic Approximation Algorithmsp. 429
11.3.2 Convergence of the Stochastic Gradient Algorithmp. 431
11.3.3 Rate of Convergence of the Stochastic Gradient Algorithmp. 432
11.3.4 Convergence of the SAEM Algorithmp. 433
11.4 Complementsp. 435
12 Statistical Properties of the Maximum Likelihood Estimatorp. 441
12.1 A Primer on MLE Asymptoticsp. 442
12.2 Stationary Approximationsp. 443
12.3 Consistencyp. 446
12.3.1 Construction of the Stationary Conditional Log-likelihoodp. 446
12.3.2 The Contrast Function and Its Propertiesp. 448
12.4 Identifiabilityp. 450
12.4.1 Equivalence of Parametersp. 451
12.4.2 Identifiability of Mixture Densitiesp. 454
12.4.3 Application of Mixture Identifiability to Hidden Markov Modelsp. 455
12.5 Asymptotic Normality of the Score and Convergence of the Observed Informationp. 457
12.5.1 The Score Function and Invoking the Fisher Identityp. 457
12.5.2 Construction of the Stationary Conditional Scorep. 459
12.5.3 Weak Convergence of the Normalized Scorep. 464
12.5.4 Convergence of the Normalized Observed Informationp. 465
12.5.5 Asymptotics of the Maximum Likelihood Estimatorp. 465
12.6 Applications to Likelihood-based Testsp. 466
12.7 Complementsp. 468
13 Fully Bayesian Approachesp. 471
13.1 Parameter Estimationp. 471
13.1.1 Bayesian Inferencep. 471
13.1.2 Prior Distributions for HMMsp. 475
13.1.3 Non-identifiability and Label Switchingp. 478
13.1.4 MCMC Methods for Bayesian Inferencep. 481
13.2 Reversible Jump Methodsp. 488
13.2.1 Variable Dimension Modelsp. 488
13.2.2 Green's Reversible Jump Algorithmp. 490
13.2.3 Alternative Sampler Designsp. 498
13.2.4 Alternatives to Reversible Jump MCMCp. 500
13.3 Multiple Imputations Methods and Maximum a Posteriorip. 501
13.3.1 Simulated Annealingp. 502
13.3.2 The SAME Algorithmp. 503
Part III Background and Complements
14 Elements of Markov Chain Theoryp. 513
14.1 Chains on Countable State Spacesp. 513
14.1.1 Irreducibilityp. 513
14.1.2 Recurrence and Transiencep. 514
14.1.3 Invariant Measures and Stationarityp. 517
14.1.4 Ergodicityp. 519
14.2 Chains on General State Spacesp. 520
14.2.1 Irreducibilityp. 521
14.2.2 Recurrence and Transiencep. 523
14.2.3 Invariant Measures and Stationarityp. 534
14.2.4 Ergodicityp. 541
14.2.5 Geometric Ergodicity and Foster-Lyapunov Conditionsp. 548
14.2.6 Limit Theoremsp. 552
14.3 Applications to Hidden Markov Modelsp. 556
14.3.1 Phi-irreducibilityp. 557
14.3.2 Atoms and Small Setsp. 558
14.3.3 Recurrence and Positive Recurrencep. 560
15 An Information-Theoretic Perspective on Order Estimationp. 565
15.1 Model Order Identification: What Is It About?p. 566
15.2 Order Estimation in Perspectivep. 567
15.3 Order Estimation and Composite Hypothesis Testingp. 569
15.4 Code-based Identificationp. 571
15.4.1 Definitionsp. 571
15.4.2 Information Divergence Ratesp. 574
15.5 MDL Order Estimators in Bayesian Settingsp. 576
15.6 Strongly Consistent Penalized Maximum Likelihood Estimators for HMM Order Estimationp. 577
15.7 Efficiency Issuesp. 580
15.7.1 Variations on Stein's Lemmap. 581
15.7.2 Achieving Optimal Error Exponentsp. 584
15.8 Consistency of the BIC Estimator in the Markov Order Estimation Problemp. 587
15.8.1 Some Martingale Toolsp. 589
15.8.2 The Martingale Approachp. 591
15.8.3 The Union Bound Meets Martingale Inequalitiesp. 592
15.9 Complementsp. 600
Part IV Appendices
A Conditioningp. 605
A.1 Probability and Topology Terminology and Notationp. 605
A.2 Conditional Expectationp. 606
A.3 Conditional Distributionp. 611
A.4 Conditional Independencep. 614
B Linear Predictionp. 617
B.1 Hilbert Spacesp. 617
B.2 The Projection Theoremp. 619
C Notationsp. 621
C.1 Mathematicalp. 621
C.2 Probabilityp. 622
C.3 Hidden Markov Modelsp. 622
C.4 Sequential Monte Carlop. 624
Referencesp. 625
Indexp. 645