### Available:*

Library | Item Barcode | Call Number | Material Type | Status |
---|---|---|---|---|

Searching... | 30000010126811 | QA274.7 C36 2005 | Open Access Book | Searching... |

### On Order

### Summary

### Summary

This book is a comprehensive treatment of inference for hidden Markov models, including both algorithms and statistical theory. Topics range from filtering and smoothing of the hidden Markov chain to parameter estimation, Bayesian methods and estimation of the number of states. In a unified way the book covers both models with finite state spaces and models with continuous state spaces (also called state-space models) requiring approximate simulation-based algorithms that are also described in detail. Many examples illustrate the algorithms and theory. This book builds on recent developments to present a self-contained view.

### Author Notes

Olivier Cappe is Researcher for the French National Center for Scientific Research (CNRS)

Eric Moulines is Professor at Ecole Nationale Superieure des Telecommunications (ENST), Paris, France

Tobias Ryden is Professor of Mathematical Statistics at Lund University, Sweden

### Excerpts

### Excerpts

"Hidden Markov models have become a widely used class of statistical models with applications in diverse areas such as communications engineering, bioinformatics, finance and many more. This book is a comprehensive treatment of inference for hidden Markov models, including both algorithms and statistical theory. Topics range from filtering and smoothing of the hidden Markov chain to parameter estimation, Bayesian methods and estimation of the number of states." "In a unified way the book covers both models with finite state spaces, which allow for exact algorithms for filtering, estimation etc., and models with continuous state spaces (also called state-space models) requiring approximate simulation-based algorithms that are also described in detail." "This volume will suit anybody with an interest in inference for stochastic processes, and it will be useful for researchers and practitioners in areas such as statistics, signal processing, communications engineering, control theory, econometrics, finance and more. The algorithmic parts of the book do not require an advanced mathematical background, while the more theoretical parts require knowledge of probability theory at the measure-theoretical level."--BOOK JACKET.### Table of Contents

Preface | p. V |

Contributors | p. IX |

1 Introduction | p. 1 |

1.1 What Is a Hidden Markov Model? | p. 1 |

1.2 Beyond Hidden Markov Models | p. 4 |

1.3 Examples | p. 6 |

1.3.1 Finite Hidden Markov Models | p. 6 |

1.3.2 Normal Hidden Markov Models | p. 13 |

1.3.3 Gaussian Linear State-Space Models | p. 15 |

1.3.4 Conditionally Gaussian Linear State-Space Models | p. 17 |

1.3.5 General (Continuous) State-Space HMMs | p. 24 |

1.3.6 Switching Processes with Markov Regime | p. 29 |

1.4 Left-to-Right and Ergodic Hidden Markov Models | p. 33 |

2 Main Definitions and Notations | p. 35 |

2.1 Markov Chains | p. 35 |

2.1.1 Transition Kernels | p. 35 |

2.1.2 Homogeneous Markov Chains | p. 37 |

2.1.3 Non-homogeneous Markov Chains | p. 40 |

2.2 Hidden Markov Models | p. 42 |

2.2.1 Definitions and Notations | p. 42 |

2.2.2 Conditional Independence in Hidden Markov Models | p. 44 |

2.2.3 Hierarchical Hidden Markov Models | p. 46 |

Part I State Inference | |

3 Filtering and Smoothing Recursions | p. 51 |

3.1 Basic Notations and Definitions | p. 53 |

3.1.1 Likelihood | p. 53 |

3.1.2 Smoothing | p. 54 |

3.1.3 The Forward-Backward Decomposition | p. 56 |

3.1.4 Implicit Conditioning (Please Read This Section!) | p. 58 |

3.2 Forward-Backward | p. 59 |

3.2.1 The Forward-Backward Recursions | p. 59 |

3.2.2 Filtering and Normalized Recursion | p. 61 |

3.3 Markovian Decompositions | p. 66 |

3.3.1 Forward Decomposition | p. 66 |

3.3.2 Backward Decomposition | p. 70 |

3.4 Complements | p. 74 |

4 Advanced Topics in Smoothing | p. 77 |

4.1 Recursive Computation of Smoothed Functionals | p. 77 |

4.1.1 Fixed Point Smoothing | p. 78 |

4.1.2 Recursive Smoothers for General Functionals | p. 79 |

4.1.3 Comparison with Forward-Backward Smoothing | p. 82 |

4.2 Filtering and Smoothing in More General Models | p. 85 |

4.2.1 Smoothing in Markov-switching Models | p. 86 |

4.2.2 Smoothing in Partially Observed Markov Chains | p. 86 |

4.2.3 Marginal Smoothing in Hierarchical HMMs | p. 87 |

4.3 Forgetting of the Initial Condition | p. 89 |

4.3.1 Total Variation | p. 90 |

4.3.2 Lipshitz Contraction for Transition Kernels | p. 95 |

4.3.3 The Doeblin Condition and Uniform Ergodicity | p. 97 |

4.3.4 Forgetting Properties | p. 100 |

4.3.5 Uniform Forgetting Under Strong Mixing Conditions | p. 105 |

4.3.6 Forgetting Under Alternative Conditions | p. 110 |

5 Applications of Smoothing | p. 121 |

5.1 Models with Finite State Space | p. 121 |

5.1.1 Smoothing | p. 122 |

5.1.2 Maximum a Posteriori Sequence Estimation | p. 125 |

5.2 Gaussian Linear State-Space Models | p. 127 |

5.2.1 Filtering and Backward Markovian Smoothing | p. 127 |

5.2.2 Linear Prediction Interpretation | p. 131 |

5.2.3 The Prediction and Filtering Recursions Revisited | p. 137 |

5.2.4 Disturbance Smoothing | p. 143 |

5.2.5 The Backward Recursion and the Two-Filter Formula | p. 148 |

5.2.6 Application to Marginal Filtering and Smoothing in CGLSSMs | p. 155 |

6 Monte Carlo Methods | p. 161 |

6.1 Basic Monte Carlo Methods | p. 161 |

6.1.1 Monte Carlo Integration | p. 162 |

6.1.2 Monte Carlo Simulation for HMM State Inference | p. 163 |

6.2 A Markov Chain Monte Carlo Primer | p. 166 |

6.2.1 The Accept-Reject Algorithm | p. 166 |

6.2.2 Markov Chain Monte Carlo | p. 170 |

6.2.3 Metropolis-Hastings | p. 171 |

6.2.4 Hybrid Algorithms | p. 179 |

6.2.5 Gibbs Sampling | p. 180 |

6.2.6 Stopping an MCMC Algorithm | p. 185 |

6.3 Applications to Hidden Markov Models | p. 186 |

6.3.1 Generic Sampling Strategies | p. 186 |

6.3.2 Gibbs Sampling in CGLSSMs | p. 194 |

7 Sequential Monte Carlo Methods | p. 209 |

7.1 Importance Sampling and Resampling | p. 210 |

7.1.1 Importance Sampling | p. 210 |

7.1.2 Sampling Importance Resampling | p. 211 |

7.2 Sequential Importance Sampling | p. 214 |

7.2.1 Sequential Implementation for HMMs | p. 214 |

7.2.2 Choice of the Instrumental Kernel | p. 218 |

7.3 Sequential Improtance Sampling with Resampling | p. 231 |

7.3.1 Weight Degeneracy | p. 231 |

7.3.2 Resampling | p. 236 |

7.4 Complements | p. 242 |

7.4.1 Implementation of Multinomial Resampling | p. 242 |

7.4.2 Alternatives to Multinomial Resampling | p. 244 |

8 Advanced Topics in Sequential Monte Carlo | p. 251 |

8.1 Alternatives to SISR | p. 251 |

8.1.1 I.I.D. Sampling | p. 253 |

8.1.2 Two-Stage Sampling | p. 256 |

8.1.3 Interpretation with Auxiliary Variables | p. 260 |

8.1.4 Auxiliary Accept-Reject Sampling | p. 261 |

8.1.5 Markov Chain Monte Carlo Auxiliary Sampling | p. 263 |

8.2 Sequential Monte Carlo in Hierarchical HMMs | p. 264 |

8.2.1 Sequential Importance Sampling and Global Sampling | p. 265 |

8.2.2 Optimal Sampling | p. 267 |

8.2.3 Application to CGLSSMs | p. 274 |

8.3 Particle Approximation of Smoothing Functionals | p. 278 |

9 Analysis of Sequential Monte Carlo Methods | p. 287 |

9.1 Importance Sampling | p. 287 |

9.1.1 Unnormalized Importance Sampling | p. 287 |

9.1.2 Deviation Inequalities | p. 291 |

9.1.3 Self-normalized Importance Sampling Estimator | p. 293 |

9.2 Sampling Importance Resampling | p. 295 |

9.2.1 The Algorithm | p. 295 |

9.2.2 Definitions and Notations | p. 297 |

9.2.3 Weighting and Resampling | p. 300 |

9.2.4 Application to the Single-Stage SIR Algorithm | p. 307 |

9.3 Single-Step Analysis of SMC Methods | p. 311 |

9.3.1 Mutation Step | p. 311 |

9.3.2 Description of Algorithms | p. 315 |

9.3.3 Analysis of the Mutation/Selection Algorithm | p. 319 |

9.3.4 Analysis of the Selection/Mutation Algorithm | p. 320 |

9.4 Sequential Monte Carlo Methods | p. 321 |

9.4.1 SISR | p. 321 |

9.4.2 I.I.D. Sampling | p. 324 |

9.5 Complements | p. 333 |

9.5.1 Weak Limits Theorems for Triangular Array | p. 333 |

9.5.2 Bibliographic Notes | p. 342 |

Part II Parameter Inference | |

10 Maximum Likelihood Inference, Part I: Optimization Through Exact Smoothing | p. 347 |

10.1 Likelihood Optimization in Incomplete Data Models | p. 347 |

10.1.1 Problem Statement and Notations | p. 348 |

10.1.2 The Expectation-Maximization Algorithm | p. 349 |

10.1.3 Gradient-based Methods | p. 353 |

10.1.4 Pros and Cons of Gradient-based Methods | p. 358 |

10.2 Application to HMMs | p. 359 |

10.2.1 Hidden Markov Models as Missing Data Models | p. 359 |

10.2.2 EM in HMMs | p. 360 |

10.2.3 Computing Derivatives | p. 362 |

10.2.4 Connection with the Sensitivity Equation Approach | p. 364 |

10.3 The Example of Normal Hidden Markov Models | p. 367 |

10.3.1 EM Parameter Update Formulas | p. 367 |

10.3.2 Estimation of the Initial Distribution | p. 370 |

10.3.3 Recursive Implementation of E-Step | p. 371 |

10.3.4 Computation of the Score and Observed Information | p. 374 |

10.4 The Example of Gaussian Linear State-Space Models | p. 384 |

10.4.1 The Intermediate Quantity of EM | p. 385 |

10.4.2 Recursive Implementation | p. 387 |

10.5 Complements | p. 389 |

10.5.1 Global Convergence of the EM Algorithm | p. 389 |

10.5.2 Rate of Convergence of EM | p. 392 |

10.5.3 Generalized EM Algorithms | p. 393 |

10.5.4 Bibliographic Notes | p. 394 |

11 Maximum Likelihood Inference, Part II: Monte Carlo Optimization | p. 397 |

11.1 Methods and Algorithms | p. 398 |

11.1.1 Monte Carlo EM | p. 398 |

11.1.2 Simulation Schedules | p. 403 |

11.1.3 Gradient-based Algorithms | p. 408 |

11.1.4 Interlude: Stochastic Approximation and the Robbins-Monro Approach | p. 411 |

11.1.5 Stochastic Gradient Algorithms | p. 412 |

11.1.6 Stochastic Approximation EM | p. 414 |

11.1.7 Stochastic EM | p. 416 |

11.2 Analysis of the MCEM Algorithm | p. 419 |

11.2.1 Convergence of Perturbed Dynamical Systems | p. 420 |

11.2.2 Convergence of the MCEM Algorithm | p. 423 |

11.2.3 Rate of Convergence of MCEM | p. 426 |

11.3 Analysis of Stochastic Approximation Algorithms | p. 429 |

11.3.1 Basic Results for Stochastic Approximation Algorithms | p. 429 |

11.3.2 Convergence of the Stochastic Gradient Algorithm | p. 431 |

11.3.3 Rate of Convergence of the Stochastic Gradient Algorithm | p. 432 |

11.3.4 Convergence of the SAEM Algorithm | p. 433 |

11.4 Complements | p. 435 |

12 Statistical Properties of the Maximum Likelihood Estimator | p. 441 |

12.1 A Primer on MLE Asymptotics | p. 442 |

12.2 Stationary Approximations | p. 443 |

12.3 Consistency | p. 446 |

12.3.1 Construction of the Stationary Conditional Log-likelihood | p. 446 |

12.3.2 The Contrast Function and Its Properties | p. 448 |

12.4 Identifiability | p. 450 |

12.4.1 Equivalence of Parameters | p. 451 |

12.4.2 Identifiability of Mixture Densities | p. 454 |

12.4.3 Application of Mixture Identifiability to Hidden Markov Models | p. 455 |

12.5 Asymptotic Normality of the Score and Convergence of the Observed Information | p. 457 |

12.5.1 The Score Function and Invoking the Fisher Identity | p. 457 |

12.5.2 Construction of the Stationary Conditional Score | p. 459 |

12.5.3 Weak Convergence of the Normalized Score | p. 464 |

12.5.4 Convergence of the Normalized Observed Information | p. 465 |

12.5.5 Asymptotics of the Maximum Likelihood Estimator | p. 465 |

12.6 Applications to Likelihood-based Tests | p. 466 |

12.7 Complements | p. 468 |

13 Fully Bayesian Approaches | p. 471 |

13.1 Parameter Estimation | p. 471 |

13.1.1 Bayesian Inference | p. 471 |

13.1.2 Prior Distributions for HMMs | p. 475 |

13.1.3 Non-identifiability and Label Switching | p. 478 |

13.1.4 MCMC Methods for Bayesian Inference | p. 481 |

13.2 Reversible Jump Methods | p. 488 |

13.2.1 Variable Dimension Models | p. 488 |

13.2.2 Green's Reversible Jump Algorithm | p. 490 |

13.2.3 Alternative Sampler Designs | p. 498 |

13.2.4 Alternatives to Reversible Jump MCMC | p. 500 |

13.3 Multiple Imputations Methods and Maximum a Posteriori | p. 501 |

13.3.1 Simulated Annealing | p. 502 |

13.3.2 The SAME Algorithm | p. 503 |

Part III Background and Complements | |

14 Elements of Markov Chain Theory | p. 513 |

14.1 Chains on Countable State Spaces | p. 513 |

14.1.1 Irreducibility | p. 513 |

14.1.2 Recurrence and Transience | p. 514 |

14.1.3 Invariant Measures and Stationarity | p. 517 |

14.1.4 Ergodicity | p. 519 |

14.2 Chains on General State Spaces | p. 520 |

14.2.1 Irreducibility | p. 521 |

14.2.2 Recurrence and Transience | p. 523 |

14.2.3 Invariant Measures and Stationarity | p. 534 |

14.2.4 Ergodicity | p. 541 |

14.2.5 Geometric Ergodicity and Foster-Lyapunov Conditions | p. 548 |

14.2.6 Limit Theorems | p. 552 |

14.3 Applications to Hidden Markov Models | p. 556 |

14.3.1 Phi-irreducibility | p. 557 |

14.3.2 Atoms and Small Sets | p. 558 |

14.3.3 Recurrence and Positive Recurrence | p. 560 |

15 An Information-Theoretic Perspective on Order Estimation | p. 565 |

15.1 Model Order Identification: What Is It About? | p. 566 |

15.2 Order Estimation in Perspective | p. 567 |

15.3 Order Estimation and Composite Hypothesis Testing | p. 569 |

15.4 Code-based Identification | p. 571 |

15.4.1 Definitions | p. 571 |

15.4.2 Information Divergence Rates | p. 574 |

15.5 MDL Order Estimators in Bayesian Settings | p. 576 |

15.6 Strongly Consistent Penalized Maximum Likelihood Estimators for HMM Order Estimation | p. 577 |

15.7 Efficiency Issues | p. 580 |

15.7.1 Variations on Stein's Lemma | p. 581 |

15.7.2 Achieving Optimal Error Exponents | p. 584 |

15.8 Consistency of the BIC Estimator in the Markov Order Estimation Problem | p. 587 |

15.8.1 Some Martingale Tools | p. 589 |

15.8.2 The Martingale Approach | p. 591 |

15.8.3 The Union Bound Meets Martingale Inequalities | p. 592 |

15.9 Complements | p. 600 |

Part IV Appendices | |

A Conditioning | p. 605 |

A.1 Probability and Topology Terminology and Notation | p. 605 |

A.2 Conditional Expectation | p. 606 |

A.3 Conditional Distribution | p. 611 |

A.4 Conditional Independence | p. 614 |

B Linear Prediction | p. 617 |

B.1 Hilbert Spaces | p. 617 |

B.2 The Projection Theorem | p. 619 |

C Notations | p. 621 |

C.1 Mathematical | p. 621 |

C.2 Probability | p. 622 |

C.3 Hidden Markov Models | p. 622 |

C.4 Sequential Monte Carlo | p. 624 |

References | p. 625 |

Index | p. 645 |