Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010150110 | QA274.7 F78 2006 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
The past decade has seen powerful new computational tools for modeling which combine a Bayesian approach with recent Monte simulation techniques based on Markov chains. This book is the first to offer a systematic presentation of the Bayesian perspective of finite mixture modelling. The book is designed to show finite mixture and Markov switching models are formulated, what structures they imply on the data, their potential uses, and how they are estimated. Presenting its concepts informally without sacrificing mathematical correctness, it will serve a wide readership including statisticians as well as biologists, economists, engineers, financial and market researchers.
Author Notes
Sylvia Fruhwirth-Schnatter is Professor of Applied Statistics and Econometrics at the Department of Applied Statistics of the Johannes Kepler University in Linz, Austria
Table of Contents
1 Finite Mixture Modeling | p. 1 |
1.1 Introduction | p. 1 |
1.2 Finite Mixture Distributions | p. 3 |
1.2.1 Basic Definitions | p. 3 |
1.2.2 Some Descriptive Features of Finite Mixture Distributions | p. 5 |
1.2.3 Diagnosing Similarity of Mixture Components | p. 9 |
1.2.4 Moments of a Finite Mixture Distribution | p. 10 |
1.2.5 Statistical Modeling Based on Finite Mixture Distributions | p. 11 |
1.3 Identifiability of a Finite Mixture Distribution | p. 14 |
1.3.1 Nonidentifiability Due to Invariance to Relabeling the Components | p. 15 |
1.3.2 Nonidentifiability Due to Potential Overfitting | p. 17 |
1.3.3 Formal Identifiability Constraints | p. 19 |
1.3.4 Generic Identifiability | p. 21 |
2 Statistical Inference for a Finite Mixture Model with Known Number of Components | p. 25 |
2.1 Introduction | p. 25 |
2.2 Classification for Known Component Parameters | p. 26 |
2.2.1 Bayes' Rule for Classifying a Single Observation | p. 26 |
2.2.2 The Bayes' Classifier for a Whole Data Set | p. 27 |
2.3 Parameter Estimation for Known Allocation | p. 29 |
2.3.1 The Complete-Data Likelihood Function | p. 29 |
2.3.2 Complete-Data Maximum Likelihood Estimation | p. 30 |
2.3.3 Complete-Data Bayesian Estimation of the Component Parameters | p. 31 |
2.3.4 Complete-Data Bayesian Estimation of the Weights | p. 35 |
2.4 Parameter Estimation When the Allocations Are Unknown | p. 41 |
2.4.1 Method of Moments | p. 42 |
2.4.2 The Mixture Likelihood Function | p. 43 |
2.4.3 A Helicopter Tour of the Mixture Likelihood Surface for Two Examples | p. 44 |
2.4.4 Maximum Likelihood Estimation | p. 49 |
2.4.5 Bayesian Parameter Estimation | p. 53 |
2.4.6 Distance-Based Methods | p. 54 |
2.4.7 Comparing Various Estimation Methods | p. 54 |
3 Practical Bayesian Inference for a Finite Mixture Model with Known Number of Components | p. 57 |
3.1 Introduction | p. 57 |
3.2 Choosing the Prior for the Parameters of a Mixture Model | p. 58 |
3.2.1 Objective and Subjective Priors | p. 58 |
3.2.2 Improper Priors May Cause Improper Mixture Posteriors | p. 59 |
3.2.3 Conditionally Conjugate Priors | p. 60 |
3.2.4 Hierarchical Priors and Partially Proper Priors | p. 61 |
3.2.5 Other Priors | p. 62 |
3.2.6 Invariant Prior Distributions | p. 62 |
3.3 Some Properties of the Mixture Posterior Density | p. 63 |
3.3.1 Invariance of the Posterior Distribution | p. 63 |
3.3.2 Invariance of Seemingly Component-Specific Functionals | p. 64 |
3.3.3 The Marginal Posterior Distribution of the Allocations | p. 65 |
3.3.4 Invariance of the Posterior Distribution of the Allocations | p. 67 |
3.4 Classification Without Parameter Estimation | p. 68 |
3.4.1 Single-Move Gibbs Sampling | p. 69 |
3.4.2 The Metropolis-Hastings Algorithm | p. 72 |
3.5 Parameter Estimation Through Data Augmentation and MCMC | p. 73 |
3.5.1 Treating Mixture Models as a Missing Data Problem | p. 73 |
3.5.2 Data Augmentation and MCMC for a Mixture of Poisson Distributions | p. 74 |
3.5.3 Data Augmentation and MCMC for General Mixtures | p. 76 |
3.5.4 MCMC Sampling Under Improper Priors | p. 78 |
3.5.5 Label Switching | p. 78 |
3.5.6 Permutation MCMC Sampling | p. 81 |
3.6 Other Monte Carlo Methods Useful for Mixture Models | p. 83 |
3.6.1 A Metropolis-Hastings Algorithm for the Parameters | p. 83 |
3.6.2 Importance Sampling for the Allocations | p. 84 |
3.6.3 Perfect Sampling | p. 85 |
3.7 Bayesian Inference for Finite Mixture Models Using Posterior Draws | p. 85 |
3.7.1 Sampling Representations of the Mixture Posterior Density | p. 85 |
3.7.2 Using Posterior Draws for Bayesian Inference | p. 87 |
3.7.3 Predictive Density Estimation | p. 89 |
3.7.4 Individual Parameter Inference | p. 91 |
3.7.5 Inference on the Hyperparameter of a Hierarchical Prior | p. 92 |
3.7.6 Inference on Component Parameters | p. 92 |
3.7.7 Model Identification | p. 94 |
4 Statistical Inference for Finite Mixture Models Under Model Specification Uncertainty | p. 99 |
4.1 Introduction | p. 99 |
4.2 Parameter Estimation Under Model Specification Uncertainty | p. 100 |
4.2.1 Maximum Likelihood Estimation Under Model Specification Uncertainty | p. 100 |
4.2.2 Practical Bayesian Parameter Estimation for Overfitting Finite Mixture Models | p. 103 |
4.2.3 Potential Overfitting | p. 105 |
4.3 Informal Methods for Identifying the Number of Components | p. 107 |
4.3.1 Mode Hunting in the Mixture Posterior | p. 108 |
4.3.2 Mode Hunting in the Sample Histogram | p. 109 |
4.3.3 Diagnosing Mixtures Through the Method of Moments | p. 110 |
4.3.4 Diagnosing Mixtures Through Predictive Methods | p. 112 |
4.3.5 Further Approaches | p. 114 |
4.4 Likelihood-Based Methods | p. 114 |
4.4.1 The Likelihood Ratio Statistic | p. 114 |
4.4.2 AIC, BIC, and the Schwarz Criterion | p. 116 |
4.4.3 Further Approaches | p. 117 |
4.5 Bayesian Inference Under Model Uncertainty | p. 117 |
4.5.1 Trans-Dimensional Bayesian Inference | p. 117 |
4.5.2 Marginal Likelihoods | p. 118 |
4.5.3 Bayes Factors for Model Comparison | p. 119 |
4.5.4 Formal Bayesian Model Selection | p. 121 |
4.5.5 Choosing Priors for Model Selection | p. 122 |
4.5.6 Further Approaches | p. 123 |
5 Computational Tools for Bayesian Inference for Finite Mixtures Models Under Model Specification Uncertainty | p. 125 |
5.1 Introduction | p. 125 |
5.2 Trans-Dimensional Markov Chain Monte Carlo Methods | p. 125 |
5.2.1 Product-Space MCMC | p. 126 |
5.2.2 Reversible Jump MCMC | p. 129 |
5.2.3 Birth and Death MCMC Methods | p. 137 |
5.3 Marginal Likelihoods for Finite Mixture Models | p. 139 |
5.3.1 Defining the Marginal Likelihood | p. 139 |
5.3.2 Choosing Priors for Selecting the Number of Components | p. 141 |
5.3.3 Computation of the Marginal Likelihood for Mixture Models | p. 143 |
5.4 Simulation-Based Approximations of the Marginal Likelihood | p. 143 |
5.4.1 Some Background on Monte Carlo Integration | p. 143 |
5.4.2 Sampling-Based Approximations for Mixture Models | p. 144 |
5.4.3 Importance Sampling | p. 146 |
5.4.4 Reciprocal Importance Sampling | p. 147 |
5.4.5 Harmonic Mean Estimator | p. 148 |
5.4.6 Bridge Sampling Technique | p. 150 |
5.4.7 Comparison of Different Simulation-Based Estimators | p. 154 |
5.4.8 Dealing with Hierarchical Priors | p. 159 |
5.5 Approximations to the Marginal Likelihood Based on Density Ratios | p. 159 |
5.5.1 The Posterior Density Ratio | p. 159 |
5.5.2 Chib's Estimator | p. 160 |
5.5.3 Laplace Approximation | p. 164 |
5.6 Reversible Jump MCMC Versus Marginal Likelihoods? | p. 165 |
6 Finite Mixture Models with Normal Components | p. 169 |
6.1 Finite Mixtures of Normal Distributions | p. 169 |
6.1.1 Model Formulation | p. 169 |
6.1.2 Parameter Estimation for Mixtures of Normals | p. 171 |
6.1.3 The Kiefer-Wolfowitz Example | p. 174 |
6.1.4 Applications of Mixture of Normal Distributions | p. 176 |
6.2 Bayesian Estimation of Univariate Mixtures of Normals | p. 177 |
6.2.1 Bayesian Inference When the Allocations Are Known | p. 177 |
6.2.2 Standard Prior Distributions | p. 179 |
6.2.3 The Influence of the Prior on the Variance Ratio | p. 179 |
6.2.4 Bayesian Estimation Using MCMC | p. 180 |
6.2.5 MCMC Estimation Under Standard Improper Priors | p. 182 |
6.2.6 Introducing Prior Dependence Among the Components | p. 185 |
6.2.7 Further Sampling-Based Approaches | p. 187 |
6.2.8 Application to the Fishery Data | p. 188 |
6.3 Bayesian Estimation of Multivariate Mixtures of Normals | p. 190 |
6.3.1 Bayesian Inference When the Allocations Are Known | p. 190 |
6.3.2 Prior Distributions | p. 192 |
6.3.3 Bayesian Parameter Estimation Using MCMC | p. 193 |
6.3.4 Application to Fisher's Iris Data | p. 195 |
6.4 Further Issues | p. 195 |
6.4.1 Parsimonious Finite Normal Mixtures | p. 195 |
6.4.2 Model Selection Problems for Mixtures of Normals | p. 199 |
7 Data Analysis Based on Finite Mixtures | p. 203 |
7.1 Model-Based Clustering | p. 203 |
7.1.1 Some Background on Cluster Analysis | p. 203 |
7.1.2 Model-Based Clustering Using Finite Mixture Models | p. 204 |
7.1.3 The Classification Likelihood and the Bayesian MAP Approach | p. 207 |
7.1.4 Choosing Clustering Criteria and the Number of Components | p. 210 |
7.1.5 Model Choice for the Fishery Data | p. 216 |
7.1.6 Model Choice for Fisher's Iris Data | p. 218 |
7.1.7 Bayesian Clustering Based on Loss Functions | p. 220 |
7.1.8 Clustering for Fisher's Iris Data | p. 224 |
7.2 Outlier Modeling | p. 224 |
7.2.1 Outlier Modeling Using Finite Mixtures | p. 224 |
7.2.2 Bayesian Inference for Outlier Models Based on Finite Mixtures | p. 225 |
7.2.3 Outlier Modeling of Darwin's Data | p. 226 |
7.2.4 Clustering Under Outliers and Noise | p. 227 |
7.3 Robust Finite Mixtures Based on the Student-t Distribution | p. 230 |
7.3.1 Parameter Estimation | p. 230 |
7.3.2 Dealing with Unknown Number of Components | p. 233 |
7.4 Further Issues | p. 233 |
7.4.1 Clustering High-Dimensional Data | p. 233 |
7.4.2 Discriminant Analysis | p. 235 |
7.4.3 Combining Classified and Unclassified Observations | p. 236 |
7.4.4 Density Estimation Using Finite Mixtures | p. 237 |
7.4.5 Finite Mixtures as an Auxiliary Computational Tool in Bayesian Analysis | p. 238 |
8 Finite Mixtures of Regression Models | p. 241 |
8.1 Introduction | p. 241 |
8.2 Finite Mixture of Multiple Regression Models | p. 242 |
8.2.1 Model Definition | p. 242 |
8.2.2 Identifiability | p. 243 |
8.2.3 Statistical Modeling Based on Finite Mixture of Regression Models | p. 246 |
8.2.4 Outliers in a Regression Model | p. 249 |
8.3 Statistical Inference for Finite Mixtures of Multiple Regression Models | p. 249 |
8.3.1 Maximum Likelihood Estimation | p. 249 |
8.3.2 Bayesian Inference When the Allocations Are Known | p. 250 |
8.3.3 Choosing Prior Distributions | p. 252 |
8.3.4 Bayesian Inference When the Allocations Are Unknown | p. 253 |
8.3.5 Bayesian Inference Using Posterior Draws | p. 254 |
8.3.6 Dealing with Model Specification Uncertainty | p. 255 |
8.4 Mixed-Effects Finite Mixtures of Regression Models | p. 256 |
8.4.1 Model Definition | p. 256 |
8.4.2 Choosing Priors for Bayesian Estimation | p. 256 |
8.4.3 Bayesian Parameter Estimation When the Allocations Are Known | p. 257 |
8.4.4 Bayesian Parameter Estimation When the Allocations Are Unknown | p. 258 |
8.5 Finite Mixture Models for Repeated Measurements | p. 259 |
8.5.1 Pooling Information Across Similar Units | p. 260 |
8.5.2 Finite Mixtures of Random-Effects Models | p. 260 |
8.5.3 Choosing the Prior for Bayesian Estimation | p. 265 |
8.5.4 Bayesian Parameter Estimation When the Allocations Are Known | p. 265 |
8.5.5 Practical Bayesian Estimation Using MCMC | p. 267 |
8.5.6 Dealing with Model Specification Uncertainty | p. 269 |
8.5.7 Application to the Marketing Data | p. 270 |
8.6 Further Issues | p. 273 |
8.6.1 Regression Modeling Based on Multivariate Mixtures of Normals | p. 273 |
8.6.2 Modeling the Weight Distribution | p. 274 |
8.6.3 Mixtures-of-Experts Models | p. 274 |
9 Finite Mixture Models with Nonnormal Components | p. 277 |
9.1 Finite Mixtures of Exponential Distributions | p. 277 |
9.1.1 Model Formulation and Parameter Estimation | p. 277 |
9.1.2 Bayesian Inference | p. 278 |
9.2 Finite Mixtures of Poisson Distributions | p. 279 |
9.2.1 Model Formulation and Estimation | p. 279 |
9.2.2 Capturing Overdispersion in Count Data | p. 280 |
9.2.3 Modeling Excess Zeros | p. 282 |
9.2.4 Application to the Eye Tracking Data | p. 283 |
9.3 Finite Mixture Models for Binary and Categorical Data | p. 286 |
9.3.1 Finite Mixtures of Binomial Distributions | p. 286 |
9.3.2 Finite Mixtures of Multinomial Distributions | p. 288 |
9.4 Finite Mixtures of Generalized Linear Models | p. 289 |
9.4.1 Finite Mixture Regression Models for Count Data | p. 290 |
9.4.2 Finite Mixtures of Logit and Probit Regression Models | p. 292 |
9.4.3 Parameter Estimation for Finite Mixtures of GLMs | p. 293 |
9.4.4 Model Selection for Finite Mixtures of GLMs | p. 294 |
9.5 Finite Mixture Models for Multivariate Binary and Categorical Data | p. 294 |
9.5.1 The Basic Latent Class Model | p. 295 |
9.5.2 Identification and Parameter Estimation | p. 296 |
9.5.3 Extensions of the Basic Latent Class Model | p. 297 |
9.6 Further Issues | p. 298 |
9.6.1 Finite Mixture Modeling of Mixed-Mode Data | p. 298 |
9.6.2 Finite Mixtures of GLMs with Random Effects | p. 299 |
10 Finite Markov Mixture Modeling | p. 301 |
10.1 Introduction | p. 301 |
10.2 Finite Markov Mixture Distributions | p. 301 |
10.2.1 Basic Definitions | p. 302 |
10.2.2 Irreducible Aperiodic Markov Chains | p. 304 |
10.2.3 Moments of a Markov Mixture Distribution | p. 308 |
10.2.4 The Autocorrelation Function of a Process Generated by a Markov Mixture Distribution | p. 310 |
10.2.5 The Autocorrelation Function of the Squared Process | p. 311 |
10.2.6 The Standard Finite Mixture Distribution as a Limiting Case | p. 312 |
10.2.7 Identifiability of a Finite Markov Mixture Distribution | p. 313 |
10.3 Statistical Modeling Based on Finite Markov Mixture Distributions | p. 314 |
10.3.1 The Basic Markov Switching Model | p. 314 |
10.3.2 The Markov Switching Regression Model | p. 315 |
10.3.3 Nonergodic Markov Chains | p. 316 |
10.3.4 Relaxing the Assumptions of the Basic Markov Switching Model | p. 316 |
11 Statistical Inference for Markov Switching Models | p. 319 |
11.1 Introduction | p. 319 |
11.2 State Estimation for Known Parameters | p. 319 |
11.2.1 Statistical Inference About the States | p. 320 |
11.2.2 Filtered State Probabilities | p. 320 |
11.2.3 Filtering for Special Cases | p. 323 |
11.2.4 Smoothing the States | p. 324 |
11.2.5 Filtering and Smoothing for More General Models | p. 326 |
11.3 Parameter Estimation for Known States | p. 327 |
11.3.1 The Complete-Data Likelihood Function | p. 327 |
11.3.2 Complete-Data Bayesian Parameter Estimation | p. 329 |
11.3.3 Complete-Data Bayesian Estimation of the Transition Matrix | p. 329 |
11.4 Parameter Estimation When the States are Unknown | p. 330 |
11.4.1 The Markov Mixture Likelihood Function | p. 330 |
11.4.2 Maximum Likelihood Estimation | p. 333 |
11.4.3 Bayesian Estimation | p. 334 |
11.4.4 Alternative Estimation Methods | p. 334 |
11.5 Bayesian Parameter Estimation with Known Number of States | p. 335 |
11.5.1 Choosing the Prior for the Parameters of a Markov Mixture Model | p. 335 |
11.5.2 Some Properties of the Posterior Distribution of a Markov Switching Model | p. 336 |
11.5.3 Parameter Estimation Through Data Augmentation and MCMC | p. 337 |
11.5.4 Permutation MCMC Sampling | p. 340 |
11.5.5 Sampling the Unknown Transition Matrix | p. 340 |
11.5.6 Sampling Posterior Paths of the Hidden Markov Chain | p. 342 |
11.5.7 Other Sampling-Based Approaches | p. 345 |
11.5.8 Bayesian Inference Using Posterior Draws | p. 345 |
11.6 Statistical Inference Under Model Specification Uncertainty | p. 346 |
11.6.1 Diagnosing Markov Switching Models | p. 346 |
11.6.2 Likelihood-Based Methods | p. 346 |
11.6.3 Marginal Likelihoods for Markov Switching Models | p. 347 |
11.6.4 Model Space MCMC | p. 348 |
11.6.5 Further Issues | p. 348 |
11.7 Modeling Overdispersion and Autocorrelation in Time Series of Count Data | p. 348 |
11.7.1 Motivating Example | p. 348 |
11.7.2 Capturing Overdispersion and Autocorrelation Using Poisson Markov Mixture Models | p. 349 |
11.7.3 Application to the Lamb Data | p. 351 |
12 Nonlinear Time Series Analysis Based on Markov Switching Models | p. 357 |
12.1 Introduction | p. 357 |
12.2 The Markov Switching Autoregressive Model | p. 358 |
12.2.1 Motivating Example | p. 358 |
12.2.2 Model Definition | p. 360 |
12.2.3 Features of the MSAR Model | p. 362 |
12.2.4 Markov Switching Models for Nonstationary Time Series | p. 363 |
12.2.5 Parameter Estimation and Model Selection | p. 365 |
12.2.6 Application to Business Cycle Analysis of the U.S. GDP Data | p. 365 |
12.3 Markov Switching Dynamic Regression Models | p. 371 |
12.3.1 Model Definition | p. 371 |
12.3.2 Bayesian Estimation | p. 371 |
12.4 Prediction of Time Series Based on Markov Switching Models | p. 372 |
12.4.1 Flexible Predictive Distributions | p. 372 |
12.4.2 Forecasting of Markov Switching Models via Sampling-Based Methods | p. 374 |
12.5 Markov Switching Conditional Heteroscedasticity | p. 375 |
12.5.1 Motivating Example | p. 375 |
12.5.2 Capturing Features of Financial Time Series Through Markov Switching Models | p. 377 |
12.5.3 Switching ARCH Models | p. 378 |
12.5.4 Statistical Inference for Switching ARCH Models | p. 380 |
12.5.5 Switching GARCH Models | p. 383 |
12.6 Some Extensions | p. 384 |
12.6.1 Time-Varying Transition Matrices | p. 384 |
12.6.2 Markov Switching Models for Longitudinal and Panel Data | p. 385 |
12.6.3 Markov Switching Models for Multivariate Time Series | p. 386 |
13 Switching State Space Models | p. 389 |
13.1 State Space Modeling | p. 389 |
13.1.1 The Local Level Model with and Without Switching | p. 389 |
13.1.2 The Linear Gaussian State Space Form | p. 391 |
13.1.3 Multiprocess Models | p. 393 |
13.1.4 Switching Linear Gaussian State Space Models | p. 393 |
13.1.5 The General State Space Form | p. 394 |
13.2 Nonlinear Time Series Analysis Based on Switching State Space Models | p. 396 |
13.2.1 ARMA Models with and Without Switching | p. 396 |
13.2.2 Unobserved Component Time Series Models | p. 397 |
13.2.3 Capturing Sudden Changes in Time Series | p. 398 |
13.2.4 Switching Dynamic Factor Models | p. 400 |
13.2.5 Switching State Space Models as a Semi-Parametric Smoothing Device | p. 401 |
13.3 Filtering for Switching Linear Gaussian State Space Models | p. 401 |
13.3.1 The Filtering Problem | p. 402 |
13.3.2 Bayesian Inference for a General Linear Regression Model | p. 402 |
13.3.3 Filtering for the Linear Gaussian State Space Model | p. 404 |
13.3.4 Filtering for Multiprocess Models | p. 406 |
13.3.5 Approximate Filtering for Switching Linear Gaussian State Space Models | p. 406 |
13.4 Parameter Estimation for Switching State Space Models | p. 410 |
13.4.1 The Likelihood Function of a State Space Model | p. 411 |
13.4.2 Maximum Likelihood Estimation | p. 412 |
13.4.3 Bayesian Inference | p. 412 |
13.5 Practical Bayesian Estimation Using MCMC | p. 415 |
13.5.1 Various Data Augmentation Schemes | p. 416 |
13.5.2 Sampling the Continuous State Process from the Smoother Density | p. 417 |
13.5.3 Sampling the Discrete States for a Switching State Space Model | p. 420 |
13.6 Further Issues | p. 421 |
13.6.1 Model Specification Uncertainty in Switching State Space Modeling | p. 421 |
13.6.2 Auxiliary Mixture Sampling for Nonlinear and Nonnormal State Space Models | p. 422 |
13.7 Illustrative Application to Modeling Exchange Rate Data | p. 423 |
A Appendix | p. 431 |
A.1 Summary of Probability Distributions | p. 431 |
A.1.1 The Beta Distribution | p. 431 |
A.1.2 The Binomial Distribution | p. 432 |
A.1.3 The Dirichlet Distribution | p. 432 |
A.1.4 The Exponential Distribution | p. 433 |
A.1.5 The F-Distribution | p. 433 |
A.1.6 The Gamma Distribution | p. 434 |
A.1.7 The Geometric Distribution | p. 435 |
A.1.8 The Multinomial Distribution | p. 435 |
A.1.9 The Negative Binomial Distribution | p. 435 |
A.1.10 The Normal Distribution | p. 436 |
A.1.11 The Poisson Distribution | p. 437 |
A.1.12 The Student-t Distribution | p. 437 |
A.1.13 The Uniform Distribution | p. 438 |
A.1.14 The Wishart Distribution | p. 438 |
A.2 Software | p. 439 |
References | p. 441 |
Index | p. 481 |