Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000004301143 | QA279.5 B394 2002 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Nonlinear Bayesian modelling is a relatively new field, but one that has seen a recent explosion of interest. Nonlinear models offer more flexibility than those with linear assumptions, and their implementation has now become much easier due to increases in computational power. Bayesian methods allow for the incorporation of prior information, allowing the user to make coherent inference. Bayesian Methods for Nonlinear Classification and Regression is the first book to bring together, in a consistent statistical framework, the ideas of nonlinear modelling and Bayesian methods.
* Focuses on the problems of classification and regression using flexible, data-driven approaches.
* Demonstrates how Bayesian ideas can be used to improve existing statistical methods.
* Includes coverage of Bayesian additive models, decision trees, nearest-neighbour, wavelets, regression splines, and neural networks.
* Emphasis is placed on sound implementation of nonlinear models.
* Discusses medical, spatial, and economic applications.
* Includes problems at the end of most of the chapters.
* Supported by a web site featuring implementation code and data sets.
Primarily of interest to researchers of nonlinear statistical modelling, the book will also be suitable for graduate students of statistics. The book will benefit researchers involved inregression and classification modelling from electrical engineering, economics, machine learning and computer science.
Author Notes
David G. T. Denison and Christopher C. Holmes are the authors of Bayesian Methods for Nonlinear Classification and Regression, published by Wiley.
Table of Contents
Preface | p. xi |
Acknowledgements | p. xiii |
1 Introduction | p. 1 |
1.1 Regression and Classification | p. 1 |
1.2 Bayesian Nonlinear Methods | p. 4 |
1.2.1 Approximating functions | p. 4 |
1.2.2 The 'best' model | p. 4 |
1.2.3 Bayesian methods | p. 5 |
1.3 Outline of the Book | p. 5 |
2 Bayesian Modelling | p. 9 |
2.1 Introduction | p. 9 |
2.2 Data Modelling | p. 9 |
2.2.1 The representation theorem for classification | p. 9 |
2.2.2 The general representation theorem | p. 10 |
2.2.3 Bayes' Theorem | p. 11 |
2.2.4 Modelling with predictors | p. 12 |
2.3 Basics of Regression Modelling | p. 14 |
2.3.1 The regression problem | p. 14 |
2.3.2 Basis function models for the regression function | p. 14 |
2.4 The Bayesian Linear Model | p. 15 |
2.4.1 The priors | p. 16 |
2.4.2 The likelihood | p. 17 |
2.4.3 The posterior | p. 17 |
2.5 Model Comparison | p. 18 |
2.5.1 Bayes' factors | p. 19 |
2.5.2 Occam's razor | p. 20 |
2.5.3 Lindley's paradox | p. 22 |
2.6 Model Selection | p. 24 |
2.6.1 Searching for models | p. 25 |
2.7 Model Averaging | p. 28 |
2.7.1 Predictive inference | p. 28 |
2.7.2 Problems with model selection | p. 30 |
2.7.3 Other work on model averaging | p. 31 |
2.8 Posterior Sampling | p. 31 |
2.8.1 The Gibbs sampler | p. 33 |
2.8.2 The Metropolis-Hastings algorithm | p. 34 |
2.8.3 The reversible jump algorithm | p. 36 |
2.8.4 Hybrid sampling | p. 39 |
2.8.5 Convergence | p. 40 |
2.9 Further Reading | p. 41 |
2.10 Problems | p. 42 |
3 Curve Fitting | p. 45 |
3.1 Introduction | p. 45 |
3.2 Curve Fitting Using Step Functions | p. 46 |
3.2.1 Example: Nile discharge data | p. 46 |
3.3 Curve Fitting with Splines | p. 51 |
3.3.1 Metropolis-Hastings sampler | p. 53 |
3.3.2 Gibbs sampling | p. 56 |
3.3.3 Example: Great Barrier Reef Data | p. 57 |
3.3.4 Monitoring convergence of the sampler | p. 60 |
3.3.5 Default curve fitting | p. 63 |
3.4 Curve Fitting Using Wavelets | p. 66 |
3.4.1 Wavelet shrinkage | p. 69 |
3.4.2 Bayesian wavelets | p. 70 |
3.5 Prior Elicitation | p. 72 |
3.5.1 The model prior | p. 73 |
3.5.2 Prior on the model parameters | p. 78 |
3.5.3 The prior on the coefficients | p. 79 |
3.5.4 The prior on the regression variance | p. 82 |
3.6 Robust Curve Fitting | p. 82 |
3.6.1 Modelling with a heavy-tailed error distribution | p. 83 |
3.6.2 Outlier detection models | p. 86 |
3.7 Discussion | p. 88 |
3.8 Further Reading | p. 89 |
3.9 Problems | p. 91 |
4 Surface Fitting | p. 95 |
4.1 Introduction | p. 95 |
4.2 Additive Models | p. 95 |
4.2.1 Introduction to additive modelling | p. 95 |
4.2.2 Ozone data example | p. 98 |
4.2.3 Further reading on Bayesian additive models | p. 99 |
4.3 Higher-Order Splines | p. 100 |
4.3.1 Truncated linear splines | p. 100 |
4.4 High-Dimensional Regression | p. 102 |
4.4.1 Extending to higher dimension | p. 102 |
4.4.2 The BWISE model | p. 103 |
4.4.3 The BMARS model | p. 103 |
4.4.4 Piecewise linear models | p. 110 |
4.4.5 Neural network models | p. 115 |
4.5 Time Series Analysis | p. 119 |
4.5.1 The BAYSTAR model | p. 121 |
4.5.2 Example: Wolf's sunspots data | p. 122 |
4.5.3 Chaotic Time Series | p. 124 |
4.6 Further Reading | p. 126 |
4.7 Problems | p. 126 |
5 Classification Using Generalised Nonlinear Models | p. 129 |
5.1 Introduction | p. 129 |
5.2 Nonlinear Models for Classification | p. 130 |
5.2.1 Classification | p. 130 |
5.2.2 Auxiliary variables method for classification | p. 132 |
5.3 Bayesian MARS for Classification | p. 136 |
5.3.1 Multiclass classification | p. 137 |
5.4 Count Data | p. 138 |
5.4.1 Example: Rongelap Island dataset | p. 140 |
5.5 The Generalised Linear Model Framework | p. 141 |
5.5.1 Bayesian generalised linear models | p. 144 |
5.5.2 Log-concavity | p. 144 |
5.6 Further Reading | p. 145 |
5.7 Problems | p. 146 |
6 Bayesian Tree Models | p. 149 |
6.1 Introduction | p. 149 |
6.1.1 Motivation for trees | p. 150 |
6.1.2 Binary-tree structure | p. 150 |
6.2 Bayesian Trees | p. 152 |
6.2.1 The random tree structure | p. 152 |
6.2.2 Classification trees | p. 153 |
6.2.3 Regression trees | p. 155 |
6.2.4 Prior on trees | p. 156 |
6.3 Simple Trees | p. 158 |
6.3.1 Stumps | p. 159 |
6.3.2 A Bayesian splitting criterion | p. 160 |
6.4 Searching for Large Trees | p. 161 |
6.4.1 The sampling algorithm | p. 161 |
6.4.2 Problems with sampling | p. 164 |
6.4.3 Improving the generated 'sample' | p. 165 |
6.5 Classification Using Bayesian Trees | p. 166 |
6.5.1 The Pima Indian dataset | p. 166 |
6.5.2 Selecting trees from the sample | p. 167 |
6.5.3 Summarising the output | p. 167 |
6.5.4 Identifying good trees | p. 169 |
6.6 Discussion | p. 170 |
6.7 Further Reading | p. 174 |
6.8 Problems | p. 175 |
7 Partition Models | p. 177 |
7.1 Introduction | p. 177 |
7.2 One-Dimensional Partition Models | p. 179 |
7.2.1 Changepoint models | p. 182 |
7.3 Multidimensional Partition Models | p. 184 |
7.3.1 Tessellations | p. 184 |
7.3.2 Marginal likelihoods for partition models | p. 186 |
7.3.3 Prior on the model structure | p. 187 |
7.3.4 Computational strategy | p. 188 |
7.4 Classification with Partition Models | p. 188 |
7.4.1 Speech recognition dataset | p. 188 |
7.5 Disease Mapping with Partition Models | p. 191 |
7.5.1 Introduction | p. 191 |
7.5.2 The disease mapping problem | p. 192 |
7.5.3 The binomial model for disease risk | p. 192 |
7.5.4 The Poisson model for disease risk | p. 193 |
7.5.5 Example: leukaemia incidence data | p. 193 |
7.5.6 Convergence assessment | p. 195 |
7.5.7 Posterior inference for the leukaemia data | p. 197 |
7.6 Discussion | p. 199 |
7.7 Further Reading | p. 203 |
7.8 Problems | p. 206 |
8 Nearest-Neighbour Models | p. 209 |
8.1 Introduction | p. 209 |
8.2 Nearest-Neighbour Classification | p. 209 |
8.3 Probabilistic Nearest Neighbour | p. 211 |
8.3.1 Formulation | p. 211 |
8.3.2 Implementation | p. 213 |
8.4 Examples | p. 214 |
8.4.1 Ripley's simulated data | p. 214 |
8.4.2 Arm tremor data | p. 216 |
8.4.3 Lancing Woods data | p. 217 |
8.5 Discussion | p. 219 |
8.6 Further Reading | p. 220 |
9 Multiple Response Models | p. 221 |
9.1 Introduction | p. 221 |
9.2 The Multiple Response Model | p. 221 |
9.3 Conjugate Multivariate Linear Regression | p. 222 |
9.4 Seemingly Unrelated Regressions | p. 223 |
9.4.1 Prior on the basis function matrix | p. 226 |
9.5 Computational Details | p. 227 |
9.5.1 Updating the parameter vector [theta] | p. 227 |
9.6 Examples | p. 228 |
9.6.1 Vector autoregressive processes | p. 229 |
9.6.2 Multiple curve fitting | p. 230 |
9.7 Discussion | p. 234 |
Appendix A Probability Distributions | p. 237 |
Appendix B Inferential Processes | p. 239 |
B.1 The Linear Model | p. 240 |
B.2 Multivariate Linear Model | p. 241 |
B.3 Exponential-Gamma Model | p. 242 |
B.4 The Multinomial-Dirichlet Model | p. 243 |
B.5 Poisson-Gamma Model | p. 244 |
B.6 Uniform-Pareto Model | p. 245 |
References | p. 247 |
Index | p. 265 |
Author Index | p. 271 |