Sparse modeling : theory, algorithms, and applications

Sparse models are particularly useful in scientific applications, such as biomarker discovery in genetic or neuroimaging data, where the interpretability of a predictive model is essential. Sparsity can also dramatically improve the cost efficiency of signal processing.

Sparse Modeling: Theory, Algorithms, and Applications provides an introduction to the growing field of sparse modeling, including application examples, problem formulations that yield sparse solutions, algorithms for finding such solutions, and recent theoretical results on sparse recovery. The book gets you up to speed on the latest sparsity-related developments and will motivate you to continue learning about the field.

The authors first present motivating examples and a high-level survey of key recent developments in sparse modeling. The book then describes optimization problems involving commonly used sparsity-enforcing tools, presents essential theoretical results, and discusses several state-of-the-art algorithms for finding sparse solutions.

The authors go on to address a variety of sparse recovery problems that extend the basic formulation to more sophisticated forms of structured sparsity and to different loss functions. They also examine a particular class of sparse graphical models and cover dictionary learning and sparse matrix factorizations.

Author Notes

Irina Rish, Genady Grabarnik

List of Figures	p. xi
Preface	p. xvii
1 Introduction	p. 1
1.1 Motivating Examples	p. 4
1.1.1 Computer Network Diagnosis	p. 4
1.1.2 Neuroimaging Analysis	p. 5
1.1.3 Compressed Sensing	p. 8
1.2 Sparse Recovery in a Nutshell	p. 9
1.3 Statistical Learning versus Compressed Sensing	p. 11
1.4 Summary and Bibliographical Notes	p. 12
2 Sparse Recovery: Problem Formulations	p. 15
2.1 Noiseless Sparse Recovery	p. 16
2.2 Approximations	p. 18
2.3 Convexity: Brief Review	p. 19
2.4 Relaxations of (P 0 ) Problem	p. 20
2.5 The Effect of l q -Regularizer on Solution Sparsity	p. 21
2.6 l 1 -norm Minimization as Linear Programming	p. 22
2.7 Noisy Sparse Recovery	p. 23
2.8 A Statistical View of Sparse Recovery	p. 27
2.9 Beyond LASSO: Other Loss Functions and Regularizers	p. 30
2.10 Summary and Bibliographical Notes	p. 33
3 Theoretical Results (Deterministic Part)	p. 35
3.1 The Sampling Theorem	p. 36
3.2 Surprising Empirical Results	p. 36
3.3 Signal Recovery from Incomplete Frequency Information	p. 39
3.4 Mutual Coherence	p. 40
3.5 Spark and Uniqueness of (P 0 ) Solution	p. 42
3.6 Null Space Properly and Uniqueness of (P 1 ) Solution	p. 45
3.7 Restricted Isometry Property (RIP)	p. 46
3.8 Square Root Bottleneck for the Worst-Case Exact Recovery	p. 47
3.9 Exact Recovery Based on RIP	p. 48
3.10 Summary and Bibliographical Notes	p. 52
4 Theoretical Results (Probabilistic Part)	p. 53
4.1 When Does RIP Hold?	p. 54
4.2 Johnson-Lindenstrauss Lemma and RIP for Subgaussian Random Matrices	p. 54
4.2.1 Proof of the Johnson-Lindenstrauss Concentration Inequality	p. 55
4.2.2 RIP for Matrices with Subgaussian Random Entries	p. 56
4.3 Random Matrices Satisfying RIP	p. 59
4.3.1 Eigenvalues and RIP	p. 60
4.3.2 Random Vectors, Isotropic Random Vectors	p. 60
4.4 RIP for Matrices with Independent Bounded Rows and Matrices with Random Rows of Fourier Transform	p. 61
4.4.1 Proof of URI	p. 64
4.4.2 Tail Bound for the Uniform Law of Large Numbers (ULLN)	p. 67
4.5 Summary and Bibliographical Notes	p. 69
5 Algorithms for Sparse Recovery Problems	p. 71
5.1 Univariate Thresholding is Optimal for Orthogonal Designs	p. 72
5.1.1 l -norm Minimization	p. 73
5.1.2 l 1 -norm Minimization	p. 74
5.2 Algorithms for l 0 -norm Minimization	p. 76
5.2.1 An Overview of Greedy Methods	p. 79
5.3 Algorithms for l 1 -norm Minimization (LASSO)	p. 82
5.3.1 Least Angle Regression for LASSO (LARS)	p. 82
5.3.2 Coordinate Descent	p. 86
5.3.3 Proximal Methods	p. 87
5.4 Summary and Bibliographical Notes	p. 92
6 Beyond LASSO: Structured Sparsity	p. 95
6.1 The Elastic Net	p. 96
6.1.1 The Elastic Net in Practice: Neuroimaging Applications	p. 100
6.2 Fused LASSO	p. 107
6.3 Group LASSO: l 1 / l 2 Penalty	p. 109
6.4 Simultaneous LASSO: l 1 /l ∞ Penalty	p. 110
6.5 Generalizations	p. 111
6.5.1 Block l 1 /l q -Norms and Beyond	p. 111
6.5.2 Overlapping Groups	p. 112
6.6 Applications	p. 114
6.6.1 Temporal Causal Modeling	p. 114
6.6.2 Generalized Additive Models	p. 115
6.6.3 Multiple Kernel Learning	p. 115
6.6.4 Multi-Task Learning	p. 117
6.7 Summary and Bibliographical Notes	p. 118
7 Beyond LASSO: Other Loss Functions	p. 121
7.1 Sparse Recovery from Noisy Observations	p. 122
7.2 Exponential Family, GLMs, and Bregman Divergences	p. 123
7.2.1 Exponential Family	p. 124
7.2.2 Generalized Linear Models (GLMs)	p. 125
7.2.3 Bregman Divergence	p. 126
7.3 Sparse Recovery with GLM Regression	p. 128
7.4 Summary and Bibliographic Notes	p. 136
8 Sparse Graphical Models	p. 139
8.1 Background	p. 140
8.2 Markov Networks	p. 141
8.2.1 Markov Network Properties: A Closer Look	p. 142
8.2.2 Gaussian MRFs	p. 144
8.3 Learning and Inference in Markov Networks	p. 145
8.3.1 Learning	p. 145
8.3.2 Inference	p. 146
8.3.3 Example: Neuroimaging Applications	p. 147
8.4 Learning Sparse Gaussian MRFs	p. 151
8.4.1 Sparse Inverse Covariance Selection Problem	p. 152
8.4.2 Optimization Approaches	p. 153
8.4.3 Selecting Regularization Parameter	p. 160
8.5 Summary and Bibliographical Notes	p. 165
9 Sparse Matrix Factorization: Dictionary Learning and Beyond	p. 167
9.1 Dictionary Learning	p. 168
9.1.1 Problem Formulation	p. 169
9.1.2 Algorithms for Dictionary Learning	p. 170
9.2 Sparse PCA	p. 174
9.2.1 Background	p. 174
9.2.2 Sparse PCA: Synthesis View	p. 176
9.2.3 Sparse PCA: Analysis View	p. 178
9.3 Sparse NMF for Blind Source Separation	p. 179
9.4 Summary and Bibliographical Notes	p. 182
Epilogue	p. 185
Appendix: Mathematical Background	p. 187
A.1 Norms, Matrices, and Eigenvalues	p. 187
A.1.1 Short Summary of Eigentheory	p. 188
A.2 Discrete Fourier Transform	p. 190
A.2.1 The Discrete Whittaker-Nyquist-Kotelnikov-Shannon Sampling Theorem	p. 191
A.3 Complexity of l 0 -norm Minimization	p. 192
A.4 Subgaussian Random Variables	p. 192
A.5 Random Variables and Symmetrizacion in R n	p. 197
A.6 Subgaussian Processes	p. 199
A.7 Dudley Entropy Inequality	p. 200
A.8 Large Deviation for the Bounded Random Operators	p. 202
Bibliography	p. 203
Index	p. 227

Available:*

On Order

Summary

Summary

Author Notes

Table of Contents