Preface
1 Introduction
2 The Lasso for Linear Models
2.1 Introduction
2.2 The Lasso Estimator
2.3 Cross-Vlalidation and Inference
2.4 Computation of the Lasso Solution
2.4.1 Single Predictor:Soft Thresholding
2.4.2 Multiple Predictors:Cyclic Coordinate Descent
2.4.3 Soft-Thresholding and Orthogonal Bases
2.5 Degrees of Freedom
2.6 Uniqueness of the LaSSO Solutions
2.7 A Glimpse at the Theory
2.8 The Nonnegative Garrote
2.9 Penalties and Bayes Estimates
2.10 Some Perspective
Exercises
3 Generalized Linear M0dels
3.1 Introduction
3.2 Logistic Regression
3.2.1 Example:Document Classification
3.2.2 Algorithms
3.3 Multiclass Logistic Regression
3.3.1 Example:Handwritten Digits
3.3.2 Algorithms
3.3.3 Grouped。Lasso Multinomial
3.4 Log-Linear Models and the Poisson GLM
3.4.1 Example:Distribution Smoothing
3.5 COX Proportional Hazards Models
3.5.1 Cross-Validation
3.5.2 Pre-Validation
3.6 Support Vector Machines
3.6.1 Logistic Regression with Separable Data
3.7 Computational Details and glmnet
Bibliographic Notes
Exercises
4 Generalizations of the Lasso Penalty
4.1 Introduction
4.2 The Elastic Net
4.3 The Group Lasso
4.3.1 Computation for the Group Lasso
4.3.2 Sparse Group Lasso
4.3.3 The Overlap Group Lasso
4.4 Sparse Additive Models and the Group Lasso
4.4.1 Additive Models and Backfitting
4.4.2 Sparse Additive Models and Backfitting
4.4.3 Approaches Using Optimization and the Group Lasso
4.4.4 Multiple Penalization for Sparse Additive Models
4.5 The Fused Lasso
4.5.1 Fitting the Fused Lasso
4.5.1.1 Reparametrization
4.5.1.2 A Path Algorithm
4.5.1.3 A Dual Path Algorithm
4.5.1.4 Dynamic Programming for the Fused Lass0
4.5.2 Trend Filtering
4.5.3 Nearly Isotonic Regression
4.6 Nonconvex Penalties
Bibliographic Notes
Exercises
5 Optimization Methods
5.1 Introduction
5.2 Convex Optimality Conditions
5.2.1 Optimality for Differentiable Problems
5.2.2 Nondifferentiable Functions and Subgradients
5.3 Gradient Descent
5.3.1 Unconstrained Gradient Descent
5.3.2 Projected Gradient Methods
5.3.3 ProximaI Gradient Methods
5.3.4 Accelerated Gradient Methods
5.4 Coordinate Descent
5.4.1 Separability and Coordinate Descent
5.4.2 Linear Regression and the Lasso
5.4.3 Logistic Regression and Generalized Linear Models
5.5 A Simulation Study
5.6 Least Angle Regression
5.7 Alternating Direction Method of Multipliers
5.8 Minorization-Maximization Algorithms
5.9 Biconvexity and Alternating Minimazation
5.10 Screening Rules
Bibliographic Notes
Appendix
Exercises
6 Statistical Inference
6.1 The Bayesian Lasso
6.2 The Bootstrap
6.3 Post.Selection Inference for the Lasso
6.3.I The Covariance Test
6.3.2 A General Scheme for Post-Selection Inference
6.3.2.1 Fixed-r Inference for the Lasso
6.3.2.2 The Spacing Test for LAR
6.3.3 What Hypothesis Is Being Tested?
6.3.4 Back to Forward Stepwise Regression
6.4 Inference via a Debiased Lasso
6.5 Other Proposals for Post-Selection Inference
Bibliographic Notes
Exercises
7 Matrix Decompositions,Approximations,and Completion
7.1 Introduction
7.2 The Singular Value Decomposition
7.3 Missing Data and Matrix Completion
7.3.1 The Netflix Movie Challenge
7.3.2 Matrix Completion Using Nuclear Norm
7.3.3 Theoretical Results for Matrix Completion
7.3.4 Maximum Margin Factorization and Related Methods
7.4 Reduced-Rank Regression
7.5 A General Matrix Regression Framework
7.6 Penalized Matrix Decomposition
7.7 Additive Matrix Decomposition
Bibliographic Notes
Exercises
8 Sparse Multivariate Methods
8.1 Introduction
8.2 Sparse Principal Components