Regularized least-squares regression using lasso or elastic net algorithms
B = lasso(X,Y)
[B,FitInfo] = lasso(X,Y)
[B,FitInfo] = lasso(X,Y,Name,Value)
Numeric matrix with n rows and p columns. Each row represents one observation, and each column represents one predictor (variable).
Numeric vector of length n, where n is the number of rows of X. Y(i) is the response to row i of X.
Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
Scalar value from 0 to 1 (excluding 0) representing the weight of lasso (L1) versus ridge (L2) optimization. Alpha = 1 represents lasso regression, Alpha close to 0 approaches ridge regression, and other values represent elastic net optimization. See Definitions.
Method lasso uses to estimate mean squared error:
Maximum number of nonzero coefficients in the model. lasso returns results only for Lambda values that satisfy this criterion.
Vector of nonnegative Lambda values. See Definitions.
Default: Geometric sequence of NumLambda values, the largest just sufficient to produce B = 0
Positive scalar, the ratio of the smallest to the largest Lambda value when you do not set Lambda.
If you set LambdaRatio = 0, lasso generates a default sequence of Lambda values, and replaces the smallest one with 0.
Positive integer, the number of Monte Carlo repetitions for cross validation.
Positive integer, the number of Lambda values lasso uses when you do not set Lambda. lasso can return fewer than NumLambda fits if the if the residual error of the fits drops below a threshold fraction of the variance of Y.
Structure that specifies whether to cross validate in parallel, and specifies the random stream or streams. Create the Options structure with statset. Option fields:
Cell array of strings representing names of the predictor variables, in the order in which they appear in X.
Convergence threshold for the coordinate descent algorithm (see Friedman, Tibshirani, and Hastie ). The algorithm terminates when successive estimates of the coefficient vector differ in the L2 norm by a relative amount less than RelTol.
Boolean value specifying whether lasso scales X before fitting the models.
Observation weights, a nonnegative vector of length n, where n is the number of rows of X. lasso scales Weights to sum to 1.
Default: 1/n * ones(n,1)
Structure containing information about the model fits.
If you set the CV name-value pair to cross validate, the FitInfo structure contains additional fields.
Construct a data set with redundant predictors, and identify those predictors using cross-validated lasso.
Create a matrix X of 100 five-dimensional normal variables and a response vector Y from just two components of X, with small added noise.
X = randn(100,5); r = [0;2;0;-3;0]; % only two nonzero coefficients Y = X*r + randn(100,1)*.1; % small added noise
Construct the default lasso fit.
B = lasso(X,Y);
Find the coefficient vector for the 25th value in B.
ans = 0 1.6093 0 -2.5865 0
lasso identifies and removes the redundant predictors.
Visually examine the cross-validated error of various levels of regularization.
Load the acetylene data and prepare the data with interactions for fitting.
load acetylene Xs = [x1 x2 x3]; X = x2fx(Xs,'interaction'); X(:,1) = ; % No constant term
Construct the lasso fit using ten-fold cross validation. Include the FitInfo output so you can plot the result.
[B FitInfo] = lasso(X,y,'CV',10);
Plot the cross-validated fits.
For a given value of λ, a nonnegative parameter, lasso solves the problem
N is the number of observations.
yi is the response at observation i.
xi is data, a vector of p values at observation i.
λ is a nonnegative regularization parameter corresponding to one value of Lambda.
The parameters β0 and β are scalar and p-vector respectively.
As λ increases, the number of nonzero components of β decreases.
The lasso problem involves the L1 norm of β, as contrasted with the elastic net algorithm.
For an α strictly between 0 and 1, and a nonnegative λ, elastic net solves the problem
Elastic net is the same as lasso when α = 1. As α shrinks toward 0, elastic net approaches ridge regression. For other values of α, the penalty term Pα(β) interpolates between the L1 norm of β and the squared L2 norm of β.
 Tibshirani, R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, Vol 58, No. 1, pp. 267–288, 1996.
 Zou, H. and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, Vol. 67, No. 2, pp. 301–320, 2005.
 Friedman, J., R. Tibshirani, and T. Hastie. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, Vol 33, No. 1, 2010. http://www.jstatsoft.org/v33/i01
 Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, 2nd edition. Springer, New York, 2008.