Tuesday, August 31, 2010

Linear Regression in MATLAB

Fitting a least-squares linear regression is easily accomplished in MATLAB using the backslash operator: '\'. In linear algebra, matrices may by multiplied like this:

output = input * coefficients

The backslash in MATLAB allows the programmer to effectively "divide" the output by the input to get the linear coefficients. This process will be illustrated by the following examples:


Simple Linear Regression

First, some data with a roughly linear relationship is needed:


>> X = [1 2 4 5 7 9 11 13 14 16]'; Y = [101 105 109 112 117 116 122 123 129 130]';


"Divide" using MATLAB's backslash operator to regress without an intercept:


>> B = X \ Y

B =

10.8900


Append a column of ones before dividing to include an intercept:


>> B = [ones(length(X),1) X] \ Y

B =

101.3021
1.! 8412


In this case, the first number is the intercept and the second is the coefficient.


Multiple Linear Regression

The following generates a matrix of 1000 observations of 5 random input variables:


>> X = rand(1e3,5);


Next, the true coefficients are defined (which wouldn't be known in a real problem). As is conventional, the intercept term is the first element of the coefficient vector. The problem at hand is to approximate these coefficients, knowing only the input and output data:


>> BTrue = [-1 2 -3 4 -5 6]';


Multiply the matrices to get the output data.


>> Y = BTrue(1) + X * BTrue(2:end);


As before, append a column of ones and use the backslash operator:


>> B = [ones(size(X,1),1) X] \ Y

B =

-1.0000
2.0000
-3.0000
4.0000 -5.0000
6.0000


Again, th! e first element in the coefficient vector is the intercept. Note that, oh so conveniently, the discovered coefficients match the designed ones exactly, since this data set is completely noise-free.


Model Recall

Executing linear models is a simple matter of matrix multiplication, but there is an efficiency issue. One might append a column of ones and simply perform the complete matrix multiplication, thus:


>> Z = [ones(size(X,1),1) X] * B;


The above process is inefficient, though, and can be improved by simply multiplying all the other coefficients by the input data matrix and adding the intercept term:


>> Z = B(1) + X * B(2:end);



Regression in the Statistics Toolbox

The MATLAB Statistics Toolbox includes several linear regression functions. Among others, there are:

regress: least squares linear regression and diagnostics
stepwisefit: stepwise linear regression

robustfit: robust (non-least-squares) linear regression and diagnostics


See help stats for more information.


See also:

The May-03-2007 posting, Weighted Regression in MATLAB.

The Oct-23-2007 posting, L-1 Linear Regression.

The Mar-15-2009 posting, Logistic Regression.

examples of linear functions

No comments:

Post a Comment