Monday, 19 May 2014

Linear Regression

Here's a new type of regression called the "Linear Regression"

What does linear regression mean?

In statistics, it is an advance or an approach that models the relationship between two variables by forming a linear equation. One variable is called an explanatory variable, and the other is considered to be a dependent variable. Before trying to form a linear equation we must first know whether there’s a relationship between the two variables.

What is the formula in which we use to check if there is a relation?

A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept. In the case of which there’s one explanatory variable it is then called simple linear regression. For more than one explanatory variable, the process is called multiple linear regressions


Example 1: 

In simple linear regression, we predict scores on one variable from the scores on a second variable. The variable we are predicting is called the criterion variable and is referred to as Y. The variable we are basing our predictions on is called the predictor variable and is referred to as X. When there is only one predictor variable, the prediction method is called simple regression. In simple linear regression, the topic of this section, the predictions of Y when plotted as a function of X form a straight line.

Here's the table of data that we're going to plot.


Here are the results we got.

You can see that there is a positive relationship between X and Y. If you were going to predict Y from X, the higher the value of X, the higher your prediction of Y.

Linear regression consists of finding the best-fitting straight line through the points. The best-fitting line is called a regression line. The black diagonal line in the following picture is the regression line and consists of the predicted score on Y for each possible value of X. The vertical lines from the points to the regression line represent the errors of prediction. 
As you can see, the red point is very near the regression line; its error of prediction is small. By contrast, the yellow point is much higher than the regression line and therefore its error of prediction is large.
therefore, the more the point is higher the more its error of prediction is large, while if it's close to the regression line, its error of prediction is small.

No comments:

Post a Comment