Why should the residuals be normally distributed?

Why should the residuals be normally distributed?

In order to make valid inferences from your regression, the residuals of the regression should follow a normal distribution. The residuals are simply the error terms, or the differences between the observed value of the dependent variable and the predicted value.

What do the residuals tell us?

A residual is a measure of how well a line fits an individual data point. This vertical distance is known as a residual. For data points above the line, the residual is positive, and for data points below the line, the residual is negative. The closer a data point’s residual is to 0, the better the fit.

How are the residuals of a regression model symmetric?

For ordinary linear regression distribution of the residuals of a regression model should be symmetric about 0 since they should have a N ( 0, σ 2 ( I − H)) distribution. Even the price is non-negative, the residuals still could be symmetric around 0 since the structural component of the model i.e X b can be anything (include negative values).

What to do if your residual is not symmetric about 0?

However, if your residual is not symmetric about 0 then it suggests you might not use ordinary linear regression. You may consider generalized linear regressions. The key is residuals not the value of price (i.e only be positive). Thanks for contributing an answer to Cross Validated!

When to not use the ordinary least regression?

So using the ordinary least regression might not cause any bias. However, if your residual is not symmetric about 0 then it suggests you might not use ordinary linear regression. You may consider generalized linear regressions. The key is residuals not the value of price (i.e only be positive).

How are the residuals in a histogram normally distributed?

The following histogram of residuals suggests that the residuals (and hence the error terms) are normally distributed: The normal probability plot of the residuals is approximately linear supporting the condition that the error terms are normally distributed.