Friday, July 16, 2010

How do you use foward selection to pick which statistics model is best in multiple linear regression?

I am trying to to find the most efficient model when there are 5 regressor variable and they do not all have the same colinearity.

How do you use foward selection to pick which statistics model is best in multiple linear regression?
If you're using software like SAS, forward selection is an option you can select in building a regression model.





If two variables are colinear, then only one will be significant. Try running a model two times with only variable in the model, and see which version has the higher R-squared.
Reply:Some stats software will do it for you. I really like Minitab.





I would suggest that you use a step-wise method to find the "best model." In this case you will add and remove (forward and backwards elimination) to find the "best" model. Minitab will do this as well.





I don't like the forward selection method. There are just to many different tracks you can end up taking.





If you don't want to do a step wise, use backwards elimination so you start with the largest possible model with all the individual factors and all interactions. Start removing the highest level interactions if the p-values indicate you can. Remember that if you have a significant interaction then the main effect must remain in the model even if there is no evidence to show that the main effect itself is significant. It is because of this fact and models I've seen in my work where a main effect is not significant but an interaction is that I don't like forward selection.

canine teeth

No comments:

Post a Comment