Contents
Which is the best way to measure feature interaction?
The R package gbm implements gradient boosted models and H-statistic. The H-statistic is not the only way to measure interactions: Variable Interaction Networks (VIN) by Hooker (2004) 33 is an approach that decomposes the prediction function into main effects and feature interactions.
When does feature interaction occur in a prediction model?
Feature Interaction. When features interact with each other in a prediction model, the prediction cannot be expressed as the sum of the feature effects, because the effect of one feature depends on the value of the other feature. Aristotle’s predicate “The whole is greater than the sum of its parts” applies in the presence of interactions.
How are interactions tested in a regression equation?
It is tested by adding a term to the model in which the two predictor variables are multiplied. The regression equation will look like this: Height = B0 + B1*Bacteria + B2*Sun + B3*Bacteria*Sun. Adding an interaction term to a model drastically changes the interpretation of all the coefficients.
How does the interaction statistic work in real life?
The interaction statistic works under the assumption that we can shuffle features independently. If the features correlate strongly, the assumption is violated and we integrate over feature combinations that are very unlikely in reality. That is the same problem that partial dependence plots have.
Can a prediction be expressed as the sum of the feature effects?
When features interact with each other in a prediction model, the prediction cannot be expressed as the sum of the feature effects, because the effect of one feature depends on the value of the other feature. Aristotle’s predicate “The whole is greater than the sum of its parts” applies in the presence of interactions.
How can I tell how good my predictive model is?
Residual tells us how good our model is against the actual value. No one knows what each residual value represents in true population but all statistician know that it exists. Calculating the real values of intercept, slope and residual terms is complex in nature. Ordinary Least Square can help us formulate a better quality model.