Estimating performance when you have fewer variables
estimating_performance.RmdUsing FTMlm objects to estimate the model performance
FTMlm objects allow for the estimation of model performance (R2) for
the full model and a model using a subset of variables. This can be done
using the rsq function.
# Load the mtcars dataset
data(mtcars)
# Fit a linear model to the mtcars dataset, predicting mpg from cyl, hp, and wt
lm_model <- lm(mpg ~ cyl + hp + wt, data = mtcars)
# Create FTM object using createFTM
ftmlm_model <- createFTM(lm_model) # same as using createFromLm
# Calculate the model performance
rsq(ftmlm_model)
#> [1] 0.84315
# Check the linear model R2
summary(lm_model)$r.squared
#> [1] 0.84315Notice the ftmlm model reports the R2 value and that it matches the linear model it was created from. This R2 value was calculated using the parameters stored within the object.
Estimating model performance when using a subset of variables
# Calculate the model performance for a model using just cyl + wt
rsq(ftmlm_model, select = c("cyl", "wt"))
#> [1] 0.8302274
# Verify we get the same performance as a de novo model
summary(lm(mpg ~ cyl + wt, data = mtcars))$r.squared
#> [1] 0.8302274
# Calculate the model performance for a model using just hp
rsq(ftmlm_model, select = c("hp"))
#> [1] 0.6024373
# Verify we get the same performance as a de novo model
summary(lm(mpg ~ hp, data = mtcars))$r.squared
#> [1] 0.6024373Notes
Ridge penalties
Estimating model performance with a ridge penalty causes the solution
to not be exact. The estimate is often rather good, but may slightly
under/over estimate the model performance. If this of concern, compare
the model performance with the optimal s and with
s = 0. If these are similar, consider using the estimate
from the non-ridge penalty model.