Predicting in new datasets
predicting_in_newdata.RmdUsing FTM objects to predict outcomes in new datasets
FTM objects allow the prediction of outcomes using the
predict function, just like the original model.
# Create a "new" data
new_data <- mtcars[1:10, c("hp", "wt", "cyl")]
# predict using the FTM from lm model
predict(ftmlm_model, newdata = new_data)
#> mpg
#> Mazda RX4 22.82043
#> Mazda RX4 Wag 22.01285
#> Datsun 710 25.96040
#> Hornet 4 Drive 20.93608
#> Hornet Sportabout 17.16780
#> Valiant 20.25036
#> Duster 360 15.49342
#> Merc 240D 23.76431
#> Merc 230 23.29574
#> Merc 280 19.98901
# predict using the FTM from glm model
predict(ftmglm_model, newdata = new_data)
#> am
#> Mazda RX4 2.24194025
#> Mazda RX4 Wag -0.09117492
#> Datsun 710 3.45752720
#> Hornet 4 Drive -3.20199515
#> Hornet Sportabout -2.16697131
#> Valiant -5.60657399
#> Duster 360 -1.07498527
#> Merc 240D -5.51285476
#> Merc 230 -4.07135061
#> Merc 280 -4.83693440
# predict using the FTM from glmnet model
predict(ftmglmnet_model, newdata = new_data)
#> am
#> Mazda RX4 1.0705100
#> Mazda RX4 Wag -0.1690128
#> Datsun 710 2.1983970
#> Hornet 4 Drive -1.8217099
#> Hornet Sportabout -1.6522074
#> Valiant -3.1097932
#> Duster 360 -0.9237527
#> Merc 240D -2.6330120
#> Merc 230 -1.7972606
#> Merc 280 -2.6627667FTM objects allow predictions even with missing variables
The major benefit of FTM objects is that they allow predictions even when the new dataset has missing variables. This is because of the flexible structure of the FTM that is achieved by reweighting of the beta coefficients.
# Create a "new" data with missing variables (missing "cyl")
new_data <- mtcars[1:10, c("hp", "wt")]
# predict using the FTM from lm model
predict(ftmlm_model, newdata = new_data)
#> mpg
#> Mazda RX4 23.57233
#> Mazda RX4 Wag 22.58348
#> Datsun 710 25.27582
#> Hornet 4 Drive 21.26502
#> Hornet Sportabout 18.32727
#> Valiant 20.47382
#> Duster 360 15.59904
#> Merc 240D 22.88707
#> Merc 230 21.99367
#> Merc 280 19.97946
# predict using the FTM from glm model
predict(ftmglm_model, newdata = new_data)
#> am
#> Mazda RX4 1.6494977
#> Mazda RX4 Wag -0.3693964
#> Datsun 710 3.4227907
#> Hornet 4 Drive -3.0612551
#> Hornet Sportabout -2.5413401
#> Valiant -5.1779994
#> Duster 360 -1.0922657
#> Merc 240D -4.5627401
#> Merc 230 -3.0777025
#> Merc 280 -4.3823738
# predict using the FTM from glmnet model
predict(ftmglmnet_model, newdata = new_data)
#> am
#> Mazda RX4 1.0705100
#> Mazda RX4 Wag -0.1690128
#> Datsun 710 2.1983970
#> Hornet 4 Drive -1.8217099
#> Hornet Sportabout -1.6522074
#> Valiant -3.1097932
#> Duster 360 -0.9237527
#> Merc 240D -2.6330120
#> Merc 230 -1.7972606
#> Merc 280 -2.6627667