This package provides code for computing the IMV. You can install this package via the following command in R:
devtools::install_github("ben-domingue/imv", ref="main")
The IMV is a metric for understanding the predictive differences between two models for binary outcomes. You can read more about the IMV below:
Below we are going to show some examples of how the IMV can be used in logistic regression (with the glm()
function in R) and item response theory (IRT) models. To interpret these results, it will help to offer some intuition about IMV values. Given the construction of the IMV as expected profits from a gambling scenario, we can compare values to games of chance. For example, for every $1 a casino bets on a blackjack hand, they expect to take in $0.01 in profit. This translates to an IMV of 0.01. We can also use the fact that coins are more likely to land with the same side facing up as the coin started prior to the toss (see here). If both parties bet $1 on a fair coin toss, you would expect to make $0.019 per toss if you had access to knowledge of the coin's original state; this again translates to an IMV of 0.019.
In Table 1 here, we also offer IMVs from a large number of prediction examples. We describe a few:
- The most predictive model of whether someone was to be evicted from a recent prediction competition had an IMV of 0.005 relative to a simple benchmark model. Similarly, the most predictive model of layoffs relative to the standard benchmark was 0.01.
- Having information on grip and gait relative to just age, sex, and education resulted in an IMV of 0.29 in predicting death by age 90.
- Knowledge of sex and ticket class resulted in an IMV of 0.35 in predicting death on the Titanic.
One of the advantages of the IMV is that it is portable so the values that we derive below can be directly compared to these values we have just discussed.
We'll first consider some basic examples using the standard glm()
function in R. We'll begin by simulating a binary outcome y
based on values of some independent variable x
y<-rbinom(length(x),1,1/(1+exp(-x))) #y is 0/1 with probability depending on x
We'll then use glm()
to analyze the relationship between x
and y
; we'll call this model m
Let's now generate a new set of outcomes new.y
based on the same observations of x
. We can use m
to make predictions pr
about new.y
new.y<-rbinom(length(x),1,1/(1+exp(-x))) #these are new outcomes but with same dependence on x
We finally come to the calcluation of the IMV. Let's calculate the IMV for predictions pr
compared to those based on mean(y)
imv.binary(new.y,mean(y),pr) #imv for new set of outcomes
How should we interpret this value of 0.44
? In Table 1 here we provide a range of values from empirical studies that this value could be compared to.
Alternatively, we can compute a cross-fold IMV based on just the model m
imv.glm(m) #imv for full model versus null model across folds
Note that we get 5 IMV values, one for each fold. We can use the average to summarize this information.
mean(.Last.value) #average imv across folds
Finally, we can use the IMV to quantify the role of an individual predictor. We'll do that here by adding a predictor z
that is unassociated with the outcomes y
df$z<-rnorm(nrow(df)) #we are adding a predictor to our data frame. note that it is not associated with y!
These values are, not surprisingly, very near zero!
We can also look at an empirical example focused on Diabetes. In this data, we can look at the role of glucose tolerance in predicting diabetes. Glucose tolerance is sometimes used as a screener for diabetes so it should be no surprise that it is a valuable predictor.
data("PimaIndiansDiabetes", package = "mlbench")
mean(imv.glm(m,var.nm='glucose')) #taking the mean of the IMVs computed for each fold
A value of 0.081
is similar to, for example, the degree to which symptoms were predictive of a COVID diagnosis in the early months of the pandemic (see value of 0.092 in Table 1 here).
We'll now turn to analysis of item response outcomes with the IMV. We'll use some functionality of the mirt
package. Let's start by looking at cross-validated predictions from a simple 1PL/Rasch model as compared to predictions based on out-of-sample item means:
resp <- expand.table(LSAT7) #see:
mod1 <- mirt(resp, 1,'Rasch')
imv.mirt(mod1) #compared to item-level means
As we'd expect, the 1PL/Rasch predictions that vary between-people for the same item are quite valuable relative to those that only vary between items.
Let's now turn to a second example using data from the IRW. Here we'll compare predictions from the 1PL to those from the 2PL (after imposing weak priors on the discrimination parameters for the 2PL).
library(irw) #see:
dataset <- redivis::user("datapages")$dataset("item_response_warehouse")
df <- dataset$table("kim2023")$to_data_frame()
if (all(items %in% 1:length(items))) {
PRIOR = (1-",ni,", a1, lnorm, 0.0, 1.0)",sep="")
You can compare these values to what we get in simulation (Figure 1) and to results from other empirical data (Figure 4) here.