Skip to content

Commit

Permalink
Clean-up for 0.3.0 release
Browse files Browse the repository at this point in the history
  • Loading branch information
tripartio committed Feb 13, 2024
1 parent 8f05bcd commit 5ec9dd6
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 16 deletions.
14 changes: 6 additions & 8 deletions vignettes/ale-intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -126,17 +126,16 @@ By default, most core functions in the `{ale}` package use parallel processing.

To access the plot for a specific variable, we can call it by its variable name as an element of the `plots` element. These are `ggplot` objects, so they are easy to manipulate. For example, to access and print the `carat` ALE plot, we simply call `ale_gam_diamonds$plots$carat` :

```{r print carat, fig.width=3.5, fig.width=4}
```{r print-carat, fig.width=3.5, fig.width=4}
# Print a plot by entering its reference
ale_gam_diamonds$plots$carat
```

To iterate the list and plot all the ALE plots, we provide here some demonstration code using the `patchwork` package for arranging multiple plots in a common plot grid using `patchwork::wrap_plots()`. We need to pass the list of plots to the `grobs` argument and we can specify that we want two plots per row with the `ncol` argument.

```{r print ale_simple, fig.width=7, fig.height=11}
```{r print-ale_simple, fig.width=7, fig.height=11}
# Print all plots
ale_gam_diamonds$plots |>
patchwork::wrap_plots(ncol = 2)
patchwork::wrap_plots(ale_gam_diamonds$plots, ncol = 2)
```

## Bootstrapped ALE
Expand Down Expand Up @@ -164,8 +163,7 @@ ale_gam_diamonds_boot <- ale(
)
# Bootstrapping produces confidence intervals
ale_gam_diamonds_boot$plots |>
patchwork::wrap_plots(ncol = 2)
patchwork::wrap_plots(ale_gam_diamonds_boot$plots, ncol = 2)
```

In this case, the bootstrapped results are mostly similar to single (non-bootstrapped) ALE result. In principle, we should always bootstrap the results and trust only in bootstrapped results. The most unusual result is that values of `x_length` (the length of the diamond) from 6.2 mm or so and higher are associated with lower diamond prices. When we compare this with the `y_width` value (width of the diamond), we suspect that when both the length and width (that is, the size) of a diamond become increasingly large, the price increases so much more rapidly with the width than with the length that the width has an inordinately high effect that is tempered by a decreased effect of the length at those high values. This would be worth further exploration for real analysis, but here we are just introducing the key features of the package.
Expand All @@ -187,7 +185,7 @@ Like the `ale()` function, the `ale_ixn()` returns a list with one element per i

Again, we provide here some demonstration code to plot all the ALE plots. It is a little more complex this time because of the two levels of interacting variables in the output data, so we use the `purrr` package to iterate the list structure. `purrr::walk()` takes a list as its first argument and then we specify an anonymous function for what we want to do with each element of the list. We specify the anonymous function as `\(.x1) {...}` where `.x1` in our case represents each individual element of `ale_ixn_gam_diamonds$plots` in turn, that is, a sublist of plots with which the x1 variable interacts. We print the plots of all the x1 interactions as a combined grid of plots with `patchwork::wrap_plots()`, as before.

```{r print all ale_ixn, fig.width=7, fig.height=7}
```{r print-all-ale_ixn, fig.width=7, fig.height=7}
# Print all interaction plots
ale_ixn_gam_diamonds$plots |>
# extract list of x1 ALE outputs
Expand All @@ -201,7 +199,7 @@ ale_ixn_gam_diamonds$plots |>

Because we are printing all plots together with the same `patchwork::wrap_plots()` statement, some of them might appear vertically distorted because each plot is forced to be of the same height. For more fine-tuned presentation, we would need to refer to a specific plot. For example, we can print the interaction plot between carat and depth by referring to it thus: `ale_ixn_gam_diamonds$plots$carat$depth`.

```{r print specific ixn, fig.width=5, fig.height=3}
```{r print-specific-ixn, fig.width=5, fig.height=3}
ale_ixn_gam_diamonds$plots$carat$depth
```

Expand Down
12 changes: 4 additions & 8 deletions vignettes/articles/ale-ALEPlot.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -160,8 +160,7 @@ Since the plots are saved as a list, they can easily be printed out all at once:

```{r ale nnet one-way plots, fig.width=7, fig.height=5}
# Print plots
nn_ale$plots |>
patchwork::wrap_plots()
patchwork::wrap_plots(nn_ale$plots)
```

The `{ale}` package plots have various features that enhance interpretability:
Expand All @@ -177,8 +176,7 @@ It might not be clear that the previous plots display exactly the same data as t
# Zero-centred ALE
nn_ale <- ale(DAT, nnet.DAT, pred_type = "raw", relative_y = 'zero')
nn_ale$plots |>
patchwork::wrap_plots()
patchwork::wrap_plots(nn_ale$plots)
```

With these zero-centred plots, the full range of y values and the rug plots give some context that aids interpretation. (If the rugs look slightly different, it is because they are randomly jittered to avoid overplotting.)
Expand Down Expand Up @@ -302,8 +300,7 @@ gbm_ale_link <- url('https://github.com/tripartio/ale/raw/main/download/gbm_ale_
readRDS()
# Print plots
gbm_ale_link$plots |>
patchwork::wrap_plots(ncol = 2)
patchwork::wrap_plots(gbm_ale_link$plots, ncol = 2)
```

Now we generate ALE data for all two-way interactions and then plot them. Again, note the interaction between `age` and `hours_per_week`. The interaction is minimal except for the extremely high cases of hours per week.
Expand Down Expand Up @@ -374,8 +371,7 @@ gbm_ale_prob <- url('https://github.com/tripartio/ale/raw/main/download/gbm_ale_
readRDS()
# Print plots
gbm_ale_prob$plots |>
patchwork::wrap_plots(ncol = 2)
patchwork::wrap_plots(gbm_ale_prob$plots, ncol = 2)
```

Finally, we again generate two-way interactions, this time based on probabilities instead of on log odds. However, probabilities might not be the best choice for indicating interactions because, as we see from the rugs in the one-way ALE plots, the GBM model heavily concentrates its probabilities in the extremes near 0 and 1. Thus, the plots' suggestions of strong interactions are likely exaggerated. In this case, the log odds ALEs shown above are probably more relevant.
Expand Down

0 comments on commit 5ec9dd6

Please sign in to comment.