`R/plot.R`

`plot.ShapDependence.Rd`

Plot Shapley value-based variable dependence curves using ggplot2 by optionally selecting target variable(s). It also can plot the interaction between a related variable to the selected variable(s).

```
# S3 method for ShapDependence
plot(
x,
target_var = NA,
related_var = NA,
sample_prop = 0.3,
sample_bin = 100,
smooth_line = TRUE,
seed = 123,
...
)
```

- x
(

`ShapDependence`

) The variable dependence object to plot. It could be the return of function`shap_dependence`

.- target_var
(

`vector`

of`character`

) The target variable to plot. It could be`NA`

. If it is`NA`

, all variables will be plotted.- related_var
(

`character`

) The dependent variable to plot together with target variables. It could be`NA`

. If it is`NA`

, no related variable will be plotted.- sample_prop
(

`numeric`

) The proportion of points to sample for plotting. It will be ignored if the number of points is less than 1000. The default is`0.3`

.- sample_bin
(

`integer`

) The number of bins to use for stratified sampling.- smooth_line
(

`logical`

) Whether to fit the smooth line or not. It will be ignored if the number of points is less than 1000. The default is 100.- seed
(

`integer`

) The seed for sampling. It will be ignored if the number of points is less than 1000. The default is 123.- ...
Other arguments passed on to

`geom_smooth`

. Mainly`method`

and`formula`

to fit the smooth line. Note that the same arguments will be used for all target variables. User could set variable one by one to set the arguments separately.

`ggplot2`

figure of dependent curves

If the number of samples is more than 1000, a stratified sampling is used to thin the sample pool, and then plot its subset. The user could set a proportion to sample and a number of bins for stratified sampling.

```
# \donttest{
# Using a pseudo presence-only occurrence dataset of
# virtual species provided in this package
library(dplyr)
library(sf)
library(stars)
library(itsdm)
# Prepare data
data("occ_virtual_species")
obs_df <- occ_virtual_species %>% filter(usage == "train")
eval_df <- occ_virtual_species %>% filter(usage == "eval")
x_col <- "x"
y_col <- "y"
obs_col <- "observation"
# Format the observations
obs_train_eval <- format_observation(
obs_df = obs_df, eval_df = eval_df,
x_col = x_col, y_col = y_col, obs_col = obs_col,
obs_type = "presence_only")
env_vars <- system.file(
'extdata/bioclim_tanzania_10min.tif',
package = 'itsdm') %>% read_stars() %>%
slice('band', c(1, 5, 12, 16))
# With imperfect_presence mode,
mod <- isotree_po(
obs_mode = "imperfect_presence",
obs = obs_train_eval$obs,
obs_ind_eval = obs_train_eval$eval,
variables = env_vars, ntrees = 20,
sample_size = 0.8, ndim = 2L,
seed = 123L, response = FALSE,
spatial_response = FALSE,
check_variable = FALSE)
var_dependence <- shap_dependence(
model = mod$model,
var_occ = mod$vars_train,
variables = mod$variables)
plot(var_dependence, target_var = 'bio1', related_var = 'bio12')
# }
```