`R/variable_contrib.R`

`variable_contrib.Rd`

Evaluate variable contribution for targeted observations according to SHapley Additive exPlanations (SHAP).

```
variable_contrib(
model,
var_occ,
var_occ_analysis,
shap_nsim = 100,
visualize = FALSE,
seed = 10,
pfun = .pfun_shap
)
```

- model
(

`isolation_forest`

or other model) The SDM. It could be the item`model`

of`POIsotree`

made by function`isotree_po`

. It also could be other user-fitted models as long as the`pfun`

can work on it.- var_occ
(

`data.frame`

,`tibble`

) The`data.frame`

style table that include values of environmental variables at occurrence locations.- var_occ_analysis
(

`data.frame`

,`tibble`

) The`data.frame`

style table that include values of environmental variables at occurrence locations for analysis. It could be either`var_occ`

or its subset, or any new dataset.- shap_nsim
(

`integer`

) The number of Monte Carlo repetitions in SHAP method to use for estimating each Shapley value. See details in documentation of function`explain`

in package`fastshap`

.- visualize
(

`logical`

) if`TRUE`

, plot the response curves. The default is`FALSE`

.- seed
(

`integer`

) The seed for any random progress. The default is`10L`

.- pfun
(

`function`

) The predict function that requires two arguments,`object`

and`newdata`

. It is only required when`model`

is not`isolation_forest`

. The default is the wrapper function designed for iForest model in`itsdm`

.

(`VariableContribution`

) A list of

shapley_values (

`data.frame`

) A table of Shapley values of each variables for all observationsfeature_values (

`tibble`

) A table of values of each variables for all observations

`plot.VariableContribution`

`explain`

in `fastshap`

```
# Using a pseudo presence-only occurrence dataset of
# virtual species provided in this package
library(dplyr)
library(sf)
library(stars)
library(itsdm)
# Prepare data
data("occ_virtual_species")
obs_df <- occ_virtual_species %>% filter(usage == "train")
eval_df <- occ_virtual_species %>% filter(usage == "eval")
x_col <- "x"
y_col <- "y"
obs_col <- "observation"
# Format the observations
obs_train_eval <- format_observation(
obs_df = obs_df, eval_df = eval_df,
x_col = x_col, y_col = y_col, obs_col = obs_col,
obs_type = "presence_only")
env_vars <- system.file(
'extdata/bioclim_tanzania_10min.tif',
package = 'itsdm') %>% read_stars() %>%
slice('band', c(1, 5, 12))
# With imperfect_presence mode,
mod <- isotree_po(
obs_mode = "imperfect_presence",
obs = obs_train_eval$obs,
obs_ind_eval = obs_train_eval$eval,
variables = env_vars, ntrees = 5,
sample_size = 0.8, ndim = 1L,
seed = 123L, response = FALSE,
spatial_response = FALSE,
check_variable = FALSE)
var_contribution <- variable_contrib(
model = mod$model,
var_occ = mod$vars_train,
var_occ_analysis = mod$vars_train %>% slice(1:2))
if (FALSE) {
plot(var_contribution,
num_features = 3,
plot_each_obs = TRUE)
# Plot together
plot(var_contribution)
}
```