Function to compute estimates on external dataset
Source:R/apply_OTR.R
compute_estimates_external.RdFunction to compute estimates on external dataset
Usage
compute_estimates_external(
df,
Y_name,
A_name,
W_list,
Z_list,
CATE_preds = NULL,
d_pred = NULL,
threshold = 0,
nuisance_models = NULL,
sl.library.outcome = NULL,
sl.library.treatment = NULL,
sl.library.missingness = NULL,
k_folds = 5,
ps_trunc_level = 0.01,
outcome_type = "gaussian",
id_name = NULL
)Arguments
- df
dataframe containing external dataset to apply rule(s) to; must contain Z_list variables that are same as pre-trained rule(s)
- Y_name
name of outcome variable in df
- A_name
name of treatment variable in df
- W_list
character vector containing names of covariates in the dataframe to be used for fitting nuisance models
- Z_list
character vector containing names of variables in df used to fit CATE model (variables used in treatment rule; must be same names as used in pre-fit CATE model(s))
- CATE_preds
dataframe with CATE predictions for each CATE model & truncation flags for extreme predictions
- d_pred
vector of binary treatment decisions (if have existing decisions and just want to estimate effects)
- threshold
character vector of decision thresholds for CATE to determine OTR. Values should be positive if `Y_name` is desirable outcome, negative if `Y_name` is undesirable outcome. If threshold is 0, use +0 for desirable, -0 for undesirable.
- nuisance_models
list of objects of class `Nuisance` containing outcome, treatment, and missingness SuperLearner models (only include if using pre-fit nuisance models)
- sl.library.outcome
character vector of SuperLearner libraries to use to fit the outcome models (if fitting nuisance internally)
- sl.library.treatment
character vector of SuperLearner libraries to use to fit the treatment models (if fitting nuisance internally)
- sl.library.missingness
character vector of SuperLearner libraries to use to fit the missingness models (if fitting nuisance internally)
- k_folds
integer number of folds to use for nuisance model cross-validation (must specify if fitting nuisance models here)
- ps_trunc_level
numeric level below which propensity scores will be truncated (to avoid errors in computing AIPTW)
- outcome_type
outcome_type specifying continuous (outcome_type = "gaussian") or binary (outcome_type = "binomial") outcome Y (if not providing pre-fit nuisance models)
- id_name
name of participant ID variable if present in data
Value
-
List of "Results" objects for each threshold. Each results object contains:
aggregated_resultsResults averaged over each nuisance model
k_fold_resultsResults for each nuisance model
decision_dfDataframe with CATE prediction from every individual CATE_model, flag for CATE truncation, average CATE predition used for treatment decision, treatment decision
k_non_naequal to number of k_folds used for nuisance models. Ignore– placeholder to keep same format as Results object from internal compute estimates