Skip to contents

Abstract

Optimal treatment rules (OTRs) are decision-making policies that assign treatment based on an individual’s observed characteristics in a way that maximizes their expected outcome. While many methods exist for the creation of such rules, we propose a novel framework that creates treatment recommendations conditional on a subset of available information and constrains these recommendations to individuals with a clinically relevant conditional average treatment effect (CATE). We make use of a nested cross-validation procedure with Super Learning to estimate a doubly robust (DR)-learner and recommend treatment to patients whose predicted treatment effect meets the pre-set threshold. We then use an augmented inverse probability of treatment weighted estimator (AIPTW) to estimate several quantities describing outcomes under the OTR: (1) the proportion treated, (2) the overall Average Treatment effect under the Rule (ATRdATR_d), (3) the Average Treatment effect in the subgroup Recommended Treatment under the rule (ATRTdATRT_d), (4) the Average Treatment effect in the subgroup Not Recommended Treatment under the rule (ATNRTdATNRT_d), and (5) the difference between treatment recommendation subgroups (ATRTdATNRTdATRT_d - ATNRT_d). The doubly-robust properties of the DR-learner and AIPTW ensure accurate estimation of the CATE and estimands of interest if some combination of nuisance parameters used to build them are consistently estimated. The vignette provides a brief introduction to the drotr package and demonstrates its use on a simulated dataset.

Introduction

Optimal treatment rules (OTRs) are policies that assign treatment based on an individual’s observed characteristics to maximize the counterfactual outcome of individual patients. OTRs are of increasing interest in fields like healthcare given treatments can have varying effects across population subgroups. While there are many ways to define OTRs, one estimand commonly used is the Conditional Average Treatment Effect (CATE). The CATE measures the expected difference in counterfactual outcome between different treatments given a subset of covariates. CATE estimation often relies on assumptions that, if violated, can lead to inaccurate results. Additionally, recommending treatment to all individuals with a CATE indicating they benefit from treatment may not be practical due to challenges such as resource limitations or population-level consequences of overprescribing.

We overcome these challenges through a doubly-robust method for learning OTRs and estimating average treatment effects under the rules. Our OTRs are constrained to assign treatment to individuals with a clinically relevant treatment effect size. We use a doubly robust (DR)-learner to estimate the CATE, ensuring accurate estimation if at least some combination of nuisance parameters are consistently estimated (Kennedy 2023). A Super Learner ensemble is used to optimally combine candidate models for treatment rule estimation in a way that minimizes cross-validated risk and uncovers treatment effect heterogeneity more effectively than traditional approaches (Luedtke and Laan 2016), (Montoya et al. 2023). Given treatment decisions made under the OTR, we identify the following quantities to describe outcomes under the OTR: (1) the proportion treated, (2) the Average Treatment effect of the Rule (ATRdATR_d), (3) the Average Treatment effect in the subgroup Recommended Treatment under the rule (ATRTdATRT_d), (4) the Average Treatment effect in the subgroup Not Recommended Treatment under the rule (ATNRTdATNRT_d), and (5) the difference between subgroups. We estimate these quantities using the augmented inverse probability of treatment weighted estimator (AIPTW) to maintain the doubly-robust property in line with the CATE (Kurz 2022). We can compare expected outcomes under different OTRs to identify which covariates are most informative and thus should be focused on in policymaking. Our method is unbiased and has 95% confidence interval coverage of data-adaptive parameters.

Methods

Notation

Let W𝒲W \in \mathcal{W} denote the full set of covariates in the data. We then consider a subset of these covariates ZZ to create our OTR such that ZWZ \subseteq W. Let A{0,1}A \in \{0,1\} be a binary treatment decision in which A=1A = 1 if an observation is assigned treatment and A=0A = 0 otherwise. Lastly, let Y𝒴Y \in \mathcal{Y} be a real-valued outcome variable. We denote missingness in outcome variable YY using Δ{0,1}\Delta \in \{0,1\} where Δ=0\Delta = 0 if the outcome is missing and Δ=1\Delta = 1 if the outcome is observed. We observe an independent and identically distributed (i.i.d) sample of observations Oi=(Wi,Ai,Δi,ΔiYi)P0𝒫O_i = (W_i, A_i,\Delta_i, \Delta_iY_i) \sim P_0 \in \mathcal{P}, where P0P_0 is an unknown distribution in model 𝒫\mathcal{P}.

We introduce counterfactual outcome Y(a,Δ=1)P0,a,Δ=1Y(a, \Delta = 1) \sim P_{0, a, \Delta = 1} for every treatment assignment 𝒶𝒜\mathcal{a \in A}. The counterfactual outcome represents the outcome that would have been observed if, perhaps contrary to actual assignments, each observation was assigned to treatment 𝒶\mathcal{a} and no outcomes were missing.

Parameters of Interest

The counterfactual outcomes can be used to quantify the effect of treatment in subgroups of patients with ZZ characteristics through the use of the Conditional Average Treatment Effect (CATE). We define the CATE as ψ(𝓏)=E[Y(a=1,Δ=1)|Z=z]E[Y(a=0,Δ=1)|Z=z]\mathcal{\psi(z)} = E[Y(a = 1, \Delta = 1) | Z = z] - E[Y(a = 0, \Delta = 1) | Z = z]. The CATE is the expected difference in counterfactual outcome for treatment vs placebo in a subgroup of patients with covariates ZZ. If the outcome is binary, ψ(Z)\psi(Z) represents the causal risk difference in the sub-population with covariate values Z=zZ=z, while if the outcome is continuous, ψ(Z)\psi(Z) represents the causal difference in expected outcome for the same subpopulation.

Consider a decision rule dd that assigns treatment to indiviuals based on their specific covariate values. We wish to consider an optimal treatment rule such that we assign treatment to all individuals who would benefit and to none who would be harmed. To do this, we use the CATE to build our rule such that d(Z)=I(ψ(Z)<0)d(Z) = I(\psi(Z) < 0) if YY is an undesirable outcome and d(Z)=I(ψ(Z)>0)d(Z) = I(\psi(Z) > 0) if Y is a desirable outcome. We may further constrain these OTRs to only recommend treatment to individuals with a CATE that reaches magnitude tt:

d(Z)={0ψ(Z)<t1otherwise d(Z) = \begin{cases} 0 & \psi(Z) < t \\ 1 & \mbox{otherwise} \end{cases} By constraining our OTR, we are able to identify subgroups that would experience the greatest benefit under the OTR, balancing benefits of treatment with potential resource restrictions or limitations.

Effect Estimands

We may use various effect estimands to quantify the impact of our constrained OTR on the target population. We focus on the following:

  1. Average Treatment effect among those Recommended Treatment by rule d (ATRTdATRT_d): E[Y(1)Y(0)|d(Z)=1]E[Y(1) - Y(0) | d(Z) = 1]
  2. Average Treatment effect among those Not Recommended Treatment by rule d (ATNRTdATNRT_d): E[Y(1)Y(0)|d(Z)=0]E[Y(1) - Y(0) | d(Z) = 0]
  3. Proportion Treated: E[d(Z)=1]E[d(Z) = 1]
  4. Average Treatment effect under Rule d (ATRdATR_d): E[Y(1)Y(0)]E[Y(1) - Y(0)]
  5. Difference between subgroups (ATRTdATRNTdATRT_d - ATRNT_d): E[Y(1)Y(0)|d(Z)=1]E[Y(1)Y(0)|d(Z)=0]E[Y(1) - Y(0) | d(Z) = 1] - E[Y(1) - Y(0) | d(Z) = 0]

Estimation

Our approach makes use of nested cross-validation to estimate the CATE. We partition the data into KK splits of approximately equal size to define the outer K folds. The kthk^{th} partition serves as the validation sample, while the remaining K1K - 1 partitions serve as the training sample. The training samples are partitioned further into VV splits to define VV inner folds, with the vthv^{th} serving as the validation sample and the V1V - 1 serving as the training sample. We fix V=3V = 3 to ensure adequate sample size within each of the inner validation folds. This nested cross-validation procedure helps to minimize risk of overfitting when estimating the CATE.

Given a training sample, we follow the Doubly-robust (DR)-Learner approach outlined in (Kennedy 2023). Three ‘nuisance’ models used to build a pseudo-outcome: (1) an outcome model μ(a,w)=E[Y|Δ=1,A=a,W=w]\mu(a,w) = E[Y | \Delta = 1, A = a, W = w], a treatment model π(a|w)=P(A=a|W=w)\pi(a | w) = P(A = a | W = w), and a missingness model γ(1|a,w)=P(Δ=1|A=a,W=w)\gamma(1 | a, w) = P(\Delta = 1 | A = a, W = w). Super Learner is used to fit regressions for each of the nuisance models. Given nuisance regression estimates, we define pseudo-outcome

D̂k,v0(Oi)=(2Ai1)Δiπ̂k,v0(AiWi){1γ̂k,v0(1Ai,Wi)}{Yiμ̂k,v0(Ai,Wi)}+{μ̂k,v0(1,Wi)μ̂k,v0(0,Wi)}. \hat{D}_{k,v}^0(O_i) = \frac{(2A_i - 1)\Delta_i}{\hat{\pi}_{k,v}^0(A_i \mid W_i) \{ 1 - \hat{\gamma}_{k,v}^0(1 \mid A_i, W_i)\}} \{ Y_i - \hat{\mu}_{k,v}^0(A_i,W_i)\} + \{ \hat{\mu}_{k,v}^0(1, W_i) - \hat{\mu}_{k,v}^0(0, W_i)\} \ . for each observation i in the v-th validation sample nested in the k-th training sample. We then fit a regression of the pseudo-outcome D̂k,v0(Oi)\hat{D}_{k,v}^0(O_i) on ZZ using data from teh vv-th validation sample, denoting the regression ψ̂k,v1\hat{\psi}_{k,v}^1. The process repeats for each of the VV inner cross-validation folds, resulting in CATE estimates (ψ̂k,v1:v=1,,V)(\hat{\psi}_{k,v}^1: v= 1, \dots, V) which can be averaged to obtain an estimate of the CATE ψ̂k0=1Vv=1Vψ̂k,v1\hat{\psi}_{k}^0 = \frac{1}{V}\sum_{v=1}^V \hat{\psi}_{k,v}^1. We define our optimal treatment rule within the kk-th fold as

d̂k0(Z)={0if ψ̂k0(Z)<t1otherwise. \hat{d}_{k}^0(Z) = \left\{ \begin{array}{cc} 0 & \mbox{if } \hat{\psi}_{k}^0(Z) < t \\ 1 & \mbox{otherwise} \end{array}\right. \ .

We can now use our decision rule learned from the kk-th training sample to estimate our effect estimands of interest. Let us identify the following quantities:

  • The average outcome under treatment a given recommendation of treatment a’: E[Y(a)d̂k0(Z)=a]E[Y(a) \mid \hat{d}_{k}^0(Z) = a'] identified via θk(aa)=E[E(YA=a,Δ=0,W)d̂k0(Z)=a]\theta_k(a \mid a') = E[E(Y \mid A = a, \Delta = 0, W) \mid \hat{d}_k^0(Z) = a']
  • The proportion treated under the rule: pk=P(dk0̂(Z)=1)p_k = P(\hat{d_k^0}(Z) = 1)

Using this notation, θk(1|1)θk(0|1)\theta_k(1|1) - \theta_k(0|1) represents the ATRTdATRT_d, θk(1|0)θk(0|0)\theta_k(1|0) - \theta_k(0|0) represents the ATNRTdATNRT_d, and {θk(1|1)θk(0|1)}pk\{\theta_k(1|1) - \theta_k(0|1)\}p_k represents the ATRdATR_d. We use augmented inverse probability of treatment weighted (AIPTW) estimates of θk(a|a)\theta_k(a |a'). The forms of these estimators are given below. Let k1\mathcal{I}_k^1 denote a set containing the indices of observations in the kk-th validation sample and let nk1n_k^1 denote the cardinality of k1\mathcal{I}_k^1. To estimate pkp_k we can use the estimate p̂k1=(nk1)1i:ik1I(dk0(Zj)=1)\hat{p}_k^1 = (n_k^1)^{-1}\sum_{i: i \in \mathcal{I}_k^1} I(d_k^0(Z_j) = 1). To estimate θk(aa)\theta_k(a \mid a'), we define for each ik1i \in \mathcal{I}_k^1, Ti(aa)=I(dk0(Zi)=a)j:jk1I(dk0(Zj)=a)[I(Ai=a,Δi=0)π̂k0(aWi){1γ̂k0(1a,Wi)}{Yiμ̂k0(a,Wi)}+μ̂k0(a,Wi)]. \begin{align*} T_i(a \mid a') &= \frac{I(d_k^0(Z_i) = a')}{\sum_{j: j \in \mathcal{I}_k^1} I(d_k^0(Z_j) = a')} \left[ \frac{I(A_i = a, \Delta_i = 0)}{\hat{\pi}_k^0(a \mid W_i) \{1 - \hat{\gamma}_k^0(1 \mid a, W_i)\}} \{ Y_i - \hat{\mu}_{k}^0(a, W_i)\} + \hat{\mu}^0_k(a, W_i) \right] \ . \end{align*} An augmented inverse probability of treatment weighted (AIPTW) estimate of θk(aa)\theta_k(a \mid a') is θ̂k1(aa)=1nk1i:ik1Ti(aa) \hat{\theta}_k^1(a \mid a') = \frac{1}{n_{k}^1} \sum_{i: i \in \mathcal{I}_k^1} T_i(a \mid a') The asymptotic variance of n1/2θ̂k1(aa)n^{1/2} \hat{\theta}_k^1(a \mid a') can be estimated using σ̂k2=1nk1i:ik1{Ti(aa)j:jk1nTj(aa)}2 \hat{\sigma}^2_k = \frac{1}{n_{k}^1} \sum_{i: i \in \mathcal{I}_k^1} \left\{ T_i(a \mid a') - \sum_{j: j \in \mathcal{I}_k^1}^n T_j(a \mid a') \right\}^2 \ The delta method for influence functions can be used to compute estimates of the asymptotic variance of scaled estimates of the effect parameters.

We estimate the parameters in each outer cross-validation fold, then average them across folds to obtain overall estimates. A flowchart depicting the overall estimation procedure can be seen in the figure below.

Usage:

First, load the package:

The following example uses a toy data set (abcd_data) that comes with the package. abcd_data is a simulated dataset based on real data from the AntiBiotics for Children with severe Diarrhea (ABCD) trial. The dataset contains 40 covariates (W) and n=6692n = 6692 observations.

data(abcd_data)
head(abcd_data)
#>   pid an_grp_01 rotavirus_new norovirus_new adenovirus_new astrovirus_new
#> 1   1         0             0      0.000000       0.000000              0
#> 2   2         0             0      0.000000       0.000000              0
#> 3   3         0             0      0.000000       0.000000              0
#> 4   4         0             0      4.301341       1.413699              0
#> 5   5         1             0      0.000000       0.000000              0
#> 6   6         1             0      0.000000       4.613438              0
#>   sapovirus_new st_etec_new shigella_new campylobacter_new tepec_new
#> 1      5.809444   0.0000000     1.975475         0.9374672  0.000000
#> 2      0.000000   1.0992149     0.000000         0.0000000  3.197217
#> 3      0.000000   1.2516375     0.000000         0.0000000  0.000000
#> 4      0.000000   0.0000000     0.000000         0.0000000  0.000000
#> 5      0.000000   0.2018044     0.000000         3.0005416  0.000000
#> 6      0.000000   0.0000000     0.000000         0.0000000  0.000000
#>   v_cholerae_new salmonella_new cryptosporidium_new rotavirus_bin norovirus_bin
#> 1              0              0                   0             0             0
#> 2              0              0                   0             0             0
#> 3              0              0                   0             0             0
#> 4              0              0                   0             0             1
#> 5              0              0                   0             0             0
#> 6              0              0                   0             0             0
#>   adenovirus_bin astrovirus_bin sapovirus_bin st_etec_bin shigella_bin
#> 1              0              0             1           0            1
#> 2              0              0             0           1            0
#> 3              0              0             0           1            0
#> 4              1              0             0           0            0
#> 5              0              0             0           1            0
#> 6              1              0             0           0            0
#>   campylobacter_bin tepec_bin v_cholerae_bin salmonella_bin cryptosporidium_bin
#> 1                 1         0              0              0                   0
#> 2                 0         1              0              0                   0
#> 3                 0         0              0              0                   0
#> 4                 0         0              0              0                   0
#> 5                 1         0              0              0                   0
#> 6                 0         0              0              0                   0
#>   dy1_scrn_vomitall dy1_scrn_lstools dy1_scrn_sstools dy1_scrn_diardays
#> 1               Yes                9                1                 0
#> 2               Yes                7                0                 1
#> 3               Yes                7                0                 5
#> 4               Yes                7                0                 2
#> 5               Yes                1                0                 1
#> 6               Yes                5                0                 2
#>      dy1_scrn_dehydr     site dy1_ant_sex agemchild     an_ses_quintile
#> 1   Some dehydration Pakistan      Female  5.801993 4th quintile of SES
#> 2   Some dehydration Pakistan      Female 14.681659 5th quintile of SES
#> 3 Severe dehydration Tanzania      Female  7.558416 3rd quintile of SES
#> 4   Some dehydration   Malawi        Male 13.743810 4th quintile of SES
#> 5     No dehydration     Mali      Female  9.848812 3rd quintile of SES
#> 6     No dehydration    Kenya        Male  9.799398 1st quintile of SES
#>   an_tothhlt5  month_en rotaseason  avemuac  lfazscore  wfazscore  wflzscore
#> 1           3      July          0 12.43363  0.9887319 -0.6014197 -0.8977421
#> 2           1      June          1 11.37052 -3.3641961 -3.8697283 -2.2103208
#> 3           2 September          1 13.50581 -0.7048566 -1.6076714 -1.4870581
#> 4           4   January          0 12.63497 -1.9954476 -2.1567078 -0.7570579
#> 5           2   January          1 13.94833 -0.1411175 -0.7486380 -0.8417239
#> 6           3     March          0 14.19664 -1.3049425 -0.6480911  0.5913560
#>        lazd90
#> 1 -0.24423622
#> 2 -3.24702621
#> 3 -0.05373623
#> 4 -2.17004791
#> 5  0.17351546
#> 6 -2.21432910

We aim to simulate the effect of azithromycin (A=an_grp_01A = an\_grp\_01), an antibiotic commonly used against diarrheal pathogens, on linear growth 90 days after trial enrollment (Y=lazd90Y = lazd90). Approximately 4% of outcomes Y are missing at random. We want to estimate an OTR based on a subset of covariates related to host characteristics measured at baseline (Z=avemuac,wfazscore,wflzscore,lfazscore,dy1_ant_sex,agemchild,an_ses_quintileZ = avemuac, wfazscore, wflzscore, lfazscore, dy1\_ant\_sex, agemchild, an\_ses\_quintile). We test a threshold of t=0.05t = 0.05 for treatment. That means patients will only be recommended treatment if they are expected to have benefit in linear growth at day 90 of the trial of 0.05 z-score or more compared to if they did not receive treatment.

Note in this example, outcome is beneficial. CATEs exceeding the threshold are desirable and indicate high benefit from treatment. If the outcome were undesirable, we would want individuals with CATEs that are have a larger negative magnitude than the threshold to be recommended treatment. The package will automatically adjust for this when setting a negative threshold. However, if using a threshold of zero, make sure to enter “+0” or “-0” as a character string to indicate direction.

Inputs

Let’s set the parameters described above to be passed into the function:

Y <- "lazd90"

A <- "an_grp_01"

W <- c("rotavirus_new", "rotavirus_bin", "norovirus_new", "norovirus_bin", "adenovirus_new",
      "adenovirus_bin", "sapovirus_new","sapovirus_bin", "astrovirus_new", "astrovirus_bin",
      "st_etec_new", "st_etec_bin", "shigella_new", "shigella_bin", "campylobacter_new",
      "campylobacter_bin", "tepec_new", "tepec_bin", "v_cholerae_new", "v_cholerae_bin",
      "salmonella_new", "salmonella_bin", "cryptosporidium_new", "cryptosporidium_bin",
      "dy1_scrn_vomitall", "dy1_scrn_lstools", "dy1_scrn_sstools", "dy1_scrn_diardays",
      "dy1_scrn_dehydr", "avemuac", "wfazscore", "lfazscore", "wflzscore", "site",
      "dy1_ant_sex", "agemchild", "an_ses_quintile", "an_tothhlt5", "month_en", "rotaseason")

Z <- c("avemuac", "wfazscore", "wflzscore", "lfazscore", "dy1_ant_sex", "agemchild", "an_ses_quintile")

t <- 0.05

Next, we need to choose which libraries we are interested in using to fit our nuisance models. The SuperLearner package comes with wrappers for common prediction algorithms (see SuperLearner::listWrappers() for full list of available wrappers). We will choose several for our outcome model that should be able to capture the complex form of the outcome model.

sl.library.outcome <- c("SL.glm", "SL.ranger", "SL.earth")

We know the ABCD trial was a randomized control trial, so SL.mean should be sufficient to estimate the 1:1 randomization.

sl.library.treatment <- c("SL.mean")

Additionally, one may wish to write their own wrappers for SuperLearner based on biological plausibility. In the simulated dataset, we know data is missing at random. However, one may hypothesize that the data is missing based on the variables month and site. We can specify a missingness model using these covariates as follows:

SL.missing.1 <- function(Y, X, newX, family, ...){
  sl.missing.1_fit <- glm(Y ~ site + month_en,
                          data = X,
                          family = family)
  # get predictions on newX
  pred <- predict(
    sl.missing.1_fit, newdata = newX, type = 'response'
  )
  # format the output as named list
  fit <- list(fitted_model.missing.1 = sl.missing.1_fit)
  out <- list(fit = fit, pred = pred)
  # give the object a class
  class(out$fit) <- "SL.missing.1"
  # return the output
  return(out)
}
predict.SL.missing.1 <- function(object, newdata, ...){
  pred <- predict(object$fitted_model.missing.1, newdata = newdata, type="response")
  return(pred)
}

We will include this new wrapper in our list of missingness models

sl.library.missingness <- c("SL.mean", "SL.missing.1")

Lastly, we must specify CATE models that are sufficiently complex to capture the true treatment effect.

sl.library.CATE <- c("SL.glm", "SL.ranger", "SL.earth")

Estimate OTR

To run the analysis, we call estimate_OTR() with the parameters specified above. In addition, we specify the name of the id variable in the dataset (pid), the number of folds used in outer cross-validation (k = 10), and indicate that the outcome is continuous (gaussian). (Note the run-time of the chunk below is approximately 4 minutes using libraries specified above).

set.seed(12345)

results_host <- estimate_OTR(df = abcd_data,
                             Y_name = Y, 
                             A_name = A, 
                             W_list = W, 
                             Z_list = Z, 
                             id_name = "pid",
                             sl.library.outcome = sl.library.outcome,
                             sl.library.treatment = sl.library.treatment,
                             sl.library.missingness = sl.library.missingness,
                             sl.library.CATE = sl.library.CATE,
                             threshold = t,
                             k_folds = 10,
                             outcome_type = "gaussian")

Alternatively, we may choose to break this up into two steps- (1) for nuisance model estimation, and (2) for estimation of effects. This can be helpful to split up time-consuming analyses that use the same set of nuisance models repeatedly. To do this, start by calling the learn_nuisance function:

nuisance_output <- learn_nuisance(df = abcd_data,
                                  Y_name = Y,
                                  A_name = A, 
                                  W_list = W,
                                  sl.library.outcome = sl.library.outcome,
                                  sl.library.treatment = sl.library.treatment,
                                  sl.library.missingness = sl.library.missingness,
                                  k_folds = 10,
                                  outcome_type = "gaussian")
#> Loading required package: nnls
#> Loading required namespace: earth
#> Loading required namespace: ranger

This will return an object of class nuisance which contains model fits (nuisance_models), outer validation fold assignments and pseudo-outcomes (k_fold_assign_and_CATE), and inner validation fold assignments (validRows). These can be passed into estimate_OTR and avoid unnecessary re-fitting of models.

nuisance_models <- nuisance_output$nuisance_models
k_fold_assign_and_CATE <- nuisance_output$k_fold_assign_and_CATE
validRows <- nuisance_output$validRows

results_host <- estimate_OTR(df = abcd_data,
                             Y_name = Y, 
                             A_name = A, 
                             W_list = W, 
                             Z_list = Z, 
                             id_name = "pid",
                             sl.library.CATE = sl.library.CATE,
                             nuisance_models = nuisance_models,
                             k_fold_assign_and_CATE = k_fold_assign_and_CATE,
                             validRows = validRows,
                             threshold = t,
                             k_folds = 10,
                             outcome_type = "gaussian")

Once we have obtained results, we can print them in a table to console using print:

print(results_host)
#>                                Results for  threshold =  0.05  Aggregated Across k =  10  folds 
#> --------------------------------------------------------------------------------------------------------------------------------------- 
#>                                                             Estimate            Standard Error      95% CI: Lower       95% CI: Upper       
#> --------------------------------------------------------------------------------------------------------------------------------------- 
#> E[Y(1) | d(Z) = 1]                                          -1.5367             0.0426              -1.6202             -1.4533             
#> E[Y(0) | d(Z) = 1]                                          -1.5572             0.0431              -1.6417             -1.4728             
#> E[d(Z) = 1]                                                 0.3753              0.0057              0.3642              0.3864              
#> E[Y(1) - Y(0) | d(Z) = 1]                                   0.0205              0.0397              -0.0573             0.0983              
#> E[Y(1) - Y(0) | d(Z) = 0]                                   0.052               0.0182              0.0164              0.0877              
#> E[Y(d) - Y(0)]                                              0.0062              0.0086              -0.0107             0.0231              
#> E[Y(1) - Y(0) | d(Z) = 1] - E[Y(1) - Y(0) | d(Z) = 0]       -0.0315             0.0437              -0.1172             0.0541              
#> 
#> Covariates used in decision rule:  avemuac, wfazscore, wflzscore, lfazscore, dy1_ant_sex, agemchild, an_ses_quintile

We can interpret the results as follows:

E[Y(1)|d(Z)=1]E[Y(1) | d(Z) = 1]: The expected length-for-age z-score at day 90 of the trial for children who are recommended treatment under the host OTR and receive treatment is -1.65 (95% CI: -1.70 to -1.60).

E[Y(0)|d(Z)=1]E[Y(0) | d(Z) = 1]: The expected length-for-age z-score at day 90 of the trial for children who are recommended treatment under the host OTR but do not receive treatment is -1.72 (95% CI: -1.77 to -1.66).

E[d(Z)=1]E[d(Z) = 1]: The percentage of children recommended treatment under the host OTR is 42.92% (95% CI: 41.73% to 44.10%)

E[Y(1)Y(0)|d(Z)=1]E[Y(1) - Y(0) | d(Z) = 1] / ATRTdATRT_d: The average treatment effect among those recommended treatment under the host OTR is 0.07 (95% CI: 0.03 to 0.11). Children who are recommended treatment under the rule and receive it will have a length-for-age z-score that is 0.07 higher than if they had not received treatment.

E[Y(1)Y(0)|d(Z)=0]E[Y(1) - Y(0) | d(Z) = 0] / ATNRTdATNRT_d: The average treatment effect among those not recommended treatment under the host OTR is 0.02 (95% CI: -0.02 to 0.05). Children who are not recommended treatment under the rule yet still receive it would have a length-for-age z-score that is 0.02 higher than their z-score without receiving treatment.

E[Y(d)Y(0)]E[Y(d) - Y(0)] / ATRdATR_d: The overall average treatment effect of the rule is 0.03 (95% CI: 0.01 to 0.05). At a population level, implementation of the treatment rule improves length-for-age z-scores at day 90 of the trial by 0.03.

E[Y(1)Y(0)|d(Z)=1]E[Y(1)Y(0)|d(Z)=0]E[Y(1) - Y(0) | d(Z) = 1] - E[Y(1) - Y(0) | d(Z) = 0] : The difference in treatment effect between the treated and untreated subgroups is 0.05 (95% CI: 0.00 to 0.11)

One may wish to examine other output in the final full_otr_results object. The full_otr_results contains the following elements:

  • results (class otr_results)
    This component contains results for various thresholds, each represented as a separate Results object.

    • threshold = t1 (Results object)
      • aggregated_results: Summary results across k CATE models
      • k_fold_results: Results and influence functions from each of the k-folds
      • decision_df: Data frame containing CATE prediction and decision for each observation
      • k_non_na: Folds that did not result in proportion treated = 1 or 0 (leading to NA effect estimates)
    • threshold = t2 (Results object)
      • Same structure as threshold = t1 if additional threshold(s) specified
    • ...
      • Additional thresholds can be added in the same manner.
  • nuisance_models
    A list of nuisance models used in the analysis.

    • fold 1 (Nuisance object)
      • outcome_model: Model for outcome.
      • treatment_model: Model for the treatment.
      • missingness_model: Model for missing data.
    • fold 2 (Nuisance object)
      • Structure similar to fold 1.
    • ...
      • Up to K nuisance models
  • CATE_models
    A list of CATE models generated during the analysis.

    • fold 1: CATE model results for fold 1.
    • fold 2: CATE model results for fold 2.
    • ...: Up to k CATE models
  • Z_list
    List of covariates used for CATE model

Note- results objects may become very large depending on the libraries used in model fitting. One could opt to save the otr_results object rather than the full_otr_results to conserve memory and maintain output printing functionality.

Compare OTRs

We may be interested in comparing results and making inference about them across different rules. Let’s consider another OTR based on Shigella pathogen quantity.


Z <- c("shigella_new", "shigella_bin")

results_shigella <- estimate_OTR(df = abcd_data,
                             Y_name = Y, 
                             A_name = A, 
                             W_list = W, 
                             Z_list = Z, 
                             id_name = "pid",
                             sl.library.outcome = sl.library.outcome,
                             sl.library.treatment = sl.library.treatment,
                             sl.library.missingness = sl.library.missingness,
                             sl.library.CATE = sl.library.CATE,
                             threshold = t,
                             k_folds = 10,
                             outcome_type = "gaussian")

print(results_shigella)
#>                                Results for  threshold =  0.05  Aggregated Across k =  10  folds 
#> --------------------------------------------------------------------------------------------------------------------------------------- 
#>                                                             Estimate            Standard Error      95% CI: Lower       95% CI: Upper       
#> --------------------------------------------------------------------------------------------------------------------------------------- 
#> E[Y(1) | d(Z) = 1]                                          -1.5163             0.0397              -1.594              -1.4386             
#> E[Y(0) | d(Z) = 1]                                          -1.7533             0.0388              -1.8292             -1.6773             
#> E[d(Z) = 1]                                                 0.1905              0.0048              0.1811              0.1999              
#> E[Y(1) - Y(0) | d(Z) = 1]                                   0.237               0.0328              0.1727              0.3013              
#> E[Y(1) - Y(0) | d(Z) = 0]                                   -0.005              0.0156              -0.0357             0.0256              
#> E[Y(d) - Y(0)]                                              0.0448              0.0066              0.032               0.0577              
#> E[Y(1) - Y(0) | d(Z) = 1] - E[Y(1) - Y(0) | d(Z) = 0]       0.242               0.0363              0.1708              0.3132              
#> 
#> Covariates used in decision rule:  shigella_new, shigella_bin

We can compare the average treatment effect among those recommended treatment under the rule (ATRTdATRT_d, subgroup effect) across rules by calling the compare.otr_results function.

compare.otr_results(res_rule1 = results_host,
                    res_rule2 = results_shigella,
                    threshold = t, 
                    rule1_comp = "se",
                    rule2_comp = "se")
#> 
#>  Subgroup Effect E[Y(d) - Y(0) | d(Z) = 1]  for rule 1 at threshold =  0.05 
#>  vs 
#>  Subgroup Effect E[Y(d) - Y(0) | d(Z) = 1]  for rule 2 at threshold =  0.05 
#> --------------------------------------------------------------------------------------------------------- 
#>                               Estimate            Standard Error      95% CI: Lower       95% CI: Upper       
#> --------------------------------------------------------------------------------------------------------- 
#> Rule 1 - Rule 2               -0.2165             0.0514              -0.3172             -0.1158             
#> 
#>  Rule 1: Z =  avemuac, wfazscore, wflzscore, lfazscore, dy1_ant_sex, agemchild, an_ses_quintile
#>  Rule 2: Z =  shigella_new, shigella_bin

The difference in ATRTdATRT_d between the host and Shigella rules is -0.17 (95% CI: -0.25 to -0.09). The subgroup that is treated using a rule based on Shigella will have an average z-score that is 0.17 higher at day 90 than the average z-score of the subgroup that is treated using a rule based on host characteristics. A rule based on Shigella may be more effective at improving length-for-age z-score at day 90 than a rule based on host characteristics using a threshold of 0.05.

Average results across seeds

One may be interested in running multiple simulations for a set rule and averaging the results to account for random variability in the model fitting. The average_across_seeds function may be used to average the results.


results_list <- vector(mode = "list", length = 3)

for(seed in 1:3){
  
  set.seed(seed)
  
  results_shigella <- estimate_OTR(df = abcd_data,
                                   Y_name = Y, 
                                   A_name = A, 
                                   W_list = W, 
                                   Z_list = Z, 
                                   id_name = "pid",
                                   sl.library.outcome = sl.library.outcome,
                                   sl.library.treatment = sl.library.treatment,
                                   sl.library.missingness = sl.library.missingness,
                                   sl.library.CATE = sl.library.CATE,
                                   threshold = t,
                                   k_folds = 10,
                                   outcome_type = "gaussian")
  
  results_list[[seed]] <- results_shigella
  
}

average_across_seeds(results_list = results_list, threshold = t)
#>                                         Average results across n =  3  seeds for threshold  0.05 
#> --------------------------------------------------------------------------------------------------------------------------------------- 
#>                                                             Estimate            Standard Error      95% CI: Lower       95% CI: Upper       
#> --------------------------------------------------------------------------------------------------------------------------------------- 
#> E[Y(1) | d(Z) = 1]                                          -1.515              0.0397              -1.5929             -1.4372             
#> E[Y(0) | d(Z) = 1]                                          -1.7506             0.039               -1.8271             -1.6742             
#> E[d(Z) = 1]                                                 0.1908              0.0048              0.1814              0.2002              
#> E[Y(1) - Y(0) | d(Z) = 1]                                   0.2356              0.0328              0.1713              0.2999              
#> E[Y(1) - Y(0) | d(Z) = 0]                                   -0.0062             0.0156              -0.0369             0.0244              
#> E[Y(d) - Y(0)]                                              0.0451              0.0066              0.0323              0.058               
#> E[Y(1) - Y(0) | d(Z) = 1] - E[Y(1) - Y(0) | d(Z) = 0]       0.2418              0.0363              0.1706              0.313               
#> 
#> Covariates used in decision rule:  shigella_new, shigella_bin

Session info

sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] nnls_1.6    drotr_0.1.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] cli_3.6.5           knitr_1.50          rlang_1.1.6        
#>  [4] xfun_0.52           Formula_1.2-5       textshaping_1.0.1  
#>  [7] SuperLearner_2.0-29 jsonlite_2.0.0      ranger_0.17.0      
#> [10] htmltools_0.5.8.1   ragg_1.4.0          sass_0.4.10        
#> [13] rmarkdown_2.29      grid_4.5.1          evaluate_1.0.4     
#> [16] jquerylib_0.1.4     earth_5.3.4         fastmap_1.2.0      
#> [19] yaml_2.3.10         foreach_1.5.2       lifecycle_1.0.4    
#> [22] compiler_4.5.1      codetools_0.2-20    fs_1.6.6           
#> [25] Rcpp_1.1.0          lattice_0.22-7      systemfonts_1.2.3  
#> [28] digest_0.6.37       R6_2.6.1            splines_4.5.1      
#> [31] gam_1.22-5          Matrix_1.7-3        bslib_0.9.0        
#> [34] plotmo_3.6.4        tools_4.5.1         plotrix_3.8-4      
#> [37] iterators_1.0.14    pkgdown_2.1.3       cachem_1.1.0       
#> [40] desc_1.4.3
Kennedy, Edward H. 2023. “Towards Optimal Doubly Robust Estimation of Heterogeneous Causal Effects.” arXiv. https://doi.org/10.48550/arXiv.2004.14497.
Kurz, C. F. 2022. “Augmented Inverse Probability Weighting and the Double Robustness Property.” Medical Decision Making: An International Journal of the Society for Medical Decision Making 42 (2): 156–67. https://doi.org/10.1177/0272989X211027181.
Luedtke, Alexander R., and Mark J. van der Laan. 2016. “Super-Learning of an Optimal Dynamic Treatment Rule.” The International Journal of Biostatistics 12 (1): 305. https://doi.org/10.1515/ijb-2015-0052.
Montoya, Lina M., Mark J. van der Laan, Alexander R. Luedtke, Jennifer L. Skeem, Jeremy R. Coyle, and Maya L. Petersen. 2023. “The Optimal Dynamic Treatment Rule Superlearner: Considerations, Performance, and Application to Criminal Justice Interventions.” The International Journal of Biostatistics 19 (1): 217–38. https://doi.org/10.1515/ijb-2020-0127.