This package assumes that a hierarchical testing procedure for the three-arm gold-standard non-inferiority design is applied. The first test aims to establish assay sensitivity of the trial. It is a test of superiority of the experimental treatment (T) against the placebo treatment (P). If assay sensitivity is successfully established, the treatment is tested for non-inferiority against the control treatment (C). Individual observations are assumed to be normally distributed, where higher values correspond to better treatment effects. Testing is assumed to be done via Z test statistics.

We highly recommend reading our open-access article (Meis et al., 2022) where the theoretical background of this package is explained.

To showcase the capabilities of this package, we will reproduce some results from the paper in the following.

It should be noted that the results will not completely agree with the results from the paper, as the calculations in the paper used much lower error tolerances and more function evaluations.

To achieve results closer to the results from the paper, you can supply the following options, though this will significantly increase computation times:

```
mvnorm_algorithm = mvtnorm::Miwa(
# steps = 128,
steps = 4097,
checkCorr = FALSE,
maxval = 1000),
nloptr_opts = list(algorithm = "NLOPT_LN_SBPLX",
# xtol_abs = 1e-3,
# xtol_rel = 1e-2,
# maxeval = 2000,
xtol_abs = 1e-10,
xtol_rel = 1e-9,
maxeval = 2000,
print_level = 0)
```

You may also want to put

when running code interactively to see the progress of the optimization.

The designs from in Table 2 from the paper are optimized to minimize the expected sample size under the alternative hypothesis.

This is (approximately) the first line in Table 2 from the paper:

```
tab1_D1 <- optimize_design_onestage(
alpha = .025,
beta = .2,
alternative_TP = .4,
alternative_TC = 0,
Delta = .2,
print_progress = FALSE
)
tab1_D1
#> Sample sizes (stage 1): T: 413, P: 125, C: 404
#> Efficacy boundaries (stage 1): Z_TP_e: 1.95996, Z_TC_e: 1.95996
#> Maximum overall sample size: 942
#> Placebo penalty at optimum (kappa * nP): 0.0
#> Objective function value: 942.0
#> Type I error for TP testing: 2.5%
#> Type I error for TC testing: 2.5%
#> Power: 80.2%
```

This is (approximately) the second line in Table 2 from the paper:

```
optimize_design_twostage(
cP1 = tab1_D1$stagec[[1]]$P, # The allocation ratios are enforced to be
cC1 = tab1_D1$stagec[[1]]$C, # the same as in the optimal single-stage design.
cT2 = 1,
cP2 = tab1_D1$stagec[[1]]$P,
cC2 = tab1_D1$stagec[[1]]$C,
bTP1f = -Inf, # These two boundary conditions enforce no futility stops.
bTC1f = -Inf,
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE
)
#> Sample sizes (stage 1): T: 224, P: 68, C: 219
#> Sample sizes (stage 2): T: 224, P: 68, C: 219
#> Efficacy boundaries (stage 1): Z_TP_e: 2.10510, Z_TC_e: 2.27093
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.27188, Z_TC_e: 2.10568
#> Inverse normal combination test weights (TP): w1: 0.70711, w2: 0.70711
#> Inverse normal combination test weights (TC): w1: 0.70711, w2: 0.70711
#> Maximum overall sample size: 1022
#> Expected sample size (H1): 801.2
#> Expected sample size (H0): 1020.3
#> Expected placebo group sample size (H1): 82.8
#> Expected placebo group sample size (H0): 134.8
#> Objective function value: 801.2
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.20%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests
```

This is (approximately) the third line in Table 2 from the paper:

```
optimize_design_twostage(
bTP1f = -Inf, # These two boundary conditions enforce no futility stops.
bTC1f = -Inf,
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE
)
#> Sample sizes (stage 1): T: 230, P: 90, C: 224
#> Sample sizes (stage 2): T: 202, P: 106, C: 191
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04997, Z_TC_e: 2.27978
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.39960, Z_TC_e: 2.09141
#> Inverse normal combination test weights (TP): w1: 0.69161, w2: 0.72227
#> Inverse normal combination test weights (TC): w1: 0.73218, w2: 0.68111
#> Maximum overall sample size: 1043
#> Expected sample size (H1): 787.2
#> Expected sample size (H0): 1040.3
#> Expected placebo group sample size (H1): 103.1
#> Expected placebo group sample size (H0): 193.9
#> Objective function value: 787.2
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.06%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests
```

This is (approximately) the fourth line in Table 2 from the paper:

```
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = FALSE
)
#> Sample sizes (stage 1): T: 238, P: 84, C: 241
#> Sample sizes (stage 2): T: 201, P: 122, C: 185
#> Efficacy boundaries (stage 1): Z_TP_e: 2.03084, Z_TC_e: 2.27784
#> Futility boundaries (stage 1): Z_TP_f: -0.29297, Z_TC_f: 0.57221
#> Efficacy boundaries (stage 2): Z_TP_e: 2.47898, Z_TC_e: 2.08790
#> Inverse normal combination test weights (TP): w1: 0.66534, w2: 0.74654
#> Inverse normal combination test weights (TC): w1: 0.74431, w2: 0.66783
#> Maximum overall sample size: 1071
#> Expected sample size (H1): 775.4
#> Expected sample size (H0): 672.8
#> Expected placebo group sample size (H1): 97.9
#> Expected placebo group sample size (H0): 109.3
#> Objective function value: 775.4
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.43%
#> Probability of futility stop (H1): 5.33%
#> Probability of futility stop (H0): 77.96%
#> Minimum conditional power: 19.62%
#> Power: 80.01%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests
```

This is (approximately) the fourth line in Table 2 from the paper:

```
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = TRUE
)
#> Sample sizes (stage 1): T: 229, P: 90, C: 231
#> Sample sizes (stage 2): T: 217, P: 107, C: 199
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04659, Z_TC_e: 2.29485
#> Futility boundaries (stage 1): Z_TP_f: 0.23336, Z_TC_f: 0.75795
#> Efficacy boundaries (stage 2): Z_TP_e: 2.40505, Z_TC_e: 2.04331
#> Inverse normal combination test weights (TP): w1: 0.68710, w2: 0.72656
#> Inverse normal combination test weights (TC): w1: 0.72466, w2: 0.68911
#> Maximum overall sample size: 1073
#> Expected sample size (H1): 768.5
#> Expected sample size (H0): 619.9
#> Expected placebo group sample size (H1): 100.2
#> Expected placebo group sample size (H0): 103.5
#> Objective function value: 768.5
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 8.33%
#> Probability of futility stop (H0): 86.28%
#> Minimum conditional power: 34.17%
#> Power: 80.16%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests
```

Next, we will optimize a design under a combination of null and alternative hypothesis.

This is (approximately) the third line in Table 3 from the paper:

```
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = TRUE,
lambda = 0.9
)
#> Sample sizes (stage 1): T: 227, P: 89, C: 231
#> Sample sizes (stage 2): T: 230, P: 98, C: 213
#> Efficacy boundaries (stage 1): Z_TP_e: 2.05198, Z_TC_e: 2.26340
#> Futility boundaries (stage 1): Z_TP_f: 0.85517, Z_TC_f: 0.77016
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34293, Z_TC_e: 2.06018
#> Inverse normal combination test weights (TP): w1: 0.69370, w2: 0.72026
#> Inverse normal combination test weights (TC): w1: 0.71238, w2: 0.70180
#> Maximum overall sample size: 1088
#> Expected sample size (H1): 771.1
#> Expected sample size (H0): 587.6
#> Expected placebo group sample size (H1): 98.2
#> Expected placebo group sample size (H0): 95.6
#> Objective function value: 758.0
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 9.18%
#> Probability of futility stop (H0): 92.17%
#> Minimum conditional power: 43.67%
#> Power: 80.15%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests
```

Now we will optimize a design under the alternative while putting an extra penalty on placebo group sample size.

This is (approximately) the fourth line in Table 2 from the paper:

```
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = TRUE,
kappa = 0.5
)
#> Sample sizes (stage 1): T: 239, P: 75, C: 237
#> Sample sizes (stage 2): T: 211, P: 114, C: 204
#> Efficacy boundaries (stage 1): Z_TP_e: 2.03405, Z_TC_e: 2.25340
#> Futility boundaries (stage 1): Z_TP_f: 0.01742, Z_TC_f: 0.80964
#> Efficacy boundaries (stage 2): Z_TP_e: 2.46906, Z_TC_e: 2.06256
#> Inverse normal combination test weights (TP): w1: 0.65529, w2: 0.75538
#> Inverse normal combination test weights (TC): w1: 0.73076, w2: 0.68263
#> Maximum overall sample size: 1080
#> Expected sample size (H1): 767.5
#> Expected sample size (H0): 624.9
#> Expected placebo group sample size (H1): 89.9
#> Expected placebo group sample size (H0): 90.1
#> Objective function value: 812.4
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 8.59%
#> Probability of futility stop (H0): 85.70%
#> Minimum conditional power: 31.96%
#> Power: 80.09%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests
```

Next, we will optimize a design under a combination of null and alternative hypothesis while including a penalty on the placebo group sample size.

This is (approximately) the seventh line in Table 2 from the paper:

```
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = TRUE,
lambda = .9,
kappa = 1
)
#> Sample sizes (stage 1): T: 235, P: 71, C: 236
#> Sample sizes (stage 2): T: 222, P: 88, C: 224
#> Efficacy boundaries (stage 1): Z_TP_e: 2.05815, Z_TC_e: 2.26759
#> Futility boundaries (stage 1): Z_TP_f: 0.75006, Z_TC_f: 0.78151
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34303, Z_TC_e: 2.05618
#> Inverse normal combination test weights (TP): w1: 0.67865, w2: 0.73446
#> Inverse normal combination test weights (TC): w1: 0.71693, w2: 0.69714
#> Maximum overall sample size: 1076
#> Expected sample size (H1): 776.6
#> Expected sample size (H0): 584.6
#> Expected placebo group sample size (H1): 83.8
#> Expected placebo group sample size (H0): 77.4
#> Objective function value: 846.6
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 9.26%
#> Probability of futility stop (H0): 91.74%
#> Minimum conditional power: 40.58%
#> Power: 80.07%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests
```

```
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
eta = 1
)
#> Sample sizes (stage 1): T: 224, P: 84, C: 248
#> Sample sizes (stage 2): T: 190, P: 55, C: 167
#> Efficacy boundaries (stage 1): Z_TP_e: 2.25324, Z_TC_e: 2.52099
#> Futility boundaries (stage 1): Z_TP_f: -0.27777, Z_TC_f: -0.06567
#> Efficacy boundaries (stage 2): Z_TP_e: 2.09262, Z_TC_e: 2.00438
#> Inverse normal combination test weights (TP): w1: 0.76715, w2: 0.64146
#> Inverse normal combination test weights (TC): w1: 0.75346, w2: 0.65750
#> Maximum overall sample size: 968
#> Expected sample size (H1): 800.9
#> Expected sample size (H0): 711.8
#> Expected placebo group sample size (H1): 94.2
#> Expected placebo group sample size (H0): 104.3
#> Objective function value: 1768.9
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.49%
#> Probability of futility stop (H1): 1.30%
#> Probability of futility stop (H0): 62.00%
#> Minimum conditional power: 4.54%
#> Power: 80.13%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests
```

```
optimize_design_twostage(
cT2 = 1, # These three boundary conditions enforce a
cP2 = quote(cP1), # between-stage allocation ratio of one.
cC2 = quote(cC1), # The quote() command is necessary for this to work.
bTP1f = -Inf, # These two boundary conditions enforce no futility stops.
bTC1f = -Inf,
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE
)
#> Sample sizes (stage 1): T: 217, P: 87, C: 212
#> Sample sizes (stage 2): T: 217, P: 87, C: 212
#> Efficacy boundaries (stage 1): Z_TP_e: 2.06549, Z_TC_e: 2.28000
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34934, Z_TC_e: 2.10025
#> Inverse normal combination test weights (TP): w1: 0.70711, w2: 0.70711
#> Inverse normal combination test weights (TC): w1: 0.70711, w2: 0.70711
#> Maximum overall sample size: 1032
#> Expected sample size (H1): 789.8
#> Expected sample size (H0): 1029.7
#> Expected placebo group sample size (H1): 99.0
#> Expected placebo group sample size (H0): 172.3
#> Objective function value: 789.8
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.09%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests
```

You can replace the default objective function by any quoted
expression. In the following example, we optimize the design parameters
to minimize the expected squared sample size under the alternative
hypothesis. These expressions can make use of internal objects created
in the objective evaluation methods, check out the source code of
`optimize_design_twostage`

in the
`optimization_methods.R`

file for more information.
`ASN`

, `ASNP`

, `n`

and
`final_state_probs`

could be useful object for crafting a
custom objective function.

```
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
objective = quote((final_state_probs[["H1"]][["TP1E_TC1E"]] + final_state_probs[["H1"]][["TP1F_TC1F"]]) *
(n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]])^2 +
(final_state_probs[["H1"]][["TP1E_TC12E"]] + final_state_probs[["H1"]][["TP1E_TC12F"]]) *
(n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]] + n[[2]][["T"]] + n[[2]][["C"]])^2 +
(final_state_probs[["H1"]][["TP12F_TC1"]] + final_state_probs[["H1"]][["TP12E_TC12E"]] +
final_state_probs[["H1"]][["TP12E_TC12F"]]) *
(n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]] + n[[2]][["T"]] + n[[2]][["P"]] + n[[2]][["C"]])^2)
)
#> Sample sizes (stage 1): T: 265, P: 86, C: 250
#> Sample sizes (stage 2): T: 157, P: 106, C: 178
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04615, Z_TC_e: 2.29343
#> Futility boundaries (stage 1): Z_TP_f: 0.05568, Z_TC_f: 0.61221
#> Efficacy boundaries (stage 2): Z_TP_e: 2.40364, Z_TC_e: 2.06634
#> Inverse normal combination test weights (TP): w1: 0.70160, w2: 0.71257
#> Inverse normal combination test weights (TC): w1: 0.77851, w2: 0.62763
#> Maximum overall sample size: 1042
#> Expected sample size (H1): 776.9
#> Expected sample size (H0): 676.6
#> Expected placebo group sample size (H1): 97.0
#> Expected placebo group sample size (H0): 103.3
#> Objective function value: 636459.1
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.45%
#> Probability of futility stop (H1): 4.93%
#> Probability of futility stop (H0): 82.46%
#> Minimum conditional power: 14.48%
#> Power: 80.06%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests
```

Meis, J, Pilz, M, Herrmann, C, Bokelmann, B, Rauch, G, Kieser, M.
Optimization of the two-stage group sequential three-arm gold-standard
design for non-inferiority trials. *Statistics in Medicine.*
2023; 42( 4): 536– 558. doi:10.1002/sim.9630.