---
title: "Package MKinfer"
author: "Matthias Kohl"
date: "`r Sys.Date()`"
output: 
  rmarkdown::html_vignette:
    toc: true
bibliography: MKinfer.bib
vignette: >
  %\VignetteIndexEntry{Package MKinfer}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{utf8}
---


## Introduction  

Package MKinfer includes a collection of functions for the computation of 
various confidence intervals [@Altman2000; @Hedderich2018] including bootstrapped 
versions [@Davison1997] as well as Hsu [@Hedderich2018], permutation [@Janssen1997], 
bootstrap [@Davison1997], intersection-union [@Sozu2015] and multiple imputation 
[@Barnard1999] t-test; furthermore, computation of intersection-union z-test 
as well as multiple imputation Wilcoxon tests. Graphical visualizations: volcano plot, 
Bland-Altman plots [@Bland1986; @Shieh2018], mean difference plot [@Boehning2008], 
plot of test statistic for permutation and bootstrap tests as well as objects 
of class htest.

We first load the package.

```{r}
library(MKinfer)
```


## Confidence Intervals

### Binomial Proportion
There are several functions for computing confidence intervals. We can compute
12 different confidence intervals for binomial proportions [@DasGupta2001]; e.g.

```{r}
## default: "wilson"
binomCI(x = 12, n = 50)
## Clopper-Pearson interval
binomCI(x = 12, n = 50, method = "clopper-pearson")
## identical to 
binom.test(x = 12, n = 50)$conf.int
```

For all intervals implemented see the help page of function binomCI. One can
also compute bootstrap intervals via function boot.ci of package boot [@Davison1997] 
as well as one-sided intervals.


### Difference of Two Binomial Proportions
There are several functions for computing confidence intervals. We can compute
different confidence intervals for the difference of two binomial proportions
(independent [@Newcombe1998a] and paired case [@Newcombe1998b]); e.g.

```{r}
## default: wilson 
binomDiffCI(a = 5, b = 0, c = 51, d = 29)
## default: wilson with continuity correction
binomDiffCI(a = 212, b = 144, c = 256, d = 707, paired = TRUE)
```

For all intervals implemented see the help page of function binomDiffCI. One
can also compute boostrap intervals via function boot.ci of package 
boot [@Davison1997]  as well as one-sided intervals.


### Mean and SD
We can compute confidence intervals for mean and SD [@Altman2000, @Davison1997].

```{r}
x <- rnorm(50, mean = 2, sd = 3)
## mean and SD unknown
normCI(x)
meanCI(x)
sdCI(x)
## SD known
normCI(x, sd = 3)
## mean known
normCI(x, mean = 2, alternative = "less")
## bootstrap
meanCI(x, boot = TRUE)
```


### Difference in Means
We can compute confidence interval for the difference of means [@Altman2000; 
@Hedderich2018; @Davison1997].

```{r}
x <- rnorm(20)
y <- rnorm(20, sd = 2)
## paired
normDiffCI(x, y, paired = TRUE)
## compare
normCI(x-y)
## bootstrap
normDiffCI(x, y, paired = TRUE, boot = TRUE)

## unpaired
y <- rnorm(10, mean = 1, sd = 2)
## classical
normDiffCI(x, y, method = "classical")
## Welch (default as in case of function t.test)
normDiffCI(x, y, method = "welch")
## Hsu
normDiffCI(x, y, method = "hsu")
## bootstrap: assuming equal variances
normDiffCI(x, y, method = "classical", boot = TRUE, bootci.type = "bca")
## bootstrap: assuming unequal variances
normDiffCI(x, y, method = "welch", boot = TRUE, bootci.type = "bca")
```

In case of unequal variances and unequal sample sizes per group the classical
confidence interval may have a bad coverage (too long or too short), as is 
indicated by the small Monte-Carlo simulation study below.

```{r}
M <- 100
CIhsu <- CIwelch <- CIclass <- matrix(NA, nrow = M, ncol = 2)
for(i in 1:M){
  x <- rnorm(10)
  y <- rnorm(30, sd = 0.1)
  CIclass[i,] <- normDiffCI(x, y, method = "classical")$conf.int
  CIwelch[i,] <- normDiffCI(x, y, method = "welch")$conf.int
  CIhsu[i,] <- normDiffCI(x, y, method = "hsu")$conf.int
}
## coverage probabilies
## classical
sum(CIclass[,1] < 0 & 0 < CIclass[,2])/M
## Welch
sum(CIwelch[,1] < 0 & 0 < CIwelch[,2])/M
## Hsu
sum(CIhsu[,1] < 0 & 0 < CIhsu[,2])/M
```


### Coefficient of Variation
We provide 12 different confidence intervals for the (classical) coefficient 
of variation [@Gulhar2012; @Davison1997]; e.g.

```{r}
x <- rnorm(100, mean = 10, sd = 2) # CV = 0.2
## default: "miller"
cvCI(x)
## Gulhar et al. (2012)
cvCI(x, method = "gulhar")
## bootstrap
cvCI(x, method = "boot")
```

For all intervals implemented see the help page of function cvCI.


### Quantiles, Median and MAD
We start with the computation of confidence intervals for quantiles [@Hedderich2018; @Davison1997].

```{r}
x <- rexp(100, rate = 0.5)
## exact
quantileCI(x = x, prob = 0.95)
## asymptotic
quantileCI(x = x, prob = 0.95, method = "asymptotic")
## boot
quantileCI(x = x, prob = 0.95, method = "boot")
```

Next, we consider the median.

```{r}
## exact
medianCI(x = x)
## asymptotic
medianCI(x = x, method = "asymptotic")
## boot
medianCI(x = x, method = "boot")
```

Finally, we take a look at MAD (median absolute deviation) where by default 
the standardized MAD is used (see function mad).

```{r}
## exact
madCI(x = x)
## aysymptotic
madCI(x = x, method = "asymptotic")
## boot
madCI(x = x, method = "boot")
## unstandardized
madCI(x = x, constant = 1)
```


## Hsu Two-Sample t-Test
The Hsu two-sample t-test is an alternative to the Welch two-sample t-test using
a different formula for computing the degrees of freedom of the respective 
t-distribution [@Hedderich2018]. The following code is taken and adapted from 
the help page of the t.test function.

```{r}
t.test(1:10, y = c(7:20))      # P = .00001855
t.test(1:10, y = c(7:20, 200)) # P = .1245    -- NOT significant anymore
hsu.t.test(1:10, y = c(7:20))
hsu.t.test(1:10, y = c(7:20, 200))

## Traditional interface
with(sleep, t.test(extra[group == 1], extra[group == 2]))
with(sleep, hsu.t.test(extra[group == 1], extra[group == 2]))
## Formula interface
t.test(extra ~ group, data = sleep)
hsu.t.test(extra ~ group, data = sleep)
```

We compare the result of the Welch t-test and Hsu t-test using function h0plot.

```{r, fig.width=7, fig.height=7}
h0plot(t.test(extra ~ group, data = sleep))
h0plot(hsu.t.test(extra ~ group, data = sleep))
```


## Bootstrap t-Test
One and two sample bootstrap t-tests with equal or unequal variances in the
two sample case [@Efron1993].

```{r}
boot.t.test(1:10, y = c(7:20)) # without bootstrap: P = .00001855
boot.t.test(1:10, y = c(7:20, 200)) # without bootstrap: P = .1245

## Traditional interface
with(sleep, boot.t.test(extra[group == 1], extra[group == 2]))
## Formula interface
boot.t.test(extra ~ group, data = sleep)
```

Using function h0plot the distribution of the bootstrap results of the test
statistic can be plotted.

```{r, fig.width=7, fig.height=7}
h0plot(boot.t.test(extra ~ group, data = sleep, bootStat = TRUE))
```


## Permutation t-Test
One and two sample permutation t-tests with equal [@Efron1993] or unequal 
variances [@Janssen1997] in the two sample case.

```{r}
perm.t.test(1:10, y = c(7:20)) # without permutation: P = .00001855
## permutation confidence interval sensitive to outlier!
perm.t.test(1:10, y = c(7:20, 200)) # without permutation: P = .1245

## Traditional interface
with(sleep, perm.t.test(extra[group == 1], extra[group == 2]))
## Formula interface
res <- perm.t.test(extra ~ group, data = sleep)
res
```

Using function h0plot the distribution of the permutation results of the test
statistic can be plotted.

```{r, fig.width=7, fig.height=7}
h0plot(perm.t.test(extra ~ group, data = sleep, permStat = TRUE))
```

In case of skewed distributions, one may use function p2ses to compute an
alternative standardized effect size (SES) as proposed by Botta-Dukat [@BottaDukat2018].

```{r}
res <- perm.t.test(extra ~ group, data = sleep)
p2ses(res$p.value)
```


## Intersection-Union z- and t-Test
We compute the multivariate intersection-union z- and t-test.

```{r}
library(mvtnorm)
## effect size
delta <- c(0.25, 0.5)
## covariance matrix
Sigma <- matrix(c(1, 0.75, 0.75, 1), ncol = 2)
## sample size
n <- 50
## generate random data from multivariate normal distributions
X <- rmvnorm(n=n, mean = delta, sigma = Sigma)
Y <- rmvnorm(n=n, mean = rep(0, length(delta)), sigma = Sigma)
## perform multivariate z-test
mpe.z.test(X = X, Y = Y, Sigma = Sigma)
## perform multivariate t-test
mpe.t.test(X = X, Y = Y)
```


## Multiple Imputation t-Test
Function mi.t.test can be used to compute a multiple imputation t-test by applying
the approch of Rubin [@Rubin1987] in combination with the adjustment of 
Barnard and Rubin [@Barnard1999].

```{r}
## Generate some data
set.seed(123)
x <- rnorm(25, mean = 1)
x[sample(1:25, 5)] <- NA
y <- rnorm(20, mean = -1)
y[sample(1:20, 4)] <- NA
pair <- c(rnorm(25, mean = 1), rnorm(20, mean = -1))
g <- factor(c(rep("yes", 25), rep("no", 20)))
D <- data.frame(ID = 1:45, response = c(x, y), pair = pair, group = g)

## Use Amelia to impute missing values
library(Amelia)
res <- amelia(D, m = 10, p2s = 0, idvars = "ID", noms = "group")

## Per protocol analysis (Welch two-sample t-test)
t.test(response ~ group, data = D)
## Intention to treat analysis (Multiple Imputation Welch two-sample t-test)
mi.t.test(res, x = "response", y = "group")

## Per protocol analysis (Two-sample t-test)
t.test(response ~ group, data = D, var.equal = TRUE)
## Intention to treat analysis (Multiple Imputation two-sample t-test)
mi.t.test(res, x = "response", y = "group", var.equal = TRUE)

## Specifying alternatives
mi.t.test(res, x = "response", y = "group", alternative = "less")
mi.t.test(res, x = "response", y = "group", alternative = "greater")

## One sample test
t.test(D$response[D$group == "yes"])
mi.t.test(res, x = "response", subset = D$group == "yes")
mi.t.test(res, x = "response", mu = -1, subset = D$group == "yes",
          alternative = "less")
mi.t.test(res, x = "response", mu = -1, subset = D$group == "yes",
          alternative = "greater")

## paired test
t.test(D$response, D$pair, paired = TRUE)
mi.t.test(res, x = "response", y = "pair", paired = TRUE)
```

Function mi.t.test also works with package mice.

```{r}
library(mice)
res.mice <- mice(D, m = 10, print = FALSE)
mi.t.test(res.mice, x = "response", y = "group")
```


## Multiple Imputation Wilcoxon Tests
Function mi.wilcox.test can be used to compute multiple imputation Wilcoxon tests 
by applying the median p rule (MPR) approach; see Section 13.3 in [@Heymans2019].

```{r}
## Per protocol analysis (Exact Wilcoxon rank sum test)
library(exactRankTests)
wilcox.exact(response ~ group, data = D, conf.int = TRUE)
## Intention to treat analysis (Multiple Imputation Exact Wilcoxon rank sum test)
mi.wilcox.test(res, x = "response", y = "group")

## Specifying alternatives
mi.wilcox.test(res, x = "response", y = "group", alternative = "less")
mi.wilcox.test(res, x = "response", y = "group", alternative = "greater")

## One sample test
wilcox.exact(D$response[D$group == "yes"], conf.int = TRUE)
mi.wilcox.test(res, x = "response", subset = D$group == "yes")
mi.wilcox.test(res, x = "response", mu = -1, subset = D$group == "yes",
               alternative = "less")
mi.wilcox.test(res, x = "response", mu = -1, subset = D$group == "yes",
               alternative = "greater")

## paired test
wilcox.exact(D$response, D$pair, paired = TRUE, conf.int = TRUE)
mi.wilcox.test(res, x = "response", y = "pair", paired = TRUE)
```

Function mi.wilcox.test also works with package mice.

```{r}
mi.wilcox.test(res.mice, x = "response", y = "group")
```


## Visualization of Mean Difference
The function mdplot can be used to visualize the mean difference (MD) 
of two groups, assuming two normal distributions. In addition, the sensitivity 
and specificity based on the optimal cutoff (Youden's J statistic) can be
added [@Boehning2008].

```{r, fig.width=7, fig.height=7}
## small effect size
mdplot(delta = 0.2)
## medium effect size
mdplot(delta = 0.5)
## large effect size
mdplot(delta = 0.8)
## z-factor = 0  (z-factor = 1 - 3*2*sd/delta)
mdplot(delta = 6)
## z-factor = 0.5  (z-factor = 1 - 3*2*sd/delta)
mdplot(delta = 12)

## unequal variances
mdplot(delta = 0.8, sd1 = 1, sd2 = 2)
mdplot(delta = 0.8, sd1 = 2, sd2 = 1)
```

Function md2sens can be used to compute sensitivity and specificity for given
MD and SDs.

```{r, fig.width=7, fig.height=7}
library(ggplot2)
## (standardized) mean difference to sensitivity/specificity
## equal variances
delta <- seq(from = 0.0, to = 6, by = 0.05)
res <- sapply(delta, md2sens)
DF <- data.frame(SMD = delta, sensitivity = res[1,], 
                 specificity = res[2,])
ggplot(DF, aes(x = SMD, y = sensitivity)) +
  geom_line() + ylim(0.5, 1.0) + xlab("(standardized) mean difference") +
  ylab("sensitivity = specificity") + ggtitle("SD1 = SD2 = 1")

## unequal variances
delta <- seq(from = 0.0, to = 6, by = 0.05)
res <- sapply(delta, md2sens, sd1 = 1, sd2 = 2)
DF <- data.frame(MD = delta, performance = c(res[1,], res[2,]),
                 measure = c(rep("sensitivity", length(delta)),
                             rep("specificity", length(delta))))
ggplot(DF, aes(x = MD, y = performance, color = measure)) +
  geom_line() + ylim(0, 1.0) + xlab("mean difference") +
  scale_color_manual(values = c("darkblue", "darkred")) +
  ggtitle("SD1 = 1, SD2 = 2")
```

Function md2zfactor can be used to compute the z-factor [@Zhang1999] for given 
MD and SDs.

```{r, fig.width=7, fig.height=7}
## (standardized) mean difference to sensitivity/specificity
## equal variances
library(ggplot2)
delta <- seq(from = 2, to = 18, by = 0.05)
res <- sapply(delta, md2zfactor)
DF <- data.frame(SMD = delta, zfactor = res)
ggplot(DF, aes(x = SMD, y = zfactor)) +
  geom_line() + xlab("(standardized) mean difference") +
  ylab("z-factor") + ggtitle("SD1 = SD2 = 1") + 
  geom_hline(yintercept = 1, linetype = "dotted")

## unequal variances
delta <- seq(from = 2.5, to = 20, by = 0.05)
res <- sapply(delta, md2zfactor, sd1 = 1, sd2 = 2)
DF <- data.frame(MD = delta, zfactor = res)
ggplot(DF, aes(x = MD, y = zfactor)) +
  geom_line() + xlab("mean difference") +
  ylab("z-factor") + ggtitle("SD1 = 1, SD2 = 2") +
  geom_hline(yintercept = 1, linetype = "dotted")
```


## Repeated Measures One-Way ANOVA
We provide a simple wrapper function that allows to compute a classical
repeated measures one-way ANOVA as well as a respective mixed effects model.
In addition, the non-parametric Friedman and Quade tests can be computed.

```{r}
set.seed(123)
outcome <- c(rnorm(10), rnorm(10, mean = 1.5), rnorm(10, mean = 1))
timepoints <- factor(rep(1:3, each = 10))
patients <- factor(rep(1:10, times = 3))
rm.oneway.test(outcome, timepoints, patients)
rm.oneway.test(outcome, timepoints, patients, method = "lme")
rm.oneway.test(outcome, timepoints, patients, method = "quade")
rm.oneway.test(outcome, timepoints, patients, method = "friedman")
```

We visualize the results using function h0plot.

```{r, fig.width=7, fig.height=7}
h0plot(rm.oneway.test(outcome, timepoints, patients))
h0plot(rm.oneway.test(outcome, timepoints, patients, method = "lme"))
h0plot(rm.oneway.test(outcome, timepoints, patients, method = "quade"))
h0plot(rm.oneway.test(outcome, timepoints, patients, method = "friedman"),
       qtail = 1e-4)
```


## Volcano Plots
Volcano plots can be used to visualize the results when many tests have been
applied. They show a measure of effect size in combination with (adjusted)
p values.

```{r, fig.width=7, fig.height=7}
## Generate some data
x <- matrix(rnorm(1000, mean = 10), nrow = 10)
g1 <- rep("control", 10)
y1 <- matrix(rnorm(500, mean = 11.75), nrow = 10)
y2 <- matrix(rnorm(500, mean = 9.75, sd = 3), nrow = 10)
g2 <- rep("treatment", 10)
group <- factor(c(g1, g2))
Data <- rbind(x, cbind(y1, y2))
## compute Hsu t-test
pvals <- apply(Data, 2, function(x, group) hsu.t.test(x ~ group)$p.value,
               group = group)
## compute log-fold change
logfc <- function(x, group){
  res <- tapply(x, group, mean)
  log2(res[1]/res[2])
}
lfcs <- apply(Data, 2, logfc, group = group)
volcano(lfcs, p.adjust(pvals, method = "fdr"), 
        effect.low = -0.25, effect.high = 0.25, 
        xlab = "log-fold change", ylab = "-log10(adj. p value)")
```


## Bland-Altman plots
Bland-Altman plots are used to visually compare two methods of measurement. We
can generate classical Bland-Altman plots [@Bland1986; @Shieh2018].

```{r, fig.width=7, fig.height=7}
data("fingsys")
baplot(fingsys$fingsys, fingsys$armsys, 
       title = "Approximative Confidence Intervals", 
       type = "parametric", ci.type = "approximate")
baplot(fingsys$fingsys, fingsys$armsys, 
       title = "Exact Confidence Intervals", 
       type = "parametric", ci.type = "exact")
baplot(fingsys$fingsys, fingsys$armsys, 
       title = "Bootstrap Confidence Intervals", 
       type = "parametric", ci.type = "boot", R = 999)
```

In the following nonparametric Bland-Altman plots, the median is used instead
of the mean in combination with empirical quantiles for the lower and upper
limit of agreement.

```{r, fig.width=7, fig.height=7}
data("fingsys")
baplot(fingsys$fingsys, fingsys$armsys, 
       title = "Approximative Confidence Intervals", 
       type = "nonparametric", ci.type = "approximate")
baplot(fingsys$fingsys, fingsys$armsys, 
       title = "Exact Confidence Intervals", 
       type = "nonparametric", ci.type = "exact")
baplot(fingsys$fingsys, fingsys$armsys, 
       title = "Bootstrap Confidence Intervals", 
       type = "nonparametric", ci.type = "boot", R = 999)
```


## Imputation of Standard Deviations for Changes from Baseline
The function imputeSD can be used to impute standard deviations for changes
from baseline adopting the approach of the Cochrane handbook 
[@Cochrane2019, Section 6.5.2.8].

```{r}
SD1 <- c(0.149, 0.022, 0.036, 0.085, 0.125, NA, 0.139, 0.124, 0.038)
SD2 <- c(NA, 0.039, 0.038, 0.087, 0.125, NA, 0.135, 0.126, 0.038)
SDchange <- c(NA, NA, NA, 0.026, 0.058, NA, NA, NA, NA)
imputeSD(SD1, SD2, SDchange)
```

Correlations can also be provided via argument corr. This option may particularly 
be useful, if no complete data is available.

```{r}
SDchange2 <- rep(NA, 9)
imputeSD(SD1, SD2, SDchange2, corr = c(0.85, 0.9, 0.95))
```


## Pairwise Comparisons
Function pairwise.fun enables the application of arbitrary functions
for pairwise comparisons.

```{r}
pairwise.wilcox.test(airquality$Ozone, airquality$Month, 
                     p.adjust.method = "none")
## To avoid the warnings
pairwise.fun(airquality$Ozone, airquality$Month, 
             fun = function(x, y) wilcox.exact(x, y)$p.value)
```

The function is also the basis for our functions pairwise.wilcox.exact and
pairwise.ext.t.test, which in contrast to the standard functions pairwise.wilcox.test
and pairwise.t.test generate a much more detailed output. In addition, 
pairwise.wilcox.exact is based on wilcox.exact instead of wilcox.test.

```{r}
pairwise.wilcox.exact(airquality$Ozone, airquality$Month)
```

Furthermore, function pairwise.ext.t.test also enables pairwise Student, Welch,
Hsu, bootstrap and permutation t tests.

```{r}
pairwise.t.test(airquality$Ozone, airquality$Month, pool.sd = FALSE)
pairwise.ext.t.test(airquality$Ozone, airquality$Month)
pairwise.ext.t.test(airquality$Ozone, airquality$Month,
                    method = "hsu.t.test")
pairwise.ext.t.test(airquality$Ozone, airquality$Month, 
                    method = "boot.t.test")
pairwise.ext.t.test(airquality$Ozone, airquality$Month, 
                    method = "perm.t.test")
```


## sessionInfo
```{r}
sessionInfo()
```


## References