# Introduction

The main objective of R-package **VCA** is to perform variance component analyses (VCA). VCA are a way to assess how the variability of a dependent variable is structured taking into account its association with one or multiple random-effects variables. Proportions of the total variability found to be attributed to these random effects variables are called *variance components* (VC). Thus, VCA is the procedure of estimating the amount of the VCs’ contribution to the total variability in the dependent variable. Moreover, there are methods provided for estimating confidence intervals (CI) of VCs along with different graphical tools to better understand the data and for detecting outliers. Also included, but usually of less importance in the field of VCA: Estimation of fixed effects and least square means (LS means) as well as testing linear hypotheses of fixed effects/LS means of linear mixed models (LMMs).

VCs can be predicted in random models (*random effects* or *variance component models*) and LMMs (*linear mixed -effects- models*) by application of either *analysis of variance* (ANOVA)-type estimation or *Restricted Maximum Likelihood* (REML). Experiments of this type frequently occur in performance evaluation analyses of diagnostic tests or analyzers (devices) quantifying various types of measurement (im)precision (see e.g. *CLSI EP05-A3* guideline). In this setting it is important to point out that precision and imprecision both refer to the variability of measurements and differ only by their respective point of view, i.e. the larger the precision of a measuring method the smaller is its imprecision and vice versa.

In the course of the discussion of R-package **VCA**, several examples will be given to allow the user to better understand the application of the most important functions. For all of the examples, simulated data sets coming with the R-package will be used. One of these, **VCAdata1**, comprises 2520 observations. There are 6 variables for 3 devices (*device*), 3 lots (*lot*), 10 samples (*sample*), 21 days (*day*) and 2 runs within day (*run*) with 2 replicates per run. This means, for each run two measurements (y) were performed under conditions which are as constant as they can get. One commonly speaks of *repeatability* measurement conditions.

```
## 'data.frame': 2520 obs. of 6 variables:
## $ lot : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
## $ sample: Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ day : Factor w/ 21 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
## $ run : Factor w/ 2 levels "1","2": 1 1 2 2 1 1 2 2 1 1 ...
## $ y : num 2.11 2.09 2.73 2.72 2.87 ...
## $ device: Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
```

For all other data sets used in this document please refer to the description that can be found in R-package **VCA**.

# Visualization of Variability via Function *varPlot*

Visualization can often help in understanding new or unknown data. When performing a *VCA* it is highly recommended to initially take a look at a variability chart to better understand the major sources of variability and to get a rough idea of the general total variability to expect. Moreover, the variability chart can help spotting abnormalities such as outliers in the data, i.e. extreme values that lie an abnormous distance from the other remaining values and therefore are likely to have a negative effect on the analysis and possible lead to invalid results (see sections *Outlier Detection* and *Checking Normality and Extreme Values Using R-Package STB*).

## Default Settings

**VCA** provides a function called **varPlot** for generating such a variability chart. To create a variability chart by *varPlot()*, it is necessary to state the model formula as well as the data set as function parameters. In this first example, a variability chart for a random model of the form y~(device+lot)/day/run will be plotted. All effects in a *VCA* are modeled as random since the interest lies in their distribution, i.e. their contribution to the total variability (variance). *Run* is nested within *day*, and *day* is, according to the model formula, nested within combinations of *lot* and *device*. Please note that in *varPlot()* the real model is not relevant but rather the order of the variables which determines the layout of the table depicted at the bottom of the variability chart. The implementation of the variability chart cannot distinguish between nested and crossed terms. The data that is used for drawing the variability chart is **datS5**, a subset of *VCAdata1* consisting only of 252 observations from sample number 5 from a total of 10 samples:

Given the parameters **form** and **Data** only, *varPlot()* creates the variability chart in a plain design:

The variability chart shows all 252 observations in *datS5* according to the nesting structure of the model formula. Again, this equals 3 devices (bottom line) with 3 lots each (second from bottom), 7 measuring days (third from bottom) per lot and 2 runs per day. Since each run consists of two replicates, these two values are drawn and connected by a vertical line highlighting the complete range of replicated measurements. Whenever measurements are performed under identical conditions (replicated), i.e. no other sources of variability have an influence, the variance of residual error will be inferred from these differences. In the setting of in-vitro diagnostics one speaks of *repeatability* variance meaning the pure assay (im)precision.

## Advanced Settings

It is possible to specify the display of further graphical elements in the variability chart using various style parameters to improve the overview and facilitate information extraction. Almost all parameters correspond to list-type objects. This allows very detailed specification of the graphical appearance, e.g. in the parameter *MeanLine* variables can be selected whose mean-values are to be highlighted as horizontal lines, their colors (*col*), line-width (*lwd*), line-type (*lty*) etc.. In fact, all function arguments accepted by the R-function *lines* can be specified, since the list *MeanLine* is passed forward to *lines()* with some modifications taking place beforehand.

In the next example, horizontal lines for mean-values (*MeanLine*) are drawn for the intercept as well as the factors *device* and *lot*. Via list-element *col* the appropriate colors are set to white, blue and magenta. Using list-element *lwd* allows to change the line width of the mean value horizontal lines. The allocation of style parameter settings to the respective graphical elements/variables is done in accordance with the order of element denomination. Furthermore, the three levels of variable *lot* will be highlighted in different shades of gray using function argument *BG*, which also needs to be specified as list:

# Outlier Detection

## Outlier Detection by Visual Inspection

From looking at the variability chart it is noticeable that there is one pair of measurements that deviates remarkably far from its within-run mean, i.e. the small red cross on the line connecting the values vertically. These two potential outliers are the replicates of run 2 on day 4, measured in lot 2 with device 1. Extreme values might negatively influence (violate) the normality assumption applied for all random variates of a linear mixed effects model (i.e. random effects, residuals). Thereby, outliers can influence the validity of the model and its results. Following commented R-code is used to generate the variability chart with the potential outliers highlighted:

```
#indicate outliers by new variable
datS5$out <- 0
datS5$out[which(datS5$device==1 & datS5$lot==2 & datS5$day==4 &datS5$run==2)] <- 1
varPlot(y~(device+lot)/day/run, datS5,
# plots horizontal lines for sub-sets
MeanLine=list( var=c("int", "device", "lot"),
col=c("white", "blue", "magenta"),
lwd=c(3,3,3)),
# colors the background according to a variable's levels
# 'col.table=TRUE' indicates the variable in the table
BG=list(var="lot", col=paste0("gray", c(70,80,90)),
col.table=TRUE),
# tailoring the appearance of measurements (points)
Points=list(pch=list(var="out", pch=c(21, 24)),
col=list(var="out", col=c("black", "red")),
bg= list(var="out", bg=c("white", "yellow")),
cex=list(var="out", cex=c(1, 1.5))),
# specification of the text within the table below
VarLab=list(list(cex=1.75), list(cex=1.5),
list(cex=1, srt=90), list()),
# variable names right of the table (since 'side=4')
VCnam=list(cex=1.25, side=4),
# use 33% of the height of the upper part for the table
htab=.33)
```

The function argument *Points* can also be used as list of arguments passed forward to function *points()*. But it can be used in a more detailed manner letting the user specify list-elements *pch*, *col*, *bg*, and *cex* as lists themselves. By doing so one can incorporate additional information indicated by different plotting symbols, color, and size of the symbols. Element *bg* is only useful in case of setting *pch* equal to 21-25. Then, *col* is interpreted as the color of the border and *bg* as the background color of these symbols (see also the last example of the varPlot help).

## Outlier Detection Using Studentized Residuals

Studentization of residuals and/or random effects transforms these random variates to have mean zero and standard deviation (SD) equal to 1 using their element-wise variance estimates for standardization. Outliers amongst studentized conditional residuals correspond to outliers on the replicate level, i.e. where all measuring conditions are kept as constant as possible (here: *device*, *lot*, *day* and *run*). Studentized conditional residuals can be assessed and plotted by the **VCA**-function **plotRandVar**. *plotRandVar()* requires a fitted model as input. In addition, the parameter *term* specifies the type of residuals whereas the parameter *mode* specifies the transformation to be applied to the residuals.

There are two options available for fitting a variance component model. One can use either ANOVA-type estimation via function **anovaVCA** or REML-estimation via function **remlVCA**. For balanced data and in situations where ANOVA-estimation does not produce negative variance-estimates, both methods generate identical results. Otherwise, both approaches to estimation of random effects, and therefore VCs, are likely to differ. Although this difference is usually small. We are using ANOVA-type estimation here but all statements are also valid for REML-estimation. Note that there is function **fitVCA** wrapping function calls to anovaVCA and remlVCA.

Fitting the model to the data returns the object *fitS5* of class **VCA**:

```
fitS5 <- anovaVCA(y~(device+lot)/day/run, datS5)
# if varPlot was called before, better shut down the old graphics device
# graphical parameters were not reset to allow the user to add further
# information to the variability chart, which would not be possible otherwise
dev.off()
plotRandVar(fitS5, term="cond", mode="student")
```

Residuals that deviate further than 3 times the standard deviation can be considered as *extreme*, since the expected value of the mean after studentization is equal to 1. Thus, only 0.3% of all observations are expected to be outside of the \(+/- 3 \times SD\) interval. There are two values that come into question as can be seen in the plot drawn by *plotRandVar()*:

However, since there are too many values, the labelling of the x-axis has become unclear. Setting the function’s parameter **pick** to TRUE allows selecting specific residuals by clicking on them within the graphics device window:

```
plotRandVar(fitS5, term="cond", mode="student", pick=TRUE)
abline(h=c(-3, 3), lty=2, col="red", lwd=2)
mtext(side=4, at=c(-3, 3), col="red", line=.25, las=1, text=c(-3, 3))
```

For every value selected in the plot the respective row name within the data set will be displayed next to the value:

*Please note*: After having selected the desired data points, the graphics procedure has to be stopped manually by right-clicking on the graphics device output and selecting “Stop” in order to be able to execute further arguments within the R-console.

To double-check whether the manually picked data points really are the replicate measurements initially verified from sighting the variability chart, simply select these observations using their row names:

```
## lot sample day run y device out
## 1191 2 5 4 2 17.11 1 1
## 1192 2 5 4 2 19.03 1 1
```

In fact, these observations are the ones initially suspected. The values should be considered for exclusion from the data set to prevent invalid results. See *Checking Normality and Extreme Values Using R-Package STB* for further methods that help determine whether these observations really have to be excluded in view of the normality assumption.

## An Outlier Detection Algorithm

Extreme values on the replicate-level are supposed to be detected by means of the Grubbs-test (CLSI EP05-A3). A modified version of *Grubbs*-test may be defined by means of the median and the **MD68**-statistic. The median can be seen as the robust version of the mean which is known to be sensitive to outliers, thus itself biased in case real outlying observations exist. The *MD68* can be seen as the robust version of the SD, defined as follows:

```
# Function computing the MD68 using SAS PCTLDEF5 quantile definition.
# x (numeric) values from which the MD68 shall be computed
# na.rm (logical) TRUE = missing values will be excluded automatically
md68 <- function(x, na.rm=FALSE)
{
stopifnot(is.numeric(x))
Med <- median(x, na.rm=na.rm)
Diff <- abs(x-Med)
MD68 <- quantile(Diff, probs=.68, type=2)
MD68
}
```

The critical value for this robust version of the *MD68*-based outlier detection algorithm was found to be reasonably well working when set equal to \(2.75 \times MD68\). Specifically, the deviation on the replicate-level from the median of the respective replicate group may not exceed a \(2.75 \times MD68\). The replicate groups in our running example are always within *run* as the smallest grouping-variable where constant measuring conditions can be assumed. The application of this outlier-detection algorithm should be done in groups where so called intermediate precision measuring conditions exist. In the example this refers to a single lot on a single device, i.e. nine such groups exist here (3 devices x 3 lots). For each of these nine groups the specific *MD68*-estimate should be computed and the observation-specific deviation from the median of the replicate group should be compared to the critical value of \(2.75 \times MD68\).

```
# identify groups of intermediate precision (IP) measuring conditions
datS5$IPgroup <- paste(datS5$device, datS5$lot, sep="_")
# uniquely identify replicate groups within IP-groups
datS5$RepGroup <- paste(datS5$day, datS5$run, sep="_")
# Define a function performing the MD68-based outlier-algorithm.
# The data.frame will be returned with two additional variables,
# "diff" (absolute differences from the replicate-group median) and
# "outlier" (TRUE = outlier, FALSE = no outlier).
#
# obj (data.frame) with at least two variables
# resp (character) name of the numeric response variable
# RepGroup (character) name of the replicate-grouping variable
md68OutlierDetection <- function(obj=NULL, resp=NULL, RepGroup=NULL)
{
stopifnot(is.data.frame(obj))
cn <- colnames(obj)
stopifnot(is.character(resp) && resp %in% cn && is.numeric(obj[,resp]))
stopifnot(is.character(RepGroup) && RepGroup %in% cn)
MD68 <- md68(obj[, resp])
Crit <- MD68*2.75
# tapply returns a list with as many elements as there are unique
# elements in obj[,RepGroup], which must be converted to a vector
obj$diff <- unlist(
tapply(obj[,resp], obj[,RepGroup],
function(x)
{
m <- median(x)
d <- abs(x - m)
d
}))
obj$threshold <- Crit
obj$outlier <- obj$diff > Crit
obj
}
```

To apply this *MD68*-based outlier detection algorithm to the data, IP-group by IP-group, one can use a simple for-loop. The result is data.frame out which has three additional variables (*threshold*, *diff*, and *outlier*).

```
IPgroups <- unique(datS5$IPgroup)
for(i in 1:length(IPgroups))
{
tmpData <- subset(datS5, IPgroup == IPgroups[i])
if(i == 1)
out <- md68OutlierDetection(tmpData, resp="y", RepGroup="RepGroup")
else
out <- rbind(out, md68OutlierDetection(tmpData, resp="y", RepGroup="RepGroup"))
}
head(out)
```

```
## lot sample day run y device out IPgroup RepGroup diff threshold
## 337 1 5 1 1 17.46 1 0 1_1 1_1 0.410 1.252
## 338 1 5 1 1 18.28 1 0 1_1 1_1 0.410 1.252
## 339 1 5 1 2 17.89 1 0 1_1 1_2 0.145 1.252
## 340 1 5 1 2 18.18 1 0 1_1 1_2 0.145 1.252
## 341 1 5 2 1 18.01 1 0 1_1 2_1 0.170 1.252
## 342 1 5 2 1 17.67 1 0 1_1 2_1 0.170 1.252
## outlier
## 337 FALSE
## 338 FALSE
## 339 FALSE
## 340 FALSE
## 341 FALSE
## 342 FALSE
```

`## [1] FALSE`

The expression shown above states that no element of variable *outlier* takes the value *TRUE*, i.e. no outlier was identified. As one can see by thoroughly inspecting the **R** source-code, for these data the outlier detection algorithm seems to suffer from relatively large between-day and between-run variability. Both contribute to the total variance within IP-group 1_2 additionally to the measurement imprecision, which is reflected in a rather large threshold value \(2.75 \times MD68 = 1.624\). In consequence, this results in not detecting the previously identified two observations *1191* and *1192* which deviate substantially from each other under repeatability conditions. The above outlined extreme value detection is thus rather conservative, i.e. replicates have to differ a lot in order to be termed extreme.

## Exteme Values on all Levels

Outliers may occur on all levels in an LMM, other levels than replicates should be assessed using studentized random effects. This is possible by specifying the desired random effects term in the *term* parameter of *plotRandVar()*. The exact term denomination can be obtained from the ANOVA-table visible when printing the *VCA*-object. It is also part of the VCA-object fitS5 :

```
##
##
## Result Variance Component Analysis:
## -----------------------------------
##
## Name DF SS MS VC %Total SD CV[%]
## 1 total 15.1786 0.6145 100 0.7839 4.4889
## 2 device 2 24.8695 12.4348 0.1379 22.4425 0.3714 2.1266
## 3 lot 2 27.2843 13.6421 0.1523 24.7816 0.3902 2.2346
## 4 device:lot:day 58 49.3208 0.8504 0.1403 22.8281 0.3745 2.1448
## 5 device:lot:day:run 63 18.2223 0.2892 0.1052 17.1217 0.3244 1.8574
## 6 error 126 9.9308 0.0788 0.0788 12.8261 0.2807 1.6076
##
## Mean: 17.46 (N = 252)
##
## Experimental Design: balanced | Method: ANOVA
```

As an example, consider the studentized random effects of “device:lot:day”. This reveals the between-day variability:

As well as the studentized random effects of “device:lot:day:run”, which show the within-run variability:

**Please note:** If there are too few factor levels (here e.g. for *device* and *lot* with 3 levels each), the assessment of studentized random effects is only of limited use.

```
## DF SS MS VC %Total SD CV[%]
## total 15.18 NA NA 0.61450 100.00 0.7839 4.489
## device 2.00 24.870 12.43476 0.13791 22.44 0.3714 2.127
## lot 2.00 27.284 13.64213 0.15228 24.78 0.3902 2.235
## device:lot:day 58.00 49.321 0.85036 0.14028 22.83 0.3745 2.145
## device:lot:day:run 63.00 18.222 0.28924 0.10521 17.12 0.3244 1.857
## error 126.00 9.931 0.07882 0.07882 12.83 0.2807 1.608
```

# Checking for Normality and Extreme Values Using R-Package **STB**

As indicated before in section *Outlier Detection*, it is assumed that random factors are stochastically independent and random effects are normally distributed. However, extreme values, i.e. outliers, can violate this normality assumption and therefore distort estimations. The rule of thumb is to exclude as few observations as possible and as many as necessary, i.e. generally a maximum of two outliers that violate the normality assumptions at most.

A reasonable means to visually check for observations or extreme values that violate this assumption is to draw a **Q-Q plot** (quantile-quantile plot). In this context, the Q-Q plot computes the theoretically expected random variates (i.e. random effects or residuals) given a normal distribution. This results in a straight diagonal line. Then the observed random variate values are drawn against the expected values. If the observed values evidently deviate from the straight diagonal line, the data most likely are not normally distributed. An additional aid in visually determining if observations violate the normal assumption is to add tolerance bands to the Q-Q plot.

R-package **STB** (https://CRAN.R-project.org/package=STB) provides function **stb.VCA** which simulates *N*data sets incorporating the variance- covariance structure of a fitted object of class *VCA* and constructs a 100(1-alpha)% simultaneous tolerance band (STB) for the simulated data. The *STB* is based on random variates exctracted from the *N* simulated data sets. The construction of the tolerance band requires a reasonably large number of simulations to calculate sufficiently exact STB (see Schuetzenmeister and Piepho 2012). In the following example, simultaneous tolerance bands from studentized (*mode*) conditional (*term*) residuals extracted from 5000 simulated data sets (**N**) are created using method *stb* for objects of class *VCA*. To do so, an LMM fit for *datS5* is required. By way of example, all model terms are assumed to be random:

```
set.seed(23)
fitS5.LMM <- anovaMM(y~((device)+(lot))/(day)/(run), datS5)
STB.res <- stb(fitS5.LMM, term="cond", mode="student", N=5000)
```

One can clearly see that there is one observation in the top right corner lying outside of the tolerance band. Applying the generic R-function **plot** to the object *STB.res* created above and setting its parameter *pick=TRUE* will allow for manual selection of observations by clicking on them within the graphics device window - just as shown in the example above when discussing the variability chart:

Since observation 1192 is violating the normality assumption, it is removed from the original data set and the STB is calculated again. First a *VCA*-object has to be generated fitting a model, here, using function **anovaMM** designed to fit linear mixed models. Terms intented to be modeled as random have to be enclosed in parantheses indicating this:

```
datS5.reduced <- datS5[!rownames(datS5) == "1192",]
fitS5.reduced <- anovaMM(y~((device)+(lot))/(day)/(run), datS5.reduced)
```

As it turns out, removing only the observation 1192 already results in a QQ-plot that obviously does not show any indications for the violation of the normality assumption anymore. To double check another plot of studentized conditional residuals is created from the fitted model of the reduced data set:

In fact, observation 1191 of the initially (see section *Outlier Detection*) suspected value pair does not have to be excluded from the data set anymore.

Excluding too many observations potentially leads to inaccurate estimates since information is lost. On the other hand, not excluding observations that should be removed from the data because of violating the normality assumption leads to invalid estimates. Once again, the following ANOVA table shows the estimates that result from fitting the original full data set *datS5* still containing observation 1191 and 1192. Pay special attention to the error’s contribution to the total variability (see *%Total* column):

```
## DF SS MS VC %Total SD CV[%]
## total 15.18 NA NA 0.61450 100.00 0.7839 4.489
## device 2.00 24.870 12.43476 0.13791 22.44 0.3714 2.127
## lot 2.00 27.284 13.64213 0.15228 24.78 0.3902 2.235
## device:lot:day 58.00 49.321 0.85036 0.14028 22.83 0.3745 2.145
## device:lot:day:run 63.00 18.222 0.28924 0.10521 17.12 0.3244 1.857
## error 126.00 9.931 0.07882 0.07882 12.83 0.2807 1.608
```

Removing only observation 1192, since it caused the violation of the normality assumption, results in the following estimates. Note how the *error %Total* reduced from 12.83% before to 10.64% now:

```
## DF SS MS VC %Total SD CV[%]
## total 14.41 NA NA 0.6081 100.00 0.7798 4.467
## device 2.00 23.952 11.9759 0.1330 21.87 0.3647 2.089
## lot 2.00 28.693 14.3464 0.1613 26.53 0.4017 2.301
## device:lot:day 58.00 49.211 0.8485 0.1443 23.72 0.3798 2.176
## device:lot:day:run 63.00 17.212 0.2732 0.1048 17.24 0.3237 1.854
## error 125.00 8.088 0.0647 0.0647 10.64 0.2544 1.457
```

# Commonly Used VCA-Models

In this section we want to give an overview on the most commonly used VCA-models. The *CLSI EP05-A3* guideline describes multiple experimental settings, e.g. the single-site evaluation study and the multi-site evaluation study. Here, we show how these experiments can be analyzed using R-package **VCA**.

## \(20 \times 2 \times 2\) Single Site Evaluation

The single site experiment recommended in the CLSI EP05-A3 consists of 20 days, 2 runs per day with 2 replicates per run (\(20 \times 2 \times 2 = 80\)).

```
# Function converts a color-string into RGB-code
# col (character) string specifying an R-color
# alpha (numeric) degree of transparency in [0, 1], 0=fully transparency, 1=opaque
asRGB <- function(col, alpha)
rgb(t(col2rgb(col))/255, alpha=alpha)
data(dataEP05A2_3)
varPlot(y~day/run, dataEP05A2_3,
# controls horizontal mean lines
MeanLine=list(var=c("int", "day"), col=c("gray75", "blue"), lwd=c(2,2)),
# controls how points (concentrations) are plotted, here using semi-transparency
# to see overlayed points
Points=list(pch=16, col=asRGB("black", .5), cex=1.25),
# controls how replicate-means are plotted
Mean=list(col="magenta", cex=1.25, lwd=2),
# controls how the title is shown
Title=list(main="20 x 2 x 2 Single-Site Evaluation", cex.main=1.75),
# controls plotting of levels per VC, if as many lists as there are VCs are
# specified, each VC can be specified individually
VarLab=list(list(cex=1.5), list(cex=1.25)),
# controls how names of VCs are plotted
VCnam=list(font=2, cex=1.5),
# controls appearance of the Y-axis label
YLabel=list(text="Concentation [mg/dL]", las=0, line=3, font=2, cex=1.25),
# Y-axis labels rotated
las=1)
```

The **VCA** package comes with three example data sets of this type (*dataEP05A2_1*, *dataEP05A2_3*, *dataEP05A2_3*). Here, run is nested within day, since all 20 days are independent from each other, thus, run No. 1 on a given day does not have anything in common with run No. 1 on another day. This can be expressed as shown below using the nesting-operator ‘/’. After the model was fitted to the data some additional inferential statistics are usually of interest, such as confidence intervals (CI) for VCs and/or performing \(\chi^2\)-tests for claims of *repeatability* or total imprecision. This can be addressed using function **VCAinference**.

```
##
##
## Result Variance Component Analysis:
## -----------------------------------
##
## Name DF SS MS VC %Total SD CV[%]
## 1 total 50.129098 35.546313 100 5.962073 3.967511
## 2 day 19 1506.145222 79.270801 12.231099 34.40891 3.497299 2.327307
## 3 day:run 20 606.928138 30.346407 7.031193 19.780372 2.65164 1.764556
## 4 error 40 651.360839 16.284021 16.284021 45.810718 4.035346 2.685355
##
## Mean: 150.3 (N = 80)
##
## Experimental Design: balanced | Method: ANOVA
```

```
# estimate 95% confidence intervals, request CI for
# all variance components via 'VarVC=TRUE'
inf.SS3 <- VCAinference(fit.SS3, VarVC=TRUE)
inf.SS3
```

```
##
##
##
## Inference from (V)ariance (C)omponent (A)nalysis
## ------------------------------------------------
##
## > VCA Result:
## -------------
##
## Name DF SS MS VC %Total SD CV[%] Var(VC)
## 1 total 50.1291 35.5463 100 5.9621 3.9675
## 2 day 19 1506.1452 79.2708 12.2311 34.4089 3.4973 2.3273 47.0968
## 3 day:run 20 606.9281 30.3464 7.0312 19.7804 2.6516 1.7646 26.3372
## 4 error 40 651.3608 16.284 16.284 45.8107 4.0353 2.6854 13.2585
##
## Mean: 150.3 (N = 80)
##
## Experimental Design: balanced | Method: ANOVA
##
##
## > VC:
## -----
## Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
## total 35.5463 24.8957 54.8935 26.338 51.0984
## day 12.2311 0* 25.6818 0.9429 23.5193
## day:run 7.0312 0* 17.0897 0* 15.4726
## error 16.284 10.9764 26.659 11.6818 24.571
##
## > SD:
## -----
## Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
## total 5.9621 4.9896 7.409 5.1321 7.1483
## day 3.4973 0* 5.0677 0.9711 4.8497
## day:run 2.6516 0* 4.134 0* 3.9335
## error 4.0353 3.3131 5.1632 3.4179 4.9569
##
## > CV[%]:
## --------
## Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
## total 3.9675 3.3203 4.9304 3.4152 4.7569
## day 2.3273 0* 3.3724 0.6462 3.2273
## day:run 1.7646 0* 2.751 0* 2.6176
## error 2.6854 2.2047 3.4359 2.2744 3.2986
##
##
## 95% Confidence Level | * CI-limits constrained to be >= 0
## SAS PROC MIXED method used for computing CIs
```

The default-setting of function *anovaVCA* sets all negative variance estimates equal to zero, since negative values for variances are not defined. This is a known problem of ANOVA-estimation for variance components. There are several reasons which might cause negative variance estimates, e.g. the model specified does not match the structure of the data (wrong model) or there might be outlying observations negatively influencing model assumptions (normality) or the variability might just be too large. These explanation are given in *SAS* Documentation of *PROC VARCOMP*, in section Negative Variance Component Estimates.

## \(3 \times 5 \times 1 \times 5\) Multi-Site Evaluation

The next model we would like to look at is used in the \(3 \times 5 \times 1 \times 5\) multi-site evaluation study. Here, 3 devices (site, laboratories, . . . ) are used and each sample is measured on 3 days in a single run in 5 replicates resulting in 75 observations overall. R-package **VCA** comes with 3 such data sets (dataEP05A3_MS_1, dataEP05A3_MS_3, dataEP05A3_MS_3). This model assumes a single reagent-lot to be used on all three sites (devices, labs, . . . ). A nice-looking variability-chart for such data can be generated as follows:

```
data(dataEP05A3_MS_1)
varPlot(y~site/day, dataEP05A3_MS_1,
BG=list(var="site", col=paste0("gray", c(100, 80, 60))),
Points=list(pch=16, col=asRGB("black", .5), cex=1.25),
MeanLine=list(var=c("int", "site"), col=c("black", "orange"), lwd=c(2,2)),
Mean=list(col="cyan", cex=1.25, lwd=2), las=1,
YLabel=list(text="Concentation [mg/dL]", las=0, line=3, font=2, cex=1.25),
Title=list(main="Multi-Site Evaluation on dataEP05A3_MS_1", cex.main=1.75),
VCnam=list(font=2, cex=1.5),
VarLab=list(list(cex=1.5, font=2), list(cex=1.25, font=2)))
```

The model itself can be fitted to the data using following code (now using *REML*).

```
# fit 3 x 5 x 1 x 5 model to data
fit.MS1 <- fitVCA(y~site/day, dataEP05A3_MS_1, method="REML")
fit.MS1
```

```
##
##
## Result Variance Component Analysis:
## -----------------------------------
##
## Name DF VC %Total SD CV[%] Var(VC)
## 1 total 16.889879 3.635232 100 1.906629 7.566293 1.564832
## 2 site 1.488641 1.017401 27.987245 1.008663 4.002794 1.390671
## 3 site:day 2.105837 0.345882 9.514705 0.588117 2.333892 0.113621
## 4 error 60 2.271949 62.49805 1.507299 5.981587 0.172058
##
## Mean: 25.2 (N = 75)
##
## Experimental Design: balanced | Method: REML
```

## \(3 \times 5 \times 2 \times 3\) Multi-Site Evaluation

The *CLSI EP05-A3* guideline describes a second type of multi-site evaluation which takes into account variability betweenruns additionally. This model requires 15 additional observations compared to the \(3 \times 5 \times 2 \times 3\)model and is the \(3 \times 5 \times 2 \times 3\) model. Thus, there are 2 runs per day with 3 replicates each resulting in 90 observations overall. Since there is no such data set contained in R-package **VCA** we will simulate such data, plot it and fit the respective model.

```
# simulate fit 3 x 5 x 2 x 3 model to data
set.seed(23)
dat.MS2 <- data.frame( y=50 +
# 3 random effects for sites
rep(rnorm(3,0,2.5), rep(30, 3)) +
# 15 random effects for days
rep(rnorm(15,0, 2), rep(6, 15)) +
# 30 random effects for runs
rep(rnorm(30,0, 1), rep(3, 30)) +
# residual error (repeatability)
rnorm(90,0,1.5),
site = gl(3, 30, labels=paste0("Site_", 1:3)),
day = gl(5, 6, 90),
run =gl(2, 3, 90)
)
```

Now visualize this data set.

```
varPlot(y~site/day/run, dat.MS2,
BG=list(var="site", col=paste0("gray", c(100, 80, 60))),
Points=list(pch=16, col=asRGB("black", .5), cex=1.25),
MeanLine=list( var=c("int", "site", "day"),
col=c("black", "orange", "blue"),
lwd=c(2,2,2)),
Mean=list(col="cyan", cex=1.25, lwd=2), las=1,
YLabel=list(text="Concentation [mg/dL]", las=0, line=3, font=2, cex=1.25),
Title=list(main="3 x 5 x 2 x 3 Multi-Site Evaluation", cex.main=1.75),
VCnam=list(font=2, cex=1.5),
# controls for which variable vertical lines are added between levels
# and how these are plotted
VLine=list(var="day", col="gray75"),
VarLab=list(list(cex=1.5), list(cex=1.25), list(cex=1.25)))
```

And finally fit the respective VCA-model using ANOVA-type estimation.

```
# fit 3 x 5 x 2 x 3 model to data (ANOVA is default)
fit.MS2 <- fitVCA(y~site/day/run, dat.MS2)
print(fit.MS2, digits=4)
```

```
##
##
## Result Variance Component Analysis:
## -----------------------------------
##
## Name DF SS MS VC %Total SD CV[%]
## 1 total 8.2851 7.2503 100 2.6926 5.2611
## 2 site 2 207.2225 103.6113 2.956 40.7713 1.7193 3.3594
## 3 site:day 12 179.1583 14.9299 1.837 25.3373 1.3554 2.6483
## 4 site:day:run 15 58.615 3.9077 0.7252 10.0026 0.8516 1.6639
## 5 error 60 103.9207 1.732 1.732 23.8888 1.3161 2.5714
##
## Mean: 51.18 (N = 90)
##
## Experimental Design: balanced | Method: ANOVA
```

## Multi-Site Multi-Lot Evaluation

The last model we want to address is the multi-lot reproducibility model. This refers to an experimental setup, where one of the two multi-site model is used with the additional variable reagent-lot. In in-vitro diagnostics reagent-lot is a factor where only a limited number of levels is available. Therefore, the typical multi-lot reproducibility design consists of 3 different reagent-lots used on 3 different sites. Different allocation schemes exist for assigning the 3 lots to the 3 sites. The best, but also the least parsimonious, design has all lots tested on all sites. These, among others, are valid lab-lot assignment schemes allowing to estimates VC reagent-lot:

```
## Site_1 Site_2 Site_3
## ReagentLot_1 X X X
## ReagentLot_2 X X X
## ReagentLot_3 X X X
```

```
## Site_1 Site_2 Site_3
## ReagentLot_1 X X
## ReagentLot_2 X X
## ReagentLot_3 X X
```

```
## Site_1 Site_2 Site_3
## ReagentLot_1 X X
## ReagentLot_2 X X
## ReagentLot_3 X X
```

The original example data set VCAdata1 is very similar to the experimental designs described here. The sub-set datS5 refers to an experimental design with 3 site (device), 3 reagent-lots tested on each site, 7 days per site-lot combination, and 2 runs per day with 2 replicates per run.

In previous examples in this section we could always use a fully-nested model, because the testing-days in different laboratories are independent from each other, despite the fact that they might be encoded by integers 1 to 5. The same holds for runs, 2 runs performed on 2 different days are independent from each other despite the fact that both might be encoded using the same factor-level. This independence does not hold for reagent-lots. The same lots are used in all laboratories, each lot showing lot-specific performance, i.e. each lot will have a lot-specific bias compared to the unknown true concentration of a sample. This lot-specific bias directly translates to random effects for variable lot and their variation is modeled as following a normal distribution. At least this is assumed when estimation takes place. To reflect this in the model formula *site* and *lot* have to modeled as main-effects, and all other terms are nested within combinations of all main-effects. This can be done using following expressions.

```
##
##
## Result Variance Component Analysis:
## -----------------------------------
##
## Name DF SS MS VC %Total SD CV[%]
## 1 total 15.1786 0.6145 100 0.7839 4.4889
## 2 device 2 24.8695 12.4348 0.1379 22.4425 0.3714 2.1266
## 3 lot 2 27.2843 13.6421 0.1523 24.7816 0.3902 2.2346
## 4 device:lot:day 58 49.3208 0.8504 0.1403 22.8281 0.3745 2.1448
## 5 device:lot:day:run 63 18.2223 0.2892 0.1052 17.1217 0.3244 1.8574
## 6 error 126 9.9309 0.0788 0.0788 12.8261 0.2807 1.6076
##
## Mean: 17.46 (N = 252)
##
## Experimental Design: balanced | Method: ANOVA
```

If we now use the wrong model formulation the result is different.

```
##
##
## Result Variance Component Analysis:
## -----------------------------------
##
## Name DF SS MS VC %Total SD CV[%]
## 1 total 21.0635 0.5637 100 0.7508 4.2995
## 2 device 2 24.8695 12.4348 0.0748 13.2669 0.2735 1.5661
## 3 device:lot 6 36.9139 6.1523 0.1935 34.3199 0.4399 2.5188
## 4 device:lot:day 54 39.6911 0.735 0.1114 19.7688 0.3338 1.9117
## 5 device:lot:day:run 63 18.2223 0.2892 0.1052 18.6634 0.3244 1.8574
## 6 error 126 9.9309 0.0788 0.0788 13.981 0.2807 1.6076
##
## Mean: 17.46 (N = 252)
##
## Experimental Design: balanced | Method: ANOVA
```

Above, the correct model (fit.MSML) specification results in 2 degrees of freedom (DF) for factor-variables *device* and *lot*. This is intuitively right, since there are 3 levels each for both variables. Using the wrong model (fit.MSML2) leads to 6 DF for *lot* (\(3 \times 3 - 3 = 6\)), which should raise concerns since the DF for lots increase the number of lots, which is not possible. This incorrect number of DF as a direct impact on the width of the confidence intervals:

```
inf.MSML <- VCAinference(fit.MSML, VarVC=TRUE)
inf.MSML2 <- VCAinference(fit.MSML2, VarVC=TRUE)
# print CI for CV, other options are "all", "VC", "SD", and "VCA"
print(inf.MSML, what="CV", digits=2)
```

```
##
##
##
## Inference from (V)ariance (C)omponent (A)nalysis
## ------------------------------------------------
##
##
## > CV[%]:
## --------
## Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
## total 4.49 3.32 6.93 3.48 6.44
## device 2.13 0* 3.75 0* 3.54
## lot 2.23 0* 3.93 0* 3.71
## device:lot:day 2.14 1.39 2.7 1.54 2.62
## device:lot:day:run 1.86 1.33 2.27 1.43 2.21
## error 1.61 1.43 1.83 1.46 1.79
##
##
## 95% Confidence Level | * CI-limits constrained to be >= 0
## SAS PROC MIXED method used for computing CIs
```

```
##
##
##
## Inference from (V)ariance (C)omponent (A)nalysis
## ------------------------------------------------
##
##
## > CV[%]:
## --------
## Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
## total 4.3 3.31 6.14 3.45 5.78
## device 1.57 0* 3.51 0* 3.28
## device:lot 2.52 0* 3.81 0* 3.63
## device:lot:day 1.91 1.11 2.46 1.27 2.38
## device:lot:day:run 1.86 1.33 2.27 1.43 2.21
## error 1.61 1.43 1.83 1.46 1.79
##
##
## 95% Confidence Level | * CI-limits constrained to be >= 0
## SAS PROC MIXED method used for computing CIs
```

The 95% CI for the wrong model (fit.MSML2) tend to be narrower, which makes the variance components estimates look more precise. This is of course not the case, since the model is misspecified.

# References

Schuetzenmeister, Andre, and Hans-Peter Piepho. 2012. “Residual Analysis of Linear Mixed Models Using a Simulation Approach.” *Computational Statistics and Data Analysis* 56: 1405–16.