| Version: | 1.0.0 |
| Title: | 'SAS' Linear Model |
| Description: | This is a core implementation of 'SAS' procedures for linear models - GLM, REG, ANOVA, TTEST, FREQ, and UNIVARIATE. Some R packages provide Type II and Type III SS. However, the results of nested and complex designs are often different from those of 'SAS'. Different results do not necessarily mean incorrectness. However, many want the same results as 'SAS'. This package aims to achieve that. Reference: Littell RC, Stroup WW, Freund RJ (2002, ISBN:0-471-22174-0). |
| Depends: | R (≥ 3.5.0), mvtnorm |
| Imports: | methods |
| Suggests: | MASS, knitr, rmarkdown |
| Author: | Kyun-Seop Bae [aut, cre] |
| Maintainer: | Kyun-Seop Bae <k@acr.kr> |
| Copyright: | 2020-, Kyun-Seop Bae |
| License: | GPL-3 |
| URL: | https://cran.r-project.org/package=sasLM |
| NeedsCompilation: | no |
| Packaged: | 2026-06-14 20:31:06 UTC; Kyun-SeopBae |
| Repository: | CRAN |
| Date/Publication: | 2026-06-15 05:10:03 UTC |
'SAS' Linear Model
Description
This is a core implementation of 'SAS' procedures for linear models - GLM, REG, and ANOVA. Some packages provide type II and type III SS. However, the results of nested and complex designs are often different from those of 'SAS'. A different result does not necessarily mean incorrectness. However, many want the same results as 'SAS'. This package aims to achieve that. Reference: Littell RC, Stroup WW, Freund RJ (2002, ISBN:0-471-22174-0).
Details
This will serve those who want SAS PROC GLM, REG, and ANOVA in R.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
## SAS PROC GLM Script for Typical Bioequivalence Data
# PROC GLM DATA=BEdata;
# CLASS SEQ SUBJ PRD TRT;
# MODEL LNCMAX = SEQ SUBJ(SEQ) PRD TRT;
# RANDOM SUBJ(SEQ)/TEST;
# LSMEANS TRT / DIFF=CONTROL("R") CL ALPHA=0.1;
# ODS OUTPUT LSMeanDiffCL=LSMD;
# DATA LSMD; SET LSMD;
# PE = EXP(DIFFERENCE);
# LL = EXP(LowerCL);
# UL = EXP(UpperCL);
# PROC PRINT DATA=LSMD; RUN;
##
## SAS PROC GLM equivalent
BEdata = af(BEdata, c("SEQ", "SUBJ", "PRD", "TRT")) # Columns as factor
formula1 = log(CMAX) ~ SEQ/SUBJ + PRD + TRT # Model
GLM(formula1, BEdata) # ANOVA tables of Type I, II, III SS
RanTest(formula1, BEdata, Random="SUBJ") # Hypothesis test with SUBJ as random
ci0 = CIest(formula1, BEdata, "TRT", c(-1, 1), 0.90) # 90% CI
exp(ci0[, c("Estimate", "Lower CL", "Upper CL")]) # 90% CI of GMR
## 'nlme' or SAS PROC MIXED is preferred for an unbalanced case
## SAS PROC MIXED equivalent
# require(nlme)
# Result = lme(log(CMAX) ~ SEQ + PRD + TRT, random=~1|SUBJ, data=BEdata)
# summary(Result)
# VarCorr(Result)
# ci = intervals(Result, 0.90) ; ci
# exp(ci$fixed["TRTT",])
##
An Example Dataset of a Bioequivalence Study
Description
Contains Cmax data from a real bioequivalence study.
Usage
BEdata
Format
A data frame with 91 observations on the following 6 variables.
ADMAdmission or Hospitalization Group Code: 1, 2, or 3
SEQGroup or Sequence character code: 'RT' or 'TR'
PRDPeriod numeric value: 1 or 2
TRTTreatment or Drug code: 'R' or 'T'
SUBJSubject ID
CMAXCmax values
Details
This contains real data from a 2x2 bioequivalence study, which has three different hospitalization groups. See Bae KS, Kang SH. Bioequivalence data analysis for the case of separate hospitalization. Transl Clin Pharmacol. 2017;25(2):93-100. doi.org/10.12793/tcp.2017.25.2.93
Analysis BY variable
Description
Functions such as GLM, REG, and aov1 can be run by levels of a variable.
Usage
BY(FUN, Formula, Data, By, ...)
Arguments
FUN |
Function name to be called, such as GLM or REG |
Formula |
a conventional formula for a linear model. |
Data |
a |
By |
a variable name in the |
... |
arguments to be passed to |
Details
This mimics the BY clause of SAS procedures.
Value
a list of FUN function outputs. The names of the list are the levels of the By variable.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
BY(GLM, uptake ~ Treatment + as.factor(conc), CO2, By="Type")
BY(REG, uptake ~ conc, CO2, By="Type")
Internal Functions
Description
Internal functions
Details
These are not to be called by the user.
Confidence Interval Estimation
Description
Get the point estimate and its confidence interval with a given contrast and alpha value using the t distribution.
Usage
CIest(Formula, Data, Term, Contrast, conf.level=0.95)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
Term |
a factor name to be estimated |
Contrast |
a level vector. Levels are alphabetically ordered by default. |
conf.level |
confidence level of the confidence interval |
Details
Get the point estimate and its confidence interval with a given contrast and alpha value using the t distribution.
Value
Estimate |
point estimate of the input linear contrast |
Lower CL |
lower confidence limit |
Upper CL |
upper confidence limit |
Std. Error |
standard error of the point estimate |
t value |
value for the t distribution |
Df |
degrees of freedom |
Pr(>|t|) |
probability of a larger absolute t value from the t distribution with the residual degrees of freedom |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
CIest(log(CMAX) ~ SEQ/SUBJ + PRD + TRT, BEdata, "TRT", c(-1, 1), 0.90) # 90% CI
F Test with a Set of Contrasts
Description
Do an F test with a given set of contrasts.
Usage
CONTR(L, Formula, Data, mu=0)
Arguments
L |
contrast matrix. Each row is a contrast. |
Formula |
a conventional formula for a linear model |
Data |
a |
mu |
a vector of mu for the hypothesis L. The length should be equal to the row count of L. |
Details
It performs an F test with a given set of contrasts (a matrix). It is similar to the CONTRAST clause of SAS PROC GLM. This can test the hypothesis that the linear combination (function)'s mean vector is mu.
Value
Returns the sum of squares and its F value and p-value.
Df |
degrees of freedom |
Sum Sq |
sum of squares for the set of contrasts |
Mean Sq |
mean square |
F value |
F value for the F distribution |
Pr(>F) |
probability of a larger F value |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
CONTR(t(c(0, -1, 1)), uptake ~ Type, CO2) # sum of square
GLM(uptake ~ Type, CO2) # compare with the above
Coefficient of Variation in percentage
Description
Coefficient of variation in percentage.
Usage
CV(y)
Arguments
y |
a numeric vector |
Details
It removes NA.
Value
Coefficient of variation in percentage.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
CV(mtcars$mpg)
Collinearity Diagnostics
Description
Collinearity diagnostics with tolerance, VIF, eigenvalue, condition index, and variance proportions
Usage
Coll(Formula, Data)
Arguments
Formula |
formula of the model |
Data |
input data as a matrix or a |
Details
Sometimes collinearity diagnostics after multiple linear regression are necessary.
Value
Tol |
tolerance of independent variables |
VIF |
variance inflation factor of independent variables |
Eigenvalue |
eigenvalue of Z'Z (crossproduct) of standardized independent variables |
Cond. Index |
condition index |
Proportions of variances |
under the names of coefficients |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
Coll(mpg ~ disp + hp + drat + wt + qsec, mtcars)
Correlation test of multiple numeric columns
Description
Testing correlation between numeric columns of data with the Pearson method.
Usage
Cor.test(Data, conf.level=0.95)
Arguments
Data |
a matrix or a |
conf.level |
confidence level |
Details
It uses all numeric columns of the input data. It uses "pairwise.complete.obs" rows.
Value
Row names show which columns are used for the test.
Estimate |
point estimate of correlation |
Lower CL |
lower confidence limit |
Upper CL |
upper confidence limit |
t value |
t value of the t distribution |
Df |
degrees of freedom |
Pr(>|t|) |
probability with the t distribution |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
Cor.test(mtcars)
Cumulative Alpha with various z's and ti
Description
Cumulative alpha values for repeated hypothesis tests with changing bound z-values and times of test (ti).
Usage
CumAlpha(z, side=2, ti=NULL, c0=NULL, Seed=5)
Arguments
z |
vector of upper z-value bounds for the repeated hypothesis test |
side |
1=one-sided test, 2=two-sided test |
ti |
vector of times (or information amount) of test. All values should be in [0, 1] and sorted. If not specified, equal intervals are assumed. |
c0 |
correlation matrix. If not specified, Brownian motion is assumed. |
Seed |
seed value for the |
Details
It calculates cumulative alpha-values for the repeated hypothesis test with a vector of upper bound z-values. If the times of test are not specified, linear (proportional) increase of information amount and Brownian motion of z-values are assumed, i.e. the correlation is sqrt(t_i/t_j).
Value
The result is a matrix.
ti |
time of test |
cum.alpha |
cumulative alpha values |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Reboussin DM, DeMets DL, Kim K, Lan KKG. Computations for group sequential boundaries using the Lan-DeMets function method. Controlled Clinical Trials. 2000;21:190-207.
Examples
CumAlpha(z=rep(qnorm(1 - 0.05/2), 10)) # two-side Z-test with alpha=0.05 for ten times
Plot Pairwise Differences
Description
Plot pairwise differences by a common method.
Usage
Diffogram(Formula, Data, Term, conf.level=0.95, adj="lsd", ...)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
Term |
a factor name to be estimated |
conf.level |
confidence level of the confidence interval |
adj |
"lsd", "tukey", "scheffe", "bon", or "duncan" to adjust p-value and confidence limit |
... |
arguments to be passed to |
Details
This usually shows the shortest interval. It corresponds to the PDIFF option of SAS PROC GLM. For the adjustment method "dunnett", see the PDIFF function.
Value
no return value, but a plot on the current device
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
Diffogram(uptake ~ Type*Treatment + as.factor(conc), CO2, "as.factor(conc)")
Example Datasets
Description
Contains data frames to be used for textbooks.
Details
This contains datasets for textbooks.
Drift defined by Lan and DeMets for Group Sequential Design
Description
Calculate the drift value with given upper bounds (z-values), times of test, and power.
Usage
Drift(bi, ti=NULL, Power=0.9)
Arguments
bi |
upper bound z-values |
ti |
times of test. These should be in the range of [0, 1]. If omitted, equal intervals are assumed. |
Power |
target power at the final test |
Details
It calculates the drift value with given upper bound z-values, times of test, and power. If the times of test are not given, equal intervals are assumed. mvtnorm::pmvt (with noncentrality) is better than pmvnorm in calculating power and sample size. However, Lan-DeMets used the multivariate normal rather than the multivariate noncentral t distribution. This function follows Lan-DeMets for consistency with previous results.
Value
Drift value for the given condition
Author(s)
Kyun-Seop Bae k@acr.kr
References
Reboussin DM, DeMets DL, Kim K, Lan KKG. Computations for group sequential boundaries using the Lan-DeMets function method. Controlled Clinical Trials. 2000;21:190-207.
Examples
Drift(seqBound(ti=(1:5)/5)[, "up.bound"])
Expected Mean Square Formula
Description
Calculates a formula table for the expected mean square of the given contrast. The default is for Type III SS.
Usage
EMS(Formula, Data, Type=3, eps=1e-8)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
Type |
type of sum of squares. The default is 3. Type 4 is not supported yet. |
eps |
A value less than this is considered zero. |
Details
This is necessary for further hypothesis tests of nesting factors.
Value
A coefficient matrix for Type III expected mean square
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
f1 = log(CMAX) ~ SEQ/SUBJ + PRD + TRT
EMS(f1, BEdata)
EMS(f1, BEdata, Type=1)
EMS(f1, BEdata, Type=2)
Estimate Linear Function
Description
Estimates Linear Function with a formula and a dataset.
Usage
ESTM(L, Formula, Data, conf.level=0.95)
Arguments
L |
a matrix of linear function rows to be tested |
Formula |
a conventional formula for a linear model |
Data |
a |
conf.level |
confidence level of the confidence limit |
Details
It tests rows of linear functions. A linear function means a linear combination of estimated coefficients. It is similar to the ESTIMATE statement of SAS PROC GLM. This is a convenient version of the est function.
Value
Estimate |
point estimate of the input linear contrast |
Lower CL |
lower confidence limit |
Upper CL |
upper confidence limit |
Std. Error |
standard error of the point estimate |
t value |
value for the t distribution |
Df |
degrees of freedom |
Pr(>|t|) |
probability of a larger absolute t value from the t distribution with the residual degrees of freedom |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
ESTM(t(c(0, -1, 1)), uptake ~ Type, CO2) # Quevec - Mississippi
Exit Probability with Cumulative Z-test in Group Sequential Design
Description
Exit probabilities with the given drift, upper bounds, and times of test.
Usage
ExitP(Theta, bi, ti=NULL)
Arguments
Theta |
drift value defined by Lan-DeMets. See the reference. |
bi |
upper bound z-values |
ti |
times of test. These should be in the range of [0, 1]. If omitted, even intervals are assumed. |
Details
It calculates exit probabilities and cumulative exit probabilities with the given drift, upper z-bounds, and times of test. If the times of test are not given, even intervals are assumed. mvtnorm::pmvt (with noncentrality) is better than pmvnorm in calculating power and sample size. However, Lan-DeMets used the multivariate normal rather than the multivariate noncentral t distribution. This function follows Lan-DeMets for consistency with previous results.
Value
The result is a matrix.
ti |
time of test |
bi |
upper z-bound |
cum.alpha |
cumulative alpha-value |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Reboussin DM, DeMets DL, Kim K, Lan KKG. Computations for group sequential boundaries using the Lan-DeMets function method. Controlled Clinical Trials. 2000;21:190-207.
Examples
b0 = seqBound(ti=(1:5)/5)[, "up.bound"]
ExitP(Theta = Drift(b0), bi = b0)
Generalized inverse matrix of type 2 for linear regression
Description
A generalized inverse is usually not unique. Some programs use this algorithm to get a unique generalized inverse matrix.
Usage
G2SWEEP(A, Augmented=FALSE, eps=1e-08)
Arguments
A |
a matrix to be inverted. If |
Augmented |
If this is |
eps |
A value less than this is considered zero. |
Details
The generalized inverse of g2-type is used by some software to do linear regression. See 'SAS Technical Report R106, The Sweep Operator: Its Importance in Statistical Computing' by J. H. Goodnight for details.
Value
when Augmented=FALSE |
ordinary g2 inverse |
when Augmented=TRUE |
g2 inverse and beta hats in the last column and the last row, and the sum of squares error (SSE) in the last cell |
attribute "rank" |
the rank of the input matrix |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
f1 = uptake ~ Type + Treatment # formula
x = ModelMatrix(f1, CO2) # Model matrix and relevant information
y = model.frame(f1, CO2)[, 1] # observation vector
nc = ncol(x$X) # number of columns of model matrix
XpY = crossprod(x$X, y)
aXpX = rbind(cbind(crossprod(x$X), XpY), cbind(t(XpY), crossprod(y)))
ag2 = G2SWEEP(aXpX, Augmented=TRUE)
b = ag2[1:nc, (nc + 1)] ; b # Beta hat
iXpX = ag2[1:nc, 1:nc] ; iXpX # g2 inverse of X'X
SSE = ag2[(nc + 1), (nc + 1)] ; SSE # Sum of Square Error
DFr = nrow(x$X) - attr(ag2, "rank") ; DFr # Degree of freedom for the residual
# Compare the below with the above
REG(f1, CO2)
aov1(f1, CO2)
General Linear Model similar to SAS PROC GLM
Description
GLM is the main function of this package.
Usage
GLM(Formula, Data, BETA=FALSE, EMEAN=FALSE, Resid=FALSE, conf.level=0.95,
Weights=1)
Arguments
Formula |
a conventional formula for a linear model. |
Data |
a |
BETA |
if |
EMEAN |
if |
Resid |
if |
conf.level |
confidence level for the confidence limit of the least square mean |
Weights |
weights for the weighted least squares. This should be a scalar or a vector of the same length as the number of rows of |
Details
It performs the core function of SAS PROC GLM. Least square means for the interaction term of three variables are not supported yet.
Value
The result is comparable to that of SAS PROC GLM.
ANOVA |
ANOVA table for the model |
Fitness |
Some measures of goodness of fit such as R-square and CV |
Type I |
Type I sum of squares table |
Type II |
Type II sum of squares table |
Type III |
Type III sum of squares table |
Parameter |
Parameter table with standard error, t value, p value. |
Expected Mean |
Least square (or expected) mean table with confidence limits. This is returned only with the EMEAN=TRUE option. |
Fitted |
Fitted values or y hat in the original scale, as SAS OUTPUT P= does, even with Weights. This is returned only with the Resid=TRUE option. |
Residual |
Residuals in the original scale, as SAS OUTPUT R= does, even with Weights. This is returned only with the Resid=TRUE option. |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
REG, aov1, aov2, aov3, LSM, PDIFF
Examples
GLM(uptake ~ Type*Treatment + conc, CO2[-1,]) # Making data unbalanced
GLM(uptake ~ Type*Treatment + conc, CO2[-1,], BETA=TRUE)
GLM(uptake ~ Type*Treatment + conc, CO2[-1,], EMEAN=TRUE)
GLM(uptake ~ Type*Treatment + conc, CO2[-1,], Resid=TRUE)
GLM(uptake ~ Type*Treatment + conc, CO2[-1,], BETA=TRUE, EMEAN=TRUE)
GLM(uptake ~ Type*Treatment + conc, CO2[-1,], BETA=TRUE, EMEAN=TRUE, Resid=TRUE)
Kurtosis
Description
Kurtosis with a conventional formula.
Usage
Kurtosis(y)
Arguments
y |
a vector of numerics |
Details
It removes NA in the input vector.
Value
Estimate of kurtosis
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Standard Error of Kurtosis
Description
Standard error of the estimated kurtosis with a conventional formula.
Usage
KurtosisSE(y)
Arguments
y |
a vector of numerics |
Details
It removes NA in the input vector.
Value
Standard error of the estimated kurtosis
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Lower Confidence Limit
Description
The estimate of the lower bound of the confidence limit using the t-distribution
Usage
LCL(y, conf.level=0.95)
Arguments
y |
a vector of numerics |
conf.level |
confidence level |
Details
It removes NA in the input vector.
Value
The estimate of the lower bound of the confidence limit using the t-distribution
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Least Squares Means
Description
Estimates least squares means using the g2 inverse.
Usage
LSM(Formula, Data, Term, conf.level=0.95, adj="lsd", hideNonEst=TRUE,
PLOT=FALSE, descend=FALSE, ...)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
Term |
a term name to be returned. If there is only one independent variable, this can be omitted. |
conf.level |
confidence level for the confidence limit |
adj |
adjustment method for grouping; "lsd" (default), "tukey", "bon", "duncan", and "scheffe" are available. This does not affect the SE, Lower CL, and Upper CL of the output table. |
hideNonEst |
logical. whether to hide non-estimable values |
PLOT |
logical. whether to plot LSMs and their confidence intervals |
descend |
logical. This specifies whether the plotting order is ascending or descending. |
... |
arguments to be passed to |
Details
It is equivalent to the LSMEANS statement of SAS PROC GLM. The result of the second example below may be different from emmeans. This is because SAS and this function calculate the mean of the transformed continuous variable. However, emmeans calculates the average before the transformation. An interaction of three variables is not supported yet. For the "dunnett" adjustment method, see the PDIFF function.
Value
Returns a table of expectations, t values and p-values.
Group |
group character. This appears when the model is one-way ANOVA or when the |
LSmean |
point estimate of the least squares mean |
LowerCL |
lower confidence limit at the given confidence level by the "lsd" method |
UpperCL |
upper confidence limit at the given confidence level by the "lsd" method |
SE |
standard error of the point estimate |
Df |
degrees of freedom of the point estimate |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
LSM(uptake ~ Type, CO2[-1,])
LSM(uptake ~ Type - 1, CO2[-1,])
LSM(uptake ~ Type*Treatment + conc, CO2[-1,])
LSM(uptake ~ Type*Treatment + conc - 1, CO2[-1,])
LSM(log(uptake) ~ Type*Treatment + log(conc), CO2[-1,])
LSM(log(uptake) ~ Type*Treatment + log(conc) - 1, CO2[-1,])
LSM(log(uptake) ~ Type*Treatment + as.factor(conc), CO2[-1,])
LSM(log(uptake) ~ Type*Treatment + as.factor(conc) - 1, CO2[-1,])
LSM(log(CMAX) ~ SEQ/SUBJ + PRD + TRT, BEdata)
LSM(log(CMAX) ~ SEQ/SUBJ + PRD + TRT - 1, BEdata)
Max without NA
Description
Maximum without NA values.
Usage
Max(y)
Arguments
y |
a vector of numerics |
Details
It removes NA values from the input vector.
Value
maximum value
Author(s)
Kyun-Seop Bae k@acr.kr
Mean without NA
Description
Mean without NA values.
Usage
Mean(y)
Arguments
y |
a vector of numerics |
Details
It removes NA values from the input vector.
Value
mean value
Author(s)
Kyun-Seop Bae k@acr.kr
Median without NA
Description
Median without NA values.
Usage
Median(y)
Arguments
y |
a vector of numerics |
Details
It removes NA values from the input vector.
Value
median value
Author(s)
Kyun-Seop Bae k@acr.kr
Min without NA
Description
Minimum without NA values.
Usage
Min(y)
Arguments
y |
a vector of numerics |
Details
It removes NA values from the input vector.
Value
minimum value
Author(s)
Kyun-Seop Bae k@acr.kr
Model Matrix
Description
This model matrix is similar to model.matrix, but it does not omit unnecessary columns.
Usage
ModelMatrix(Formula, Data, KeepOrder=FALSE, XpX=FALSE)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
KeepOrder |
If |
XpX |
If |
Details
It makes the model (design) matrix for GLM.
Value
Model matrix and attributes similar to the output of model.matrix.
X |
design matrix, i.e. model matrix |
XpX |
cross-product of the design matrix, X'X |
terms |
detailed information about terms such as formula and labels |
termsIndices |
term indices |
assign |
assignment of columns for each term in order, a different way of expressing term indices |
Author(s)
Kyun-Seop Bae k@acr.kr
Number of observations
Description
Number of observations excluding NA values.
Usage
N(y)
Arguments
y |
a vector of numerics |
Details
It removes NA values from the input vector.
Value
Count of the observations
Author(s)
Kyun-Seop Bae k@acr.kr
O'Brien-Fleming bounds for cumulative Z-test in Group Sequential Design
Description
Sequential O'Brien-Fleming upper bounds for the cumulative Z-test on accumulating data. Z values are correlated. This is usually used for group sequential design.
Usage
OBFBound(K, alpha=0.05, side=2, ti=NULL, c0=NULL)
Arguments
K |
count of tests, including the final one |
alpha |
goal alpha value for the last test at time 0. |
side |
1=one-sided test, 2=two-sided test |
ti |
times for test. These should be in [0, 1]. If not specified, equal intervals are assumed. |
c0 |
correlation matrix. If not specified, Brownian motion is assumed. |
Details
It calculates O'Brien-Fleming upper z-bounds and cumulative alpha-values for the repeated test in group sequential design.
Value
The result is a matrix.
ti |
time of test |
z |
O'Brien-Fleming upper z-bound |
cum.alpha |
cumulative alpha-value |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
OBFBound(K=2)
OBFBound(K=3)
OBFBound(K=4)
OBFBound(K=5)
Odds Ratio of two groups
Description
Odds ratio between two groups
Usage
OR(y1, n1, y2, n2, conf.level=0.95)
Arguments
y1 |
positive event count of the test (the first) group |
n1 |
total count of the test (the first) group |
y2 |
positive event count of the control (the second) group |
n2 |
total count of the control (the second) group |
conf.level |
confidence level |
Details
It calculates the odds ratio of two groups. No continuity correction is done here. If you need the percent scale, multiply the output by 100.
Value
The result is a data.frame.
odd1 |
odds from the first group, y1/(n1 - y1) |
odd2 |
odds from the second group, y2/(n2 - y2) |
OR |
odds ratio, odd1/odd2 |
SElog |
standard error of log(OR) |
lower |
lower confidence limit of OR |
upper |
upper confidence limit of OR |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
RD, RR, RDmn1, RRmn1, ORmn1, RDmn, RRmn, ORmn
Examples
OR(104, 11037, 189, 11034) # no continuity correction
Odds Ratio of two groups with strata by the CMH method
Description
Odds ratio and its confidence interval of two groups with stratification by the Cochran-Mantel-Haenszel method
Usage
ORcmh(d0, conf.level=0.95)
Arguments
d0 |
A |
conf.level |
confidence level |
Details
It calculates the odds ratio and its confidence interval of two groups. This can be used for meta-analysis also.
Value
The following output will be returned for each stratum and the common value.
odd1 |
odds from the first group, y1/(n1 - y1) |
odd2 |
odds from the second group, y2/(n2 - y2) |
OR |
odds ratio, odd1/odd2. The point estimate of the common OR is calculated with the MH weights. |
SElog |
standard error of log(OR) |
lower |
lower confidence limit of OR |
upper |
upper confidence limit of OR |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
RDmn1, RRmn1, ORmn1, RDmn, RRmn, ORmn, RDinv, RRinv, ORinv
Examples
d1 = matrix(c(25, 339, 28, 335, 23, 370, 40, 364), nrow=2, byrow=TRUE)
colnames(d1) = c("y1", "n1", "y2", "n2")
ORcmh(d1)
Odds Ratio of two groups with strata by the inverse variance method
Description
Odds ratio and its confidence interval of two groups with stratification by the inverse variance method
Usage
ORinv(d0, conf.level=0.95)
Arguments
d0 |
A |
conf.level |
confidence level |
Details
It calculates the odds ratio and its confidence interval of two groups by the inverse variance method. This supports stratification. This can be used for meta-analysis also.
Value
The following output will be returned for each stratum and the common value.
odd1 |
odds from the first group, y1/(n1 - y1) |
odd2 |
odds from the second group, y2/(n2 - y2) |
OR |
odds ratio, odd1/odd2. The point estimate of the common OR is calculated with the inverse variance weights. |
SElog |
standard error of log(OR) |
lower |
lower confidence limit of OR |
upper |
upper confidence limit of OR |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
RDmn1, RRmn1, ORmn1, RDmn, RRmn, ORmn, RDinv, RRinv, ORcmh
Examples
d1 = matrix(c(25, 339, 28, 335, 23, 370, 40, 364), nrow=2, byrow=TRUE)
colnames(d1) = c("y1", "n1", "y2", "n2")
ORinv(d1)
Odds Ratio and Score CI of two groups with strata by the MN method
Description
Odds ratio and its score confidence interval of two groups with stratification by the Miettinen and Nurminen method
Usage
ORmn(d0, conf.level=0.95, eps=1e-8)
Arguments
d0 |
A |
conf.level |
confidence level |
eps |
absolute value less than eps is regarded as negligible |
Details
It calculates the common odds ratio and its score confidence interval of two groups with stratification. The confidence interval is asymmetric, and there is no standard error in the output. For the stratified case, the inverse variance weighted score statistic with the bias correction is used, following Laud. The common odds ratio point estimate is the zero of the weighted score statistic, and the confidence bounds are found with the uniroot function. The result agrees with ratesci::scoreci(contrast="OR", stratified=TRUE, skew=FALSE) to at least 7 significant digits. For a single stratum, it returns the classical Miettinen-Nurminen interval of ORmn1. This can be used for meta-analysis also.
Value
The following output will be returned for each stratum and common value. There is no standard error.
odd1 |
odds from the first group, y1/(n1 - y1). For the common value, it is calculated with the weights at the point estimate. |
odd2 |
odds from the second group, y2/(n2 - y2). For the common value, it is calculated with the weights at the point estimate. |
OR |
odds ratio of the stratum. The common OR is the zero of the inverse variance weighted score statistic. |
lower |
lower confidence limit of OR |
upper |
upper confidence limit of OR |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Miettinen O, Nurminen M. Comparative analysis of two rates. Stat Med 1985;4:213-26
Laud PJ. Equal-tailed confidence intervals for comparison of rates. Pharmaceutical Statistics 2017;16:334-348
See Also
RDmn1, RRmn1, ORmn1, RDmn, RRmn, RDinv, RRinv, ORinv, ORcmh
Examples
d1 = matrix(c(25, 339, 28, 335, 23, 370, 40, 364), nrow=2, byrow=TRUE)
colnames(d1) = c("y1", "n1", "y2", "n2")
ORmn(d1)
d2 = data.frame(y1=c(4, 2, 10), n1=c(20, 20, 20), y2=c(8, 11, 2), n2=c(20, 20, 20))
ORmn(d2)
Odds Ratio and Score CI of two groups without strata by the MN method
Description
Odds ratio and its score confidence interval of two groups without stratification
Usage
ORmn1(y1, n1, y2, n2, conf.level=0.95, eps=1e-8)
Arguments
y1 |
positive event count of the test (the first) group |
n1 |
total count of the test (the first) group |
y2 |
positive event count of the control (the second) group |
n2 |
total count of the control (the second) group |
conf.level |
confidence level |
eps |
absolute value less than eps is regarded as negligible |
Details
It calculates the odds ratio and its score confidence interval of the two groups. The confidence interval is asymmetric, and there is no standard error in the output. This does not support stratification. This implementation uses the uniroot function, which usually gives at least 5 significant digits. In contrast, the PropCIs::orscoreci function uses an incremental or decremental search by a factor of 1.001, which gives less than 3 significant digits.
Value
There is no standard error.
odd1 |
odds from the first group, y1/(n1 - y1) |
odd2 |
odds from the second group, y2/(n2 - y2) |
OR |
odds ratio, odd1/odd2 |
lower |
lower confidence limit of OR |
upper |
upper confidence limit of OR |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Miettinen O, Nurminen M. Comparative analysis of two rates. Stat Med 1985;4:213-26
See Also
RDmn1, RRmn1, RDmn, RRmn, ORmn
Examples
ORmn1(104, 11037, 189, 11034)
Pairwise Difference
Description
Estimates pairwise differences by a common method.
Usage
PDIFF(Formula, Data, Term, conf.level=0.95, adj="lsd", ref, PLOT=FALSE,
reverse=FALSE, ...)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
Term |
a factor name to be estimated |
conf.level |
confidence level of the confidence interval |
adj |
"lsd", "tukey", "scheffe", "bon", "duncan", or "dunnett" to adjust the p-value and confidence limit |
ref |
reference or control level for the Dunnett test. If missing, the first level is used, as SAS does. |
PLOT |
whether to plot the diffogram |
reverse |
reverse A - B to B - A |
... |
arguments to be passed to |
Details
It corresponds to the PDIFF option of SAS PROC GLM.
Value
Returns a table of expectations, t values and p-values. Output columns may vary according to the adjustment option.
Estimate |
point estimate of the input linear contrast |
Lower CL |
lower confidence limit |
Upper CL |
upper confidence limit |
Std. Error |
standard error of the point estimate |
t value |
value for the t distribution |
Df |
degrees of freedom |
Pr(>|t|) |
probability of a larger absolute t value from the t distribution with the residual's degrees of freedom |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
PDIFF(uptake ~ Type*Treatment + as.factor(conc), CO2, "as.factor(conc)")
PDIFF(uptake ~ Type*Treatment + as.factor(conc), CO2, "as.factor(conc)", adj="tukey")
Partial Correlation test of multiple columns
Description
Testing partial correlation between many columns of data with the Pearson method.
Usage
Pcor.test(Data, x, y)
Arguments
Data |
a numeric matrix or |
x |
names of columns to be tested |
y |
names of control columns |
Details
It performs multiple partial correlation tests. It uses "complete.obs" rows of the x and y columns.
Value
Row names show which columns are used for the test.
Estimate |
point estimate of correlation |
Df |
degrees of freedom |
t value |
t value of the t distribution |
Pr(>|t|) |
probability with the t distribution |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
Pcor.test(mtcars, c("mpg", "hp", "qsec"), c("drat", "wt"))
Pocock (fixed) Bound for the cumulative Z-test with a final target alpha-value
Description
Cumulative alpha values for the cumulative hypothesis test with a fixed upper bound z-value in group sequential design.
Usage
PocockBound(K=2, alpha=0.05, side=2)
Arguments
K |
total number of tests |
alpha |
alpha value at the final test |
side |
1=one-sided test, 2=two-sided test |
Details
Pocock suggested a fixed upper bound z-value for the cumulative hypothesis test in group sequential designs.
Value
a fixed upper bound z-value for the K times repeated hypothesis test with a final alpha-value. Attributes are:
ti |
time of test. Equal intervals are assumed. |
cum.alpha |
cumulative alpha value |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Reboussin DM, DeMets DL, Kim K, Lan KKG. Computations for group sequential boundaries using the Lan-DeMets function method. Controlled Clinical Trials. 2000;21:190-207.
See Also
Examples
PocockBound(K=2) # Z-value of upper bound for the two-stage design
PocockBound(K=3) # Z-value of upper bound for the two-stage design
PocockBound(K=4) # Z-value of upper bound for the two-stage design
PocockBound(K=5) # Z-value of upper bound for the two-stage design
Interquartile Range
Description
Interquartile range (Q3 - Q1) with a conventional formula.
Usage
QuartileRange(y, Type=2)
Arguments
y |
a vector of numerics |
Type |
a type specifier to be passed to the |
Details
It removes NA in the input vector. Type 2 is the SAS default, while Type 6 is the SPSS default.
Value
The value of the interquartile range
Author(s)
Kyun-Seop Bae k@acr.kr
Risk Difference between two groups
Description
Risk (proportion) difference between two groups
Usage
RD(y1, n1, y2, n2, conf.level=0.95)
Arguments
y1 |
positive event count of the test (the first) group |
n1 |
total count of the test (the first) group |
y2 |
positive event count of the control (the second) group |
n2 |
total count of the control (the second) group |
conf.level |
confidence level |
Details
It calculates the risk difference between the two groups. No continuity correction here. If you need the percent scale, multiply the output by 100.
Value
The result is a data.frame.
p1 |
proportion from the first group |
p2 |
proportion from the second group |
RD |
risk difference, p1 - p2 |
SE |
standard error of RD |
lower |
lower confidence limit of RD |
upper |
upper confidence limit of RD |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
RR, OR, RDmn1, RRmn1, ORmn1, RDmn, RRmn, ORmn
Examples
RD(104, 11037, 189, 11034) # no continuity correction
Risk Difference between two groups with strata by inverse variance method
Description
Risk difference and its confidence interval between two groups with stratification by the inverse variance method
Usage
RDinv(d0, conf.level=0.95)
Arguments
d0 |
A |
conf.level |
confidence level |
Details
It calculates the risk difference and its confidence interval between two groups by the inverse variance method. The common risk difference is given by both the fixed effect model and the DerSimonian-Laird random effects model, with Cochran's Q test for heterogeneity. If you need the percent scale, multiply the output by 100. This can be used for meta-analysis also.
Value
RDs |
risk difference and its confidence interval of each stratum |
Heterogeneity |
Cochran's Q statistic for heterogeneity across the strata and its p-value |
tau2 |
between-strata variance estimated by the method of moments |
Fixed |
common risk difference, its standard error, and its confidence interval by the fixed effect model with the inverse variance weights |
Random |
common risk difference, its standard error, and its confidence interval by the DerSimonian-Laird random effects model |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
RDmn1, RRmn1, ORmn1, RDmn, RRmn, ORmn, RRinv, ORinv, ORcmh
Examples
d1 = matrix(c(25, 339, 28, 335, 23, 370, 40, 364), nrow=2, byrow=TRUE)
colnames(d1) = c("y1", "n1", "y2", "n2")
RDinv(d1)
Risk Difference and Score CI between two groups with strata by the MN method
Description
Risk difference and its score confidence interval between two groups with stratification by the Miettinen and Nurminen method
Usage
RDmn(d0, conf.level=0.95, eps=1e-8)
Arguments
d0 |
A |
conf.level |
confidence level |
eps |
an absolute value less than eps is regarded as negligible |
Details
It calculates the risk difference and its score confidence interval between the two groups. The confidence interval is asymmetric, and there is no standard error in the output. If you need the percent scale, multiply the output by 100. This supports stratification. This implementation uses the uniroot function, which usually gives at least 5 significant digits. This can be used for meta-analysis also.
Value
The following output will be returned for each stratum and the common value. There is no standard error.
p1 |
proportion from the first group, y1/n1 |
p2 |
proportion from the second group, y2/n2 |
RD |
risk difference, p1 - p2. The point estimate of the common RD is calculated with the MN weight. |
lower |
lower confidence limit of RD |
upper |
upper confidence limit of RD |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Miettinen O, Nurminen M. Comparative analysis of two rates. Stat Med 1985;4:213-26
See Also
RDmn1, RRmn1, ORmn1, RRmn, ORmn, RDinv, RRinv, ORinv, ORcmh
Examples
d1 = matrix(c(25, 339, 28, 335, 23, 370, 40, 364), nrow=2, byrow=TRUE)
colnames(d1) = c("y1", "n1", "y2", "n2")
RDmn(d1)
d2 = data.frame(y1=c(4, 2, 10), n1=c(20, 20, 20), y2=c(8, 11, 2), n2=c(20, 20, 20))
RDmn(d2)
Risk Difference and Score CI between two groups without strata by the MN method
Description
Risk difference and its score confidence interval between two groups without stratification
Usage
RDmn1(y1, n1, y2, n2, conf.level=0.95, eps=1e-8)
Arguments
y1 |
positive event count of the test (the first) group |
n1 |
total count of the test (the first) group. The maximum allowable value is 1e8. |
y2 |
positive event count of the control (the second) group |
n2 |
total count of the control (the second) group. The maximum allowable value is 1e8. |
conf.level |
confidence level |
eps |
an absolute value less than eps is regarded as negligible |
Details
It calculates the risk difference and its score confidence interval between the two groups. The confidence interval is asymmetric, and there is no standard error in the output. If you need the percent scale, multiply the output by 100. This does not support stratification. This implementation uses the uniroot function, which usually gives at least 5 significant digits.
Value
There is no standard error.
p1 |
proportion from the first group, y1/n1 |
p2 |
proportion from the second group, y2/n2 |
RD |
risk difference, p1 - p2 |
lower |
lower confidence limit of RD |
upper |
upper confidence limit of RD |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Miettinen O, Nurminen M. Comparative analysis of two rates. Stat Med 1985;4:213-26
See Also
RRmn1, ORmn1, RDmn, RRmn, ORmn
Examples
RDmn1(104, 11037, 189, 11034)
Regression of Linear Least Square, similar to SAS PROC REG
Description
REG is similar to SAS PROC REG.
Usage
REG(Formula, Data, conf.level=0.95, HC=FALSE, Resid=FALSE, Weights=1,
summarize=TRUE)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
conf.level |
confidence level for the confidence limit |
HC |
heteroscedasticity-related output is required, such as HC0, HC3, and White's first and second moment specification test |
Resid |
if |
Weights |
weights for each observation, usually the inverse of each variance. This should be a scalar or a vector of the same length as the number of rows of |
summarize |
If this is |
Details
It performs the core function of SAS PROC REG.
Value
The result is comparable to that of SAS PROC REG.
The first part is the ANOVA table.
The second part is measures of fitness.
The third part is the estimates of coefficients.
Estimate |
point estimate of parameters, coefficients |
Estimable |
estimability: 1=TRUE, 0=FALSE. This appears only when at least one inestimability occurs. |
Std. Error |
standard error of the point estimate |
Lower CL |
lower confidence limit with conf.level |
Upper CL |
upper confidence limit with conf.level |
Df |
degrees of freedom |
t value |
value for the t distribution |
Pr(>|t|) |
probability of a larger absolute t value from the t distribution with the residual's degrees of freedom |
The above result is repeated using HC0 and HC3, followed by White's first and second moment specification test, if the HC option is specified. The t values and their p values with HC1 and HC2 are between those of HC0 and HC3.
Fitted |
Fitted value or y hat in the original scale as SAS OUTPUT P= does, even with Weights. This is returned only with the Resid=TRUE option. |
Residual |
Residuals in the original scale as SAS OUTPUT R= does, even with Weights. This is returned only with the Resid=TRUE option. |
If summarize=FALSE, REG returns;
coefficients |
beta coefficients |
g2 |
g2 inverse |
rank |
rank of the model matrix |
DFr |
degrees of freedom for the residual |
SSE |
sum of squared errors |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
REG(uptake ~ Plant + Type + Treatment + conc, CO2)
REG(uptake ~ conc, CO2, HC=TRUE)
REG(uptake ~ conc, CO2, Resid=TRUE)
REG(uptake ~ conc, CO2, HC=TRUE, Resid=TRUE)
REG(uptake ~ conc, CO2, summarize=FALSE)
Relative Risk of the two groups
Description
Relative Risk between the two groups
Usage
RR(y1, n1, y2, n2, conf.level=0.95)
Arguments
y1 |
positive event count of the test (the first) group |
n1 |
total count of the test (the first) group |
y2 |
positive event count of the control (the second) group |
n2 |
total count of the control (the second) group |
conf.level |
confidence level |
Details
It calculates the relative risk of the two groups. No continuity correction here. If you need the percent scale, multiply the output by 100.
Value
The result is a data.frame.
p1 |
proportion from the first group |
p2 |
proportion from the second group |
RR |
relative risk, p1/p2 |
SElog |
standard error of log(RR) |
lower |
lower confidence limit of RR |
upper |
upper confidence limit of RR |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
RD, OR, RDmn1, RRmn1, ORmn1, RDmn, RRmn, ORmn
Examples
RR(104, 11037, 189, 11034) # no continuity correction
Relative Risk of two groups with strata by inverse variance method
Description
Relative risk and its confidence interval of two groups with stratification by the inverse variance method
Usage
RRinv(d0, conf.level=0.95)
Arguments
d0 |
A |
conf.level |
confidence level |
Details
It calculates the relative risk and its confidence interval of two groups by the inverse variance method on the log scale. The common relative risk is given by both the fixed effect model and the DerSimonian-Laird random effects model, with Cochran's Q test for heterogeneity. This can be used for meta-analysis also.
Value
RRs |
relative risk, its confidence interval, and the percent weights (pwi for the fixed effect model, pwsi for the random effects model) of each stratum |
Heterogeneity |
Cochran's Q statistic for heterogeneity across the strata and its p-value |
tau2 |
between-strata variance estimated by the method of moments |
Fixed |
common relative risk and its confidence interval by the fixed effect model with the inverse variance weights |
Random |
common relative risk and its confidence interval by the DerSimonian-Laird random effects model |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
RDmn1, RRmn1, ORmn1, RDmn, RRmn, ORmn, RDinv, ORinv, ORcmh
Examples
d1 = matrix(c(25, 339, 28, 335, 23, 370, 40, 364), nrow=2, byrow=TRUE)
colnames(d1) = c("y1", "n1", "y2", "n2")
RRinv(d1)
Relative Risk and Score CI of two groups with strata by the MN method
Description
Relative risk and its score confidence interval of two groups with stratification by the Miettinen and Nurminen method
Usage
RRmn(d0, conf.level=0.95, eps=1e-8)
Arguments
d0 |
A |
conf.level |
confidence level |
eps |
an absolute value less than eps is regarded as negligible |
Details
It calculates the relative risk and its score confidence interval of the two groups. The confidence interval is asymmetric, and there is no standard error in the output. This supports stratification. This implementation uses the uniroot function, which usually gives at least 5 significant digits, whereas the PropCIs::riskscoreci function uses a cubic equation approximation which gives only about 2 significant digits. This can be used for meta-analysis also.
Value
The following output will be returned for each stratum and the common value. There is no standard error.
p1 |
proportion from the first group, y1/n1 |
p2 |
proportion from the second group, y2/n2 |
RR |
relative risk, p1/p2. The point estimate of the common RR is calculated with the MN weight. |
lower |
lower confidence limit of RR |
upper |
upper confidence limit of RR |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Miettinen O, Nurminen M. Comparative analysis of two rates. Stat Med 1985;4:213-26
See Also
RDmn1, RRmn1, ORmn1, RDmn, ORmn, RDinv, RRinv, ORinv, ORcmh
Examples
d1 = matrix(c(25, 339, 28, 335, 23, 370, 40, 364), nrow=2, byrow=TRUE)
colnames(d1) = c("y1", "n1", "y2", "n2")
RRmn(d1)
d2 = data.frame(y1=c(4, 2, 10), n1=c(20, 20, 20), y2=c(8, 11, 2), n2=c(20, 20, 20))
RRmn(d2)
Relative Risk and Score CI of two groups without strata by the MN method
Description
Relative risk and its score confidence interval of the two groups without stratification
Usage
RRmn1(y1, n1, y2, n2, conf.level=0.95, eps=1e-8)
Arguments
y1 |
positive event count of the test (the first) group |
n1 |
total count of the test (the first) group |
y2 |
positive event count of the control (the second) group |
n2 |
total count of the control (the second) group |
conf.level |
confidence level |
eps |
an absolute value less than eps is regarded as negligible |
Details
It calculates the relative risk and its score confidence interval of the two groups. The confidence interval is asymmetric, and there is no standard error in the output. This does not support stratification. This implementation uses the uniroot function, which usually gives at least 5 significant digits, whereas the PropCIs::riskscoreci function uses a cubic equation approximation which gives only about 2 significant digits.
Value
There is no standard error.
p1 |
proportion from the first group, y1/n1 |
p2 |
proportion from the second group, y2/n2 |
RR |
relative risk, p1/p2 |
lower |
lower confidence limit of RR |
upper |
upper confidence limit of RR |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Miettinen O, Nurminen M. Comparative analysis of two rates. Stat Med 1985;4:213-26
See Also
RDmn1, ORmn1, RDmn, RRmn, ORmn
Examples
RRmn1(104, 11037, 189, 11034)
Test with Random Effects
Description
Hypothesis test with a specified type of SS using random effects as error terms. This corresponds to SAS PROC GLM's RANDOM /TEST statement.
Usage
RanTest(Formula, Data, Random="", Type=3, eps=1e-8)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
Random |
a vector of random effects. All should be specified as primary terms, not as interaction terms. All interaction terms with a random factor are regarded as random effects. |
Type |
Sum of squares type to be used as contrast |
eps |
A value less than this is considered as zero. |
Details
Type can be from 1 to 3. All interaction terms with a random factor are regarded as random effects. Here the error term should not be the MSE.
Value
Returns ANOVA and E(MS) tables with the specified type of SS.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
RanTest(log(CMAX) ~ SEQ/SUBJ + PRD + TRT, BEdata, Random="SUBJ")
fBE = log(CMAX) ~ ADM/SEQ/SUBJ + PRD + TRT
RanTest(fBE, BEdata, Random=c("ADM", "SUBJ"))
RanTest(fBE, BEdata, Random=c("ADM", "SUBJ"), Type=2)
RanTest(fBE, BEdata, Random=c("ADM", "SUBJ"), Type=1)
Range
Description
The range, maximum - minimum, as a scalar value.
Usage
Range(y)
Arguments
y |
a vector of numerics |
Details
It removes NA in the input vector.
Value
A scalar value of the range
Author(s)
Kyun-Seop Bae k@acr.kr
Standard Deviation
Description
Standard deviation of a sample.
Usage
SD(y)
Arguments
y |
a vector of numerics |
Details
It removes NA in the input vector. The length of the vector should be larger than 1.
Value
Sample standard deviation
Author(s)
Kyun-Seop Bae k@acr.kr
Standard Error of the Sample Mean
Description
The estimate of the standard error of the sample mean
Usage
SEM(y)
Arguments
y |
a vector of numerics |
Details
It removes NA in the input vector.
Value
The estimate of the standard error of the sample mean
Author(s)
Kyun-Seop Bae k@acr.kr
F Test with Slice
Description
Performs an F test with a given slice term.
Usage
SLICE(Formula, Data, Term, By)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
Term |
a factor name (not an interaction) to calculate the sum of squares and do an F test with least square means |
By |
a factor name to be used for slicing |
Details
It performs an F test with a given slice term. It is similar to the SLICE option of SAS PROC GLM.
Value
Returns the sum of squares and its F value and p-value. Row names are the levels of the slice term.
Df |
degrees of freedom |
Sum Sq |
sum of squares for the set of contrasts |
Mean Sq |
mean square |
F value |
F value for the F distribution |
Pr(>F) |
probability of a larger F value |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
SLICE(uptake ~ Type*Treatment, CO2, "Type", "Treatment")
SLICE(uptake ~ Type*Treatment, CO2, "Treatment", "Type")
Sum of Square
Description
Sum of squares with ANOVA.
Usage
SS(x, rx, L, eps=1e-8)
Arguments
x |
a result of |
rx |
a result of |
L |
linear hypothesis, a full matrix matching the information in |
eps |
Values less than this are considered as zero. |
Details
It calculates the sum of squares and completes the ANOVA table.
Value
ANOVA table |
a classical ANOVA table without the residual (Error) part. |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Score Confidence Interval for a Proportion or a Binomial Distribution
Description
Score confidence interval of a proportion in one group
Usage
ScoreCI(y, n, conf.level=0.95)
Arguments
y |
positive event count of a group |
n |
total count of a group |
conf.level |
confidence level |
Details
It calculates the score confidence interval of a proportion in one group. The confidence interval is asymmetric, and there is no standard error in the output. If you need the percent scale, multiply the output by 100.
Value
The result is a data.frame. There is no standard error.
PE |
point estimate of the proportion |
Lower |
lower confidence limit of the proportion |
Upper |
upper confidence limit of the proportion |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
ScoreCI(104, 11037)
Skewness
Description
Skewness with a conventional formula.
Usage
Skewness(y)
Arguments
y |
a vector of numerics |
Details
It removes NA in the input vector.
Value
Estimate of skewness
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Standard Error of Skewness
Description
Standard error of the skewness with a conventional formula.
Usage
SkewnessSE(y)
Arguments
y |
a vector of numerics |
Details
It removes NA in the input vector.
Value
Standard error of the estimated skewness
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Type III Expected Mean Square Formula
Description
Calculates a formula table for the expected mean square of Type III SS.
Usage
T3MS(Formula, Data, L0, eps=1e-8)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
L0 |
a matrix of row linear contrasts; if missing, |
eps |
Values less than this are considered as zero. |
Details
This is necessary for further hypothesis tests of nesting factors.
Value
A coefficient matrix for Type III expected mean square
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
T3MS(log(CMAX) ~ SEQ/SUBJ + PRD + TRT, BEdata)
Test Type III SS using an error term other than MSE
Description
Hypothesis test of Type III SS using an error term other than MSE. This corresponds to SAS PROC GLM's RANDOM /TEST clause.
Usage
T3test(Formula, Data, H="", E="", eps=1e-8)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
H |
Hypothesis term |
E |
Error term |
eps |
Values less than this are considered as zero. |
Details
It tests a factor of Type III SS using some other term as an error term. Here the error term should not be MSE.
Value
Returns one or more ANOVA table(s) of Type III SS.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
T3test(log(CMAX) ~ SEQ/SUBJ + PRD + TRT, BEdata, E=c("SEQ:SUBJ"))
T3test(log(CMAX) ~ SEQ/SUBJ + PRD + TRT, BEdata, H="SEQ", E=c("SEQ:SUBJ"))
Independent two groups t-test comparable to PROC TTEST
Description
This is comparable to SAS PROC TTEST.
Usage
TTEST(x, y, conf.level=0.95)
Arguments
x |
a vector of data from the first (test, active, experimental) group |
y |
a vector of data from the second (reference, control, placebo) group |
conf.level |
confidence level |
Details
Be cautious when choosing the row to use in the output.
Value
The output format is comparable to that of SAS PROC TTEST.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
TTEST(mtcars[mtcars$am==1, "mpg"], mtcars[mtcars$am==0, "mpg"])
Upper Confidence Limit
Description
The estimate of the upper bound of the confidence limit using the t-distribution
Usage
UCL(y, conf.level=0.95)
Arguments
y |
a vector of numerics |
conf.level |
confidence level |
Details
It removes NA in the input vector.
Value
The estimate of the upper bound of the confidence limit using the t-distribution
Author(s)
Kyun-Seop Bae k@acr.kr
Univariate Descriptive Statistics
Description
Returns descriptive statistics of a numeric vector.
Usage
UNIV(y, conf.level = 0.95)
Arguments
y |
a numeric vector |
conf.level |
confidence level for confidence limit |
Details
A convenient and comprehensive function for descriptive statistics. NA is removed during the calculation. This is similar to SAS PROC UNIVARIATE.
Value
nAll |
count of all elements in the input vector |
nNA |
count of |
nFinite |
count of finite numbers |
Mean |
mean excluding |
Variance |
variance excluding |
SD |
standard deviation excluding |
CV |
coefficient of variation in percent |
SEM |
standard error of the sample mean, the sample standard deviation divided by the square root of nFinite |
LowerCL |
lower confidence limit of mean |
UpperCL |
upper confidence limit of mean |
TrimmedMean |
trimmed mean with a trimming proportion of 1 - confidence level |
Min |
minimum value |
Q1 |
first quartile value with quantile type 2, the SAS default |
Median |
median value |
Q3 |
third quartile value with quantile type 2, the SAS default |
Max |
maximum value |
Range |
range of finite numbers, maximum - minimum |
IQR |
interquartile range with quantile type 2, the SAS default |
MAD |
median absolute deviation |
VarLL |
lower confidence limit of variance |
VarUL |
upper confidence limit of variance |
Skewness |
skewness |
SkewnessSE |
standard error of skewness |
Kurtosis |
kurtosis |
KurtosisSE |
standard error of kurtosis |
GeometricMean |
geometric mean, calculated only when all given values are positive. |
GeometricCV |
geometric coefficient of variation in percent, calculated only when all given values are positive. |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
UNIV(lh)
White's Model Specification Test
Description
This is shown in SAS PROC REG as the Test of First and Second Moment Specification.
Usage
WhiteTest(rx)
Arguments
rx |
a result of |
Details
This is also called White's general test for heteroskedasticity.
Value
Returns a direct test result by the more complex Theorem 2, not by the simpler Corollary 1.
Author(s)
Kyun-Seop Bae k@acr.kr
References
White H. A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica 1980;48(4):817-838.
Examples
WhiteTest(lm(mpg ~ disp, mtcars))
Convert some columns of a data.frame to factors
Description
Conveniently convert some columns of a data.frame into factors.
Usage
af(DataFrame, Cols)
Arguments
DataFrame |
a |
Cols |
column names or indices to be converted |
Details
It performs conversion of some columns in a data.frame into factors conveniently.
Value
Returns a data.frame with converted columns.
Author(s)
Kyun-Seop Bae k@acr.kr
ANOVA with Type I SS
Description
ANOVA with Type I SS.
Usage
aov1(Formula, Data, BETA=FALSE, Resid=FALSE)
Arguments
Formula |
a conventional formula for a linear model. |
Data |
a |
BETA |
if |
Resid |
if |
Details
It performs the core function of SAS PROC GLM, and returns Type I SS. This also accepts continuous independent variables.
Value
The result table is comparable to that of SAS PROC ANOVA.
Df |
degrees of freedom |
Sum Sq |
sum of squares for the set of contrasts |
Mean Sq |
mean square |
F value |
F value for the F distribution |
Pr(>F) |
probability of a larger F value |
The next returns are optional.
Parameter |
Parameter table with standard error, t value, p value. |
Fitted |
Fitted values or y hat. This is returned only with the |
Residual |
Weighted residuals. This is returned only with the |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
aov1(uptake ~ Plant + Type + Treatment + conc, CO2)
aov1(uptake ~ Plant + Type + Treatment + conc, CO2, BETA=TRUE)
aov1(uptake ~ Plant + Type + Treatment + conc, CO2, Resid=TRUE)
aov1(uptake ~ Plant + Type + Treatment + conc, CO2, BETA=TRUE, Resid=TRUE)
ANOVA with Type II SS
Description
ANOVA with Type II SS.
Usage
aov2(Formula, Data, BETA=FALSE, Resid=FALSE)
Arguments
Formula |
a conventional formula for a linear model. |
Data |
a |
BETA |
if |
Resid |
if |
Details
It performs the core function of SAS PROC GLM, and returns Type II SS. This also accepts continuous independent variables.
Value
The result table is comparable to that of SAS PROC ANOVA.
Df |
degrees of freedom |
Sum Sq |
sum of squares for the set of contrasts |
Mean Sq |
mean square |
F value |
F value for the F distribution |
Pr(>F) |
probability of a larger F value |
The next returns are optional.
Parameter |
Parameter table with standard error, t value, p value. |
Fitted |
Fitted values or y hat. This is returned only with the |
Residual |
Weighted residuals. This is returned only with the |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
aov2(uptake ~ Plant + Type + Treatment + conc, CO2)
aov2(uptake ~ Plant + Type + Treatment + conc, CO2, BETA=TRUE)
aov2(uptake ~ Plant + Type + Treatment + conc, CO2, Resid=TRUE)
aov2(uptake ~ Plant + Type + Treatment + conc, CO2, BETA=TRUE, Resid=TRUE)
aov2(uptake ~ Type, CO2)
aov2(uptake ~ Type - 1, CO2)
ANOVA with Type III SS
Description
ANOVA with Type III SS.
Usage
aov3(Formula, Data, BETA=FALSE, Resid=FALSE)
Arguments
Formula |
a conventional formula for a linear model. |
Data |
a |
BETA |
if |
Resid |
if |
Details
It performs the core function of SAS PROC GLM, and returns Type III SS. This also accepts continuous independent variables.
Value
The result table is comparable to that of SAS PROC ANOVA.
Df |
degrees of freedom |
Sum Sq |
sum of squares for the set of contrasts |
Mean Sq |
mean square |
F value |
F value for the F distribution |
Pr(>F) |
probability of a larger F value |
The next returns are optional.
Parameter |
Parameter table with standard error, t value, p value. |
Fitted |
Fitted values or y hat. This is returned only with the |
Residual |
Weighted residuals. This is returned only with the |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
aov3(uptake ~ Plant + Type + Treatment + conc, CO2)
aov3(uptake ~ Plant + Type + Treatment + conc, CO2, BETA=TRUE)
aov3(uptake ~ Plant + Type + Treatment + conc, CO2, Resid=TRUE)
aov3(uptake ~ Plant + Type + Treatment + conc, CO2, BETA=TRUE, Resid=TRUE)
An example dataset for meta-analysis - aspirin in coronary heart disease
Description
The data is from 'Canner PL. An overview of six clinical trials of aspirin in coronary heart disease. Stat Med. 1987'
Usage
aspirinCHD
Format
A data frame with 6 rows.
y1death event count of aspirin group
n1total subjects of the aspirin group
y2death event count of placebo group
n2total subjects of the placebo group
Details
This data is for educational purposes.
References
Canner PL. An overview of six clinical trials of aspirin in coronary heart disease. Stat Med. 1987;6:255-263.
Beautify the output of knitr::kable
Description
Trailing zeros after integers are somewhat annoying. This removes them from the vector of strings.
Usage
bk(ktab, rpltag=c("n", "N"), dig=10)
Arguments
ktab |
an output of |
rpltag |
tag string of replacement rows. This is usually "n", which means the sample count. |
dig |
maximum digits of decimals in the |
Details
This is convenient if used with tsum0, tsum1, tsum2, or tsum3. This requires knitr::kable.
Value
A new processed vector of strings. The class is still knitr_kable.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
## OUTPUT example
# t0 = tsum0(CO2, "uptake", c("mean", "median", "sd", "length", "min", "max"))
# bk(kable(t0)) # requires knitr package
#
# | | x|
# |:------|--------:|
# |mean | 27.21310|
# |median | 28.30000|
# |sd | 10.81441|
# |n | 84 |
# |min | 7.70000|
# |max | 45.50000|
# t1 = tsum(uptake ~ Treatment, CO2,
# e=c("mean", "median", "sd", "min", "max", "length"),
# ou=c("chilled", "nonchilled"),
# repl=list(c("median", "length"), c("med", "N")))
#
# bk(kable(t1, digits=3)) # requires knitr package
#
# | | chilled| nonchilled| Combined|
# |:----|-------:|----------:|--------:|
# |mean | 23.783| 30.643| 27.213|
# |med | 19.700| 31.300| 28.300|
# |sd | 10.884| 9.705| 10.814|
# |min | 7.700| 10.600| 7.700|
# |max | 42.400| 45.500| 45.500|
# |N | 42 | 42 | 84 |
Sum of Square with a Given Contrast Set
Description
Calculates the sum of squares of a contrast from an lfit result.
Usage
cSS(K, rx, mu=0, eps=1e-8)
Arguments
K |
contrast matrix. Each row is a contrast. |
rx |
a result of the |
mu |
a vector of mu for the hypothesis K. The length should be equal to the row count of K. |
eps |
Values less than this are considered zero. |
Details
It calculates the sum of squares with a given contrast matrix and an lfit result. It corresponds to SAS PROC GLM CONTRAST. This can test the hypothesis that the linear combination (function)'s mean vector is mu.
Value
Returns the sum of squares and its F value and p-value.
Df |
degrees of freedom |
Sum Sq |
sum of squares for the set of contrasts |
Mean Sq |
mean square |
F value |
F value for the F distribution |
Pr(>F) |
probability of a larger F value |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
rx = REG(uptake ~ Type, CO2, summarize=FALSE)
cSS(t(c(0, -1, 1)), rx) # sum of square
GLM(uptake ~ Type, CO2) # compare with the above
Correlation test by Fisher's Z transformation
Description
Testing correlation between two numeric vectors by Fisher's Z transformation.
Usage
corFisher(x, y, conf.level=0.95, rho=0)
Arguments
x |
the first input numeric vector |
y |
the second input numeric vector |
conf.level |
confidence level |
rho |
population correlation rho under the null hypothesis |
Details
This accepts only two numeric vectors.
Value
N |
sample size, length of input vectors |
r |
sample correlation |
Fisher.z |
Fisher's z |
bias |
bias to correct |
rho.hat |
point estimate of population rho |
conf.level |
confidence level for the confidence interval |
lower |
lower limit of confidence interval |
upper |
upper limit of confidence interval |
rho0 |
population correlation rho under the null hypothesis |
p.value |
p value under the null hypothesis |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Fisher RA. Statistical Methods for Research Workers. 14e. 1973
Examples
corFisher(mtcars$disp, mtcars$hp, rho=0.6)
Get a Contrast Matrix for Type I SS
Description
Makes a contrast matrix for Type I SS using the forward Doolittle method.
Usage
e1(XpX, eps=1e-8)
Arguments
XpX |
the crossproduct of a design or model matrix. This should have appropriate column names. |
eps |
A value less than this is considered zero. |
Details
It makes a contrast matrix for Type I SS. If zapsmall is used, the result becomes less accurate.
Value
A contrast matrix for Type I SS.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
x = ModelMatrix(uptake ~ Plant + Type + Treatment + conc, CO2)
round(e1(crossprod(x$X)), 12)
Get a Contrast Matrix for Type II SS
Description
Makes a contrast matrix for Type II SS.
Usage
e2(x, eps=1e-8)
Arguments
x |
an output of |
eps |
A value less than this is considered zero. |
Details
It makes a contrast matrix for Type II SS. If zapsmall is used, the result becomes less accurate.
Value
A contrast matrix for Type II SS.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
round(e2(ModelMatrix(uptake ~ Plant + Type + Treatment + conc, CO2)), 12)
round(e2(ModelMatrix(uptake ~ Type, CO2)), 12)
round(e2(ModelMatrix(uptake ~ Type - 1, CO2)), 12)
Get a Contrast Matrix for Type III SS
Description
Makes a contrast matrix for Type III SS.
Usage
e3(x, eps=1e-8)
Arguments
x |
an output of |
eps |
A value less than this is considered zero. |
Details
It makes a contrast matrix for Type III SS. If zapsmall is used, the result becomes less accurate.
Value
A contrast matrix for Type III SS.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
round(e3(ModelMatrix(uptake ~ Plant + Type + Treatment + conc, CO2)), 12)
Estimate Linear Functions
Description
Estimates Linear Functions with a given GLM result.
Usage
est(L, X, rx, conf.level=0.95, adj="lsd", paired=FALSE)
Arguments
L |
a matrix of linear contrast rows to be tested |
X |
a model (design) matrix from |
rx |
a result of the |
conf.level |
confidence level of the confidence limit |
adj |
adjustment method for grouping. This supports "tukey", "bon", "scheffe", "duncan", and "dunnett". This only affects grouping, not the confidence interval. |
paired |
If this is |
Details
It tests rows of linear functions. A linear function means a linear combination of estimated coefficients. It corresponds to the ESTIMATE statement of SAS PROC GLM. The same sample size per group is assumed for the Tukey adjustment.
Value
Estimate |
point estimate of the input linear contrast |
Lower CL |
lower confidence limit by the "lsd" method |
Upper CL |
upper confidence limit by the "lsd" method |
Std. Error |
standard error of the point estimate |
t value |
value for the t distribution, for methods other than "scheffe" |
F value |
value for the F distribution, for the "scheffe" method only |
Df |
degrees of freedom of the residuals |
Pr(>|t|) |
probability of a larger absolute t value from the t distribution with the residual degrees of freedom, for methods other than "scheffe" |
Pr(>F) |
probability of a larger F value from the F distribution with the residual degrees of freedom, for the "scheffe" method only |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
x = ModelMatrix(uptake ~ Type, CO2)
rx = REG(uptake ~ Type, CO2, summarize=FALSE)
est(t(c(0, -1, 1)), x$X, rx) # Quebec - Mississippi
t.test(uptake ~ Type, CO2) # compare with the above
Estimability Check
Description
Checks the estimability of row vectors of coefficients.
Usage
estmb(L, X, g2, eps=1e-8)
Arguments
L |
row vectors of coefficients |
X |
a model (design) matrix from |
g2 |
g2 generalized inverse of |
eps |
An absolute value less than this is considered to be zero. |
Details
It checks the estimability of L, row vectors of coefficients. This corresponds to the ESTIMATE statement of SAS PROC GLM. See <Kennedy Jr. WJ, Gentle JE. Statistical Computing. 1980> p361 or <Golub GH, Styan GP. Numerical Computations for Univariate Linear Models. 1971>.
Value
a vector of logical values indicating which rows are estimable (as TRUE)
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Generalized type 2 inverse matrix, g2 inverse
Description
A generalized inverse is usually not unique. Some programs use this algorithm to get a unique generalized inverse matrix. This uses the SWEEP operator and works for non-square matrices also.
Usage
g2inv(A, eps=1e-08)
Arguments
A |
a matrix to be inverted |
eps |
A value less than this is considered zero. |
Details
See 'SAS Technical Report R106, The Sweep Operator: Its Importance in Statistical Computing' by J. H. Goodnight for details.
Value
g2 inverse
Author(s)
Kyun-Seop Bae k@acr.kr
References
Searle SR, Khuri AI. Matrix Algebra Useful for Statistics. 2e. John Wiley and Sons Inc. 2017.
See Also
Examples
A = matrix(c(1, 2, 4, 3, 3, -1, 2, -2, 5, -4, 0, -7), byrow=TRUE, ncol=4) ; A
g2inv(A)
Geometric Coefficient of Variation in percentage
Description
Geometric coefficient of variation in percentage.
Usage
geoCV(y)
Arguments
y |
a numeric vector |
Details
It removes NA. This is sqrt(exp(var(log(y))) - 1)*100.
Value
Geometric coefficient of variation in percentage.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
geoCV(mtcars$mpg)
Geometric Mean without NA
Description
Geometric mean without NA values.
Usage
geoMean(y)
Arguments
y |
a vector of numerics |
Details
It removes NA in the input vector.
Value
geometric mean value
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
geoMean(mtcars$mpg)
Is it a correlation matrix?
Description
Tests if the input matrix is a correlation matrix or not.
Usage
is.cor(m, eps=1e-16)
Arguments
m |
a presumed correlation matrix |
eps |
epsilon value. An absolute value less than this is considered zero. |
Details
A diagonal component does not need to be exactly 1, but it should be close to 1.
Value
TRUE or FALSE
Author(s)
Kyun-Seop Bae k@acr.kr
Linear Fit
Description
Fits a least squares linear model.
Usage
lfit(x, y, eps=1e-8)
Arguments
x |
a result of |
y |
a column vector of the response (dependent) variable |
eps |
A value less than this is considered zero. |
Details
A minimal version of the least squares fit of a linear model
Value
coefficients |
beta coefficients |
g2 |
g2 inverse |
rank |
rank of the model matrix |
DFr |
degrees of freedom for the residual |
SSE |
sum of squares error |
SST |
sum of squares total |
DFr2 |
degrees of freedom of the residual for the beta coefficient |
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
f1 = uptake ~ Type*Treatment + conc
x = ModelMatrix(f1, CO2)
y = model.frame(f1, CO2)[,1]
lfit(x, y)
Linear Regression with g2 inverse
Description
Coefficients are calculated with the g2 inverse. The output is similar to summary(lm()).
Usage
lr(Formula, Data, eps=1e-8)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
eps |
A value less than this is considered zero. |
Details
It uses G2SWEEP to get the g2 inverse. The result is similar to summary(lm()) without options.
Value
The result is comparable to that of SAS PROC REG.
Estimate |
point estimate of parameters, coefficients |
Std. Error |
standard error of the point estimate |
t value |
value for the t distribution |
Pr(>|t|) |
probability of a larger absolute t value from the t distribution with the residual degrees of freedom |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
lr(uptake ~ Plant + Type + Treatment + conc, CO2)
lr(uptake ~ Plant + Type + Treatment + conc - 1, CO2)
lr(uptake ~ Type, CO2)
lr(uptake ~ Type - 1, CO2)
Simple Linear Regressions with Each Independent Variable
Description
Usually, the first step in multiple linear regression is to perform simple linear regressions with each single independent variable.
Usage
lr0(Formula, Data)
Arguments
Formula |
a conventional formula for a linear model. The intercept will always be added. |
Data |
a |
Details
It performs simple linear regression for each independent variable.
Value
Each row means one simple linear regression with that row name as the only independent variable.
Intercept |
estimate of the intercept |
SE(Intercept) |
standard error of the intercept |
Slope |
estimate of the slope |
SE(Slope) |
standard error of the slope |
Rsq |
R-squared for the simple linear model |
Pr(>F) |
p-value of the slope or the model |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
lr0(uptake ~ Plant + Type + Treatment + conc, CO2)
lr0(mpg ~ ., mtcars)
Independent two groups t-test similar to PROC TTEST with summarized input
Description
This is comparable to SAS PROC TTEST, except that it uses summarized input (sufficient statistics).
Usage
mtest(m1, s1, n1, m0, s0, n0, conf.level=0.95)
Arguments
m1 |
mean of the first (test, active, experimental) group |
s1 |
sample standard deviation of the first group |
n1 |
sample size of the first group |
m0 |
mean of the second (reference, control, placebo) group |
s0 |
sample standard deviation of the second group |
n0 |
sample size of the second group |
conf.level |
confidence level |
Details
This uses summarized input. This also produces confidence intervals of means and variances by group.
Value
The output format is comparable to that of SAS PROC TTEST.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
mtest(5.4, 10.5, 3529, 5.1, 8.9, 5190) # NEJM 388;15 p1386
Plot Confidence and Prediction Bands for Simple Linear Regression
Description
It plots bands of the confidence interval and prediction interval for simple linear regression.
Usage
pB(Formula, Data, Resol=300, conf.level=0.95, lx, ly, ...)
Arguments
Formula |
a formula |
Data |
a |
Resol |
resolution for the output |
conf.level |
confidence level |
lx |
x position of the legend |
ly |
y position of the legend |
... |
arguments to be passed to |
Details
It plots. Discard the return values. If lx or ly is missing, the legend position is calculated automatically.
Value
Ignore the return values.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
pB(hp ~ disp, mtcars)
pB(mpg ~ disp, mtcars)
Diagnostic Plot for Regression
Description
Four standard diagnostic plots for regression.
Usage
pD(rx, Title=NULL)
Arguments
rx |
a result of |
Title |
title to be printed on the plot |
Details
The most frequently used diagnostic plots are 'observed vs. fitted', 'standardized residual vs. fitted', 'distribution plot of standardized residuals', and 'Q-Q plot of standardized residuals'.
Value
Four diagnostic plots on a page.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
pD(lm(uptake ~ Plant + Type + Treatment + conc, CO2), "Diagnostic Plot")
Residual Diagnostic Plot for Regression
Description
Nine residual diagnostic plots.
Usage
pResD(rx, Title=NULL)
Arguments
rx |
a result of |
Title |
title to be printed on the plot |
Details
SAS-style residual diagnostic plots.
Value
Nine residual diagnostic plots on a page.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
pResD(lm(uptake ~ Plant + Type + Treatment + conc, CO2), "Residual Diagnostic Plot")
Regression of Conventional Way with Rich Diagnostics
Description
regD provides rich diagnostics such as student residual, leverage (hat), Cook's D, studentized deleted residual, DFFITS, and DFBETAS.
Usage
regD(Formula, Data)
Arguments
Formula |
a conventional formula for a linear model |
Data |
a |
Details
It performs the conventional regression analysis. This does not use the g2 inverse; therefore, it cannot handle a singular matrix. If the model (design) matrix is not of full rank, use REG or fewer parameters.
Value
Coefficients |
conventional coefficients summary with Wald statistics |
Diagnostics |
Diagnostics table for detecting outliers or influential/leverage points. This includes the fitted value (Predicted), residual (Residual), standard error of the residual (SE_Resid), studentized residual (Student_Res), hat (Leverage), Cook's D, studentized deleted residual (RStudent), DFFITS, and COVRATIO. |
DFBETAS |
Column names are the names of coefficients. Each row shows how much each coefficient is affected by deleting the corresponding observation. |
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
regD(uptake ~ conc, CO2)
Satterthwaite Approximation of Variance and Degrees of Freedom
Description
Calculates the pooled variance and degrees of freedom using the Satterthwaite equation.
Usage
satt(vars, dfs, ws=c(1, 1))
Arguments
vars |
a vector of variances |
dfs |
a vector of degrees of freedom |
ws |
a vector of weights |
Details
The input can contain more than two variances.
Value
Variance |
approximated variance |
Df |
degrees of freedom |
Author(s)
Kyun-Seop Bae k@acr.kr
Sequential bounds for cumulative Z-test in Group Sequential Design
Description
Sequential upper bounds for cumulative Z-test on accumulating data. Z values are correlated. This is usually used for a group sequential design.
Usage
seqBound(ti, alpha = 0.05, side = 2, t2 = NULL, asf = 1)
Arguments
ti |
times for the tests. These should be in [0, 1]. |
alpha |
goal alpha value for the last test at time 1. |
side |
1=one-sided test, 2=two-sided test |
t2 |
fractions of the information amount. These should be in [0, 1]. If not available, ti will be used instead. |
asf |
alpha spending function. 1=O'Brien-Fleming type (approximate, not exact), 2=Pocock type (approximate, not exact), 3=alpha*ti, 4=alpha*ti^1.5, 5=alpha*ti^2 |
Details
It calculates upper z-bounds and cumulative alpha-values for the repeated tests in a group sequential design. The correlation is assumed to be sqrt(t_i/t_j).
Use PocockBound and OBFBound for more exact bounds.
Value
The result is a matrix.
time |
time of test |
up.bound |
upper z-bound |
cum.alpha |
cumulative alpha-value |
Author(s)
Kyun-Seop Bae k@acr.kr
References
Reboussin DM, DeMets DL, Kim K, Lan KKG. Computations for group sequential boundaries using the Lan-DeMets function method. Controlled Clinical Trials. 2000;21:190-207.
See Also
Examples
seqBound(ti=(1:5)/5)
seqBound(ti=(1:5)/5, asf=2)
Confidence interval with the last Z-value for a group sequential design
Description
Confidence interval with given upper bounds, times of tests, the last Z-value, and confidence level.
Usage
seqCI(bi, ti, Zval, conf.level=0.95)
Arguments
bi |
upper bound z-values |
ti |
times for the tests. These should be in [0, 1]. |
Zval |
the last z-value from the observed data. This is not necessarily the planned final Z-value. |
conf.level |
confidence level |
Details
It calculates the confidence interval with given upper bounds, times of tests, the last Z-value, and confidence level. It assumes a two-sided test. mvtnorm::pmvt (with noncentrality) is better than pmvnorm in calculating power, sample size, and confidence interval. However, Lan-DeMets used the multivariate normal rather than the multivariate noncentral t distribution. This function follows Lan-DeMets for consistency with previous results. For the theoretical background, see the reference.
Value
confidence interval of the Z-value for the given confidence level.
Author(s)
Kyun-Seop Bae k@acr.kr
References
Reboussin DM, DeMets DL, Kim K, Lan KKG. Computations for group sequential boundaries using the Lan-DeMets function method. Controlled Clinical Trials. 2000;21:190-207.
Examples
seqCI(bi = c(2.53, 2.61, 2.57, 2.47, 2.43, 2.38),
ti = c(.2292, .3333, .4375, .5833, .7083, .8333), Zval=2.82)
Independent two means test similar to t.test with summarized input
Description
This produces essentially the same result as t.test, except using summarized input (sufficient statistics).
Usage
tmtest(m1, s1, n1, m0, s0, n0, conf.level=0.95, nullHypo=0, var.equal=FALSE)
Arguments
m1 |
mean of the first (test, active, experimental) group |
s1 |
sample standard deviation of the first group |
n1 |
sample size of the first group |
m0 |
mean of the second (reference, control, placebo) group |
s0 |
sample standard deviation of the second group |
n0 |
sample size of the second group |
conf.level |
confidence level |
nullHypo |
value for the difference of means under the null hypothesis |
var.equal |
assumption on the variance equality |
Details
The default is the Welch t-test with the Satterthwaite approximation.
Value
The output format is very similar to that of t.test.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
tmtest(5.4, 10.5, 3529, 5.1, 8.9, 5190) # NEJM 388;15 p1386
tmtest(5.4, 10.5, 3529, 5.1, 8.9, 5190, var.equal=TRUE)
Trimmed Mean
Description
Trimmed mean wrapping the mean function.
Usage
trimmedMean(y, Trim=0.05)
Arguments
y |
a vector of numerics |
Trim |
trimming proportion. Default is 0.05 |
Details
It removes NA in the input vector.
Value
The value of the trimmed mean
Author(s)
Kyun-Seop Bae k@acr.kr
Table Summary
Description
Summarize a continuous dependent variable with or without independent variables.
Usage
tsum(Formula=NULL, Data=NULL, ColNames=NULL, MaxLevel=30, ...)
Arguments
Formula |
a conventional formula |
Data |
a |
ColNames |
If there is no |
MaxLevel |
An independent variable with more levels than this will not be handled. |
... |
arguments to be passed to |
Details
A convenient summarization function for a continuous variable. This is a wrapper function for tsum0, tsum1, tsum2, or tsum3.
Value
A data.frame of descriptive summarization values.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
tsum(lh)
t(tsum(CO2))
t(tsum(uptake ~ Treatment, CO2))
tsum(uptake ~ Type + Treatment, CO2)
print(tsum(uptake ~ conc + Type + Treatment, CO2), digits=3)
Table Summary with 0 independent (x) variables
Description
Summarize a continuous dependent (y) variable without any independent (x) variable.
Usage
tsum0(d, y, e=c("Mean", "SD", "N"), repl=list(c("length"), c("n")))
Arguments
d |
a |
y |
y variable name, a continuous variable |
e |
a vector of summary function names |
repl |
a list of strings to replace after summarization. The length of the list should be 2, and both elements should have the same length. |
Details
A convenient summarization function for a continuous variable.
Value
A vector of summarized values
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
tsum0(CO2, "uptake")
tsum0(CO2, "uptake", repl=list(c("mean", "length"), c("Mean", "n")))
Table Summary with 1 independent (x) variable
Description
Summarize a continuous dependent (y) variable with one independent (x) variable.
Usage
tsum1(d, y, u, e=c("Mean", "SD", "N"), ou="", repl=list(c("length"), ("n")))
Arguments
d |
a |
y |
y variable name, a continuous variable |
u |
x variable name, upper side variable |
e |
a vector of summary function names |
ou |
order of levels of the upper side x variable |
repl |
a list of strings to replace after summarization. The length of the list should be 2, and both elements should have the same length. |
Details
A convenient summarization function for a continuous variable with one x variable.
Value
A data.frame of summarized values. Row names are from e names. Column names are from the levels of the x variable.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
tsum1(CO2, "uptake", "Treatment")
tsum1(CO2, "uptake", "Treatment",
e=c("mean", "median", "sd", "min", "max", "length"),
ou=c("chilled", "nonchilled"),
repl=list(c("median", "length"), c("med", "n")))
Table Summary with 2 independent (x) variables
Description
Summarize a continuous dependent (y) variable with two independent (x) variables.
Usage
tsum2(d, y, l, u, e=c("Mean", "SD", "N"), h=NULL, ol="", ou="", rm.dup=TRUE,
repl=list(c("length"), c("n")))
Arguments
d |
a |
y |
y variable name, a continuous variable |
l |
x variable name to be shown on the left side |
u |
x variable name to be shown on the upper side |
e |
a vector of summary function names |
h |
a vector of summary function names for the horizontal subgroup. If |
ol |
order of levels of the left side x variable |
ou |
order of levels of the upper side x variable |
rm.dup |
if |
repl |
a list of strings to replace after summarization. The length of the list should be 2, and both elements should have the same length. |
Details
A convenient summarization function for a continuous variable with two x variables; one on the left side, the other on the upper side.
Value
A data.frame of summarized values. Column names are from the levels of u. Row names are basically from the levels of l.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
tsum2(CO2, "uptake", "Type", "Treatment")
tsum2(CO2, "uptake", "Type", "conc")
tsum2(CO2, "uptake", "Type", "Treatment",
e=c("mean", "median", "sd", "min", "max", "length"),
ou=c("chilled", "nonchilled"),
repl=list(c("median", "length"), c("med", "n")))
Table Summary with 3 independent (x) variables
Description
Summarize a continuous dependent (y) variable with three independent (x) variables.
Usage
tsum3(d, y, l, u, e=c("Mean", "SD", "N"), h=NULL, ol1="", ol2="", ou="",
rm.dup=TRUE, repl=list(c("length"), c("n")))
Arguments
d |
a |
y |
y variable name, a continuous variable |
l |
a vector of two x variable names to be shown on the left side. The length should be 2. |
u |
x variable name to be shown on the upper side |
e |
a vector of summary function names |
h |
a list of two vectors of summary function names for the first and second horizontal subgroups. If |
ol1 |
order of levels of the 1st left side x variable |
ol2 |
order of levels of the 2nd left side x variable |
ou |
order of levels of the upper side x variable |
rm.dup |
if |
repl |
a list of strings to replace after summarization. The length of the list should be 2, and both elements should have the same length. |
Details
A convenient summarization function for a continuous variable with three x variables; two on the left side, the other on the upper side.
Value
A data.frame of summarized values. Column names are from the levels of u. Row names are basically from the levels of l.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
tsum3(CO2, "uptake", c("Type", "Treatment"), "conc")
tsum3(CO2, "uptake", c("Type", "Treatment"), "conc",
e=c("mean", "median", "sd", "min", "max", "length"),
h=list(c("mean", "sd", "length"), c("mean", "length")),
ol2=c("chilled", "nonchilled"),
repl=list(c("median", "length"), c("med", "n")))
F-Test for the ratio of two groups' variances
Description
F-test for the ratio of two groups' variances. This is similar to var.test except using the summarized input.
Usage
vtest(v1, n1, v0, n0, ratio=1, conf.level=0.95)
Arguments
v1 |
sample variance of the first (test, active, experimental) group |
n1 |
sample size of the first group |
v0 |
sample variance of the second (reference, control, placebo) group |
n0 |
sample size of the second group |
ratio |
value for the ratio of variances under the null hypothesis |
conf.level |
confidence level |
Details
For the confidence interval of one group, use the UNIV function.
Value
The output format is very similar to that of var.test.
Author(s)
Kyun-Seop Bae k@acr.kr
Examples
vtest(10.5^2, 3529, 8.9^2, 5190) # NEJM 388;15 p1386
vtest(2.3^2, 13, 1.5^2, 11, conf.level=0.9) # Red book p240
Test for the difference of two groups' means
Description
This is similar to the two groups t-test, but using the standard normal (Z) distribution.
Usage
ztest(m1, s1, n1, m0, s0, n0, conf.level=0.95, nullHypo=0)
Arguments
m1 |
mean of the first (test, active, experimental) group |
s1 |
known standard deviation of the first group |
n1 |
sample size of the first group |
m0 |
mean of the second (reference, control, placebo) group |
s0 |
known standard deviation of the second group |
n0 |
sample size of the second group |
conf.level |
confidence level |
nullHypo |
value for the difference of means under the null hypothesis |
Details
Use this only for known standard deviations (or variances) or very large sample sizes per group.
Value
The output format is very similar to that of t.test.
Author(s)
Kyun-Seop Bae k@acr.kr
See Also
Examples
ztest(5.4, 10.5, 3529, 5.1, 8.9, 5190) # NEJM 388;15 p1386