---
title: "Homogeneous Weibull Series Systems: Shared Shape Parameter"
output:
    rmarkdown::html_vignette:
        toc: true
vignette: >
  %\VignetteIndexEntry{Homogeneous Weibull Series Systems: Shared Shape Parameter}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
header-includes:
- \renewcommand{\v}[1]{\boldsymbol{#1}}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  echo = TRUE,
  comment = "#>")

# Set to TRUE to regenerate long-running simulation results
run_long <- FALSE

library(maskedcauses)
old_opts <- options(digits = 4)
```


Theory {#theory}
=================

## Component lifetime model

Consider a series system with $m$ components where each component lifetime
follows a Weibull distribution with a **common shape parameter** $k$ but
individual scale parameters $\beta_1, \ldots, \beta_m$.
The parameter vector is
$$
  \theta = (k, \beta_1, \ldots, \beta_m) \in \mathbb{R}^{m+1}_{>0}.
$$

The $j$-th component has hazard function
$$
  h_j(t \mid k, \beta_j) = \frac{k}{\beta_j}
    \left(\frac{t}{\beta_j}\right)^{k-1}, \quad t > 0,
$$
survival function
$$
  R_j(t \mid k, \beta_j) = \exp\!\left(-\left(\frac{t}{\beta_j}\right)^k\right),
$$
and pdf
$$
  f_j(t \mid k, \beta_j) = h_j(t) \, R_j(t)
    = \frac{k}{\beta_j}\left(\frac{t}{\beta_j}\right)^{k-1}
      \exp\!\left(-\left(\frac{t}{\beta_j}\right)^k\right).
$$

The shape parameter $k$ controls the failure rate behaviour:

- $k < 1$: decreasing failure rate (DFR) -- infant mortality, burn-in
- $k = 1$: constant failure rate (CFR) -- exponential distribution
- $k > 1$: increasing failure rate (IFR) -- wear-out, aging

## System lifetime

The system lifetime $T = \min(T_1, \ldots, T_m)$ has reliability
$$
  R_{\text{sys}}(t \mid \theta) = \prod_{j=1}^m R_j(t)
    = \exp\!\left(-\sum_{j=1}^m \left(\frac{t}{\beta_j}\right)^k\right).
$$

Because all components share the same shape $k$, the exponent simplifies:
$$
  \sum_{j=1}^m \left(\frac{t}{\beta_j}\right)^k
    = t^k \sum_{j=1}^m \beta_j^{-k}
    = \left(\frac{t}{\beta_{\text{sys}}}\right)^k,
$$
where
$$
  \beta_{\text{sys}} = \left(\sum_{j=1}^m \beta_j^{-k}\right)^{-1/k}.
$$
Thus, the system lifetime is itself Weibull with shape $k$ and scale
$\beta_{\text{sys}}$:
$$
  T \sim \operatorname{Weibull}(k, \beta_{\text{sys}}).
$$
This closure under the minimum operation is the defining structural advantage
of the homogeneous model and is computed by `wei_series_system_scale(k, scales)`.

## Conditional cause probability

The conditional probability that component $j$ caused the system failure at
time $t$ is
$$
  P(K = j \mid T = t, \theta) = \frac{h_j(t)}{h_{\text{sys}}(t)}
    = \frac{\beta_j^{-k} \cdot (k / \beta_j)(t/\beta_j)^{k-1}}
           {\sum_l \beta_l^{-k} \cdot (k / \beta_l)(t/\beta_l)^{k-1}}.
$$
The $t^{k-1}$ and $k$ factors cancel, yielding
$$
  P(K = j \mid T = t, \theta) = \frac{\beta_j^{-k}}{\sum_{l=1}^m \beta_l^{-k}}
    \eqqcolon w_j.
$$
Remarkably, this does **not** depend on the failure time $t$ -- the conditional
cause probability is constant, just as in the exponential case. This occurs
because every component hazard shares the same power-law time dependence
$t^{k-1}$, which cancels in the ratio. We define the **cause weights**
$$
  w_j = \frac{\beta_j^{-k}}{\sum_{l=1}^m \beta_l^{-k}}.
$$

## Marginal cause probability

Since $P(K = j \mid T = t, \theta)$ is independent of $t$,
the marginal probability integrates trivially:
$$
  P(K = j \mid \theta) = \mathbb{E}_T[P(K = j \mid T, \theta)] = w_j
    = \frac{\beta_j^{-k}}{\sum_{l=1}^m \beta_l^{-k}}.
$$
Components with smaller scale parameters (shorter expected lifetimes) are more
likely to be the cause of system failure.

## Connection to the exponential model

When $k = 1$, the Weibull distribution reduces to the exponential with rate
$\lambda_j = 1/\beta_j$. In this case:

- The system scale becomes
  $\beta_{\text{sys}} = (1/\sum_j \beta_j^{-1})$ and the system rate is
  $\lambda_{\text{sys}} = \sum_j \lambda_j$.
- The cause weights become $w_j = \lambda_j / \lambda_{\text{sys}}$.
- All likelihood contributions reduce to the `exp_series_md_c1_c2_c3` forms.

We verify this identity numerically in the [Weibull(k=1) = Exponential Identity](#exponential-identity) section below.


Worked Example {#worked-example}
================================

We construct a 3-component homogeneous Weibull series system with increasing
failure rate ($k = 1.5$):

```{r worked-example-params}
theta <- c(k = 1.5, beta1 = 100, beta2 = 150, beta3 = 200)
k <- theta[1]
scales <- theta[-1]
m <- length(scales)

beta_sys <- wei_series_system_scale(k, scales)
cat("System scale (beta_sys):", round(beta_sys, 2), "\n")
cat("System mean lifetime:", round(beta_sys * gamma(1 + 1/k), 2), "\n")
```

## Component hazards

The `component_hazard()` generic returns a closure for the $j$-th component
hazard. We overlay all three to visualize how failure intensity changes over
time:

```{r hazard-plot, fig.width=7, fig.height=4.5}
model <- wei_series_homogeneous_md_c1_c2_c3()
t_grid <- seq(0.1, 200, length.out = 300)

cols <- c("#E41A1C", "#377EB8", "#4DAF4A")
plot(NULL, xlim = c(0, 200), ylim = c(0, 0.06),
     xlab = "Time t", ylab = "Hazard h_j(t)",
     main = "Component Hazard Functions")
for (j in seq_len(m)) {
  h_j <- component_hazard(model, j)
  lines(t_grid, h_j(t_grid, theta), col = cols[j], lwd = 2)
}
legend("topleft", paste0("Component ", 1:m, " (beta=", scales, ")"),
       col = cols, lwd = 2, cex = 0.9)
```

All three hazard curves are increasing ($k > 1$), but component 1 (smallest
scale) has the steepest rate of increase and dominates at all times.

## Cause probabilities

```{r cause-probs}
# Analytical cause weights: w_j = beta_j^{-k} / sum(beta_l^{-k})
w <- scales^(-k) / sum(scales^(-k))
names(w) <- paste0("Component ", 1:m)
cat("Cause weights (w_j):\n")
print(round(w, 4))
```

The `conditional_cause_probability()` generic confirms these are time-invariant:

```{r cond-cause-plot, fig.width=7, fig.height=4}
ccp_fn <- conditional_cause_probability(
  wei_series_homogeneous_md_c1_c2_c3(scales = scales)
)
probs <- ccp_fn(t_grid, theta)

plot(NULL, xlim = c(0, 200), ylim = c(0, 0.7),
     xlab = "Time t", ylab = "P(K = j | T = t)",
     main = "Conditional Cause Probability vs Time")
for (j in seq_len(m)) {
  lines(t_grid, probs[, j], col = cols[j], lwd = 2)
}
legend("right", paste0("Component ", 1:m),
       col = cols, lwd = 2, cex = 0.9)
abline(h = w, col = cols, lty = 3)
```

The conditional cause probabilities are flat lines, confirming that the
homogeneous shape creates time-invariant cause attributions -- a key structural
simplification shared with the exponential model.

## Data generation with periodic inspection

```{r gen-periodic-data}
gen <- rdata(model)

set.seed(2024)
df <- gen(theta, n = 500, p = 0.3,
          observe = observe_periodic(delta = 20, tau = 250))

cat("Observation types:\n")
print(table(df$omega))
cat("\nFirst 6 rows:\n")
print(head(df), row.names = FALSE)
```


Likelihood Contributions {#likelihood-contributions}
=====================================================

The log-likelihood under conditions C1, C2, C3 decomposes into individual
observation contributions. The homogeneous shape is critical: because the
system lifetime is Weibull($k$, $\beta_{\text{sys}}$), left- and
interval-censored terms have **closed-form** expressions involving only the
system Weibull CDF and cause weights $w_c$.

Let $c_i$ denote the candidate set for observation $i$ and define
$w_{c_i} = \sum_{j \in c_i} \beta_j^{-k} / \sum_l \beta_l^{-k}$.

## Exact observation ($\omega_i = \text{exact}$)

The system failed at observed time $t_i$:
$$
  \ell_i = \log\!\left(\sum_{j \in c_i} h_j(t_i)\right)
    - \sum_{j=1}^m \left(\frac{t_i}{\beta_j}\right)^k.
$$

## Right-censored ($\omega_i = \text{right}$)

The system survived past $t_i$ (no candidate set information):
$$
  \ell_i = -\sum_{j=1}^m \left(\frac{t_i}{\beta_j}\right)^k
    = -\left(\frac{t_i}{\beta_{\text{sys}}}\right)^k.
$$

## Left-censored ($\omega_i = \text{left}$)

The system failed before inspection time $\tau_i$:
$$
  \ell_i = \log w_{c_i} + \log\!\left(
    1 - \exp\!\left(-\left(\frac{\tau_i}{\beta_{\text{sys}}}\right)^k\right)
  \right).
$$
This is $\log w_{c_i} + \log F_{\text{sys}}(\tau_i)$, where $F_{\text{sys}}$
is the system Weibull CDF -- a closed form that does **not** require numerical
integration. The $w_{c_i}$ term arises because, under homogeneous shapes, the
cause attribution and system lifetime factor cleanly.

## Interval-censored ($\omega_i = \text{interval}$)

The system failed in $(a_i, b_i)$:
$$
  \ell_i = \log w_{c_i}
    - \left(\frac{a_i}{\beta_{\text{sys}}}\right)^k
    + \log\!\left(1 - \exp\!\left(
      -\left(\left(\frac{b_i}{\beta_{\text{sys}}}\right)^k
       - \left(\frac{a_i}{\beta_{\text{sys}}}\right)^k\right)
    \right)\right).
$$
This is $\log w_{c_i} + \log(R_{\text{sys}}(a_i) - R_{\text{sys}}(b_i))$,
again in closed form.

## Why homogeneous shapes enable closed forms

In the heterogeneous Weibull model (`wei_series_md_c1_c2_c3`), the system
survival function $R_{\text{sys}}(t) = \prod_j \exp(-(t/\beta_j)^{k_j})$ does
not reduce to a single Weibull, so the left- and interval-censored
contributions require numerical integration via `cubature`. The homogeneous
constraint $k_j = k$ for all $j$ collapses the product into a single Weibull
CDF evaluation, and the cause weights $w_{c_i}$ separate from the time
dependence. This makes the homogeneous model substantially faster and more
numerically stable for censored data.


MLE Fitting {#mle-fitting}
===========================

We fit the model to the periodically inspected data generated above using the
`fit()` generic, which returns a solver based on `optim`:

```{r mle-fit, warning=FALSE}
solver <- fit(model)
theta0 <- c(1.2, 110, 140, 180)  # Initial guess near true values

estimate <- solver(df, par = theta0, method = "Nelder-Mead")
print(estimate)
```

```{r mle-recovery}
cat("True parameters: ", round(theta, 2), "\n")
cat("MLE estimates:   ", round(estimate$par, 2), "\n")
cat("Std errors:      ", round(sqrt(diag(estimate$vcov)), 2), "\n")
cat("Relative error:  ",
    round(100 * abs(estimate$par - theta) / theta, 1), "%\n")
```

## Score and Hessian computation

The score function uses a **hybrid approach**: analytical gradients for exact
and right-censored observations, and `numDeriv::grad` for left- and
interval-censored observations. The Hessian is fully numerical, computed as the
Jacobian of the score via `numDeriv::jacobian`.

```{r score-hessian-verify}
ll_fn <- loglik(model)
scr_fn <- score(model)
hess_fn <- hess_loglik(model)

# Score at MLE should be near zero
scr_mle <- scr_fn(df, estimate$par)
cat("Score at MLE:", round(scr_mle, 4), "\n")

# Verify score against numerical gradient
scr_num <- numDeriv::grad(function(th) ll_fn(df, th), estimate$par)
cat("Max |analytical - numerical| score:",
    formatC(max(abs(scr_mle - scr_num)), format = "e", digits = 2), "\n")

# Hessian eigenvalues (should be negative for concavity)
H <- hess_fn(df, estimate$par)
cat("Hessian eigenvalues:", round(eigen(H)$values, 2), "\n")
```


Monte Carlo Simulation Study {#monte-carlo}
=============================================

We compare estimation performance across three shape regimes:
decreasing failure rate ($k = 0.7$), constant ($k = 1.0$), and increasing
($k = 1.5$). Each scenario uses $m = 3$ components, $n = 500$ observations,
Bernoulli masking with $p = 0.3$, and periodic inspection with approximately
25% right-censoring.

```{r load-precomputed, include=FALSE, eval=!run_long}
results <- readRDS("precomputed_weibull_homogeneous.rds")
```

```{r mc-setup, cache=TRUE, eval=run_long}
set.seed(2024)

B <- 100          # Monte Carlo replications
n_mc <- 500       # Sample size per replication
p_mc <- 0.3       # Masking probability
alpha <- 0.05     # CI level

# Three shape regimes
scenarios <- list(
  DFR = c(k = 0.7, beta1 = 100, beta2 = 150, beta3 = 200),
  CFR = c(k = 1.0, beta1 = 100, beta2 = 150, beta3 = 200),
  IFR = c(k = 1.5, beta1 = 100, beta2 = 150, beta3 = 200)
)

results <- list()
```

```{r mc-run, cache=TRUE, eval=run_long, warning=FALSE, message=FALSE}
model_mc <- wei_series_homogeneous_md_c1_c2_c3()
gen_mc <- rdata(model_mc)
solver_mc <- fit(model_mc)

for (sc_name in names(scenarios)) {
  th <- scenarios[[sc_name]]
  k_sc <- th[1]
  scales_sc <- th[-1]
  m_sc <- length(scales_sc)
  npars <- m_sc + 1

  # Choose tau to give ~25% right-censoring
  beta_sys_sc <- wei_series_system_scale(k_sc, scales_sc)
  tau_sc <- qweibull(0.75, shape = k_sc, scale = beta_sys_sc)
  delta_sc <- tau_sc / 10  # ~10 inspection intervals

  estimates <- matrix(NA, nrow = B, ncol = npars)
  se_est <- matrix(NA, nrow = B, ncol = npars)
  ci_lower <- matrix(NA, nrow = B, ncol = npars)
  ci_upper <- matrix(NA, nrow = B, ncol = npars)
  converged <- logical(B)
  cens_fracs <- numeric(B)

  for (b in seq_len(B)) {
    df_b <- gen_mc(th, n = n_mc, p = p_mc,
                   observe = observe_periodic(delta = delta_sc, tau = tau_sc))
    cens_fracs[b] <- mean(df_b$omega == "right")

    tryCatch({
      est_b <- solver_mc(df_b, par = c(1, rep(120, m_sc)),
                         method = "L-BFGS-B",
                         lower = rep(1e-6, npars))
      estimates[b, ] <- est_b$par
      se_est[b, ] <- sqrt(diag(est_b$vcov))
      z <- qnorm(1 - alpha / 2)
      ci_lower[b, ] <- est_b$par - z * se_est[b, ]
      ci_upper[b, ] <- est_b$par + z * se_est[b, ]
      converged[b] <- est_b$converged
    }, error = function(e) {
      converged[b] <<- FALSE
    })
  }

  results[[sc_name]] <- list(
    theta = th, estimates = estimates, se_est = se_est,
    ci_lower = ci_lower, ci_upper = ci_upper,
    converged = converged, cens_fracs = cens_fracs,
    tau = tau_sc, delta = delta_sc
  )
}
```

```{r mc-save, include=FALSE, eval=run_long}
saveRDS(results, "precomputed_weibull_homogeneous.rds")
```

### Bias and MSE by shape regime

```{r mc-results-table}
summary_rows <- list()

for (sc_name in names(results)) {
  res <- results[[sc_name]]
  th <- res$theta
  valid <- res$converged & !is.na(res$estimates[, 1])
  est_v <- res$estimates[valid, , drop = FALSE]

  bias <- colMeans(est_v) - th
  variance <- apply(est_v, 2, var)
  mse <- bias^2 + variance
  pnames <- c("k", paste0("beta", seq_along(th[-1])))

  for (j in seq_along(th)) {
    summary_rows[[length(summary_rows) + 1]] <- data.frame(
      Regime = sc_name,
      Parameter = pnames[j],
      True = th[j],
      Mean_Est = mean(est_v[, j]),
      Bias = bias[j],
      RMSE = sqrt(mse[j]),
      Rel_Bias_Pct = 100 * bias[j] / th[j],
      stringsAsFactors = FALSE
    )
  }
}

mc_table <- do.call(rbind, summary_rows)

knitr::kable(mc_table, digits = 3, row.names = FALSE,
             caption = "Monte Carlo Results by Shape Regime (B=100, n=500)",
             col.names = c("Regime", "Parameter", "True", "Mean Est.",
                          "Bias", "RMSE", "Rel. Bias %"))
```

### Confidence interval coverage

```{r mc-coverage}
coverage_rows <- list()

for (sc_name in names(results)) {
  res <- results[[sc_name]]
  th <- res$theta
  valid <- res$converged & !is.na(res$ci_lower[, 1])
  pnames <- c("k", paste0("beta", seq_along(th[-1])))

  for (j in seq_along(th)) {
    valid_j <- valid & !is.na(res$ci_lower[, j]) & !is.na(res$ci_upper[, j])
    covered <- (res$ci_lower[valid_j, j] <= th[j]) &
               (th[j] <= res$ci_upper[valid_j, j])
    width <- mean(res$ci_upper[valid_j, j] - res$ci_lower[valid_j, j])

    coverage_rows[[length(coverage_rows) + 1]] <- data.frame(
      Regime = sc_name,
      Parameter = pnames[j],
      Coverage = mean(covered),
      Mean_Width = width,
      stringsAsFactors = FALSE
    )
  }
}

cov_table <- do.call(rbind, coverage_rows)

knitr::kable(cov_table, digits = 3, row.names = FALSE,
             caption = "95% Wald CI Coverage by Shape Regime",
             col.names = c("Regime", "Parameter", "Coverage", "Mean Width"))
```

### Censoring rates

```{r mc-censoring-rates}
for (sc_name in names(results)) {
  res <- results[[sc_name]]
  conv_rate <- mean(res$converged)
  cens_rate <- mean(res$cens_fracs[res$converged])
  cat(sprintf("%s (k=%.1f): convergence=%.1f%%, mean censoring=%.1f%%\n",
              sc_name, res$theta[1], 100 * conv_rate, 100 * cens_rate))
}
```

### Sampling distribution visualization

```{r mc-sampling-dist, fig.width=8, fig.height=7}
oldpar <- par(mfrow = c(3, 4), mar = c(4, 3, 2, 1), oma = c(0, 0, 2, 0))
on.exit(par(oldpar))
pnames <- c("k", "beta1", "beta2", "beta3")

for (sc_name in names(results)) {
  res <- results[[sc_name]]
  th <- res$theta
  valid <- res$converged & !is.na(res$estimates[, 1])
  est_v <- res$estimates[valid, , drop = FALSE]

  for (j in seq_along(th)) {
    hist(est_v[, j], breaks = 20, probability = TRUE,
         main = paste0(sc_name, ": ", pnames[j]),
         xlab = pnames[j],
         col = "lightblue", border = "white", cex.main = 0.9)
    abline(v = th[j], col = "red", lwd = 2, lty = 2)
    abline(v = mean(est_v[, j]), col = "blue", lwd = 2)
  }
}

mtext("Sampling Distributions by Regime", outer = TRUE, cex = 1.2)
```

### Interpretation

The Monte Carlo study reveals several patterns:

- **Shape estimation accuracy varies by regime.** In absolute terms, the DFR
  regime ($k = 0.7$) has the smallest shape RMSE, while IFR ($k = 1.5$) has
  the best *relative* precision (RMSE/$k$). IFR has the widest absolute
  confidence intervals simply because the parameter value is largest; the
  relative CI width (width/$k$) is actually smallest for IFR. The steeper
  curvature of the IFR hazard function provides stronger signal about the
  shape parameter relative to its magnitude.

- **Scale estimation is robust across regimes.** Relative bias in the scale
  parameters is below 5% in all three regimes. Coverage is near the nominal
  95% for most parameters, though a few (e.g., CFR $\beta_2$) show
  undercoverage around 90%, which is borderline significant at $B = 100$
  replications. In the DFR regime, scale parameters exhibit positive bias
  that increases with the true scale value.

- **Periodic inspection adds interval-censoring.** Unlike the exponential
  vignette which used only exact + right observations, this study uses periodic
  inspection. The closed-form interval-censored contributions (unique to the
  homogeneous model) keep computation fast despite the additional complexity.


Weibull($k=1$) = Exponential Identity {#exponential-identity}
==============================================================

When $k = 1$, the homogeneous Weibull model reduces to the exponential model.
We verify this numerically by comparing log-likelihoods on the same data.

```{r exp-identity}
# Parameters: k=1 with scales equivalent to rates (1/beta_j)
exp_rates <- c(0.01, 0.008, 0.005)
wei_scales <- 1 / exp_rates  # beta = 1/lambda
wei_theta <- c(1, wei_scales)

# Generate data under exponential model
exp_model <- exp_series_md_c1_c2_c3()
exp_gen <- rdata(exp_model)

set.seed(42)
df_test <- exp_gen(exp_rates, n = 200, tau = 300, p = 0.3)

# Evaluate both log-likelihoods
ll_exp <- loglik(exp_model)
ll_wei <- loglik(model)

val_exp <- ll_exp(df_test, exp_rates)
val_wei <- ll_wei(df_test, wei_theta)

cat("Exponential loglik:", round(val_exp, 6), "\n")
cat("Weibull(k=1) loglik:", round(val_wei, 6), "\n")
cat("Absolute difference:", formatC(abs(val_exp - val_wei),
                                     format = "e", digits = 2), "\n")
```

The two log-likelihoods agree to machine precision, confirming that the
homogeneous Weibull model is a proper generalization of the exponential series
model. This also serves as a consistency check on the implementation.

```{r exp-identity-score}
# Score comparison
s_exp <- score(exp_model)(df_test, exp_rates)
s_wei <- score(model)(df_test, wei_theta)

# The exponential score is d/d(lambda_j), the Weibull score includes d/dk
# and d/d(beta_j). Since lambda_j = 1/beta_j, by the chain rule:
#   d(ell)/d(lambda_j) = d(ell)/d(beta_j) * d(beta_j)/d(lambda_j)
#                      = d(ell)/d(beta_j) * (-beta_j^2)
# So: d(ell)/d(lambda_j) = -beta_j^2 * d(ell)/d(beta_j)
s_wei_transformed <- -wei_scales^2 * s_wei[-1]

cat("Exponential score:        ", round(s_exp, 4), "\n")
cat("Weibull(k=1) scale score: ", round(s_wei_transformed, 4), "\n")
cat("Max |difference|:",
    formatC(max(abs(s_exp - s_wei_transformed)), format = "e", digits = 2), "\n")
```

The transformed Weibull scale scores match the exponential rate scores to
machine precision, confirming that the two parameterizations yield identical
inference at $k = 1$. The shape score $\partial\ell/\partial k$ is nonzero at
the true parameter (the score is zero only at the MLE, not at the DGP truth
for any finite sample).

```{r cleanup, include=FALSE}
options(old_opts)
```