---
title: "Masked Series Systems"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Masked Series Systems}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```

## Competing Risks with Masked Failure Causes

A series system of $m$ components fails when any component fails.
We observe the system failure time $t$, but we may not know *which*
component caused the failure. Instead, we observe a **candidate set**
$C \subseteq \{1, \ldots, m\}$ of components that could have been
responsible.

Under standard masking conditions (C1: the true cause is always in
the candidate set; C2: masking is symmetric across candidates; C3:
masking is independent of system parameters), the log-likelihood
contribution for an exact observation with candidate set $C_i$ under
independent exponential components with rates
$\lambda_1, \ldots, \lambda_m$ is:

$$
\ell_i = \log\!\Bigl(\sum_{j \in C_i} \lambda_j\Bigr)
       - \Bigl(\sum_{j=1}^{m} \lambda_j\Bigr) \, t_i
$$

Right-censored observations (system survived past $t$) contribute
only the survival term:

$$
\ell_i = -\Bigl(\sum_{j=1}^{m} \lambda_j\Bigr) \, t_i
$$

These are just two different log-likelihood functions sharing the same
parameters, which is exactly what `likelihood.contr` composes.

## Building the Model

```{r model}
library(likelihood.contr)
library(likelihood.model)

# m = 3 components. Candidate set columns: x1, x2, x3 (logical).
m <- 3

masked_exact <- contr_fn(
  loglik = function(df, par, ...) {
    C <- as.matrix(df[, paste0("x", seq_len(m))])
    lambda_c <- rowSums(sweep(C, 2, par, `*`))
    lambda_sys <- sum(par)
    sum(log(lambda_c) - lambda_sys * df$t)
  },
  score = function(df, par, ...) {
    C <- as.matrix(df[, paste0("x", seq_len(m))])
    lambda_c <- rowSums(sweep(C, 2, par, `*`))
    # d/d(lambda_j): sum(C[i,j] / lambda_c[i]) - n * t_bar
    colSums(C / lambda_c) - sum(df$t)
  }
)

masked_right <- contr_fn(
  loglik = function(df, par, ...) {
    -sum(par) * sum(df$t)
  },
  score = function(df, par, ...) {
    rep(-sum(df$t), m)
  }
)

model <- likelihood_contr(
  obs_type = "omega",
  exact = masked_exact,
  right = masked_right,
  assumptions = c(
    "independent exponential components",
    "series system",
    "C1: true cause always in candidate set",
    "C2: symmetric masking",
    "C3: masking independent of parameters"
  )
)
model
```

## Simulating Masked Data

```{r simulate}
set.seed(42)
n <- 300
true_rates <- c(1.0, 0.5, 0.3)
censor_time <- 2.0
mask_prob <- 0.4  # probability a non-failed component enters candidate set

# Generate component lifetimes and system lifetime
comp_times <- matrix(rexp(n * m, rate = rep(true_rates, each = n)), n, m)
sys_times <- apply(comp_times, 1, min)
failed_comp <- apply(comp_times, 1, which.min)

# Apply right-censoring
obs_times <- pmin(sys_times, censor_time)
omega <- ifelse(sys_times <= censor_time, "exact", "right")

# Generate candidate sets satisfying C1/C2/C3
C <- matrix(FALSE, n, m)
for (i in seq_len(n)) {
  if (omega[i] == "exact") {
    C[i, failed_comp[i]] <- TRUE                        # C1
    others <- setdiff(seq_len(m), failed_comp[i])
    C[i, others] <- runif(length(others)) < mask_prob   # C2/C3
  }
  # right-censored: candidate set stays empty
}

df <- data.frame(t = obs_times, omega = omega, C)
colnames(df)[3:(m + 2)] <- paste0("x", seq_len(m))

cat("Exact:", sum(omega == "exact"),
    " Right-censored:", sum(omega == "right"), "\n")
head(df)
```

## Fitting

```{r fit}
result <- fit(model)(df, par = c(0.5, 0.5, 0.5))
summary(result)
```

```{r compare}
cat("True rates: ", paste(true_rates, collapse = ", "), "\n")
cat("Estimated:  ", paste(round(coef(result), 3), collapse = ", "), "\n")
```

## Why This Works

The key insight is that `likelihood.contr` does not care what the
log-likelihood *means*, only that each observation type contributes a
function `f(df, par) -> scalar`. The masked series system model is just
two contributions (exact with candidate sets, right-censored without)
sharing the same rate parameters. The candidate set information lives in
the data frame columns, and each contribution function reads what it
needs.

This same pattern extends to left-censored and interval-censored
masked data, Weibull components, or any parametric family where you can
write the log-likelihood contribution as a function of the data and
parameters.
