| SOCR ≫ | DSPA ≫ | DSPA2 Topics ≫ |
This notebook contains a self-contained simulation pipeline for
validating the regularization commutator framework in linear and
logistic regression. This V6 Rmd version addresses several
issues with previous versions, expands DSPA_Appendix_25_StatisticalCommutator_v1,
and reflects R1 commutator paper expansion.
The main V6 improvements include:
The naive influence commutator is treated as identically degenerate because \(H_\lambda^{-1}A + H_\lambda^{-1}(\lambda P)=I\), which is never used as a tuning objective.
The implemented Type I and Type II commutators are
\[C_I(\lambda) = [A, H_\lambda^{-1}P], \qquad C_{II}(\lambda) = [P, H_\lambda^{-1}A].\]
The balanced score is maximized directly
(V6 no longer minimizes the reciprocal of the
score).
The effective degrees-of-freedom balance is adjusted for rank-deficient penalties. If \(r=\dim\ker(P)\), then
\[B_r(\lambda)=\{\operatorname{df}(\lambda)-r\}\{p-\operatorname{df}(\lambda)\}, \qquad \operatorname{df}(\lambda)=\operatorname{tr}(H_\lambda^{-1}A).\]
Oracle, cross-validation, and test evaluation are all computed on a grid that includes \(\lambda=0\), and failed or flat objectives fall back to \(\lambda=0\), not to the middle of the grid.
Logistic regression uses a stable penalized IRLS algorithm with a proper penalized-objective line search.
Summary tables use a corrected median/MAD aggregation routine, avoiding the previous interleaving bug that mislabeled median and MAD columns.
The sparse ridge case \(P=I\) is retained as a negative control. Since \(P\) commutes with every \(A\), both commutator scores should be flat or numerically zero.
# -----------------------------------------------------------------------------
# Helpers
# -----------------------------------------------------------------------------
`%||%` <- function(x, y) if (is.null(x)) y else x
as_numeric_matrix <- function(x) {
if (inherits(x, "Matrix")) as.matrix(x) else x
}
symmetrize <- function(M) {
Matrix::forceSymmetric((M + t(M)) / 2)
}
# -----------------------------------------------------------------------------
# Numerically stable solve for symmetric positive definite / semidefinite systems.
# Escalates diagonal jitter before falling back to a Moore-Penrose inverse.
# -----------------------------------------------------------------------------
stable_solve_spd <- function(M, b = NULL, jitter = 1e-8, max_tries = 6) {
M <- Matrix::forceSymmetric(M)
p <- nrow(M)
diag_scale <- mean(abs(Matrix::diag(M)))
if (!is.finite(diag_scale) || diag_scale <= 0) diag_scale <- 1
for (k in 0:max_tries) {
jit <- jitter * (10^k) * diag_scale
Mj <- M + jit * Matrix::Diagonal(n = p)
cholM <- tryCatch(
Matrix::Cholesky(Mj, LDL = FALSE, perm = TRUE),
error = function(e) NULL
)
if (!is.null(cholM)) {
if (is.null(b)) return(Matrix::solve(cholM, Matrix::Diagonal(n = p)))
return(Matrix::solve(cholM, b))
}
}
Minv <- MASS::ginv(as.matrix(M))
if (is.null(b)) return(Minv)
Minv %*% as.matrix(b)
}
# -----------------------------------------------------------------------------
# Penalty nullity and adjusted balance factor.
# -----------------------------------------------------------------------------
penalty_nullity <- function(P, tol = 1e-8) {
ev <- eigen(as.matrix(Matrix::forceSymmetric(P)), symmetric = TRUE,
only.values = TRUE)$values
scale <- max(1, max(abs(ev), na.rm = TRUE))
sum(ev <= tol * scale)
}
balance_factor <- function(eff_df, p, nullity = 0) {
if (!is.finite(eff_df)) return(NA_real_)
nullity <- max(0, min(p - 1, as.integer(round(nullity))))
df_clip <- min(p, max(nullity, Re(eff_df)))
(df_clip - nullity) * (p - df_clip)
}
scale_penalty_to_A <- function(P, A, eps = 1e-12) {
P <- Matrix::forceSymmetric(P)
scale <- as.numeric(Matrix::norm(A, "F")) / max(eps, as.numeric(Matrix::norm(P, "F")))
P * scale
}
# -----------------------------------------------------------------------------
# Safe grid selection.
# Returns a status flag instead of silently choosing a meaningless grid point.
# -----------------------------------------------------------------------------
select_lambda_grid <- function(vals, grid_vals, maximize = FALSE,
fallback = 0, tie_tol = 1e-10,
flat_tol = 1e-12) {
stopifnot(length(vals) == length(grid_vals))
ok <- is.finite(vals) & is.finite(grid_vals)
if (!any(ok)) {
return(list(lambda = fallback, value = NA_real_, index = NA_integer_,
status = "all_nonfinite"))
}
v <- vals[ok]
g <- grid_vals[ok]
original_idx <- which(ok)
spread <- max(v) - min(v)
scale <- max(1, max(abs(v)))
if (!is.finite(spread) || spread <= flat_tol * scale) {
return(list(lambda = fallback, value = v[1], index = original_idx[1],
status = "flat_objective"))
}
best <- if (maximize) max(v) else min(v)
tol_abs <- tie_tol * max(1, abs(best))
tie_idx <- if (maximize) which(v >= best - tol_abs) else which(v <= best + tol_abs)
j <- tie_idx[1]
list(lambda = g[j], value = v[j], index = original_idx[j], status = "ok")
}
lambda_index <- function(lambda, grid_vals, tol = 1e-12) {
d <- abs(grid_vals - lambda)
j <- which.min(d)
if (!length(j) || !is.finite(d[j]) || d[j] > tol * max(1, abs(lambda))) {
return(NA_integer_)
}
j
}
metric_at_lambda <- function(metric, lambda, grid_vals) {
j <- lambda_index(lambda, grid_vals)
if (is.na(j)) return(NA_real_)
metric[j]
}# -----------------------------------------------------------------------------
# Penalty matrices
# -----------------------------------------------------------------------------
make_diff_penalty <- function(p, order = 1) {
if (p < 2) stop("p must be at least 2")
D <- Matrix::Matrix(0, nrow = p - 1, ncol = p, sparse = TRUE)
for (i in seq_len(p - 1)) {
D[i, i] <- -1
D[i, i + 1] <- 1
}
P <- Matrix::crossprod(D)
if (order <= 1) return(Matrix::forceSymmetric(P))
for (k in 2:order) P <- Matrix::crossprod(D %*% P)
Matrix::forceSymmetric(P)
}
make_complete_graph_laplacian <- function(groups, p) {
P <- Matrix::Matrix(0, nrow = p, ncol = p, sparse = FALSE)
for (idx in groups) {
m <- length(idx)
if (m <= 1) next
L <- diag(m) * m - matrix(1, m, m)
P[idx, idx] <- P[idx, idx] + L
}
Matrix::forceSymmetric(P)
}
make_groups <- function(p, n_groups) {
split(seq_len(p), rep(seq_len(n_groups), length.out = p))
}
# -----------------------------------------------------------------------------
# Simulation scenarios.
# Sparse with P=I is intentionally a negative control for the commutator score.
# -----------------------------------------------------------------------------
generate_dataset <- function(scenario = "toeplitz", n = 200, p = 50,
type = c("linear", "logistic"),
signal_scale = 1, seed = NULL,
linear_noise_sd = 1,
logistic_max_eta_sd = 2.5,
block_penalty = c("laplacian", "covariance")) {
type <- match.arg(type)
block_penalty <- match.arg(block_penalty)
if (!is.null(seed)) set.seed(seed)
scenario <- tolower(scenario)
if (scenario == "toeplitz") {
rho <- 0.5
Sigma <- toeplitz(rho^(0:(p - 1)))
X <- MASS::mvrnorm(n, mu = rep(0, p), Sigma = Sigma)
beta_true <- numeric(p)
beta_true[5:min(10, p)] <- 2
if (p >= 25) beta_true[20:25] <- -1.5
if (p >= 40) beta_true[35:40] <- 1
P <- make_diff_penalty(p, order = 1)
} else if (scenario == "block") {
n_blocks <- 5
groups <- make_groups(p, n_blocks)
Sigma <- matrix(0.2, p, p)
for (idx in groups) Sigma[idx, idx] <- 0.7
diag(Sigma) <- 1
X <- MASS::mvrnorm(n, mu = rep(0, p), Sigma = Sigma)
beta_true <- numeric(p)
active <- sample(seq_len(p), min(10, p))
beta_true[active] <- rnorm(length(active), mean = 2, sd = 0.5)
P <- if (block_penalty == "laplacian") {
make_complete_graph_laplacian(groups, p)
} else {
Matrix::forceSymmetric(Matrix::Matrix(Sigma, sparse = FALSE))
}
} else if (scenario == "sparse") {
X <- matrix(rnorm(n * p), n, p)
beta_true <- numeric(p)
active <- sample(seq_len(p), min(5, p))
beta_true[active] <- runif(length(active), 2, 4) *
sample(c(-1, 1), length(active), replace = TRUE)
P <- Matrix::Diagonal(p)
} else if (scenario == "grouped") {
n_groups <- 10
groups <- make_groups(p, n_groups)
X <- matrix(0, n, p)
for (idx in groups) {
factor <- rnorm(n)
X[, idx] <- matrix(factor, n, length(idx)) +
matrix(rnorm(n * length(idx), sd = 0.5), n, length(idx))
}
beta_true <- numeric(p)
active_groups <- sample(seq_along(groups), min(3, length(groups)))
for (g in active_groups) {
idx <- groups[[g]]
beta_true[idx] <- rnorm(length(idx), mean = 2, sd = 0.5)
}
P <- make_complete_graph_laplacian(groups, p)
} else {
stop("Unknown scenario: ", scenario)
}
beta_true <- beta_true * signal_scale
if (type == "logistic" && is.finite(logistic_max_eta_sd)) {
eta0 <- as.vector(X %*% beta_true)
s_eta <- stats::sd(eta0)
if (is.finite(s_eta) && s_eta > logistic_max_eta_sd && s_eta > 0) {
beta_true <- beta_true * (logistic_max_eta_sd / s_eta)
}
}
if (type == "linear") {
y <- as.vector(X %*% beta_true + rnorm(n, sd = linear_noise_sd))
} else {
prob <- stats::plogis(as.vector(X %*% beta_true))
y <- stats::rbinom(n, size = 1, prob = prob)
}
list(X = X, y = y, beta_true = beta_true, P = P,
scenario = scenario, type = type)
}
preprocess_train_test <- function(X_train, X_test, y_train, y_test, type) {
x_center <- colMeans(X_train)
x_scale <- apply(X_train, 2, stats::sd)
x_scale[!is.finite(x_scale) | x_scale < 1e-8] <- 1
X_train_s <- sweep(sweep(X_train, 2, x_center, "-"), 2, x_scale, "/")
X_test_s <- sweep(sweep(X_test, 2, x_center, "-"), 2, x_scale, "/")
if (type == "linear") {
y_center <- mean(y_train)
y_train_s <- y_train - y_center
y_test_s <- y_test - y_center
} else {
y_center <- 0
y_train_s <- as.integer(y_train)
y_test_s <- as.integer(y_test)
}
list(X_train = X_train_s, X_test = X_test_s,
y_train = y_train_s, y_test = y_test_s,
x_center = x_center, x_scale = x_scale, y_center = y_center)
}# -----------------------------------------------------------------------------
# Linear model fitting
# -----------------------------------------------------------------------------
fit_linear_penalized <- function(X, y, P, lambda, jitter = 1e-8) {
n <- nrow(X)
A <- Matrix::crossprod(X) / n
b <- Matrix::crossprod(X, y) / n
beta <- stable_solve_spd(A + lambda * P, b, jitter = jitter)
as.vector(beta)
}
# -----------------------------------------------------------------------------
# AUC and log-loss
# -----------------------------------------------------------------------------
auc_fast <- function(y_true, p_hat) {
y_true <- as.integer(y_true)
p_hat <- as.vector(p_hat)
n_pos <- sum(y_true == 1L)
n_neg <- sum(y_true == 0L)
if (n_pos == 0L || n_neg == 0L) return(NA_real_)
r <- rank(p_hat, ties.method = "average")
(sum(r[y_true == 1L]) - n_pos * (n_pos + 1) / 2) / (n_pos * n_neg)
}
binomial_logloss <- function(y, prob, eps = 1e-12) {
prob <- pmin(pmax(as.vector(prob), eps), 1 - eps)
-mean(y * log(prob) + (1 - y) * log(1 - prob))
}
# -----------------------------------------------------------------------------
# Stable penalized logistic IRLS.
# Objective: mean negative log-likelihood + lambda/2 * beta'P beta.
# -----------------------------------------------------------------------------
logistic_penalized_objective <- function(beta, X, y, P, lambda) {
eta <- as.vector(X %*% beta)
nll <- mean(pmax(eta, 0) - y * eta + log1p(exp(-abs(eta))))
pen <- 0.5 * lambda * as.numeric(Matrix::crossprod(beta, P %*% beta))
nll + pen
}
irls_penalized_logistic <- function(X, y, P, lambda, beta_init = NULL,
max_iters = 100, tol = 1e-7,
clip_eta = 30, min_w = 1e-6,
jitter = 1e-8, max_half_steps = 25) {
n <- nrow(X)
p <- ncol(X)
beta <- if (is.null(beta_init)) rep(0, p) else as.vector(beta_init)
obj <- logistic_penalized_objective(beta, X, y, P, lambda)
converged <- FALSE
last_step_factor <- NA_real_
it_used <- 0L
for (it in seq_len(max_iters)) {
it_used <- it
eta <- as.vector(X %*% beta)
eta_clip <- pmax(pmin(eta, clip_eta), -clip_eta)
mu <- stats::plogis(eta_clip)
Wv <- pmax(mu * (1 - mu), min_w)
z <- eta + (y - mu) / Wv
W <- Matrix::Diagonal(n = n, x = Wv)
A <- Matrix::crossprod(X, W %*% X) / n
b <- Matrix::crossprod(X, W %*% z) / n
H <- A + lambda * P
beta_full <- tryCatch(
as.vector(stable_solve_spd(H, b, jitter = jitter)),
error = function(e) rep(NA_real_, p)
)
if (!all(is.finite(beta_full))) break
direction <- beta_full - beta
if (max(abs(direction)) < tol * (1 + max(abs(beta)))) {
beta <- beta_full
converged <- TRUE
break
}
accepted <- FALSE
step_factor <- 1
beta_try <- beta
obj_try <- obj
for (hs in 0:max_half_steps) {
beta_try <- beta + step_factor * direction
obj_try <- logistic_penalized_objective(beta_try, X, y, P, lambda)
if (is.finite(obj_try) && obj_try <= obj + 1e-10) {
accepted <- TRUE
break
}
step_factor <- step_factor / 2
}
last_step_factor <- step_factor
if (!accepted) break
if (max(abs(beta_try - beta)) < tol * (1 + max(abs(beta)))) {
beta <- beta_try
obj <- obj_try
converged <- TRUE
break
}
beta <- beta_try
obj <- obj_try
}
list(beta = beta, converged = converged, iters = it_used,
objective = obj, last_step_factor = last_step_factor)
}# -----------------------------------------------------------------------------
# Compute Type I and Type II scores from A and P.
# -----------------------------------------------------------------------------
commutator_from_curvature <- function(A, P, lambda, nullity = NULL,
jitter = 1e-8) {
p <- ncol(A)
if (is.null(nullity)) nullity <- penalty_nullity(P)
H <- A + lambda * P
HinvA <- stable_solve_spd(H, A, jitter = jitter)
HinvP <- stable_solve_spd(H, P, jitter = jitter)
eff_df <- Re(sum(Matrix::diag(HinvA)))
balance <- balance_factor(eff_df, p, nullity = nullity)
C_I <- A %*% HinvP - HinvP %*% A
C_II <- P %*% HinvA - HinvA %*% P
mis_I <- as.numeric(Matrix::norm(C_I, "F")^2)
mis_II <- as.numeric(Matrix::norm(C_II, "F")^2)
list(score_I = mis_I * balance,
score_II = mis_II * balance,
mis_I = mis_I,
mis_II = mis_II,
eff_df = eff_df,
balance = balance,
nullity = nullity)
}
compute_commutator_scores_linear <- function(lambda_grid_eval, X, P,
nullity = NULL, jitter = 1e-8) {
A <- Matrix::crossprod(X) / nrow(X)
if (is.null(nullity)) nullity <- penalty_nullity(P)
out <- data.frame(
lambda = lambda_grid_eval,
score_I = NA_real_, score_II = NA_real_,
mis_I = NA_real_, mis_II = NA_real_,
eff_df = NA_real_, balance = NA_real_,
converged = TRUE,
stringsAsFactors = FALSE
)
for (j in seq_along(lambda_grid_eval)) {
res <- tryCatch(
commutator_from_curvature(A, P, lambda_grid_eval[j], nullity = nullity,
jitter = jitter),
error = function(e) NULL
)
if (!is.null(res)) {
out$score_I[j] <- res$score_I
out$score_II[j] <- res$score_II
out$mis_I[j] <- res$mis_I
out$mis_II[j] <- res$mis_II
out$eff_df[j] <- res$eff_df
out$balance[j] <- res$balance
}
}
out
}
compute_commutator_scores_logistic <- function(lambda_grid_eval, X, y, P,
nullity = NULL, jitter = 1e-8) {
if (is.null(nullity)) nullity <- penalty_nullity(P)
out <- data.frame(
lambda = lambda_grid_eval,
score_I = NA_real_, score_II = NA_real_,
mis_I = NA_real_, mis_II = NA_real_,
eff_df = NA_real_, balance = NA_real_,
converged = FALSE,
iters = NA_integer_,
stringsAsFactors = FALSE
)
ord <- order(lambda_grid_eval, decreasing = TRUE)
beta_ws <- NULL
for (j in ord) {
lam <- lambda_grid_eval[j]
fit <- tryCatch(
irls_penalized_logistic(X, y, P, lam, beta_init = beta_ws, jitter = jitter),
error = function(e) list(beta = rep(NA_real_, ncol(X)), converged = FALSE,
iters = NA_integer_)
)
out$converged[j] <- isTRUE(fit$converged)
out$iters[j] <- fit$iters %||% NA_integer_
if (!all(is.finite(fit$beta))) next
beta_ws <- fit$beta
eta <- as.vector(X %*% fit$beta)
mu <- stats::plogis(pmax(pmin(eta, 30), -30))
Wv <- pmax(mu * (1 - mu), 1e-6)
A <- Matrix::crossprod(X, Matrix::Diagonal(n = nrow(X), x = Wv) %*% X) / nrow(X)
res <- tryCatch(
commutator_from_curvature(A, P, lam, nullity = nullity, jitter = jitter),
error = function(e) NULL
)
if (!is.null(res)) {
out$score_I[j] <- res$score_I
out$score_II[j] <- res$score_II
out$mis_I[j] <- res$mis_I
out$mis_II[j] <- res$mis_II
out$eff_df[j] <- res$eff_df
out$balance[j] <- res$balance
}
}
out
}# -----------------------------------------------------------------------------
# Test-set metric scans. These are used both to define the oracle and to evaluate
# the selected lambda values, ensuring oracle dominance by construction.
# -----------------------------------------------------------------------------
scan_test_metrics_linear <- function(lambda_grid_eval, X_train, y_train,
X_test, y_test, P, jitter = 1e-8) {
n <- nrow(X_train)
A <- Matrix::crossprod(X_train) / n
b <- Matrix::crossprod(X_train, y_train) / n
mse <- rep(NA_real_, length(lambda_grid_eval))
for (j in seq_along(lambda_grid_eval)) {
lam <- lambda_grid_eval[j]
beta <- tryCatch(
as.vector(stable_solve_spd(A + lam * P, b, jitter = jitter)),
error = function(e) rep(NA_real_, ncol(X_train))
)
if (all(is.finite(beta))) {
mse[j] <- mean((y_test - as.vector(X_test %*% beta))^2)
}
}
data.frame(lambda = lambda_grid_eval, MSE = mse, stringsAsFactors = FALSE)
}
scan_test_metrics_logistic <- function(lambda_grid_eval, X_train, y_train,
X_test, y_test, P, jitter = 1e-8) {
auc <- rep(NA_real_, length(lambda_grid_eval))
logloss <- rep(NA_real_, length(lambda_grid_eval))
converged <- rep(FALSE, length(lambda_grid_eval))
iters <- rep(NA_integer_, length(lambda_grid_eval))
ord <- order(lambda_grid_eval, decreasing = TRUE)
beta_ws <- NULL
for (j in ord) {
lam <- lambda_grid_eval[j]
fit <- tryCatch(
irls_penalized_logistic(X_train, y_train, P, lam,
beta_init = beta_ws, jitter = jitter),
error = function(e) list(beta = rep(NA_real_, ncol(X_train)),
converged = FALSE, iters = NA_integer_)
)
converged[j] <- isTRUE(fit$converged)
iters[j] <- fit$iters %||% NA_integer_
if (!all(is.finite(fit$beta))) next
beta_ws <- fit$beta
prob <- stats::plogis(pmax(pmin(as.vector(X_test %*% fit$beta), 30), -30))
auc[j] <- auc_fast(y_test, prob)
logloss[j] <- binomial_logloss(y_test, prob)
}
data.frame(lambda = lambda_grid_eval, AUC = auc, LogLoss = logloss,
converged = converged, iters = iters, stringsAsFactors = FALSE)
}
# -----------------------------------------------------------------------------
# Cross-validation objectives.
# Linear CV minimizes validation MSE.
# Logistic CV minimizes validation log-loss.
# -----------------------------------------------------------------------------
make_folds <- function(n, n_folds = 5) {
sample(rep(seq_len(n_folds), length.out = n))
}
cv_linear <- function(lambda_grid_eval, X, y, P, n_folds = 5, jitter = 1e-8) {
folds <- make_folds(nrow(X), n_folds)
scores <- rep(NA_real_, length(lambda_grid_eval))
for (j in seq_along(lambda_grid_eval)) {
lam <- lambda_grid_eval[j]
fold_err <- rep(NA_real_, n_folds)
for (f in seq_len(n_folds)) {
idx_tr <- which(folds != f)
idx_val <- which(folds == f)
Xtr <- X[idx_tr, , drop = FALSE]
ytr <- y[idx_tr]
Xval <- X[idx_val, , drop = FALSE]
yval <- y[idx_val]
beta <- tryCatch(
fit_linear_penalized(Xtr, ytr, P, lam, jitter = jitter),
error = function(e) rep(NA_real_, ncol(X))
)
if (all(is.finite(beta))) {
fold_err[f] <- mean((yval - as.vector(Xval %*% beta))^2)
}
}
scores[j] <- mean(fold_err, na.rm = TRUE)
}
scores
}
cv_logistic <- function(lambda_grid_eval, X, y, P, n_folds = 5, jitter = 1e-8) {
folds <- make_folds(nrow(X), n_folds)
fold_scores <- matrix(NA_real_, nrow = length(lambda_grid_eval), ncol = n_folds)
for (f in seq_len(n_folds)) {
idx_tr <- which(folds != f)
idx_val <- which(folds == f)
Xtr <- X[idx_tr, , drop = FALSE]
ytr <- y[idx_tr]
Xval <- X[idx_val, , drop = FALSE]
yval <- y[idx_val]
ord <- order(lambda_grid_eval, decreasing = TRUE)
beta_ws <- NULL
for (j in ord) {
lam <- lambda_grid_eval[j]
fit <- tryCatch(
irls_penalized_logistic(Xtr, ytr, P, lam,
beta_init = beta_ws, jitter = jitter),
error = function(e) list(beta = rep(NA_real_, ncol(X)), converged = FALSE)
)
if (!all(is.finite(fit$beta))) next
beta_ws <- fit$beta
prob <- stats::plogis(pmax(pmin(as.vector(Xval %*% fit$beta), 30), -30))
fold_scores[j, f] <- binomial_logloss(yval, prob)
}
}
rowMeans(fold_scores, na.rm = TRUE)
}run_one_replication <- function(scenario, rep_id, type = c("linear", "logistic"),
n = 200, p = 50, lambda_grid_eval,
n_folds = 5, seed_base = 1,
linear_signal_scale = 1,
logistic_signal_scale = 0.5,
logistic_max_eta_sd = 2.5,
linear_noise_sd = 1,
block_penalty = "laplacian") {
type <- match.arg(type)
signal_scale <- if (type == "linear") linear_signal_scale else logistic_signal_scale
data_seed <- seed_base + rep_id * 1000 + match(scenario, scenarios) * 100000 +
ifelse(type == "linear", 0, 5000000)
dat <- generate_dataset(
scenario = scenario, n = n, p = p, type = type,
signal_scale = signal_scale, seed = data_seed,
linear_noise_sd = linear_noise_sd,
logistic_max_eta_sd = logistic_max_eta_sd,
block_penalty = block_penalty
)
set.seed(data_seed + 17)
n_test <- round(0.30 * n)
idx_test <- sample(seq_len(n), n_test)
idx_train <- setdiff(seq_len(n), idx_test)
prep <- preprocess_train_test(
dat$X[idx_train, , drop = FALSE], dat$X[idx_test, , drop = FALSE],
dat$y[idx_train], dat$y[idx_test], type = type
)
X_train <- prep$X_train
X_test <- prep$X_test
y_train <- prep$y_train
y_test <- prep$y_test
A_train <- Matrix::crossprod(X_train) / nrow(X_train)
P <- scale_penalty_to_A(dat$P, A_train)
nullity_P <- penalty_nullity(P)
# Test scan: used for oracle and all selected-method evaluations.
if (type == "linear") {
test_scan <- scan_test_metrics_linear(lambda_grid_eval, X_train, y_train,
X_test, y_test, P)
oracle_sel <- select_lambda_grid(test_scan$MSE, lambda_grid_eval,
maximize = FALSE, fallback = 0)
cv_scores <- cv_linear(lambda_grid_eval, X_train, y_train, P, n_folds = n_folds)
cv_sel <- select_lambda_grid(cv_scores, lambda_grid_eval,
maximize = FALSE, fallback = 0)
comm_scan <- compute_commutator_scores_linear(lambda_grid_eval, X_train, P,
nullity = nullity_P)
metric_vec <- test_scan$MSE
metric_name <- "MSE"
} else {
test_scan <- scan_test_metrics_logistic(lambda_grid_eval, X_train, y_train,
X_test, y_test, P)
oracle_sel <- select_lambda_grid(test_scan$AUC, lambda_grid_eval,
maximize = TRUE, fallback = 0)
cv_scores <- cv_logistic(lambda_grid_eval, X_train, y_train, P, n_folds = n_folds)
cv_sel <- select_lambda_grid(cv_scores, lambda_grid_eval,
maximize = FALSE, fallback = 0)
comm_scan <- compute_commutator_scores_logistic(lambda_grid_eval, X_train, y_train, P,
nullity = nullity_P)
metric_vec <- test_scan$AUC
metric_name <- "AUC"
}
comm_I_sel <- select_lambda_grid(comm_scan$score_I, lambda_grid_eval,
maximize = TRUE, fallback = 0)
comm_II_sel <- select_lambda_grid(comm_scan$score_II, lambda_grid_eval,
maximize = TRUE, fallback = 0)
lambda_unreg <- 0
lambda_oracle <- oracle_sel$lambda
lambda_cv <- cv_sel$lambda
lambda_comm_I <- comm_I_sel$lambda
lambda_comm_II <- comm_II_sel$lambda
metric_unreg <- metric_at_lambda(metric_vec, lambda_unreg, lambda_grid_eval)
metric_oracle <- metric_at_lambda(metric_vec, lambda_oracle, lambda_grid_eval)
metric_cv <- metric_at_lambda(metric_vec, lambda_cv, lambda_grid_eval)
metric_comm_I <- metric_at_lambda(metric_vec, lambda_comm_I, lambda_grid_eval)
metric_comm_II <- metric_at_lambda(metric_vec, lambda_comm_II, lambda_grid_eval)
method_metrics <- c(metric_unreg, metric_cv, metric_comm_I, metric_comm_II)
oracle_ok <- if (type == "linear") {
is.finite(metric_oracle) && metric_oracle <= min(method_metrics, na.rm = TRUE) + 1e-8
} else {
is.finite(metric_oracle) && metric_oracle >= max(method_metrics, na.rm = TRUE) - 1e-8
}
data.frame(
Model = type,
Scenario = scenario,
Rep = rep_id,
NTrain = nrow(X_train),
NTest = nrow(X_test),
PDim = ncol(X_train),
PenaltyNullity = nullity_P,
Lambda_Unreg = lambda_unreg,
Lambda_Oracle = lambda_oracle,
Lambda_CV = lambda_cv,
Lambda_CommI = lambda_comm_I,
Lambda_CommII = lambda_comm_II,
Status_Oracle = oracle_sel$status,
Status_CV = cv_sel$status,
Status_CommI = comm_I_sel$status,
Status_CommII = comm_II_sel$status,
Objective_Oracle = oracle_sel$value,
Objective_CV = cv_sel$value,
Score_CommI = comm_I_sel$value,
Score_CommII = comm_II_sel$value,
MetricName = metric_name,
Metric_Unreg = metric_unreg,
Metric_Oracle = metric_oracle,
Metric_CV = metric_cv,
Metric_CommI = metric_comm_I,
Metric_CommII = metric_comm_II,
OracleDominanceOK = oracle_ok,
stringsAsFactors = FALSE
)
}
run_experiments <- function(type = c("linear", "logistic"), scenarios,
n_reps, n, p, lambda_grid_eval, n_folds,
seed_base, use_parallel = TRUE) {
type <- match.arg(type)
task_grid <- expand.grid(Scenario = scenarios, Rep = seq_len(n_reps),
stringsAsFactors = FALSE)
runner <- function(i) {
run_one_replication(
scenario = task_grid$Scenario[i], rep_id = task_grid$Rep[i], type = type,
n = n, p = p, lambda_grid_eval = lambda_grid_eval, n_folds = n_folds,
seed_base = seed_base,
linear_signal_scale = params$linear_signal_scale,
logistic_signal_scale = params$logistic_signal_scale,
logistic_max_eta_sd = params$logistic_max_eta_sd,
linear_noise_sd = params$linear_noise_sd,
block_penalty = params$block_penalty
)
}
message("Running ", nrow(task_grid), " ", type, " tasks.")
if (isTRUE(use_parallel) && has_future) {
res <- future.apply::future_lapply(
seq_len(nrow(task_grid)), runner,
future.seed = TRUE,
future.packages = c("Matrix", "MASS", "stats")
)
} else {
res <- lapply(seq_len(nrow(task_grid)), runner)
}
do.call(rbind, res)
}The default settings run \(100\)
replications per scenario for both models. For a quick smoke test, knit
with a smaller params$n_reps, for example \(5\) or \(10\).
robust_by_scenario <- function(df, raw_cols) {
pieces <- lapply(split(df, df$Scenario), function(d) {
out <- list(Scenario = d$Scenario[1], N = nrow(d))
for (cl in raw_cols) {
out[[paste0(cl, ".median")]] <- median(d[[cl]], na.rm = TRUE)
out[[paste0(cl, ".mad")]] <- stats::mad(d[[cl]], constant = 1, na.rm = TRUE)
}
as.data.frame(out, check.names = FALSE)
})
res <- do.call(rbind, pieces)
rownames(res) <- NULL
res
}
check_oracle_dominance <- function(df, type = c("linear", "logistic"), tol = 1e-8) {
type <- match.arg(type)
method_cols <- intersect(c("Metric_Unreg", "Metric_CV", "Metric_CommI", "Metric_CommII"), names(df))
if (!length(method_cols) || !"Metric_Oracle" %in% names(df)) return(integer(0))
M <- as.matrix(df[, method_cols, drop = FALSE])
if (type == "linear") {
best_method <- apply(M, 1, min, na.rm = TRUE)
which(df$Metric_Oracle > best_method + tol)
} else {
best_method <- apply(M, 1, max, na.rm = TRUE)
which(df$Metric_Oracle < best_method - tol)
}
}
selection_status_summary <- function(df) {
status_cols <- intersect(c("Status_Oracle", "Status_CV", "Status_CommI", "Status_CommII"), names(df))
pieces <- list()
for (sc in unique(df$Scenario)) {
d <- df[df$Scenario == sc, , drop = FALSE]
for (cl in status_cols) {
method <- sub("^Status_", "", cl)
tab <- table(factor(d[[cl]], levels = c("ok", "flat_objective", "all_nonfinite")))
pieces[[paste(sc, method, sep = "_")]] <- data.frame(
Scenario = sc,
Method = method,
OK = unname(tab["ok"]),
Flat = unname(tab["flat_objective"]),
AllNonfinite = unname(tab["all_nonfinite"]),
stringsAsFactors = FALSE
)
}
}
res <- do.call(rbind, pieces)
rownames(res) <- NULL
res
}
analyze_results <- function(df, type = c("linear", "logistic"),
extreme_mse_factor = 10) {
type <- match.arg(type)
n_before <- nrow(df)
df <- df[is.finite(df$Metric_Oracle), , drop = FALSE]
n_after <- nrow(df)
cat(sprintf("Rows with finite oracle: %d / %d (%.1f%%)\n",
n_after, n_before, 100 * n_after / max(1, n_before)))
if (n_after == 0) stop("No finite oracle results.")
bad_oracle <- check_oracle_dominance(df, type = type)
cat(sprintf("Oracle-dominance violations: %d\n", length(bad_oracle)))
if (type == "linear") {
med_oracle <- median(df$Metric_Oracle, na.rm = TRUE)
df$Extreme <- df$Metric_Oracle > extreme_mse_factor * med_oracle
} else {
df$Extreme <- FALSE
}
cat(sprintf("Extreme rows flagged: %d\n", sum(df$Extreme, na.rm = TRUE)))
df_clean <- df[!df$Extreme, , drop = FALSE]
metric_cols <- c("Metric_Unreg", "Metric_Oracle", "Metric_CV", "Metric_CommI", "Metric_CommII")
robust <- robust_by_scenario(df_clean, metric_cols)
if (type == "linear") {
df_clean$Rel_Unreg <- df_clean$Metric_Unreg / df_clean$Metric_Oracle
df_clean$Rel_CV <- df_clean$Metric_CV / df_clean$Metric_Oracle
df_clean$Rel_CommI <- df_clean$Metric_CommI / df_clean$Metric_Oracle
df_clean$Rel_CommII <- df_clean$Metric_CommII / df_clean$Metric_Oracle
relative <- robust_by_scenario(df_clean,
c("Rel_Unreg", "Rel_CV", "Rel_CommI", "Rel_CommII"))
} else {
df_clean$Rel_Unreg <- df_clean$Metric_Unreg / df_clean$Metric_Oracle
df_clean$Rel_CV <- df_clean$Metric_CV / df_clean$Metric_Oracle
df_clean$Rel_CommI <- df_clean$Metric_CommI / df_clean$Metric_Oracle
df_clean$Rel_CommII <- df_clean$Metric_CommII / df_clean$Metric_Oracle
df_clean$Gap_Unreg <- df_clean$Metric_Oracle - df_clean$Metric_Unreg
df_clean$Gap_CV <- df_clean$Metric_Oracle - df_clean$Metric_CV
df_clean$Gap_CommI <- df_clean$Metric_Oracle - df_clean$Metric_CommI
df_clean$Gap_CommII <- df_clean$Metric_Oracle - df_clean$Metric_CommII
relative <- robust_by_scenario(
df_clean,
c("Rel_Unreg", "Rel_CV", "Rel_CommI", "Rel_CommII",
"Gap_Unreg", "Gap_CV", "Gap_CommI", "Gap_CommII")
)
}
lambda_cols <- c("Lambda_Unreg", "Lambda_Oracle", "Lambda_CV", "Lambda_CommI", "Lambda_CommII")
lambda_summary <- robust_by_scenario(df_clean, lambda_cols)
tests <- list()
for (sc in unique(df_clean$Scenario)) {
idx <- which(df_clean$Scenario == sc)
for (comm in c("CommI", "CommII")) {
mcol <- paste0("Metric_", comm)
diff <- if (type == "linear") {
log(pmax(df_clean[[mcol]][idx], 1e-12)) -
log(pmax(df_clean$Metric_CV[idx], 1e-12))
} else {
df_clean[[mcol]][idx] - df_clean$Metric_CV[idx]
}
ok <- is.finite(diff)
if (sum(ok) >= 2 && stats::sd(diff[ok]) > 0) {
tt <- stats::t.test(diff[ok])
tests[[paste(sc, comm, sep = "_")]] <- data.frame(
Scenario = sc, Method = comm,
t = unname(tt$statistic), p = tt$p.value, n = sum(ok),
stringsAsFactors = FALSE
)
} else {
tests[[paste(sc, comm, sep = "_")]] <- data.frame(
Scenario = sc, Method = comm,
t = NA_real_, p = NA_real_, n = sum(ok),
stringsAsFactors = FALSE
)
}
}
}
test_df <- do.call(rbind, tests)
rownames(test_df) <- NULL
list(
robust = robust,
relative = relative,
lambda = lambda_summary,
tests = test_df,
status = selection_status_summary(df_clean),
df_full = df,
df_clean = df_clean,
n_extreme = sum(df$Extreme, na.rm = TRUE),
oracle_violations = bad_oracle
)
}
fmt_med_mad <- function(med, mad, digits = 3) {
sprintf(paste0("%.", digits, "f (%.", digits, "f)"), med, mad)
}
metric_table <- function(analysis, digits = 3) {
rob <- analysis$robust
data.frame(
Scenario = rob$Scenario,
N = rob$N,
Unregularized = fmt_med_mad(rob$Metric_Unreg.median, rob$Metric_Unreg.mad, digits),
Oracle = fmt_med_mad(rob$Metric_Oracle.median, rob$Metric_Oracle.mad, digits),
CV = fmt_med_mad(rob$Metric_CV.median, rob$Metric_CV.mad, digits),
CommI = fmt_med_mad(rob$Metric_CommI.median, rob$Metric_CommI.mad, digits),
CommII = fmt_med_mad(rob$Metric_CommII.median, rob$Metric_CommII.mad, digits),
check.names = FALSE
)
}
relative_table <- function(analysis, type = c("linear", "logistic"), digits = 3) {
type <- match.arg(type)
rel <- analysis$relative
if (type == "linear") {
data.frame(
Scenario = rel$Scenario,
N = rel$N,
Unregularized = fmt_med_mad(rel$Rel_Unreg.median, rel$Rel_Unreg.mad, digits),
CV = fmt_med_mad(rel$Rel_CV.median, rel$Rel_CV.mad, digits),
CommI = fmt_med_mad(rel$Rel_CommI.median, rel$Rel_CommI.mad, digits),
CommII = fmt_med_mad(rel$Rel_CommII.median, rel$Rel_CommII.mad, digits),
check.names = FALSE
)
} else {
data.frame(
Scenario = rel$Scenario,
N = rel$N,
`AUC ratio: Unreg` = fmt_med_mad(rel$Rel_Unreg.median, rel$Rel_Unreg.mad, digits),
`AUC ratio: CV` = fmt_med_mad(rel$Rel_CV.median, rel$Rel_CV.mad, digits),
`AUC ratio: CommI` = fmt_med_mad(rel$Rel_CommI.median, rel$Rel_CommI.mad, digits),
`AUC ratio: CommII` = fmt_med_mad(rel$Rel_CommII.median, rel$Rel_CommII.mad, digits),
`AUC gap: Unreg` = fmt_med_mad(rel$Gap_Unreg.median, rel$Gap_Unreg.mad, digits),
`AUC gap: CV` = fmt_med_mad(rel$Gap_CV.median, rel$Gap_CV.mad, digits),
`AUC gap: CommI` = fmt_med_mad(rel$Gap_CommI.median, rel$Gap_CommI.mad, digits),
`AUC gap: CommII` = fmt_med_mad(rel$Gap_CommII.median, rel$Gap_CommII.mad, digits),
check.names = FALSE
)
}
}
latex_ready_table <- function(x, caption, label) {
knitr::kable(x, format = "latex", booktabs = TRUE, caption = caption,
label = label, escape = FALSE)
}Rows with finite oracle: 400 / 400 (100.0%) Oracle-dominance violations: 0 Extreme rows flagged: 0
| Scenario | N | Unregularized | Oracle | CV | CommI | CommII |
|---|---|---|---|---|---|---|
| block | 100 | 1.521 (0.190) | 1.431 (0.173) | 1.470 (0.176) | 1.745 (0.250) | 6.593 (0.771) |
| grouped | 100 | 1.568 (0.175) | 1.389 (0.153) | 1.421 (0.159) | 1.425 (0.158) | 1.584 (0.223) |
| sparse | 100 | 1.545 (0.214) | 1.513 (0.226) | 1.532 (0.222) | 1.545 (0.214) | 1.545 (0.214) |
| toeplitz | 100 | 1.500 (0.206) | 1.394 (0.156) | 1.429 (0.192) | 1.419 (0.173) | 2.355 (0.357) |
Values near 1 are oracle-level; values larger than 1 are worse than oracle.
| Scenario | N | Unregularized | CV | CommI | CommII |
|---|---|---|---|---|---|
| block | 100 | 1.045 (0.038) | 1.008 (0.008) | 1.158 (0.094) | 4.492 (0.851) |
| grouped | 100 | 1.151 (0.091) | 1.015 (0.015) | 1.024 (0.023) | 1.140 (0.090) |
| sparse | 100 | 1.008 (0.008) | 1.017 (0.016) | 1.008 (0.008) | 1.008 (0.008) |
| toeplitz | 100 | 1.063 (0.053) | 1.018 (0.018) | 1.012 (0.011) | 1.650 (0.241) |
| Scenario | Method | OK | Flat | AllNonfinite |
|---|---|---|---|---|
| toeplitz | Oracle | 100 | 0 | 0 |
| toeplitz | CV | 100 | 0 | 0 |
| toeplitz | CommI | 100 | 0 | 0 |
| toeplitz | CommII | 100 | 0 | 0 |
| block | Oracle | 100 | 0 | 0 |
| block | CV | 100 | 0 | 0 |
| block | CommI | 100 | 0 | 0 |
| block | CommII | 100 | 0 | 0 |
| sparse | Oracle | 100 | 0 | 0 |
| sparse | CV | 100 | 0 | 0 |
| sparse | CommI | 0 | 100 | 0 |
| sparse | CommII | 0 | 100 | 0 |
| grouped | Oracle | 100 | 0 | 0 |
| grouped | CV | 100 | 0 | 0 |
| grouped | CommI | 100 | 0 | 0 |
| grouped | CommII | 100 | 0 | 0 |
cat("Positive t-statistics mean the commutator has larger log-MSE than CV; negative values favor the commutator.\n\n")Positive t-statistics mean the commutator has larger log-MSE than CV; negative values favor the commutator.
| Scenario | Method | t | p | n |
|---|---|---|---|---|
| toeplitz | CommI | -4.7863 | 0.0000 | 100 |
| toeplitz | CommII | 29.7105 | 0.0000 | 100 |
| block | CommI | 10.8873 | 0.0000 | 100 |
| block | CommII | 56.5521 | 0.0000 | 100 |
| sparse | CommI | 0.4357 | 0.6640 | 100 |
| sparse | CommII | 0.4357 | 0.6640 | 100 |
| grouped | CommI | 1.1651 | 0.2468 | 100 |
| grouped | CommII | 10.5282 | 0.0000 | 100 |
Rows with finite oracle: 400 / 400 (100.0%) Oracle-dominance violations: 0 Extreme rows flagged: 0
| Scenario | N | Unregularized | Oracle | CV | CommI | CommII |
|---|---|---|---|---|---|---|
| block | 100 | 0.720 (0.039) | 0.888 (0.029) | 0.878 (0.030) | 0.834 (0.033) | 0.875 (0.032) |
| grouped | 100 | 0.710 (0.043) | 0.880 (0.030) | 0.873 (0.034) | 0.834 (0.036) | 0.869 (0.032) |
| sparse | 100 | 0.718 (0.055) | 0.815 (0.042) | 0.806 (0.041) | 0.718 (0.055) | 0.718 (0.055) |
| toeplitz | 100 | 0.735 (0.045) | 0.881 (0.026) | 0.870 (0.027) | 0.812 (0.040) | 0.856 (0.031) |
cat("AUC ratios near 1 and gaps near 0 are oracle-level. AUC ratios below 1 are worse than oracle.\n\n")AUC ratios near 1 and gaps near 0 are oracle-level. AUC ratios below 1 are worse than oracle.
| Scenario | N | AUC ratio: Unreg | AUC ratio: CV | AUC ratio: CommI | AUC ratio: CommII | AUC gap: Unreg | AUC gap: CV | AUC gap: CommI | AUC gap: CommII |
|---|---|---|---|---|---|---|---|---|---|
| block | 100 | 0.822 (0.048) | 0.993 (0.005) | 0.955 (0.028) | 0.992 (0.006) | 0.156 (0.044) | 0.006 (0.004) | 0.041 (0.025) | 0.007 (0.005) |
| grouped | 100 | 0.812 (0.048) | 0.996 (0.004) | 0.952 (0.022) | 0.988 (0.007) | 0.169 (0.042) | 0.003 (0.003) | 0.044 (0.018) | 0.010 (0.006) |
| sparse | 100 | 0.899 (0.043) | 0.987 (0.009) | 0.899 (0.043) | 0.899 (0.043) | 0.083 (0.035) | 0.011 (0.007) | 0.083 (0.035) | 0.083 (0.035) |
| toeplitz | 100 | 0.833 (0.053) | 0.990 (0.006) | 0.934 (0.034) | 0.974 (0.016) | 0.145 (0.045) | 0.009 (0.006) | 0.057 (0.027) | 0.022 (0.014) |
| Scenario | Method | OK | Flat | AllNonfinite |
|---|---|---|---|---|
| toeplitz | Oracle | 100 | 0 | 0 |
| toeplitz | CV | 100 | 0 | 0 |
| toeplitz | CommI | 100 | 0 | 0 |
| toeplitz | CommII | 100 | 0 | 0 |
| block | Oracle | 100 | 0 | 0 |
| block | CV | 100 | 0 | 0 |
| block | CommI | 100 | 0 | 0 |
| block | CommII | 100 | 0 | 0 |
| sparse | Oracle | 100 | 0 | 0 |
| sparse | CV | 100 | 0 | 0 |
| sparse | CommI | 0 | 100 | 0 |
| sparse | CommII | 0 | 100 | 0 |
| grouped | Oracle | 100 | 0 | 0 |
| grouped | CV | 100 | 0 | 0 |
| grouped | CommI | 100 | 0 | 0 |
| grouped | CommII | 100 | 0 | 0 |
cat("Positive t-statistics mean the commutator has higher AUC than CV; negative values favor CV.\n\n")Positive t-statistics mean the commutator has higher AUC than CV; negative values favor CV.
| Scenario | Method | t | p | n |
|---|---|---|---|---|
| toeplitz | CommI | -12.0405 | 0.0000 | 100 |
| toeplitz | CommII | -6.0473 | 0.0000 | 100 |
| block | CommI | -10.0030 | 0.0000 | 100 |
| block | CommII | -0.2519 | 0.8016 | 100 |
| sparse | CommI | -14.6176 | 0.0000 | 100 |
| sparse | CommII | -14.6176 | 0.0000 | 100 |
| grouped | CommI | -11.3909 | 0.0000 | 100 |
| grouped | CommII | -4.8783 | 0.0000 | 100 |
if (exists("linear_analysis")) {
cat("\n\n## Linear diagnostics\n\n")
cat("Oracle dominance violations:", length(linear_analysis$oracle_violations), "\n\n")
if (length(linear_analysis$oracle_violations) > 0) {
print(head(linear_analysis$df_full[linear_analysis$oracle_violations, ], 10))
}
flat_sparse <- subset(linear_analysis$status, Scenario == "sparse" & grepl("Comm", Method))
cat("Sparse P=I negative-control commutator statuses:\n\n")
print(knitr::kable(flat_sparse, align = "c"))
}Oracle dominance violations: 0
Sparse P=I negative-control commutator statuses:
| Scenario | Method | OK | Flat | AllNonfinite | |
|---|---|---|---|---|---|
| 11 | sparse | CommI | 0 | 100 | 0 |
| 12 | sparse | CommII | 0 | 100 | 0 |
if (exists("logistic_analysis")) {
cat("\n\n## Logistic diagnostics\n\n")
cat("Oracle dominance violations:", length(logistic_analysis$oracle_violations), "\n\n")
if (length(logistic_analysis$oracle_violations) > 0) {
print(head(logistic_analysis$df_full[logistic_analysis$oracle_violations, ], 10))
}
flat_sparse <- subset(logistic_analysis$status, Scenario == "sparse" & grepl("Comm", Method))
cat("Sparse P=I negative-control commutator statuses:\n\n")
print(knitr::kable(flat_sparse, align = "c"))
}Oracle dominance violations: 0
Sparse P=I negative-control commutator statuses:
| Scenario | Method | OK | Flat | AllNonfinite | |
|---|---|---|---|---|---|
| 11 | sparse | CommI | 0 | 100 | 0 |
| 12 | sparse | CommII | 0 | 100 | 0 |
Interpretation checks:
if (has_ggplot2) {
metric_long <- function(df) {
metric_cols <- c("Metric_Unreg", "Metric_Oracle", "Metric_CV", "Metric_CommI", "Metric_CommII")
M <- as.matrix(df[, metric_cols, drop = FALSE])
data.frame(
Model = rep(df$Model, each = length(metric_cols)),
Scenario = rep(df$Scenario, each = length(metric_cols)),
Method = rep(c("Unreg", "Oracle", "CV", "CommI", "CommII"), times = nrow(df)),
Metric = as.vector(t(M)),
stringsAsFactors = FALSE
)
}
# Optional: Define a consistent, colorblind-friendly palette
method_palette <- c(
"Unreg" = "#999999", # Gray
"Oracle" = "#0072B2", # Blue
"CV" = "#009E73", # Green
"CommI" = "#D55E00", # Orange
"CommII" = "#CC79A7" # Pink
)
if (exists("linear_analysis")) {
p_linear <- ggplot2::ggplot(metric_long(linear_analysis$df_clean),
ggplot2::aes(x = Method, y = Metric, fill = Method)) +
ggplot2::geom_boxplot(outlier.alpha = 0.25, outlier.size = 0.5) +
ggplot2::scale_fill_manual(values = method_palette, guide = "none") + # Hide legend (redundant with x-axis)
ggplot2::facet_wrap(~ Scenario, scales = "free_y") +
ggplot2::labs(title = "Linear model test MSE", y = "MSE", x = NULL) +
ggplot2::theme_bw() +
ggplot2::theme(
strip.background = ggplot2::element_rect(fill = "gray90"),
panel.grid.minor = ggplot2::element_blank()
)
print(p_linear)
}
if (exists("logistic_analysis")) {
p_logistic <- ggplot2::ggplot(metric_long(logistic_analysis$df_clean),
ggplot2::aes(x = Method, y = Metric, fill = Method)) +
ggplot2::geom_boxplot(outlier.alpha = 0.25, outlier.size = 0.5) +
ggplot2::scale_fill_manual(values = method_palette, guide = "none") +
ggplot2::facet_wrap(~ Scenario, scales = "free_y") +
ggplot2::labs(title = "Logistic model test AUC", y = "AUC", x = NULL) +
ggplot2::theme_bw() +
ggplot2::theme(
strip.background = ggplot2::element_rect(fill = "gray90"),
panel.grid.minor = ggplot2::element_blank()
)
print(p_logistic)
}
} else {
cat("Package ggplot2 is not installed; skipping plots.\n")
}if (isTRUE(params$save_results)) {
dir.create(params$output_dir, showWarnings = FALSE, recursive = TRUE)
prefix <- file.path(params$output_dir, params$output_prefix)
export_obj <- list(
params = params,
lambda_grid_eval = lambda_grid_eval,
linear_df = if (exists("linear_df")) linear_df else NULL,
logistic_df = if (exists("logistic_df")) logistic_df else NULL,
linear_analysis = if (exists("linear_analysis")) linear_analysis else NULL,
logistic_analysis = if (exists("logistic_analysis")) logistic_analysis else NULL
)
saveRDS(export_obj, paste0(prefix, ".rds"))
if (exists("linear_df")) write.csv(linear_df, paste0(prefix, "_linear_raw.csv"), row.names = FALSE)
if (exists("logistic_df")) write.csv(logistic_df, paste0(prefix, "_logistic_raw.csv"), row.names = FALSE)
tex_lines <- character(0)
if (exists("linear_analysis")) {
tex_lines <- c(
tex_lines,
"% Linear model tables",
capture.output(latex_ready_table(
metric_table(linear_analysis),
caption = "Linear-model test MSE by method, reported as median (MAD).",
label = "tab:linear_metric_revised"
)),
"",
capture.output(latex_ready_table(
relative_table(linear_analysis, type = "linear"),
caption = "Linear-model MSE ratio to oracle, reported as median (MAD). Values above 1 are worse than oracle.",
label = "tab:linear_relative_revised"
)),
""
)
}
if (exists("logistic_analysis")) {
tex_lines <- c(
tex_lines,
"% Logistic model tables",
capture.output(latex_ready_table(
metric_table(logistic_analysis),
caption = "Logistic-model test AUC by method, reported as median (MAD).",
label = "tab:logistic_metric_revised"
)),
"",
capture.output(latex_ready_table(
relative_table(logistic_analysis, type = "logistic"),
caption = "Logistic-model AUC ratio and AUC gap to oracle, reported as median (MAD). Ratios near 1 and gaps near 0 are best.",
label = "tab:logistic_relative_revised"
)),
""
)
}
writeLines(tex_lines, paste0(prefix, "_latex_tables.tex"))
cat("Saved results to: ", normalizePath(params$output_dir), "\n\n", sep = "")
cat("Files written with prefix: ", params$output_prefix, "\n", sep = "")
}Saved results to: D:.dir_Books\2025_BiasVariance_2025_New_Commutator_2026_V2_Commutator_results
Files written with prefix: commutator_validation_revised
The results above provide empirical validation of the commutator framework as a geometric diagnostic and low-cost tuning heuristic, complementing classical cross-validation selection.
The results pass the main internal checks: both linear and logistic experiments have \(400 / 400\) finite oracle rows, zero oracle-dominance violations, and zero extreme rows. The sparse (P=I) case also behaves exactly as a negative control: both Type I and Type II commutator selectors are flat in all 100 sparse replications, which is the expected outcome when (A) and (P) commute.
The simulation results are scientifically coherent. For linear regression, CV is consistently close to the oracle. Type I is competitive or better than CV in Toeplitz, statistically indistinguishable from CV in Grouped and Sparse, and worse in Block. Type II is unstable in the rank-deficient structured-penalty scenarios, especially Block and Toeplitz, while the Sparse equality is not a success but the expected flat (\(P=I\)) negative-control behavior.
For logistic regression, CV remains closest to the oracle overall. Type II is very close to CV in Block and reasonably close in Grouped/Toeplitz, but Type I is systematically worse than CV. In Sparse, both commutator methods collapse to the unregularized baseline because the commutator objective is flat for (\(P=I\)). This is theoretically correct and reflects a negative control setting, rather that a tuning failure.
The following decision rules support meaningful interpretation of the empirical results.
The oracle-dominance count for each model. If it is nonzero, inspect those rows before using any performance tables.
The sparse ridge scenario serves as a negative
control. If the commutator selectors return
flat_objective, that is expected because \(P=I\) commutes with every curvature matrix
\(A\).
For linear models, compare methods using MSE ratios to oracle. A ratio close to \(1\) means oracle-level performance, while values \(> 1\) indicate excess test error.
For logistic models, compare methods using both AUC ratios and AUC gaps to oracle. Ratios close to \(1\) and gaps close to \(0\) indicate oracle-level performance.
The paired tests are only as secondary evidence. Practical effect sizes in the median/MAD tables should drive the interpretation.
A conservative conclusion compatible with this validation design is:
The pipeline simulation results support Type I and Type II commutator scores as geometry-aware diagnostics for regularization tuning when the curvature and penalty operators do not commute. In commuting cases, such as spherical ridge penalties, the commutator score is correctly uninformative. Rank-deficient structured penalties require a nullity-adjusted balance factor, and Type II orderings should be interpreted with conditioning diagnostics. Therefore, the commutator selector is best viewed as a computationally-light complement to cross-validation rather than a universal replacement.