1 Introduction

This BPAD section in Chapter 3 is focused on Tumor Growth Simulation and Logistic Growth Modeling. It demonstrates cancer data simulation and modeling of tumor growth under different treatment scenarios (radiation therapy and chemotherapy). Readers are encouraged to review the SOCR - DSPA sections on Dynamical Systems and Stochastic Differential Equation Modeling (in R-Julia) and Cancer Treatment Stochastic Differential Equation (SDE) Modeling, as well as the BPAD section on Exponential Growth and Decay.

The data-driven simulation and modeling accounts for:

Local Control (LC): A measure of tumor response to treatment,
Toxicity (RP2): Radiation pneumonitis Grade 2+ toxicity, a side effect of radiation therapy, and
Additional Biological Variables: Biomarkers, imaging features, and patient-specific factors.

The model allows for forward prediction of outcomes given user-specified time-periods.

2 Experimental settings

2.1 Parameter Configuration

Below we show the core parameter settings, which can be modified to explore different conditions and treatment regimens.

# Set random seed for reproducibility
set.seed(12345)

# Simulation parameters
n_patients <- 50      # Number of patients to simulate
days <- 90            # Days of observation
prediction_days <- 30 # Days to predict forward

# Treatment parameters
radiation_doses <- c("Low" = 60, "Medium" = 70, "High" = 80) # Gy
chemo_regimens <- c("None" = 0, "Standard" = 1, "Intensive" = 2)

# Logistic growth parameters - base values
K_base <- 100        # Carrying capacity (maximum tumor size in mm)
r_base <- 0.08       # Intrinsic growth rate
N0_base <- 10        # Initial tumor size (mm)

# Treatment effect parameters
# How much each treatment reduces the growth rate
radiation_effect <- c("Low" = 0.3, "Medium" = 0.5, "High" = 0.7)
chemo_effect <- c("None" = 0, "Standard" = 0.4, "Intensive" = 0.6)

# Toxicity parameters
toxicity_radiation <- c("Low" = 0.1, "Medium" = 0.25, "High" = 0.4)
toxicity_chemo <- c("None" = 0, "Standard" = 0.15, "Intensive" = 0.35)

# Patient-specific biological factors (influence on growth and toxicity)
biomarker_names <- c("Immune_Score", "Hypoxia", "Proliferation_Index", 
                     "Vascular_Density", "PD_L1_Expression")
imaging_names <- c("SUV_max", "Tumor_Heterogeneity", "Metabolic_Volume")
clinical_names <- c("Age", "Performance_Status", "Smoking_Pack_Years", "Comorbidity_Score")

# Factor effects on carrying capacity, growth rate, and toxicity
bio_K_effect <- c(0.2, -0.15, 0.25, 0.1, -0.2)      # Effects on carrying capacity
bio_r_effect <- c(-0.1, 0.2, 0.15, 0.1, -0.15)      # Effects on growth rate
bio_tox_effect <- c(-0.2, 0.15, 0.1, 0.2, 0.1)      # Effects on toxicity

img_K_effect <- c(0.15, 0.2, 0.25)                  # Effects on carrying capacity
img_r_effect <- c(0.1, 0.15, 0.2)                   # Effects on growth rate
img_tox_effect <- c(0.15, 0.1, 0.2)                 # Effects on toxicity

clin_K_effect <- c(0.1, -0.2, 0.15, 0.2)            # Effects on carrying capacity
clin_r_effect <- c(0.05, -0.15, 0.1, 0.15)          # Effects on growth rate
clin_tox_effect <- c(0.15, 0.2, 0.25, 0.3)          # Effects on toxicity

2.2 Simulation Functions

Here we define the core finctions for simulating realsitic cancer datasets.

# Function to generate patient-specific biological factors
generate_patient_factors <- function() {
  # Generate standardized biomarkers (mean 0, sd 1)
  biomarkers <- rnorm(length(biomarker_names), mean = 0, sd = 1)
  names(biomarkers) <- biomarker_names
  
  # Generate imaging features (mean 0, sd 1)
  imaging <- rnorm(length(imaging_names), mean = 0, sd = 1)
  names(imaging) <- imaging_names
  
  # Generate clinical factors (appropriate distributions)
  clinical <- c(
    Age = rnorm(1, mean = 65, sd = 10),                          # Age ~ N(65, 10)
    Performance_Status = sample(0:2, 1, prob = c(0.3, 0.5, 0.2)), # 0-2 scale
    Smoking_Pack_Years = max(0, rnorm(1, mean = 30, sd = 15)),     # Pack years (>=0)
    Comorbidity_Score = sample(0:4, 1, prob = c(0.3, 0.3, 0.2, 0.1, 0.1)) # 0-4 scale
  )
  
  # Normalize clinical factors
  clinical_norm <- c(
    Age = (clinical["Age"] - 65) / 10,
    Performance_Status = clinical["Performance_Status"] / 2,
    Smoking_Pack_Years = (clinical["Smoking_Pack_Years"] - 30) / 15,
    Comorbidity_Score = clinical["Comorbidity_Score"] / 4
  )
  
  # Combined factors (original values for reporting, normalized for calculations)
  return(list(
    biomarkers = biomarkers,
    imaging = imaging,
    clinical = clinical,
    clinical_norm = clinical_norm
  ))
}

# Function to calculate patient-specific growth parameters
calculate_growth_parameters <- function(factors, rad_dose_level, chemo_regimen_level) {
  # Base values
  K <- K_base
  r <- r_base
  N0 <- N0_base
  
  # Apply biological factor effects - biomarkers
  for (i in 1:length(biomarker_names)) {
    K <- K * (1 + bio_K_effect[i] * factors$biomarkers[i])
    r <- r * (1 + bio_r_effect[i] * factors$biomarkers[i])
  }
  
  # Apply biological factor effects - imaging
  for (i in 1:length(imaging_names)) {
    K <- K * (1 + img_K_effect[i] * factors$imaging[i])
    r <- r * (1 + img_r_effect[i] * factors$imaging[i])
  }
  
  # Apply biological factor effects - clinical
  for (i in 1:length(clinical_names)) {
    K <- K * (1 + clin_K_effect[i] * factors$clinical_norm[i])
    r <- r * (1 + clin_r_effect[i] * factors$clinical_norm[i])
  }
  
  # Apply treatment effects
  r_reduced <- r * (1 - radiation_effect[rad_dose_level] - chemo_effect[chemo_regimen_level])
  
  # Ensure logical ranges
  K <- max(K, 1.2 * N0)  # Carrying capacity should be larger than initial size
  r_reduced <- max(r_reduced, 0.001)  # Growth rate should be positive but can be very small
  
  return(list(K = K, r = r_reduced, N0 = N0))
}

# Function to calculate toxicity
calculate_toxicity <- function(factors, rad_dose_level, chemo_regimen_level) {
  # Base toxicity from treatments
  tox_base <- toxicity_radiation[rad_dose_level] + toxicity_chemo[chemo_regimen_level]
  
  # Apply biological factor effects - biomarkers
  for (i in 1:length(biomarker_names)) {
    tox_base <- tox_base * (1 + bio_tox_effect[i] * factors$biomarkers[i])
  }
  
  # Apply biological factor effects - imaging
  for (i in 1:length(imaging_names)) {
    tox_base <- tox_base * (1 + img_tox_effect[i] * factors$imaging[i])
  }
  
  # Apply biological factor effects - clinical
  for (i in 1:length(clinical_names)) {
    tox_base <- tox_base * (1 + clin_tox_effect[i] * factors$clinical_norm[i])
  }
  
  # Add random variation and bound between 0 and 1
  toxicity <- pmin(1, pmax(0, tox_base + rnorm(1, 0, 0.05)))
  
  # Convert to binary outcome with probability
  toxicity_bin <- rbinom(1, 1, toxicity)
  
  return(list(
    toxicity_prob = toxicity,
    toxicity_outcome = toxicity_bin
  ))
}

# Logistic growth function
logistic_growth <- function(t, K, r, N0) {
  return(K / (1 + ((K / N0) - 1) * exp(-r * t)))
}

# Add noise to the growth curve to simulate measurement error and biological variation
add_noise <- function(value, cv = 0.05) {
  return(value * (1 + rnorm(1, 0, cv)))
}

# Function to generate tumor growth data for one patient
generate_patient_data <- function(patient_id, days, rad_dose_level, chemo_regimen_level) {
  # Generate patient factors
  factors <- generate_patient_factors()
  
  # Calculate growth parameters
  params <- calculate_growth_parameters(factors, rad_dose_level, chemo_regimen_level)
  
  # Calculate toxicity
  toxicity <- calculate_toxicity(factors, rad_dose_level, chemo_regimen_level)
  
  # Generate tumor size at each time point
  data <- data.frame(
    patient_id = patient_id,
    day = 0:days,
    radiation_dose = rad_dose_level,
    radiation_dose_value = radiation_doses[rad_dose_level],
    chemo_regimen = chemo_regimen_level,
    chemo_regimen_value = chemo_regimens[chemo_regimen_level]
  )
  
  # Add predicted tumor size based on logistic growth
  data$tumor_size_pred <- sapply(data$day, function(t) logistic_growth(t, params$K, params$r, params$N0))
  
  # Add noise to create observed tumor size
  data$tumor_size <- sapply(data$tumor_size_pred, add_noise)
  
  # Add growth parameters
  data$K <- params$K
  data$r <- params$r
  data$N0 <- params$N0
  
  # Add toxicity data
  data$toxicity_prob <- toxicity$toxicity_prob
  data$toxicity_outcome <- toxicity$toxicity_outcome
  
  # Add all biological factors
  for (name in biomarker_names) {
    data[[name]] <- factors$biomarkers[name]
  }
  
  for (name in imaging_names) {
    data[[name]] <- factors$imaging[name]
  }
  
  for (name in clinical_names) {
    data[[name]] <- factors$clinical[name]
  }
  
  # Calculate Local Control (LC) - inverse relationship with tumor size relative to baseline
  data$local_control <- with(data, 100 * (1 - (tumor_size / tumor_size[1]) / (K / tumor_size[1])))
  
  # Ensure LC is between 0 and 100
  data$local_control <- pmin(100, pmax(0, data$local_control))
  
  return(data)
}

2.3 Generate Simulated Patient Data

Next we simulate cancer patient datasets.

# Assign treatments to patients
patient_treatments <- data.frame(
  patient_id = 1:n_patients,
  rad_dose_level = sample(names(radiation_doses), n_patients, replace = TRUE),
  chemo_regimen_level = sample(names(chemo_regimens), n_patients, replace = TRUE)
)

# Generate data for all patients
patient_data_list <- map2(
  patient_treatments$patient_id,
  patient_treatments$rad_dose_level,
  function(id, rad) {
    map(
      patient_treatments$chemo_regimen_level[patient_treatments$patient_id == id],
      function(chemo) {
        generate_patient_data(id, days, rad, chemo)
      }
    )
  }
) %>% flatten()

# Combine all patient data
all_patient_data <- bind_rows(patient_data_list)

# Display summary of the data
summary_data <- all_patient_data %>%
  filter(day == days) %>%
  group_by(radiation_dose, chemo_regimen) %>%
  summarize(
    n_patients = n(),
    mean_tumor_size = mean(tumor_size),
    mean_local_control = mean(local_control),
    toxicity_rate = mean(toxicity_outcome),
    .groups = 'drop'
  )

knitr::kable(summary_data, caption = "Summary of Patient Outcomes at End of Observation Period")

Summary of Patient Outcomes at End of Observation Period
radiation_dose	chemo_regimen	n_patients	mean_tumor_size	mean_local_control	toxicity_rate
High	Intensive	7	10.64339	70.77827	0.7142857
High	None	4	40.97723	51.81287	0.5000000
High	Standard	6	10.96377	89.20703	0.5000000
Low	Intensive	4	17.68211	69.86646	0.5000000
Low	None	5	59.98504	17.98194	0.0000000
Low	Standard	5	57.71604	47.91041	0.6000000
Medium	Intensive	10	11.16218	88.53072	0.6000000
Medium	None	4	50.70305	17.38160	0.2500000
Medium	Standard	5	14.93581	75.98805	0.6000000

2.4 Visualize Tumor Growth Curves

Let’s show the tumor growth/decay curves for different scenarios.

# Plot average tumor growth by treatment combination
avg_growth <- all_patient_data %>%
  group_by(day, radiation_dose, chemo_regimen) %>%
  summarize(
    mean_size = mean(tumor_size),
    lower_ci = quantile(tumor_size, 0.25),
    upper_ci = quantile(tumor_size, 0.75),
    .groups = 'drop'
  )

ggplot(avg_growth, aes(x = day, y = mean_size, color = radiation_dose, linetype = chemo_regimen)) +
  geom_line(size = 1) +
  geom_ribbon(aes(ymin = lower_ci, ymax = upper_ci, fill = radiation_dose), alpha = 0.2, colour = NA) +
  labs(
    title = "Average Tumor Growth Curves by Treatment",
    x = "Day",
    y = "Tumor Size (mm)",
    color = "Radiation Dose",
    linetype = "Chemotherapy Regimen",
    fill = "Radiation Dose"
  ) +
  theme_minimal() +
  facet_grid(chemo_regimen ~ radiation_dose)

2.5 Visualize Local Control Metrics

To track the growth and decay patterns we use various tumor local control metrics.

# Plot average local control by treatment combination
avg_lc <- all_patient_data %>%
  group_by(day, radiation_dose, chemo_regimen) %>%
  summarize(
    mean_lc = mean(local_control),
    lower_ci = quantile(local_control, 0.25),
    upper_ci = quantile(local_control, 0.75),
    .groups = 'drop'
  )

ggplot(avg_lc, aes(x = day, y = mean_lc, color = radiation_dose, linetype = chemo_regimen)) +
  geom_line(size = 1) +
  geom_ribbon(aes(ymin = lower_ci, ymax = upper_ci, fill = radiation_dose), alpha = 0.2, colour = NA) +
  labs(
    title = "Average Local Control by Treatment",
    x = "Day",
    y = "Local Control (%)",
    color = "Radiation Dose",
    linetype = "Chemotherapy Regimen",
    fill = "Radiation Dose"
  ) +
  theme_minimal() +
  facet_grid(chemo_regimen ~ radiation_dose)

2.6 Toxicity Analysis

Side-effects of cancer treatments represent one of hte major clinical outcomes. Here we provide intervention toxicity analysis.

# Create toxicity summary
toxicity_summary <- all_patient_data %>%
  filter(day == days) %>%
  group_by(radiation_dose, chemo_regimen) %>%
  summarize(
    toxicity_rate = mean(toxicity_outcome),
    toxicity_prob_mean = mean(toxicity_prob),
    n_patients = n(),
    .groups = 'drop'
  )

# Plot toxicity rates
ggplot(toxicity_summary, aes(x = radiation_dose, y = toxicity_rate, fill = chemo_regimen)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  geom_text(aes(label = paste0(round(toxicity_rate*100, 1), "%")), 
            position = position_dodge(width = 0.9), vjust = -0.5) +
  labs(
    title = "Toxicity Rates (RP2+) by Treatment Combination",
    x = "Radiation Dose",
    y = "Toxicity Rate",
    fill = "Chemotherapy Regimen"
  ) +
  theme_minimal() +
  ylim(0, 1)

# Correlation between toxicity and biological factors
toxicity_corr_data <- all_patient_data %>%
  filter(day == 0) %>%  # Just need one row per patient for time-invariant factors
  select(patient_id, radiation_dose, chemo_regimen, toxicity_outcome, toxicity_prob, 
         all_of(c(biomarker_names, imaging_names, clinical_names)))

# Calculate correlation with toxicity probability
corr_factors <- cor(toxicity_corr_data[,c("toxicity_prob", biomarker_names, imaging_names, "Age", "Performance_Status", "Smoking_Pack_Years", "Comorbidity_Score")])
corr_toxicity <- corr_factors["toxicity_prob", ]
corr_toxicity <- corr_toxicity[order(abs(corr_toxicity), decreasing = TRUE)]
corr_toxicity <- corr_toxicity[-1]  # Remove self-correlation

# Plot top correlations
corr_data <- data.frame(
  factor = names(corr_toxicity),
  correlation = as.numeric(corr_toxicity)
)

ggplot(corr_data, aes(x = reorder(factor, correlation), y = correlation)) +
  geom_bar(stat = "identity", aes(fill = correlation > 0)) +
  coord_flip() +
  labs(
    title = "Correlation of Biological Factors with Toxicity",
    x = "Biological Factor",
    y = "Correlation Coefficient"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

3 Model-based inference

3.1 Fit Logistic Growth Models

# Function to fit logistic growth model to a single patient's data
fit_logistic_model <- function(patient_df) {
  # Get patient ID
  patient_id <- unique(patient_df$patient_id)
  
  # Extract day and tumor size
  day <- patient_df$day
  tumor_size <- patient_df$tumor_size
  
  # Initial parameter estimates
  K_init <- max(tumor_size) * 1.2
  r_init <- 0.1
  N0_init <- tumor_size[1]
  
  # Fit logistic model using non-linear least squares
  tryCatch({
    model <- nlsLM(tumor_size ~ K / (1 + ((K / N0) - 1) * exp(-r * day)),
                  data = patient_df,
                  start = list(K = K_init, r = r_init, N0 = N0_init),
                  lower = c(K = max(tumor_size), r = 0.001, N0 = 0.1),
                  upper = c(K = K_init * 5, r = 1, N0 = N0_init * 2),
                  control = nls.lm.control(maxiter = 100))
    
    # Extract parameters
    params <- coef(model)
    
    # Calculate predictions
    predictions <- logistic_growth(day, params["K"], params["r"], params["N0"])
    
    # Calculate goodness of fit
    residuals <- tumor_size - predictions
    sse <- sum(residuals^2)
    sst <- sum((tumor_size - mean(tumor_size))^2)
    r_squared <- 1 - sse/sst
    rmse <- sqrt(mean(residuals^2))
    
    # Create result object
    result <- list(
      patient_id = patient_id,
      success = TRUE,
      params = params,
      predictions = predictions,
      r_squared = r_squared,
      rmse = rmse
    )
    
    return(result)
    
  }, error = function(e) {
    # If model fitting fails, return failure result
    return(list(
      patient_id = patient_id,
      success = FALSE,
      params = c(K = NA, r = NA, N0 = NA),
      predictions = rep(NA, length(day)),
      r_squared = NA,
      rmse = NA
    ))
  })
}

# Split data by patient and fit models
patient_ids <- unique(all_patient_data$patient_id)
fitted_models <- list()

for (id in patient_ids) {
  patient_df <- all_patient_data %>% filter(patient_id == id)
  fitted_models[[as.character(id)]] <- fit_logistic_model(patient_df)
}

# Create a summary of fitted parameters
fitted_params <- data.frame(
  patient_id = sapply(fitted_models, function(x) x$patient_id),
  success = sapply(fitted_models, function(x) x$success),
  K = sapply(fitted_models, function(x) x$params["K"]),
  r = sapply(fitted_models, function(x) x$params["r"]),
  N0 = sapply(fitted_models, function(x) x$params["N0"]),
  r_squared = sapply(fitted_models, function(x) x$r_squared),
  rmse = sapply(fitted_models, function(x) x$rmse)
)

# Join with treatment information
fitted_params <- fitted_params %>%
  left_join(patient_treatments, by = "patient_id")

# Summary statistics for fitted parameters
fitted_summary <- fitted_params %>%
  filter(success) %>%
  group_by(rad_dose_level, chemo_regimen_level) %>%
  summarize(
    n = n(),
    K_mean = mean(K),
    K_sd = sd(K),
    r_mean = mean(r),
    r_sd = sd(r),
    N0_mean = mean(N0),
    N0_sd = sd(N0),
    r_squared_mean = mean(r_squared),
    rmse_mean = mean(rmse),
    .groups = 'drop'
  )

knitr::kable(fitted_summary, caption = "Summary of Fitted Logistic Growth Parameters by Treatment")

Summary of Fitted Logistic Growth Parameters by Treatment
rad_dose_level	chemo_regimen_level	n	K_mean	K_sd	r_mean	r_sd	N0_mean	N0_sd	r_squared_mean	rmse_mean
High	Intensive	7	30.16941	21.35528	0.0037955	0.0046139	9.887918	0.1061149	0.1799725	0.5046302
High	None	4	115.70060	102.82598	0.0235361	0.0059824	9.859138	0.0701250	0.9627808	1.2102087
High	Standard	6	51.32164	30.69427	0.0038341	0.0045693	9.964039	0.1051384	0.2354409	0.5006287
Low	Intensive	4	59.23714	65.36024	0.0082962	0.0042297	9.975837	0.1275491	0.7684536	0.6546079
Low	None	5	74.83986	40.11595	0.0404914	0.0137116	9.981190	0.2643829	0.9818601	1.9382326
Low	Standard	5	115.07837	71.66973	0.0252970	0.0046306	9.935568	0.1830232	0.9841783	1.4398626
Medium	Intensive	10	40.75052	28.08037	0.0036011	0.0034367	9.917112	0.0844268	0.2351274	0.5029957
Medium	None	4	62.92985	6.23567	0.0393629	0.0184632	10.153344	0.1781188	0.9773829	1.9148449
Medium	Standard	5	51.39369	29.11994	0.0072258	0.0041750	9.926430	0.2519186	0.7972476	0.6066814

3.2 Visualize Model Fits

# Function to create data frame with observed and predicted values
create_fit_data <- function(patient_id) {
  # Get data for this patient
  patient_df <- all_patient_data %>% filter(patient_id == patient_id)
  model_result <- fitted_models[[as.character(patient_id)]]
  
  if (model_result$success) {
    # Extract unique days for this patient
    days_unique <- unique(patient_df$day)
    
    # Make sure we have predictions matching the number of unique days
    if (length(model_result$predictions) == length(days_unique)) {
      # Create a lookup table mapping days to predictions
      pred_lookup <- data.frame(
        day = days_unique,
        predicted = model_result$predictions
      )
      
      # Join predictions to patient data
      fit_df <- patient_df %>%
        left_join(pred_lookup, by = "day")
    } else {
      # If lengths don't match, something went wrong - add NA predictions
      fit_df <- patient_df %>%
        mutate(predicted = NA)
    }
  } else {
    # If model failed, add NA predictions
    fit_df <- patient_df %>%
      mutate(predicted = NA)
  }
  
  return(fit_df)
}

# Sample a few patients to visualize
sample_patients <- sample(patient_ids[sapply(fitted_models, function(x) x$success)], 6)
sample_fits <- lapply(sample_patients, create_fit_data)
sample_fits_df <- bind_rows(sample_fits)

# Create plot of observed vs predicted for sampled patients
ggplot(sample_fits_df, aes(x = day)) +
  geom_point(aes(y = tumor_size), alpha = 0.6) +
  geom_line(aes(y = predicted), color = "red", size = 1) +
  facet_wrap(~ patient_id, scales = "free_y") +
  labs(
    title = "Observed vs Predicted Tumor Size for Sample Patients",
    x = "Day",
    y = "Tumor Size (mm)",
    subtitle = "Points: Observed, Red Line: Logistic Growth Model"
  ) +
  theme_minimal()

4 Forward Prediction

# Function to predict forward for a specific patient
predict_forward <- function(patient_id, days_forward) {
  # Get fitted model
  model_result <- fitted_models[[as.character(patient_id)]]
  
  if (!model_result$success) {
    return(NULL)
  }
  
  # Get current data
  patient_df <- all_patient_data %>% filter(patient_id == patient_id)
  last_day <- max(patient_df$day)
  
  # Extract parameters
  params <- model_result$params
  
  # Create prediction days
  pred_days <- (last_day + 1):(last_day + days_forward)
  
  # Create prediction data frame
  pred_df <- data.frame(
    patient_id = patient_id,
    day = pred_days,
    radiation_dose = unique(patient_df$radiation_dose),
    chemo_regimen = unique(patient_df$chemo_regimen),
    predicted_size = logistic_growth(pred_days, params["K"], params["r"], params["N0"])
  )
  
  # Add local control prediction
  pred_df$predicted_lc <- 100 * (1 - (pred_df$predicted_size / patient_df$tumor_size[1]) / 
                                  (params["K"] / patient_df$tumor_size[1]))
  pred_df$predicted_lc <- pmin(100, pmax(0, pred_df$predicted_lc))
  
  return(pred_df)
}

# Make predictions for all patients
all_predictions <- list()
for (id in patient_ids) {
  pred <- predict_forward(id, prediction_days)
  if (!is.null(pred)) {
    all_predictions[[length(all_predictions) + 1]] <- pred
  }
}

# Combine predictions
if (length(all_predictions) > 0) {
  all_predictions_df <- bind_rows(all_predictions)
  
  # Create summary by treatment combination
  pred_summary <- all_predictions_df %>%
    group_by(day, radiation_dose, chemo_regimen) %>%
    summarize(
      mean_size = mean(predicted_size),
      lower_ci = quantile(predicted_size, 0.25),
      upper_ci = quantile(predicted_size, 0.75),
      mean_lc = mean(predicted_lc),
      lc_lower_ci = quantile(predicted_lc, 0.25),
      lc_upper_ci = quantile(predicted_lc, 0.75),
      .groups = 'drop'
    )
  
  # Join with historical data for continuous plotting
  historical_summary <- all_patient_data %>%
    group_by(day, radiation_dose, chemo_regimen) %>%
    summarize(
      mean_size = mean(tumor_size),
      lower_ci = quantile(tumor_size, 0.25),
      upper_ci = quantile(tumor_size, 0.75),
      mean_lc = mean(local_control),
      lc_lower_ci = quantile(local_control, 0.25),
      lc_upper_ci = quantile(local_control, 0.75),
      .groups = 'drop'
    ) %>%
    mutate(data_type = "Historical")
  
  pred_summary <- pred_summary %>%
    mutate(data_type = "Predicted")
  
  combined_summary <- bind_rows(historical_summary, pred_summary)
  
  # Create future prediction plots
  p1 <- ggplot(combined_summary, aes(x = day, y = mean_size, color = radiation_dose, 
                                    linetype = data_type)) +
    geom_line(size = 1) +
    geom_ribbon(aes(ymin = lower_ci, ymax = upper_ci, fill = radiation_dose), 
                alpha = 0.2, colour = NA) +
    labs(
      title = "Tumor Size Prediction by Treatment",
      x = "Day",
      y = "Tumor Size (mm)",
      color = "Radiation Dose",
      linetype = "Data Type",
      fill = "Radiation Dose"
    ) +
    theme_minimal() +
    facet_grid(chemo_regimen ~ radiation_dose) +
    geom_vline(xintercept = days, linetype = "dashed", color = "gray50")
  
  p2 <- ggplot(combined_summary, aes(x = day, y = mean_lc, color = radiation_dose, 
                                    linetype = data_type)) +
    geom_line(size = 1) +
    geom_ribbon(aes(ymin = lc_lower_ci, ymax = lc_upper_ci, fill = radiation_dose), 
                alpha = 0.2, colour = NA) +
    labs(
      title = "Local Control Prediction by Treatment",
      x = "Day",
      y = "Local Control (%)",
      color = "Radiation Dose",
      linetype = "Data Type",
      fill = "Radiation Dose"
    ) +
    theme_minimal() +
    facet_grid(chemo_regimen ~ radiation_dose) +
    geom_vline(xintercept = days, linetype = "dashed", color = "gray50")
  
  print(p1)
  print(p2)
}

5 Interactive Visualization

# Create interactive plot for tumor growth
if (length(all_predictions) > 0) {
  # Prepare data
  combined_data <- bind_rows(
    all_patient_data %>% 
      select(patient_id, day, radiation_dose, chemo_regimen, size = tumor_size) %>%
      mutate(data_type = "Historical"),
    all_predictions_df %>% 
      select(patient_id, day, radiation_dose, chemo_regimen, size = predicted_size) %>%
      mutate(data_type = "Predicted")
  )
  
  # Sample patients for clearer visualization
  sample_ids <- sample(unique(combined_data$patient_id), min(10, length(unique(combined_data$patient_id))))
  
  sample_data <- combined_data %>%
    filter(patient_id %in% sample_ids)
  
  # Create interactive plot
  p <- plot_ly(sample_data, x = ~day, y = ~size, color = ~radiation_dose, 
               linetype = ~data_type, type = 'scatter', mode = 'lines+markers',
               text = ~paste("Patient:", patient_id, "<br>Day:", day, 
                             "<br>Tumor size:", round(size, 1), "mm",
                             "<br>Radiation:", radiation_dose,
                             "<br>Chemo:", chemo_regimen)) %>%
    layout(title = "Interactive Tumor Growth Prediction",
           xaxis = list(title = "Day"),
           yaxis = list(title = "Tumor Size (mm)"),
           shapes = list(
             list(type = "line", x0 = days, x1 = days, 
                  y0 = 0, y1 = 1, yref = "paper",
                  line = list(color = "gray", dash = "dash"))
           ))
  
  p
}

6 Biological Factor Analysis

Using additional patient biomedical data may lead to a more holistic modeling and better forecasting.

# Analyze how biological factors affect outcomes
# Create a data frame with one row per patient
patient_summary <- all_patient_data %>%
  filter(day == days) %>%
  select(patient_id, radiation_dose, chemo_regimen, 
         final_tumor_size = tumor_size, 
         local_control, toxicity_outcome,
         all_of(c(biomarker_names, imaging_names, clinical_names)))

# Calculate correlation with tumor size and local control
bio_correlations <- cor(patient_summary[, c("final_tumor_size", "local_control", 
                                          biomarker_names, imaging_names,
                                          "Age", "Performance_Status", 
                                          "Smoking_Pack_Years", "Comorbidity_Score")])

# Extract correlations
size_corr <- bio_correlations["final_tumor_size", ]
size_corr <- size_corr[order(abs(size_corr), decreasing = TRUE)]
size_corr <- size_corr[-1]  # Remove self-correlation

lc_corr <- bio_correlations["local_control", ]
lc_corr <- lc_corr[order(abs(lc_corr), decreasing = TRUE)]
lc_corr <- lc_corr[-1]  # Remove self-correlation

# Create data frames for plotting
size_corr_df <- data.frame(
  factor = names(size_corr),
  correlation = as.numeric(size_corr)
)

lc_corr_df <- data.frame(
  factor = names(lc_corr),
  correlation = as.numeric(lc_corr)
)

# Plot correlations
p1 <- ggplot(size_corr_df[1:10,], aes(x = reorder(factor, correlation), y = correlation)) +
  geom_bar(stat = "identity", aes(fill = correlation > 0)) +
  coord_flip() +
  labs(
    title = "Top 10 Factors Correlated with Final Tumor Size",
    x = "Biological Factor",
    y = "Correlation Coefficient"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

p2 <- ggplot(lc_corr_df[1:10,], aes(x = reorder(factor, correlation), y = correlation)) +
  geom_bar(stat = "identity", aes(fill = correlation > 0)) +
  coord_flip() +
  labs(
    title = "Top 10 Factors Correlated with Local Control",
    x = "Biological Factor",
    y = "Correlation Coefficient"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

print(p1)

print(p2)

# Examine key factors more closely
key_factors <- c("Immune_Score", "Hypoxia", "Proliferation_Index", "Age", "Performance_Status")

# Create plots for key factors vs outcomes
for (factor in key_factors) {
  p <- ggplot(patient_summary, aes_string(x = factor, y = "local_control", color = "radiation_dose")) +
    geom_point(aes(size = ifelse(toxicity_outcome == 1, 3, 1))) +
    geom_smooth(method = "loess", se = FALSE) +
    facet_wrap(~ chemo_regimen) +
    labs(
      title = paste("Effect of", factor, "on Local Control"),
      x = factor,
      y = "Local Control (%)",
      color = "Radiation Dose",
      size = "Toxicity"
    ) +
    theme_minimal()
  
  print(p)
}

6.1 User Parameter Interface for Custom Prediction

# Function to predict outcome for a new patient with user-specified parameters
predict_new_patient <- function(rad_dose_level, chemo_regimen_level, bio_factors, days_to_predict) {
  # Convert factors to required format
  factors <- list(
    biomarkers = bio_factors$biomarkers,
    imaging = bio_factors$imaging,
    clinical = bio_factors$clinical,
    clinical_norm = bio_factors$clinical_norm
  )
  
  # Calculate patient-specific parameters
  params <- calculate_growth_parameters(factors, rad_dose_level, chemo_regimen_level)
  
  # Calculate toxicity
  toxicity <- calculate_toxicity(factors, rad_dose_level, chemo_regimen_level)
  
  # Create prediction days
  pred_days <- 0:days_to_predict
  
  # Generate predictions
  tumor_size <- sapply(pred_days, function(t) logistic_growth(t, params$K, params$r, params$N0))
  
  # Calculate local control
  local_control <- 100 * (1 - (tumor_size / tumor_size[1]) / (params$K / tumor_size[1]))
  local_control <- pmin(100, pmax(0, local_control))
  
  # Create prediction data frame
  pred_df <- data.frame(
    day = pred_days,
    tumor_size = tumor_size,
    local_control = local_control,
    toxicity_prob = toxicity$toxicity_prob,
    rad_dose = rad_dose_level,
    chemo_regimen = chemo_regimen_level
  )
  
  return(list(
    predictions = pred_df,
    params = params,
    toxicity = toxicity
  ))
}

# Example: Set patient parameters 
# You can modify these values for custom predictions
example_bio_factors <- list(
  biomarkers = c(
    Immune_Score = 0.5,
    Hypoxia = -0.2, 
    Proliferation_Index = 0.3,
    Vascular_Density = 0.1,
    PD_L1_Expression = 0.7
  ),
  imaging = c(
    SUV_max = 0.4,
    Tumor_Heterogeneity = 0.2,
    Metabolic_Volume = 0.3
  ),
  clinical = c(
    Age = 65,
    Performance_Status = 1,
    Smoking_Pack_Years = 20,
    Comorbidity_Score = 2
  ),
  clinical_norm = c(
    Age = (65 - 65) / 10,
    Performance_Status = 1 / 2,
    Smoking_Pack_Years = (20 - 30) / 15,
    Comorbidity_Score = 2 / 4
  )
)

# Make prediction for example patient
example_pred <- predict_new_patient("High", "Intensive", example_bio_factors, 90)

# Plot prediction
ggplot(example_pred$predictions, aes(x = day)) +
  geom_line(aes(y = tumor_size, color = "Tumor Size")) +
  geom_line(aes(y = local_control, color = "Local Control")) +
  labs(
    title = "Predicted Outcomes for Example Patient",
    subtitle = paste0("Treatment: ", example_pred$predictions$rad_dose[1], " Radiation + ", 
                      example_pred$predictions$chemo_regimen[1], " Chemotherapy"),
    x = "Day",
    y = "Value",
    color = "Metric"
  ) +
  scale_y_continuous(sec.axis = sec_axis(~., name = "Local Control (%)")) +
  theme_minimal() +
  geom_hline(yintercept = example_pred$toxicity$toxicity_prob * 100, linetype = "dashed", color = "red") +
  geom_text(aes(x = max(day), y = example_pred$toxicity$toxicity_prob * 100 + 5), 
            label = paste0("Toxicity Risk: ", round(example_pred$toxicity$toxicity_prob * 100, 1), "%"),
            hjust = 1, color = "red")

7 Sensitivity Analysis

# Perform sensitivity analysis on key parameters
# Set baseline parameters
baseline_factors <- example_bio_factors

# Parameters to vary
sensitivity_params <- list(
  Immune_Score = seq(-2, 2, by = 0.5),
  Hypoxia = seq(-2, 2, by = 0.5),
  Age = seq(40, 90, by = 10)
)

# Treatment combinations to test
treatments <- list(
  list(rad = "High", chemo = "Intensive"),
  list(rad = "Medium", chemo = "Standard"),
  list(rad = "Low", chemo = "None")
)

# Function to run sensitivity analysis
run_sensitivity <- function(param_name, param_values, baseline_factors, treatments) {
  results <- data.frame()
  
  for (val in param_values) {
    for (t in treatments) {
      # Create modified factors
      mod_factors <- baseline_factors
      
      if (param_name %in% names(mod_factors$biomarkers)) {
        mod_factors$biomarkers[param_name] <- val
      } else if (param_name %in% names(mod_factors$imaging)) {
        mod_factors$imaging[param_name] <- val
      } else if (param_name %in% names(mod_factors$clinical)) {
        mod_factors$clinical[param_name] <- val
        
        # Update normalized value as well
        if (param_name == "Age") {
          mod_factors$clinical_norm["Age"] <- (val - 65) / 10
        } else if (param_name == "Performance_Status") {
          mod_factors$clinical_norm["Performance_Status"] <- val / 2
        } else if (param_name == "Smoking_Pack_Years") {
          mod_factors$clinical_norm["Smoking_Pack_Years"] <- (val - 30) / 15
        } else if (param_name == "Comorbidity_Score") {
          mod_factors$clinical_norm["Comorbidity_Score"] <- val / 4
        }
      }
      
      # Make prediction
      pred <- predict_new_patient(t$rad, t$chemo, mod_factors, 90)
      
      # Extract final values
      final_values <- pred$predictions[pred$predictions$day == 90, ]
      
      # Add to results
      results <- rbind(results, data.frame(
        param_name = param_name,
        param_value = val,
        rad_dose = t$rad,
        chemo_regimen = t$chemo,
        final_tumor_size = final_values$tumor_size,
        final_lc = final_values$local_control,
        toxicity_prob = pred$toxicity$toxicity_prob
      ))
    }
  }
  
  return(results)
}

# Run sensitivity analyses
sensitivity_results <- list()
for (param in names(sensitivity_params)) {
  sensitivity_results[[param]] <- run_sensitivity(param, sensitivity_params[[param]], 
                                                baseline_factors, treatments)
}

# Combine results
all_sensitivity <- do.call(rbind, sensitivity_results)

# Create plots
# Plot for tumor size
ggplot(all_sensitivity, aes(x = param_value, y = final_tumor_size, 
                           color = interaction(rad_dose, chemo_regimen))) +
  geom_line() +
  geom_point() +
  facet_wrap(~ param_name, scales = "free_x") +
  labs(
    title = "Sensitivity Analysis: Effect on Final Tumor Size",
    x = "Parameter Value",
    y = "Final Tumor Size (mm)",
    color = "Treatment"
  ) +
  theme_minimal()

# Plot for local control
ggplot(all_sensitivity, aes(x = param_value, y = final_lc, 
                           color = interaction(rad_dose, chemo_regimen))) +
  geom_line() +
  geom_point() +
  facet_wrap(~ param_name, scales = "free_x") +
  labs(
    title = "Sensitivity Analysis: Effect on Local Control",
    x = "Parameter Value",
    y = "Local Control (%)",
    color = "Treatment"
  ) +
  theme_minimal()

# Plot for toxicity
ggplot(all_sensitivity, aes(x = param_value, y = toxicity_prob * 100, 
                           color = interaction(rad_dose, chemo_regimen))) +
  geom_line() +
  geom_point() +
  facet_wrap(~ param_name, scales = "free_x") +
  labs(
    title = "Sensitivity Analysis: Effect on Toxicity Risk",
    x = "Parameter Value",
    y = "Toxicity Risk (%)",
    color = "Treatment"
  ) +
  theme_minimal()

8 Treatment Optimization

# Function to find optimal treatment for a patient based on their biological factors
optimize_treatment <- function(bio_factors, target_lc = 90, max_toxicity = 0.3) {
  # All treatment combinations
  rad_levels <- names(radiation_doses)
  chemo_levels <- names(chemo_regimens)
  
  all_treatments <- expand.grid(rad = rad_levels, chemo = chemo_levels)
  
  # Evaluate all treatments
  results <- data.frame()
  
  for (i in 1:nrow(all_treatments)) {
    rad <- as.character(all_treatments$rad[i])
    chemo <- as.character(all_treatments$chemo[i])
    
    # Make prediction
    pred <- predict_new_patient(rad, chemo, bio_factors, 90)
    
    # Extract day 90 values
    day90 <- pred$predictions[pred$predictions$day == 90, ]
    
    # Add to results
    results <- rbind(results, data.frame(
      rad_dose = rad,
      chemo_regimen = chemo,
      final_tumor_size = day90$tumor_size,
      final_lc = day90$local_control,
      toxicity_prob = pred$toxicity$toxicity_prob,
      meets_lc = day90$local_control >= target_lc,
      meets_tox = pred$toxicity$toxicity_prob <= max_toxicity
    ))
  }
  
  # Calculate a score based on objectives
  results$score <- with(results, 
                        ifelse(meets_lc & meets_tox, 
                               final_lc - (toxicity_prob * 100), 
                               ifelse(meets_lc, 
                                      final_lc - ((toxicity_prob - max_toxicity) * 500),
                                      ifelse(meets_tox,
                                             final_lc - target_lc - 50,
                                             final_lc - target_lc - ((toxicity_prob - max_toxicity) * 500)))))
  
  # Sort by score
  results <- results[order(-results$score), ]
  
  return(results)
}

# Example: optimize treatment for our example patient
opt_results <- optimize_treatment(example_bio_factors, target_lc = 90, max_toxicity = 0.3)

# Display results table
knitr::kable(opt_results, caption = "Treatment Options Ranked by Score")

Treatment Options Ranked by Score
	rad_dose	chemo_regimen	final_tumor_size	final_lc	toxicity_prob	meets_lc	meets_tox	score
2	Medium	None	80.52673	27.913077	0.3331864	FALSE	FALSE	-78.68011
4	Low	Standard	45.94685	58.868726	0.2176747	FALSE	TRUE	-81.13127
8	Medium	Intensive	10.85027	90.286919	0.6686898	TRUE	FALSE	-94.05800
5	Medium	Standard	17.75897	84.102304	0.4905389	FALSE	FALSE	-101.16717
6	High	Standard	10.85027	90.286919	0.7300220	TRUE	FALSE	-124.72408
1	Low	None	101.11513	9.482494	0.1587520	FALSE	TRUE	-130.51751
7	Low	Intensive	17.75897	84.102304	0.5810302	FALSE	FALSE	-146.41277
3	High	None	45.94685	58.868726	0.5615132	FALSE	FALSE	-161.88789
9	High	Intensive	10.85027	90.286919	0.9166898	TRUE	FALSE	-218.05799

# Visualize the top treatment option
top_treatment <- opt_results[1, ]
top_pred <- predict_new_patient(top_treatment$rad_dose, top_treatment$chemo_regimen, 
                              example_bio_factors, 90)

ggplot(top_pred$predictions, aes(x = day)) +
  geom_line(aes(y = tumor_size, color = "Tumor Size")) +
  geom_line(aes(y = local_control, color = "Local Control")) +
  labs(
    title = "Predicted Outcomes for Optimal Treatment",
    subtitle = paste0("Treatment: ", top_treatment$rad_dose, " Radiation + ", 
                      top_treatment$chemo_regimen, " Chemotherapy",
                      "\nLocal Control: ", round(top_treatment$final_lc, 1), 
                      "%, Toxicity Risk: ", round(top_treatment$toxicity_prob * 100, 1), "%"),
    x = "Day",
    y = "Value",
    color = "Metric"
  ) +
  scale_y_continuous(sec.axis = sec_axis(~., name = "Local Control (%)")) +
  theme_minimal() +
  geom_hline(yintercept = top_treatment$toxicity_prob * 100, linetype = "dashed", color = "red") +
  geom_text(aes(x = max(day), y = top_treatment$toxicity_prob * 100 + 5), 
            label = paste0("Toxicity Risk: ", round(top_treatment$toxicity_prob * 100, 1), "%"),
            hjust = 1, color = "red")

9 Summary

This document has demonstrated a comprehensive simulation of tumor growth under different treatment scenarios, accounting for patient-specific biological factors. The logistic growth model provides a useful framework for predicting treatment outcomes and can be used to optimize treatment strategies for individual patients.

This BPAD section in Chapter 3 is focused on Tumor Growth Simulation and Logistic Growth Modeling accounting for patient-specific biological factors. We show cancer data simulation and modeling of tumor growth under different treatment scenarios (radiation therapy and chemotherapy). There are additional resources that provide complementary materials, e.g., SOCR - DSPA sections on Dynamical Systems and Stochastic Differential Equation Modeling (in R-Julia), Cancer Treatment Stochastic Differential Equation (SDE) Modeling, and the BPAD section on Exponential Growth and Decay.

Specifically, using simulated data, we show that:

Higher radiation doses and more intensive chemotherapy generally lead to better local control but increase toxicity risk.
Patient-specific biological factors significantly influence both treatment efficacy and toxicity.
The logistic growth model allows for reasonably accurate predictions of tumor response to treatment.
Treatment optimization can help identify the best balance between efficacy and toxicity for individual patients.

This model-based approach may be extended to incorporating more complex biological interactions and time-dependent treatment effects. Including fractionation schedules for radiation therapy or incorporating immune response dynamics may also be useful for validating the model with observed clinical data.

Biomedical Physics with Applications to Disease (BPAD)

Chapter 3: Exponential Growth and Decay - Tumor Growth Simulation and Logistic Growth Modeling

BPAD Team + SOCR

March 16, 2025