SOCR ≫ DSPA ≫ DSPA2 Topics ≫

This DSPA Appendix shows examples of dynamic stochastic differential equation (SDE) models of cancer-treatment. The main objective is to develop an advanced generative stochastic differential equation (SDE) model proposing a new approach of cancer-treatment prediction. Ideally, the model should balance the relations between the primary pair of clinical outcome measures - intensity of the cancer-treatment intervention (by surgery, chemotherapy and radiation, manifesting as tumor local-control, \(LC\)) and side-effects (\(RP2\)), radiation pneumonitis grade 2 or higher, a measure of significant side effect in radiation therapy, see DOI: 10.1259/bjr.20220239.

Review the earlier Volterra-Lotka preditor-pray model and Hodgkin-Huxley (HH) model of neuronal neuron action potential propagation, which provide an oversimplified motivation for building a more advanced SDE cancer-model, which includes many controllable parameters accounting for the univariate distributions of dozens of observable clinical variables (biomarkers) and multivariate relations. The model (hyper)parameters can be tuned by Bayesian optimization using very sparsely sampled longitudinal clinical cancer-treatment data that includes the temporal dynamics of \(LC\) and \(RP2\) clinical outcomes from the point of initial diagnosis, through cancer-treatment intervention using alternative treatment strategies, to the final outcome stages. We’ll include a simulation of the proposed model and visualize SDE model instantiations that shows the time dynamics of \(LC\) and \(RP2\).

This advanced stochastic differential equation (SDE) model to predict and simulate cancer-treatment outcomes will track

  1. Local Control (LC) of the tumor (a proxy for treatment intensity/efficacy),
  2. Toxicity or side effects (RP2),
  3. Additional clinical/biological variables (e.g., biomarkers, imaging signals, performance status), and
  4. Patient survival or other endpoints.

Such a generative SDE framework can be viewed either as a Lotka-Volterra extension, where \(LC\) and \(RP2\) interact like predator-prey, or as an Hodgkin-Huxley-like system (with gating-type variables controlling “firing” or “flare-up” episodes of toxicity vs. tumor response). The key is to incorporate both deterministic dynamics (e.g., standard dose-response models) and stochastic components, e.g., random fluctuations reflecting inter-patient heterogeneity, measurement noise, or temporal variability in a patient’s biology.

The following R chunk includes knitr-specific parallel configuration specification.

library(knitr)

library(knitr)
knitr::opts_chunk$set(
  cache = FALSE,        # Disable caching for parallel code
  autodep = FALSE,      # Disable automatic dependency detection
  message = FALSE,      # Suppress messages
  warning = FALSE       # Suppress warnings
)

0.1 SDE Model Design

Let’s denote the state variables by \[\mathbf{X}(t) = \begin{pmatrix} LC(t) \\ RP2(t) \\ \mathbf{B}(t) \end{pmatrix},\] where \(LC(t)\) measures local tumor control (e.g., size reduction, imaging readouts), \(RP2(t)\) measures radiation- or drug-related side effects or toxicity, and \(\mathbf{B}(t)\) is a vector of relevant biomarkers or other continuous clinical variables (e.g., lab values, immunological markers), potentially of dimension 10-30 depending on data availability.

The treatment inputs can be denoted by \(\mathbf{u}(t)\) the treatment inputs (e.g., dose schedules for surgery/chemotherapy/radiation, or immunotherapy cycles). These external controls can drive the tumor and/or side-effect trajectories.

Next, we need to formulate the time evolution of the SDE model

\[d \mathbf{X}(t) = \mathbf{f}\bigl(\mathbf{X}(t), \mathbf{u}(t), \boldsymbol{\theta}\bigr)\,dt \;+\; \mathbf{g}\bigl(\mathbf{X}(t), \mathbf{u}(t), \boldsymbol{\theta}\bigr)\,d\mathbf{W}(t),\] where \(\mathbf{f}(\cdot)\) is the drift (deterministic) part, modeling how tumor control and toxicity progress on average, \(\mathbf{g}(\cdot)\) encodes the diffusion (noise) part, allowing for random fluctuations in responses and side effects, \(\boldsymbol{\theta}\) is a parameter set controlling tumor growth rates, clearance rates, synergy between multiple therapies, and so on, and \(\mathbf{W}(t)\) is a vector Wiener process (Brownian motion).

Using intuition from the Lotka-Volterra model, we may treat \(LC\) as a “prey” variable (grows unless suppressed) and \(RP2\) as a “predator” (side effects escalate and feed off continuing treatment). Let’s consider a possible drift in the form

\[\begin{aligned} dLC &= \Bigl[\alpha\,LC - \beta\,LC \cdot RP2 \;-\;\phi_{\text{decay}}\,LC \Bigr] dt \;+\; \sigma_{LC}(LC,RP2)\, dW_1,\\[6pt] dRP2 &= \Bigl[\gamma\,LC \cdot RP2 - \delta\,RP2 \;-\;\psi_{\text{treatment}}\, \mathbf{u}(t)\,RP2 \Bigr] dt \;+\; \sigma_{RP2}(LC,RP2)\, dW_2,\\[6pt] d\mathbf{B} &= \mathbf{F}_{\mathbf{B}}(\mathbf{B}, LC, RP2, \mathbf{u})\,dt \;+\; \boldsymbol{\Sigma}_{\mathbf{B}}(\mathbf{B},LC,RP2)\,d\mathbf{W}_{\mathbf{B}}, \end{aligned}\] where \(\alpha, \beta, \gamma, \delta, \dots\) are parameters, and \(\sigma_{LC}, \sigma_{RP2}, \boldsymbol{\Sigma}_{\mathbf{B}}\) are noise intensities (possibly state-dependent).

Similarly to the Hodgkin-Huxley model, \(LC\) is analogous to a “membrane voltage,” while \(RP2\) (and possibly some gating variables) evolves more slowly, controlling when toxicity “fires.” The drift terms would mimic the classic HH model formulation:

\[\begin{aligned} d LC &= \text{(nonlinear function of LC, gating vars, u(t))}\,dt \;+\; \ldots,\\ d RP2 &= \text{(slower gating dynamics, driven by LC transitions)}\,dt \;+\; \ldots \end{aligned}.\]

Such a SDE model captures bursts of toxicity or short-lived flare-ups.

The (hyper)parameter vector estimation can be achieved by using Bayesian optimization. Because clinical data are often limited and noisy, we can use Bayesian methods (e.g., approximate Bayesian computation in Julia, Stan, or specialized packages) or utilize Bayesian optimization to tune the parameter vector \(\boldsymbol{\theta}\).

In clinical setting, naturally sparse sampling (few time points per patient) is handled by fitting population-level priors, then refining posteriors for subgroups or individuals. We can define a likelihood linking the SDE outputs to observed data (LC, toxicities, biomarkers). Markov chain Monte Carlo (MCMC) or gradient-based Bayesian approaches can then infer the posterior distribution for \(\boldsymbol{\theta}\).

1 R Implementation

1.1 Simplified 2D SDE Modeling

We’ll start with a simpler first-generation SDE 2D model, using the snssde2d() function for the \(LC\)-\(RP2\) subsystem (as higher-dimensional SDEs require specialized handling in R). We’ll focus just on \(LC\) and \(RP2\) dynamics, with treatment as an external parameter.

The state variables are \[\mathbf{X}(t) = \begin{pmatrix} LC(t) \\ RP2(t) \end{pmatrix}. \]

The 2D SDE model system is \[\begin{aligned} dLC &= \left[\alpha LC \left(1 - \frac{LC}{K}\right) - \beta LC \cdot RP2 - \phi_{\text{treat}} u(t) \right] dt + \sigma_{LC} LC \, dW_1, \\ dRP2 &= \left[\gamma LC \cdot RP2 - \delta RP2 + \psi_{\text{treat}} u(t) \right] dt + \sigma_{RP2} \sqrt{RP2} \, dW_2. \end{aligned}\]

library(Sim.DiffProc)
library(plotly)

# Parameters
alphaP <- 0.3    # Tumor growth rate
KP <- 1.5        # Tumor carrying capacity
betaP <- 0.02    # Toxicity suppression rate
gammaP <- 0.015  # Toxicity activation rate
deltaP <- 0.4    # Toxicity clearance rate
phi_treat <- 0.1 # Treatment efficacy on LC
psi_treat <- 0.05 # Treatment-induced toxicity
sigma_LC <- 0.05 # Noise for LC
sigma_RP2 <- 0.1 # Noise for RP2

# Treatment schedule (example: weekly cycles)
u_t <- function(t) {
  ifelse(t %% 7 < 2, 1.0, 0.0)  # Treatment active 2 days/week
}

# Define drift and diffusion for snssde2d()
fx <- expression(
  alphaP * x * (1 - x/KP) - betaP * x * y - phi_treat * 1.0, # u(t) simplified to 1 when active
  gammaP * x * y - deltaP * y + psi_treat * 1.0
)

gx <- expression(
  sigma_LC * x,
  sigma_RP2 * sqrt(y)
)

# Simulate
set.seed(123)
mod <- snssde2d(
  drift = fx,
  diffusion = gx,
  x0 = c(x = 0.8, y = 0.1),
  t0 = 0,
  T = 60,
  N = 500,
  M = 50
)

# Visualize
fig <- plot_ly() 
for (i in 1:5) {  # Plot first 5 paths
  fig <- fig %>%
    add_trace(x = mod$time, y = mod$X[, i], name = paste("LC", i),
              type = "scatter", mode = "lines", line = list(color = "blue")) %>%
    add_trace(x = mod$time, y = mod$Y[, i], name = paste("RP2", i),
              line = list(color = "red"), yaxis = "y2")
}
fig <- fig %>% layout(
  title = "LC (Tumor Control) vs. RP2 (Toxicity)",
  yaxis2 = list(overlaying = "y", side = "right", title = "RP2"),
  xaxis = list(title = "Time (Days)")
)
fig

To match the existing R package capabilities, in this simplified 2D model the state space only covers \(LC\) and \(RP2\), but excludes the vector of biomarkers \(B\). We’ll revisit this later. The treatment schedule uses treatment cycles directly in the drift terms; phi_treat * 1.0 assumes full dose when active. For time-dependent u(t), we should precompute its values over the time grid and reference them in the drift expressions. An extension accounting for observable vectors of biomarkers, which are omitted in the 2D SDE model, may require simulating each biomarker as a 1D SDE or use alternative packages like pomp.

Below we show the protocol for R-based parameter estimation using rBayesianOptimization.

library(rBayesianOptimization)

# Simulate observed data (ground truth)
observed_data <- data.frame(
  time = seq(from=mod$t0, to=mod$T, by=mod$Dt),
  LC = rowMeans(mod$X),
  RP2 = rowMeans(mod$Y)
)

# Objective function to minimize MSE
objective_func <- function(alphaP, betaP, gammaP) {
  sim_mod <- snssde2d(
    drift = expression(
      alphaP * x * (1 - x/1.5) - betaP * x * y - 0.1,
      gammaP * x * y - 0.4 * y + 0.05
    ),
    diffusion = expression(0.05*x, 0.1*sqrt(y)),
    x0 = c(0.8, 0.1),
    t0 = 0,
    T = 60,
    N = 500
  )
  mse <- mean((sim_mod$X - observed_data$LC)^2 + (sim_mod$Y - observed_data$RP2)^2)
  return(list(Score = -mse))
}

# Bayesian optimization
opt_result <- BayesianOptimization(
  objective_func,
  bounds = list(
    alphaP = c(0.1, 0.5),
    betaP = c(0.01, 0.05),
    gammaP = c(0.01, 0.03)
  ),
  init_points = 5,
  n_iter = 20
)
## elapsed = 0.02   Round = 1   alphaP = 0.340964   betaP = 0.04482155  gammaP = 0.02975993 Value = -0.01000155 
## elapsed = 0.02   Round = 2   alphaP = 0.1091407  betaP = 0.01837854  gammaP = 0.02581208 Value = -0.01493705 
## elapsed = 0.00   Round = 3   alphaP = 0.428225   betaP = 0.03315608  gammaP = 0.01017334 Value = -0.005088571 
## elapsed = 0.00   Round = 4   alphaP = 0.1146278  betaP = 0.02775184  gammaP = 0.02165458 Value = -0.005338672 
## elapsed = 0.00   Round = 5   alphaP = 0.1940198  betaP = 0.0246884   gammaP = 0.01649646 Value = -0.007238786 
## elapsed = 0.02   Round = 6   alphaP = 0.4813248  betaP = 0.03072383  gammaP = 0.02149576 Value = -0.1544111 
## elapsed = 0.01   Round = 7   alphaP = 0.1550263  betaP = 0.03071254  gammaP = 0.02586016 Value = -0.009795527 
## elapsed = 0.00   Round = 8   alphaP = 0.1000 betaP = 0.0500  gammaP = 0.0100 Value = -0.04291667 
## elapsed = 0.02   Round = 9   alphaP = 0.3880017  betaP = 0.02188004  gammaP = 0.01164873 Value = -0.01688399 
## elapsed = 0.00   Round = 10  alphaP = 0.1963528  betaP = 0.01123368  gammaP = 0.0300 Value = -0.007565522 
## elapsed = 0.00   Round = 11  alphaP = 0.282257   betaP = 0.04799122  gammaP = 0.01532491 Value = -0.009646454 
## elapsed = 0.00   Round = 12  alphaP = 0.245288   betaP = 0.04974854  gammaP = 0.02067502 Value = -0.02056674 
## elapsed = 0.00   Round = 13  alphaP = 0.2989052  betaP = 0.0500  gammaP = 0.0100 Value = -0.01580561 
## elapsed = 0.00   Round = 14  alphaP = 0.1000 betaP = 0.0500  gammaP = 0.0300 Value = -0.02294954 
## elapsed = 0.02   Round = 15  alphaP = 0.1000 betaP = 0.0500  gammaP = 0.01689369 Value = -0.02679663 
## elapsed = 0.00   Round = 16  alphaP = 0.5000 betaP = 0.0100  gammaP = 0.0300 Value = -0.006871026 
## elapsed = 0.00   Round = 17  alphaP = 0.5000 betaP = 0.03723532  gammaP = 0.0100 Value = -0.02225222 
## elapsed = 0.01   Round = 18  alphaP = 0.1854054  betaP = 0.02100875  gammaP = 0.01888987 Value = -0.009016901 
## elapsed = 0.00   Round = 19  alphaP = 0.2725012  betaP = 0.02431722  gammaP = 0.0274043  Value = -0.01257695 
## elapsed = 0.02   Round = 20  alphaP = 0.2398215  betaP = 0.03820745  gammaP = 0.01359014 Value = -0.01843583 
## elapsed = 0.02   Round = 21  alphaP = 0.5000 betaP = 0.03124543  gammaP = 0.02835901 Value = -0.01552359 
## elapsed = 0.02   Round = 22  alphaP = 0.1661688  betaP = 0.03703741  gammaP = 0.02313721 Value = -0.01509109 
## elapsed = 0.00   Round = 23  alphaP = 0.4314345  betaP = 0.01653852  gammaP = 0.02955281 Value = -0.009896026 
## elapsed = 0.01   Round = 24  alphaP = 0.1721756  betaP = 0.02145591  gammaP = 0.02817582 Value = -0.01390439 
## elapsed = 0.00   Round = 25  alphaP = 0.1000 betaP = 0.02164099  gammaP = 0.01373757 Value = -0.01346684 
## 
##  Best Parameters Found: 
## Round = 3    alphaP = 0.428225   betaP = 0.03315608  gammaP = 0.01017334 Value = -0.005088571
print(opt_result$Best_Par)
##     alphaP      betaP     gammaP 
## 0.42822499 0.03315608 0.01017334
# Set the 3 parameters to the BO estimates and plot 50 SDE Model solutions
alphaP <- opt_result$Best_Par[[1]]    # Tumor growth rate
betaP  <- opt_result$Best_Par[[2]]    # Toxicity suppression rate
gammaP <- opt_result$Best_Par[[3]]  # Toxicity activation rate

# Define drift and diffusion for snssde2d()
fx <- expression(
  alphaP * x * (1 - x/KP) - betaP * x * y - phi_treat * 1.0, # u(t) simplified to 1 when active
  gammaP * x * y - deltaP * y + psi_treat * 1.0
)

gx <- expression(
  sigma_LC * x,
  sigma_RP2 * sqrt(y)
)

# Simulate
set.seed(123)
mod <- snssde2d(
  drift = fx,
  diffusion = gx,
  x0 = c(x = 0.8, y = 0.1),
  t0 = 0,
  T = 60,
  N = 500,
  M = 50
)

# Visualize
fig <- plot_ly() 
for (i in 6:10){    # 1:mod$M) {  # Plot just 5 paths or all of them
  fig <- fig %>%
    add_trace(x = mod$time, y = mod$X[, i], name = paste("LC", i),
              type = "scatter", mode = "lines", line = list(color = "blue")) %>%
    add_trace(x = mod$time, y = mod$Y[, i], name = paste("RP2", i),
              line = list(color = "red"), yaxis = "y2")
}
fig <- fig %>% layout(
  title = "LC (Tumor Control) vs. RP2 (Toxicity)",
  yaxis2 = list(overlaying = "y", side = "right", title = "RP2"),
  xaxis = list(title = "Time (Days)")
)
fig

For higher dimensional SDE models with biomarkers, use sde.sim() from the sde package or switch to Julia’s DifferentialEquations.jl. In these situations, the time-dependent parameters should precompute the treatment schedules as vectors and reference them in drift/diffusion terms using approxfun(). The parameter estimate validation may require holding-out data or synthetic benchmarking.

1.2 General High-Dimensional (\(nD\)) SDE Modeling

Let’s build a stochastic differential equation (SDE) model that captures the dynamic interplay between tumor local control (LC), treatment-related side effects (RP2), and biomarkers (\(B\)). The model extends the Lotka-Volterra framework with treatment inputs and biomarker dynamics. Consider the triple of state variables (outcomes + inputs)

\[\mathbf{X}(t) = \begin{pmatrix} LC(t) \\ RP2(t) \\ \mathbf{B}(t) \end{pmatrix}, \quad \text{where the hyperparameter vector } \mathbf{B}(t) \in \mathbb{R}^k \text{ (e.g., } k=10\text{ biomarkers)}.\]

The analytical representation of the SDE system is \[\begin{aligned} dLC &= \left[\alpha LC \left(1 - \frac{LC}{K}\right) - \beta LC \cdot RP2 - \phi_{\text{treat}} u(t) LC \right] dt + \sigma_{LC} LC \, dW_1, \\ dRP2 &= \left[\gamma LC \cdot RP2 - \delta RP2 + \psi_{\text{treat}} u(t) \right] dt + \sigma_{RP2} \sqrt{RP2} \, dW_2, \\ dB_i &= \left[\mu_i(B_i) + \eta_i LC - \nu_i RP2 \right] dt + \sigma_{B_i} \, dW_{B_i}, \quad i=1,\dots,k, \end{aligned}\] which depends on the following parameters

  • \(\alpha, K\): Tumor growth rate and carrying capacity.
  • \(\beta\): Toxicity-driven tumor suppression.
  • \(\phi_{\text{treat}}\): Treatment efficacy on LC.
  • \(\gamma, \delta\): Toxicity activation and clearance.
  • \(\psi_{\text{treat}}\): Treatment-induced toxicity.
  • \(\mu_i, \eta_i, \nu_i\): Biomarker drift terms.
  • \(\sigma\): Noise intensities.

Here’s a comprehensive solution using Julia’s DifferentialEquations.jl package for high-dimensional SDE modeling. For specificity, we’ll implement a \(5D\) system (\(LC\), \(RP2\) and a \(3\)-component biomarkers vector). This workflow requires Julia \(V\ge 1.9\) from julialang.org.

# R-block of code
# The first time - install these libs
#
# install.packages("JuliaCall")
# install.packages("JuliaConnectoR")
# library(JuliaCall)
# install_julia()  # This is needed just once to install Julia on the system!!!
# install.packages("XRJulia")  # Install the package if not already installed

# Install required R packages
# install.packages(c("JuliaCall", "plotly", "reticulate"))
library(JuliaCall)

Sys.setenv(JULIA_NUM_THREADS = 11)
julia <- julia_setup()  # This will start Julia with 10 threads

# findJulia()  # This will locate the Julia installation
# julia <- JuliaSession()  # Start Julia and assign the session to the `julia` object

# Check that the julia object is properly initialized by running a simple Julia command:
julia_command("1 + 1")  #  # Should return 2
## 2
message(paste0("The (Julia) result of this command, \'julia_command(\"1 + 1\")\', is ", julia_command("1 + 1"), "!\n"))
# Verify the number of threads in Julia:
julia_command("using Base.Threads; Threads.nthreads() = 10")  # Manually set threads
# Should return 10
message(paste0("The Number of Cores Julia starts with is: ", julia_eval("Threads.nthreads()"), "!\n"))  
# julia_command("using Base.Threads; Threads.nthreads()")  # Should return 10
# julia_command("ENV[\"JULIA_NUM_THREADS\"]")  # Should return "10"
# JuliaConnectoR::stopJulia()     # Terminate the Julia session 
#### JUST ONE TIME!!!!
##   **1.3 Julia Packages (Run in Julia REPL)**
# using Pkg
# Pkg.add(["DifferentialEquations", "ModelingToolkit", "SciMLBase", "Plots", "CSV", "HDF5"])

Side-note on Computational Inefficiencies: In this case, the \(5\) state variables include \(2\) outcomes and \(3\) biomarkers. Of course, there could be additional observables included in a more complex SDE model, which depends on a mixture of biospecimen, demographic, genetic, phenotypic and imaging descriptors. In principle, we should consider pre-processing the high-dimensional biomarkers including

  • Introduce a hierarchical clustering of biomarkers by type, e.g., clinical, biospecimen, ’omics (gene-transcriptome), phenotypic, imaging, text, cellular, etc., and
  • Using DSPA linear and non-linear dimentionality reduction transform the complex web of biomarkers into two to three salient-features within each biomarker type,
  • Normalize the salient-features within and between biomarker-types.

Such pre-processing will boost the energy of the vital (ensemble) biomarkers within each type, and will have the added benefit of ensuring the computational feasibility of solving the complex SDE modeling in reasonable time (working with \(\approx 5\) to \(25\) salient biomarkers). In this simulation example, we use a state space composed of

\[\mathbf{X}(t) = \begin{pmatrix} LC(t) \\ RP2(t) \\ B_1(t) \\ B_2(t) \\ B_3(t) \end{pmatrix}\]

anddefine the 5D SDE model

\[\begin{aligned} dLC &= \left[\alpha LC \left(1 - \frac{LC}{K}\right) - \beta LC \cdot RP2 - \phi u(t) \right]dt + \sigma_{LC} LC \, dW_1 \\ dRP2 &= \left[\gamma LC \cdot RP2 - \delta RP2 + \psi u(t) \right]dt + \sigma_{RP2} \sqrt{RP2} \, dW_2 \\ dB_1 &= \left[\mu_1(B_{1,\text{eq}} - B_1) + \eta_1 LC \right]dt + \sigma_{B1} \, dW_3 \\ dB_2 &= \left[\mu_2(B_{2,\text{eq}} - B_2) - \eta_2 RP2 \right]dt + \sigma_{B2} \, dW_4 \\ dB_3 &= \left[\mu_3(B_{3,\text{eq}} - B_3) + \eta_3 LC \cdot RP2 \right]dt + \sigma_{B3} \, dW_5 \end{aligned} .\]

# sde_analysis.R
library(JuliaCall)
library(plotly)
# # For HDF5 file handling
# if (!require("BiocManager", quietly = TRUE))
#     install.packages("BiocManager")
# BiocManager::install("rhdf5")
library(rhdf5)

# Initialize Julia
julia_setup(installJulia = TRUE)
julia_library("HDF5")
# Set the working directory
setwd("C:/Users/IvoD/Desktop/Ivo.dir/Research/UMichigan/Publications_Books/2023/DSPA_Springer_2nd_Edition_2023/Rmd_HTML/appendix/")

# Run Julia simulation
julia_source("./DSPA_Appendix19_SDE_JuliaChunk1/DSPA_Appendix19_SDE_JuliaChunk1.jl")

# Load HDF5 with explicit dimensions
h5_data <- h5read("./DSPA_Appendix19_SDE_JuliaChunk1/cancer_SDE_trajectories.h5", "/")
time_points <- h5_data$time_points
trajectory_ids <- 1:(dim(h5_data$trajectories)[3])   # 100 trajectories
dims <- as.integer(h5_data$dims)  # [variables=5, time=601, trajectories=100]

# Create individual surfaces
fig <- plot_ly(showscale = FALSE) %>%
  add_surface(
    x = time_points, y = trajectory_ids, z = h5_data$trajectories[1,,],
    name = "LC", colorscale = "Viridis", visible = TRUE
  ) %>%
  add_surface(
    x = time_points, y = trajectory_ids, z = h5_data$trajectories[2,,],
    name = "RP2", colorscale = "Hot", visible = FALSE
  ) %>%
  add_surface(
    x = time_points, y = trajectory_ids, z = h5_data$trajectories[3,,],
    name = "B1", colorscale = "Greens", visible = FALSE
  ) %>%
  add_surface(
    x = time_points, y = trajectory_ids, z = h5_data$trajectories[4,,],
    name = "B2", colorscale = "Reds", visible = FALSE
  ) %>%
  add_surface(
    x = time_points, y = trajectory_ids, z = h5_data$trajectories[5,,],
    name = "B3", colorscale = "Blues", visible = FALSE
  )

# Create dropdown menu with PROPERLY STRUCTURED BUTTONS
fig <- fig %>% layout(
  title = "Cancer Treatment Dynamics - 3D Trajectory Analysis\n [variables=5, timepoints=601, trajectories=100]",
  scene = list(
    xaxis = list(title = "Time (days)"),
    yaxis = list(title = "Trajectory ID"),
    zaxis = list(title = "Value"),
    camera = list(eye = list(x = 1.8, y = -1.8, z = 0.8))
  ),
  updatemenus = list(
    list(
      type = "dropdown",
      x = 1.2,
      y = 0.8,
      buttons = list(
        list(method = "restyle",
             args = list("visible", list(TRUE, FALSE, FALSE, FALSE, FALSE)),
             label = "LC"),
        list(method = "restyle",
             args = list("visible", list(FALSE, TRUE, FALSE, FALSE, FALSE)),
             label = "RP2"),
        list(method = "restyle",
             args = list("visible", list(FALSE, FALSE, TRUE, FALSE, FALSE)),
             label = "B1"),
        list(method = "restyle",
             args = list("visible", list(FALSE, FALSE, FALSE, TRUE, FALSE)),
             label = "B2"),
        list(method = "restyle",
             args = list("visible", list(FALSE, FALSE, FALSE, FALSE, TRUE)),
             label = "B3"),
        list(method = "restyle",
             args = list("visible", list(TRUE, TRUE, TRUE, TRUE, TRUE)),
             label = "All")
      )
    )
  )
)

# Show composite 3D plot with all SDE-generated trajectories -- across time and instantiations
fig

For hyper-parameter estimation, we can use Bayesian Optimization (BO) in Julia where we define the function sde_cost(params).

Below, we run the Bayesian optimization in R invoking the Julia function sde_cost(params).

library(rBayesianOptimization)
library(plotly)
library(dplyr)

# Load observed data (ASSUME: trajectory 1 as the ground truth)
trajectories <- h5_data$trajectories  # [variables=5, time=601, trajectories=100]
observed_data <- trajectories[, , 1]
# Pass observed data to Julia
julia_assign("r_observed_data", observed_data)

# Define optimization bounds
bounds <- list(
  alphaP = c(0.1, 0.5),
  betaP = c(0.01, 0.05),
  gammaP = c(0.01, 0.03),
  deltaP = c(0.2, 0.6)
)

# Optimization wrapper that passes the observed data
obj_func <- function(alphaP, betaP, gammaP, deltaP) {
  # Clamp parameters to valid ranges
  alphaP <- pmax(0.1, pmin(alphaP, 0.5))
  betaP <- pmax(0.01, pmin(betaP, 0.05))
  gammaP <- pmax(0.01, pmin(gammaP, 0.03))
  deltaP <- pmax(0.2, pmin(deltaP, 0.6))
  
  # cost <- julia_eval(sprintf("sde_cost([%f, %f, %f, %f], r_observed_data)", 
  #                           alphaP, betaP, gammaP, deltaP))
  tryCatch({
    cost <- julia_eval(sprintf("
      try
        sde_cost([%.6f, %.6f, %.6f, %.6f], r_observed_data)
      catch e
        Inf  # Return infinite cost for failed simulations
      end
    ", alphaP, betaP, gammaP, deltaP))
    
    list(Score = -cost)  # Negative for maximization
    
  }, error = function(e) {
    list(Score = -1e6)  # Large penalty for R-level errors
  })
  
  return(list(Score = -cost))  # Negative for maximization
}

# Run optimization
set.seed(123)
opt_result <- BayesianOptimization(
  obj_func,
  bounds = bounds,
  init_points = 10,
  n_iter = 50,
  acq = "ucb",
  kappa = 2.576,  # Higher exploration
  eps = 0.01,     # Reduced random jump
  verbose = TRUE
)
## elapsed = 1.72   Round = 1   alphaP = 0.215031   betaP = 0.04827333  gammaP = 0.02779079 deltaP = 0.5852097  Value = -76.90529 
## elapsed = 0.00   Round = 2   alphaP = 0.4153221  betaP = 0.02813337  gammaP = 0.02385607 deltaP = 0.5609196  Value = -21.14316 
## elapsed = 0.00   Round = 3   alphaP = 0.2635908  betaP = 0.03710283  gammaP = 0.02281014 deltaP = 0.4762821  Value = -45.38897 
## elapsed = 0.00   Round = 4   alphaP = 0.453207   betaP = 0.03290534  gammaP = 0.0298854  deltaP = 0.518187   Value = -18.99103 
## elapsed = 0.00   Round = 5   alphaP = 0.4761869  betaP = 0.01411699  gammaP = 0.02311412 deltaP = 0.2098455  Value = -23.00838 
## elapsed = 0.11   Round = 6   alphaP = 0.1182226  betaP = 0.045993    gammaP = 0.02417061 deltaP = 0.3911184  Value = -186919.3649 
## elapsed = 0.00   Round = 7   alphaP = 0.3112422  betaP = 0.01984351  gammaP = 0.02088132 deltaP = 0.5033838  Value = -23.31306 
## elapsed = 0.00   Round = 8   alphaP = 0.4569676  betaP = 0.01168238  gammaP = 0.02188284 deltaP = 0.2865632  Value = -29.35837 
## elapsed = 0.00   Round = 9   alphaP = 0.320574   betaP = 0.02311683  gammaP = 0.01578319 deltaP = 0.3272724  Value = -17.17768 
## elapsed = 0.00   Round = 10  alphaP = 0.2826459  betaP = 0.04818015  gammaP = 0.01294227 deltaP = 0.2926503  Value = -15.23521 
## elapsed = 0.00   Round = 11  alphaP = 0.5000 betaP = 0.0142415   gammaP = 0.0100 deltaP = 0.2077148  Value = -30.83737 
## elapsed = 0.00   Round = 12  alphaP = 0.2308818  betaP = 0.04683643  gammaP = 0.0100 deltaP = 0.2060597  Value = -68.11251 
## elapsed = 0.00   Round = 13  alphaP = 0.371171   betaP = 0.01051178  gammaP = 0.0300 deltaP = 0.5748111  Value = -21.96428 
## elapsed = 0.00   Round = 14  alphaP = 0.3461219  betaP = 0.04715575  gammaP = 0.02986649 deltaP = 0.2224682  Value = -22.87684 
## elapsed = 0.00   Round = 15  alphaP = 0.2208822  betaP = 0.0130287   gammaP = 0.02873902 deltaP = 0.3309868  Value = -48.77285 
## elapsed = 0.00   Round = 16  alphaP = 0.491584   betaP = 0.04879006  gammaP = 0.02967303 deltaP = 0.5988503  Value = -31.58045 
## elapsed = 0.00   Round = 17  alphaP = 0.4701203  betaP = 0.01557945  gammaP = 0.02550863 deltaP = 0.273801   Value = -30.84854 
## elapsed = 0.00   Round = 18  alphaP = 0.4570762  betaP = 0.04019579  gammaP = 0.02489288 deltaP = 0.447653   Value = -16.88655 
## elapsed = 0.00   Round = 19  alphaP = 0.4612464  betaP = 0.01701273  gammaP = 0.02550336 deltaP = 0.5285232  Value = -27.03785 
## elapsed = 0.00   Round = 20  alphaP = 0.3978904  betaP = 0.04734292  gammaP = 0.0198141  deltaP = 0.2229915  Value = -19.96604 
## elapsed = 0.00   Round = 21  alphaP = 0.4439146  betaP = 0.01228579  gammaP = 0.01140796 deltaP = 0.2449235  Value = -22.92839 
## elapsed = 0.00   Round = 22  alphaP = 0.4629101  betaP = 0.0100  gammaP = 0.01002825 deltaP = 0.2000 Value = -22.36557 
## elapsed = 0.02   Round = 23  alphaP = 0.4609483  betaP = 0.03425636  gammaP = 0.02519469 deltaP = 0.5935028  Value = -21.29726 
## elapsed = 0.00   Round = 24  alphaP = 0.4700 betaP = 0.04686789  gammaP = 0.0100 deltaP = 0.5807658  Value = -22.79267 
## elapsed = 0.00   Round = 25  alphaP = 0.4486144  betaP = 0.0200918   gammaP = 0.01726894 deltaP = 0.4609431  Value = -19.72584 
## elapsed = 0.00   Round = 26  alphaP = 0.4341278  betaP = 0.03676414  gammaP = 0.02499253 deltaP = 0.2732177  Value = -16.13289 
## elapsed = 0.00   Round = 27  alphaP = 0.4390514  betaP = 0.04399499  gammaP = 0.01291989 deltaP = 0.3598945  Value = -15.80081 
## elapsed = 0.02   Round = 28  alphaP = 0.4480798  betaP = 0.03872926  gammaP = 0.02300004 deltaP = 0.5862674  Value = -20.06049 
## elapsed = 0.00   Round = 29  alphaP = 0.4342713  betaP = 0.03085571  gammaP = 0.02724047 deltaP = 0.4000158  Value = -30.23223 
## elapsed = 0.00   Round = 30  alphaP = 0.442048   betaP = 0.04065702  gammaP = 0.0175245  deltaP = 0.4509451  Value = -20.32392 
## elapsed = 0.00   Round = 31  alphaP = 0.3901216  betaP = 0.02522547  gammaP = 0.01357014 deltaP = 0.2274649  Value = -24.37757 
## elapsed = 0.00   Round = 32  alphaP = 0.3401031  betaP = 0.0102083   gammaP = 0.01702817 deltaP = 0.5250943  Value = -18.34195 
## elapsed = 0.00   Round = 33  alphaP = 0.4486582  betaP = 0.01425387  gammaP = 0.02695614 deltaP = 0.3405165  Value = -25.06427 
## elapsed = 0.00   Round = 34  alphaP = 0.4637497  betaP = 0.0500  gammaP = 0.02389084 deltaP = 0.6000 Value = -24.42161 
## elapsed = 0.00   Round = 35  alphaP = 0.3287772  betaP = 0.04630848  gammaP = 0.0257891  deltaP = 0.575814   Value = -30.10731 
## elapsed = 0.00   Round = 36  alphaP = 0.4463788  betaP = 0.0215876   gammaP = 0.01759989 deltaP = 0.4094292  Value = -20.44675 
## elapsed = 0.00   Round = 37  alphaP = 0.4341074  betaP = 0.01657833  gammaP = 0.01893487 deltaP = 0.4797959  Value = -24.93446 
## elapsed = 0.00   Round = 38  alphaP = 0.4635344  betaP = 0.03432802  gammaP = 0.01469921 deltaP = 0.3686987  Value = -28.7696 
## elapsed = 0.00   Round = 39  alphaP = 0.4307495  betaP = 0.02935503  gammaP = 0.01400023 deltaP = 0.3273029  Value = -24.18917 
## elapsed = 0.00   Round = 40  alphaP = 0.4412624  betaP = 0.01587121  gammaP = 0.02570147 deltaP = 0.3946027  Value = -29.50909 
## elapsed = 0.02   Round = 41  alphaP = 0.4515956  betaP = 0.03044777  gammaP = 0.02933891 deltaP = 0.2490016  Value = -20.22391 
## elapsed = 0.00   Round = 42  alphaP = 0.4551599  betaP = 0.02449809  gammaP = 0.0103355  deltaP = 0.5203634  Value = -22.12629 
## elapsed = 0.00   Round = 43  alphaP = 0.2982001  betaP = 0.04470715  gammaP = 0.02013553 deltaP = 0.3982358  Value = -24.94021 
## elapsed = 0.00   Round = 44  alphaP = 0.4099627  betaP = 0.03765237  gammaP = 0.01334057 deltaP = 0.4184433  Value = -19.82845 
## elapsed = 0.02   Round = 45  alphaP = 0.4089671  betaP = 0.02181249  gammaP = 0.01320336 deltaP = 0.2365289  Value = -21.91951 
## elapsed = 0.00   Round = 46  alphaP = 0.4552448  betaP = 0.04184342  gammaP = 0.01521734 deltaP = 0.2000 Value = -18.84434 
## elapsed = 0.00   Round = 47  alphaP = 0.44394    betaP = 0.04904042  gammaP = 0.02673455 deltaP = 0.2910557  Value = -17.44532 
## elapsed = 0.00   Round = 48  alphaP = 0.2945785  betaP = 0.02464845  gammaP = 0.01757752 deltaP = 0.2457871  Value = -16.51645 
## elapsed = 0.00   Round = 49  alphaP = 0.4491682  betaP = 0.04109014  gammaP = 0.02795872 deltaP = 0.2364012  Value = -26.24999 
## elapsed = 0.00   Round = 50  alphaP = 0.3028382  betaP = 0.03860143  gammaP = 0.02532854 deltaP = 0.5166078  Value = -17.91474 
## elapsed = 0.00   Round = 51  alphaP = 0.4507086  betaP = 0.03981822  gammaP = 0.01923333 deltaP = 0.2805891  Value = -19.67192 
## elapsed = 0.00   Round = 52  alphaP = 0.4192304  betaP = 0.02395849  gammaP = 0.01354981 deltaP = 0.5662541  Value = -22.49306 
## elapsed = 0.00   Round = 53  alphaP = 0.4282272  betaP = 0.04034959  gammaP = 0.01258217 deltaP = 0.2648013  Value = -17.24989 
## elapsed = 0.00   Round = 54  alphaP = 0.4045001  betaP = 0.04534912  gammaP = 0.02241467 deltaP = 0.5665305  Value = -18.55051 
## elapsed = 0.00   Round = 55  alphaP = 0.3994378  betaP = 0.02486266  gammaP = 0.01358426 deltaP = 0.5194098  Value = -15.36117 
## elapsed = 0.00   Round = 56  alphaP = 0.4055585  betaP = 0.02218931  gammaP = 0.02515179 deltaP = 0.3654861  Value = -24.71103 
## elapsed = 0.00   Round = 57  alphaP = 0.4509694  betaP = 0.04658857  gammaP = 0.01285525 deltaP = 0.3506319  Value = -18.54859 
## elapsed = 0.00   Round = 58  alphaP = 0.4397619  betaP = 0.02411225  gammaP = 0.01759789 deltaP = 0.4989453  Value = -27.95577 
## elapsed = 0.00   Round = 59  alphaP = 0.2936202  betaP = 0.01289876  gammaP = 0.02019857 deltaP = 0.5709312  Value = -22.82341 
## elapsed = 0.00   Round = 60  alphaP = 0.41364    betaP = 0.02681897  gammaP = 0.01304214 deltaP = 0.4906822  Value = -22.52042 
## 
##  Best Parameters Found: 
## Round = 10   alphaP = 0.2826459  betaP = 0.04818015  gammaP = 0.01294227 deltaP = 0.2926503  Value = -15.23521
# Print and plot results
message("Best parameter set:")
message(opt_result$Best_Par)
message(paste("Best score:", opt_result$Best_Value))

# Plot convergence
# Interactive visualizations of Bayesian Optimization results using plotly

# Function to create interactive visualizations for optimization results
visualize_optimization <- function(opt_result) {
  # Extract optimization history
  history <- as.data.frame(opt_result$History)
  history$Iteration <- 1:nrow(history)
  
  # 1. Optimization Progress Plot
  progress_plot <- plot_ly() %>%
    add_trace(
      data = history,
      x = ~Iteration,
      y = ~Value,
      type = 'scatter',
      mode = 'lines+markers',
      name = 'Objective Value',
      marker = list(size = 8, color = 'rgba(0, 0, 255, 0.8)'),
      line = list(width = 2, color = 'rgba(0, 0, 255, 0.8)')
    ) %>%
    add_trace(
      x = which.max(history$Value),
      y = max(history$Value),
      type = 'scatter',
      mode = 'markers',
      marker = list(size = 12, color = 'red', symbol = 'star'),
      name = 'Best Result'
    ) %>%
    layout(
      title = 'Bayesian Optimization Progress',
      xaxis = list(title = 'Iteration'),
      yaxis = list(title = 'Objective Value'),
      hovermode = 'closest'
    )
  
  # 2. Parameter Convergence Plots (faceted)
  param_names <- c("alphaP", "betaP", "gammaP", "deltaP")
  param_data <- history %>%
    select(Iteration, all_of(param_names), Value) %>%
    tidyr::pivot_longer(
      cols = all_of(param_names),
      names_to = "Parameter",
      values_to = "ParamValue"
    )
  
  # Set hover text showing objective value
  param_data$hover_text <- paste(
    "Iteration:", param_data$Iteration, "<br>",
    "Parameter:", param_data$Parameter, "<br>",
    "Value:", round(param_data$ParamValue, 4), "<br>",
    "Objective:", round(param_data$Value, 4)
  )
  
  parameter_plot <- plot_ly() %>%
    add_trace(
      data = param_data,
      x = ~Iteration,
      y = ~ParamValue,
      color = ~Parameter,
      text = ~hover_text,
      hoverinfo = 'text',
      type = 'scatter',
      mode = 'lines+markers'
    ) %>%
    layout(
      title = 'Parameter Convergence',
      xaxis = list(title = 'Iteration'),
      yaxis = list(title = 'Parameter Value'),
      legend = list(title = list(text = 'Parameter')),
      hovermode = 'closest'
    )
  
  # 3. 3D Parameter Space Exploration
  # Choose 3 parameters to visualize
  params_3d <- param_names[1:3]
  best_idx <- which.max(history$Value)
  
  space_exploration <- plot_ly() %>%
    add_trace(
      data = history,
      x = ~get(params_3d[1]),
      y = ~get(params_3d[2]),
      z = ~get(params_3d[3]),
      color = ~Value,
      colorscale = 'Viridis',
      type = 'scatter3d',
      mode = 'markers',
      marker = list(
        size = 6,
        opacity = 0.8
      ),
      text = ~paste(
        "Iteration:", Iteration, "<br>",
        params_3d[1], ":", round(get(params_3d[1]), 4), "<br>",
        params_3d[2], ":", round(get(params_3d[2]), 4), "<br>",
        params_3d[3], ":", round(get(params_3d[3]), 4), "<br>",
        "Value:", round(Value, 4)
      ),
      hoverinfo = 'text'
    ) %>%
    # Add best point
    add_trace(
      data = history[best_idx, ],
      x = ~get(params_3d[1]),
      y = ~get(params_3d[2]),
      z = ~get(params_3d[3]),
      type = 'scatter3d',
      mode = 'markers',
      marker = list(
        color = 'red',
        size = 10,
        symbol = 'diamond'
      ),
      name = 'Best Result'
    ) %>%
    layout(
      title = paste('Parameter Space Exploration:', 
                    paste(params_3d, collapse=" vs ")),
      scene = list(
        xaxis = list(title = params_3d[1]),
        yaxis = list(title = params_3d[2]),
        zaxis = list(title = params_3d[3])
      ),
      coloraxis = list(colorbar = list(title = 'Objective Value'))
    )
  
  # 4. Parameter Pairs Correlation Matrix
  # Create a matrix of scatter plots for parameter correlations
  param_pairs <- list()
  pair_count <- 1
  
  for (i in 1:(length(param_names)-1)) {
    for (j in (i+1):length(param_names)) {
      p1 <- param_names[i]
      p2 <- param_names[j]
      
      pair_plot <- plot_ly() %>%
        add_trace(
          data = history,
          x = ~get(p1),
          y = ~get(p2),
          type = 'scatter',
          mode = 'markers',
          marker = list(
            size = 8,
            color = ~Value,
            colorscale = 'Viridis',
            colorbar = list(title = 'Objective Value'),
            opacity = 0.8
          ),
          text = ~paste(
            "Iteration:", Iteration, "<br>",
            p1, ":", round(get(p1), 4), "<br>",
            p2, ":", round(get(p2), 4), "<br>",
            "Value:", round(Value, 4)
          ),
          hoverinfo = 'text'
        ) %>%
        # Add best point
        add_trace(
          data = history[best_idx, ],
          x = ~get(p1),
          y = ~get(p2),
          type = 'scatter',
          mode = 'markers',
          marker = list(
            color = 'red',
            size = 12,
            symbol = 'star'
          ),
          name = 'Best Result'
        ) %>%
        layout(
          title = paste('Parameter Correlation:', p1, 'vs', p2),
          xaxis = list(title = p1),
          yaxis = list(title = p2),
          showlegend = FALSE
        )
      
      param_pairs[[pair_count]] <- pair_plot
      pair_count <- pair_count + 1
    }
  }
  
  # Create the individual plots as separate objects for easy access
  plots <- list(
    progress = progress_plot,
    parameters = parameter_plot,
    space_3d = space_exploration,
    pairs = param_pairs
  )
  
  # Return all plots
  return(plots)
}

# Usage:
plots <- visualize_optimization(opt_result)

# Display individual plots
plots$progress
plots$parameters
plots$space_3d
# Display parameter pair correlations
plots$pairs[[1]]  # First pair
plots$pairs[[2]]  # Second pair
plots$pairs[[3]]  # Third pair
# etc.

# Alternatively, create a dashboard with subplot
parameter_pairs_subplot <- subplot(plots$pairs, nrows = 2, margin = 0.05) %>%
  layout(title = "Parameter Pair Correlations", showlegend = FALSE)
parameter_pairs_subplot 
# Create dashboard with all visualizations
dashboard <- subplot(plots$progress, plots$parameters, plots$space_3d, parameter_pairs_subplot,
                     nrows = 2, margin = 0.08) %>%
  layout(title = "Bayesian Optimization Dashboard", showlegend = TRUE)
# dashboard

This end-to-end solution provides a template for advanced cancer treatment modeling with biomarkers, leveraging the strengths of both R (visualization, optimization) and Julia (high-performance differential equations).

2 Multi-Core Parallel Processing Optimization of \(nD\) SDE Modeling

Consider again the stochastic differential equation (SDE) model that captures the dynamic interplay between tumor local control (\(LC\)), treatment-related side effects (\(RP2\)), and biomarkers (\(B\)). The model extends the Lotka-Volterra framework with treatment inputs and biomarker dynamics. This time, we will use multi-core parallel processing optimization of \(nD\) SDE modeling. For simplicity of the hands-on demonstration, again we still use \(2+3\) state variables (outcomes + inputs)

\[\mathbf{X}(t) = \begin{pmatrix} LC(t) \\ RP2(t) \\ \mathbf{B}(t) \end{pmatrix}, \quad \text{where } \mathbf{B}(t) \in \mathbb{R}^k \text{ (e.g., } k=10\text{ biomarkers)}.\]

The analytical representation of the SDE system is \[\begin{aligned} dLC &= \left[\alpha LC \left(1 - \frac{LC}{K}\right) - \beta LC \cdot RP2 - \phi_{\text{treat}} u(t) LC \right] dt + \sigma_{LC} LC \, dW_1, \\ dRP2 &= \left[\gamma LC \cdot RP2 - \delta RP2 + \psi_{\text{treat}} u(t) \right] dt + \sigma_{RP2} \sqrt{RP2} \, dW_2, \\ dB_i &= \left[\mu_i(B_i) + \eta_i LC - \nu_i RP2 \right] dt + \sigma_{B_i} \, dW_{B_i}, \quad i=1,\dots,k, \end{aligned}\] which depends on the following parameters

  • \(\alpha, K\): Tumor growth rate and carrying capacity.
  • \(\beta\): Toxicity-driven tumor suppression.
  • \(\phi_{\text{treat}}\): Treatment efficacy on LC.
  • \(\gamma, \delta\): Toxicity activation and clearance.
  • \(\psi_{\text{treat}}\): Treatment-induced toxicity.
  • \(\mu_i, \eta_i, \nu_i\): Biomarker drift terms.
  • \(\sigma\): Noise intensities.

We can optimize the SDE cost function further by using parallel processing across multiple cores, e.g., \(10\) cores, by updating the R code to work with the parallelized Julia functions. The parallelized approach provides several significant performance benefits when optimizing the SDE model:

  1. Multi-core Evaluation: By using \(10\) cores, we can evaluate parameter sets approximately \(10\times\) faster than the sequential approach.

  2. Ensemble Evaluation: Each parameter set is evaluated across multiple stochastic trajectories, improving robustness of the cost calculation.

  3. Batch Processing: The code supports evaluating multiple parameter sets simultaneously, which works well with batch Bayesian optimization algorithms.

  4. Load Balancing: The pmap function distributes work evenly across workers, ensuring efficient resource utilization.

  5. Worker Setup:

-   The code adds workers at initialization
-   Required packages and variables are shared across workers
-   The base problem (`prob`) is distributed to all workers
  1. Enhanced Cost Function:
-   Uses ensemble simulation for stochastic systems
-   Calculates costs across multiple trajectories for robustness
-   Uses median cost to reduce sensitivity to outliers
-   Improved error handling for worker processes
  1. Performance Considerations:
-   Parameter constraints are applied on each worker
-   Correlation calculations use try-catch for stability
-   Finite filtering prevents NaN/Inf from corrupting results

Note that the underying Julia parallel code is in a separate file, DSPA_Appendix19_SDE_parallel.jl, which is sourced in R. We can modify the init_points and n_iter parameters to control the evaluation runs with parallelization. The code handles both single-parameter evaluation (for standard Bayesian optimization) and batch evaluation (for advanced optimizers). By monitoring the memory usage, we can run multiple SDEs simultaneously that are memory-intensive. This approach provides a significant speedup for the the SDE model optimization process while maintaining the quality of the results. The actual multi-core parallelization is handled in Julia directly, not R. Here, we use the Julia source file DSPA_Appendix19_SDE_parallel.jl.

library(rBayesianOptimization)
library(JuliaCall)
library(rhdf5)
library(GPfit)
library(plotly)
library(reshape2)

Sys.setenv(JULIA_NUM_THREADS = 11)
# Setup Julia  #   julia_setup(verbose = TRUE)
julia <- julia_setup() 

julia_source("./DSPA_Appendix19_SDE_JuliaChunk1/DSPA_Appendix19_SDE_parallel.jl")

# Load and pass data
h5_data <- h5read("./DSPA_Appendix19_SDE_JuliaChunk1/cancer_SDE_trajectories.h5", "trajectories")
observed_data <- h5_data[,,1]
julia_assign("observed_data_shared", observed_data)

# Define bounds
bounds <- list(
  alphaP = c(0.15, 0.45),
  betaP = c(0.012, 0.048),
  gammaP = c(0.012, 0.028),
  deltaP = c(0.25, 0.55)
)

# Create log file
# log_file <- "./DSPA_Appendix19_SDE_JuliaChunk1/optimization_log.csv"
# cat("Timestamp,Worker,alphaP,betaP,gammaP,deltaP,Cost\n", file = log_file)

# Helper for improved BO stability, add hard constraints before passing parameters to Julia
clamp <- function(x, lo, hi) {
  pmax(lo, pmin(x, hi))
}

# Helper function for default bounds
getDefaultBounds <- function() {
  list(
    alphaP = c(0.15, 0.45),
    betaP = c(0.012, 0.048),
    gammaP = c(0.012, 0.028),
    deltaP = c(0.25, 0.55)
  )
}

# Objective function with noise handling
obj_func <- function(alphaP, betaP, gammaP, deltaP) {
  # Hard parameter clamping with diagnostics
  params <- pmax(
    c(alphaP = 0.15, betaP = 0.012, gammaP = 0.012, deltaP = 0.25),
    pmin(
      c(alphaP, betaP, gammaP, deltaP),
      c(alphaP = 0.45, betaP = 0.048, gammaP = 0.028, deltaP = 0.55)
    )
  )
  
  # Parameter validity check
  if(any(is.na(params)) || any(!is.finite(params))) {
    warning("Invalid parameters detected: ", paste(params, collapse = ", "))
    return(list(Score = -95.0))
  }
  
  tryCatch({
    # Add small noise for GP stability
    jittered_params <- params * runif(4, 0.999, 1.001)
    
    cost <- julia_eval(sprintf(
      "parallel_sde_cost([%f, %f, %f, %f], observed_data_shared)",
      jittered_params[1], jittered_params[2], 
      jittered_params[3], jittered_params[4]
    ))
    
    # Ensure finite cost
    if(!is.finite(cost)) {
      warning("Non-finite cost: ", cost)
      cost <- 95.0
    }
    
    return(list(Score = -cost))
    
  }, error = function(e) {
    return(list(Score = -95.0))  # Hard cap at maximum penalty
  })
}

# Run optimization with adjusted parameters
set.seed(123)

# Step 1: Simplified Bayesian Optimization Framework
robust_bayes_opt <- function(FUN, bounds, init_points = 20, n_iter = 40) {
  # Parameter dimension
  d <- length(bounds)
  
  # Validate input structure
  if (!is.list(bounds) || is.null(names(bounds))) {
    stop("Bounds must be a named list of numeric vectors")
  }
  
  param_names <- names(bounds)
  num_params <- length(param_names)
  
  # Initialize parameter matrix with column names
  X <- matrix(nrow = 0, ncol = num_params)
  colnames(X) <- param_names
  y <- numeric(0)
  
  # Generate initial samples with bounds validation
  init_samples <- lapply(param_names, function(p) {
    if (bounds[[p]][1] >= bounds[[p]][2]) {
      stop(paste("Invalid bounds for parameter", p))
    }
    runif(init_points, bounds[[p]][1], bounds[[p]][2])
  })
  
  X_init <- do.call(cbind, init_samples)
  colnames(X_init) <- param_names
  
  # Evaluate initial points
  for (i in 1:init_points) {
    params <- as.list(X_init[i, , drop = FALSE])
    res <- do.call(FUN, params)
    X <- rbind(X, X_init[i, , drop = FALSE])
    y <- c(y, res$Score)
  }
  
  # Bayesian optimization loop
  for (iter in 1:n_iter) {
    # Scale parameters using proper bounds
    X_scaled <- t(apply(X, 1, function(row) {
      sapply(param_names, function(p) {
        (row[p] - bounds[[p]][1]) / diff(bounds[[p]])
      })
    }))
    
    # Fit GP model with validated control parameters
    # GP model fitting with proper control vector
    gp <- GPfit::GP_fit(
      X = X_scaled,
      Y = y,
      corr = list(type = "exponential", power = 1.95),
      control = c(200 * d, 80 * d, 2 * d),
      nug_thres = 20,
      trace = FALSE,
      maxit = 100
    )
    
    # Acquisition function with type validation
    utility <- function(x) {
      x_scaled <- sapply(param_names, function(p) {
        (x[p] - bounds[[p]][1]) / diff(bounds[[p]])
      })
      pred <- predict(gp, matrix(x_scaled, nrow = 1))
      mu <- pred$Y_hat
      sigma <- sqrt(pred$MSE)
      mu + 1.96 * sigma
    }
    
    # Optimize with explicit type conversion
    next_point <- optim(
      par = sapply(bounds, function(rng) runif(1, rng[1], rng[2])),
      fn = utility,
      method = "L-BFGS-B",
      lower = as.numeric(sapply(bounds, "[", 1)),
      upper = as.numeric(sapply(bounds, "[", 2)),
      control = list(fnscale = -1)
    )$par
    
    # Validate and store results
    if (any(is.na(next_point))) {
      warning("NA parameters generated at iteration ", iter)
      next
    }
    
    res <- do.call(FUN, as.list(next_point))
    X <- rbind(X, next_point)
    y <- c(y, res$Score)
  }
  
  # Return results with validation
  if (length(y) == 0) stop("No valid evaluations completed")
  best_idx <- which.max(y)
  list(
    Best_Par = X[best_idx, ],
    Best_Value = y[best_idx],
    History = data.frame(X, Score = y)
  )
}

# Step 2: Parameter Stabilization Wrapper
safe_objective <- function(alphaP, betaP, gammaP, deltaP) {
  # Hard parameter clamping
  params <- list(
    alphaP = clamp(alphaP, 0.15, 0.45),
    betaP = clamp(betaP, 0.012, 0.048),
    gammaP = clamp(gammaP, 0.012, 0.028),
    deltaP = clamp(deltaP, 0.25, 0.55)
  )
  
  # Numeric stabilization
  params <- lapply(params, function(p) {
    sign(p) * min(abs(p), 10)  # Bound magnitude
  })
  
  tryCatch({
    cost <- julia_eval(sprintf(
      "parallel_sde_cost([%f, %f, %f, %f], observed_data_shared)",
      params$alphaP, params$betaP, params$gammaP, params$deltaP
    ))
    
    list(Score = -cost, Pred = 0)
  }, error = function(e) {
    list(Score = -95.0, Pred = 0)  # Fail-safe value
  })
}

clamp <- function(x, lo, hi) pmax(lo, pmin(x, hi))

# Define parameter bounds
param_bounds <- list(
  alphaP = c(0.15, 0.45),
  betaP = c(0.012, 0.048),
  gammaP = c(0.012, 0.028),
  deltaP = c(0.25, 0.55)
)

# Run stabilized optimization
set.seed(123)
opt_result <- robust_bayes_opt(
  FUN = safe_objective,
  bounds = param_bounds,
  init_points = 20,
  n_iter = 40
)

# Analyze results
print(opt_result$Best_Par)
##    alphaP     betaP    gammaP    deltaP 
## 0.3254983 0.0120000 0.0120000 0.4264078
plot(opt_result$History$Score, type='l', 
     xlab="Iteration", ylab="Objective Value")

# 4. Save results
saveRDS(opt_result, "./DSPA_Appendix19_SDE_JuliaChunk1/sde_optimization_results.rds")

# Print results
print("Best parameter set:")
## [1] "Best parameter set:"
print(opt_result$Best_Par)
##    alphaP     betaP    gammaP    deltaP 
## 0.3254983 0.0120000 0.0120000 0.4264078
print(paste("Best score:", opt_result$Best_Value))
## [1] "Best score: -102.4"
# Create visualization
library(ggplot2)
library(reshape2)

# Convert history to data frame
history_df <- data.frame(Iteration = 1:length(opt_result$History$Score),
                         Cost = -opt_result$History$Score,
                         alphaP = opt_result$History$alphaP,
                         betaP = opt_result$History$betaP,
                         gammaP = opt_result$History$gammaP,
                         deltaP = opt_result$History$deltaP)

# Plot parameter trajectories
param_df <- melt(history_df[,c("Iteration", "alphaP", "betaP", "gammaP", "deltaP")],
                 id.vars = "Iteration")

# # Plot optimization progress
# ggplot(history_df, aes(x = Iteration, y = Cost)) +
#   geom_point() +
#   geom_line(alpha = 0.5) +
#   geom_smooth(method = "loess", se = TRUE) +
#   theme_minimal() +
#   labs(title = "Optimization Progress",
#        y = "Cost (lower is better)")

# ggplot(param_df, aes(x = Iteration, y = value, color = variable)) +
#   geom_line() +
#   geom_point(size = 1) +
#   facet_wrap(~variable, scales = "free_y") +
#   theme_minimal() +
#   labs(title = "Parameter Convergence",
#        y = "Parameter Value",
#        color = "Parameter")

# Convert history to data frame
history_df <- data.frame(
  Iteration = 1:length(opt_result$History$Score),
  Cost = -opt_result$History$Score,
  alphaP = opt_result$History$alphaP,
  betaP = opt_result$History$betaP,
  gammaP = opt_result$History$gammaP,
  deltaP = opt_result$History$deltaP
)

# Calculate LOESS smoothing for cost
loess_fit <- loess(Cost ~ Iteration, data = history_df, span = 0.75)
smooth_curve <- predict(loess_fit)
se <- predict(loess_fit, se = TRUE)$se.fit

# 1. Interactive Cost Progress Plot
cost_plot <- plot_ly() %>%
  add_trace(data = history_df, x = ~Iteration, y = ~Cost, type = 'scatter',
            mode = 'markers+lines', name = 'Actual Cost',
            marker = list(size = 8, color = 'rgba(0,100,255,0.7)'),
            line = list(color = 'rgba(0,100,255,0.3)')) %>%
  add_trace(x = ~Iteration, y = smooth_curve, type = 'scatter', mode = 'lines',
            name='Trend', line=list(color = 'red', width=3), hoverinfo = 'skip') %>%
  add_trace(x = ~Iteration, y = smooth_curve + 2*se, type = 'scatter',
            mode = 'lines', name = 'Confidence', 
            line = list(color = 'rgba(255,0,0,0.2)'), showlegend = FALSE) %>%
  add_trace(x = ~Iteration, y = smooth_curve - 2*se, type = 'scatter',
            mode = 'lines', fill = 'tonexty', fillcolor = 'rgba(255,0,0,0.1)',
            line = list(color = 'rgba(255,0,0,0.2)'),
            name = 'Confidence Interval', showlegend = FALSE) %>%
  layout(title = list(text = "Optimization Progress", font = list(size = 20)),
         xaxis = list(title = "Iteration"), yaxis = list(title = "Cost (lower is better)"),
         hovermode = 'x unified', showlegend = TRUE)

# 2. Interactive Parameter Trajectories Plot
param_df <- melt(history_df[,c("Iteration", "alphaP", "betaP", "gammaP", "deltaP")],
                 id.vars = "Iteration")

# Calculate parameter-specific LOESS smoothing
param_smooth <- function(data, param) {
  fit <- loess(value ~ Iteration, data = data[data$variable == param,], span = 0.75)
  predict(fit)
}

param_plot <- plot_ly() %>%
  layout(grid = list(rows = 2, columns = 2), showlegend = FALSE)

# Colors for parameters
param_colors <- c(alphaP = "#1f77b4", betaP = "#ff7f0e", 
                 gammaP = "#2ca02c", deltaP = "#d62728")

# Create subplot for each parameter
params <- c("alphaP", "betaP", "gammaP", "deltaP")
for(i in seq_along(params)) {
  param <- params[i]
  param_data <- param_df[param_df$variable == param,]
  smooth_values <- param_smooth(param_df, param)
  
  param_plot <- param_plot %>%
    add_trace(
      data = param_data,
      x = ~Iteration,
      y = ~value,
      type = 'scatter',
      mode = 'markers+lines',
      name = param,
      marker = list(size = 6, color = param_colors[param]),
      line = list(color = paste0(param_colors[param], "50")),
      xaxis = paste0("x", i),
      yaxis = paste0("y", i)
    ) %>%
    add_trace(
      x = param_data$Iteration,
      y = smooth_values,
      type = 'scatter',
      mode = 'lines',
      line = list(color = param_colors[param], width = 3),
      name = paste(param, "trend"),
      xaxis = paste0("x", i),
      yaxis = paste0("y", i)
    )
}

param_plot <- param_plot %>%
  layout(
    title = list(
      text = "Parameter Convergence",
      font = list(size = 20)
    ),
    xaxis = list(title = "Iteration", domain = c(0, 0.45)),
    xaxis2 = list(title = "Iteration", domain = c(0.55, 1)),
    xaxis3 = list(title = "Iteration", domain = c(0, 0.45)),
    xaxis4 = list(title = "Iteration", domain = c(0.55, 1)),
    yaxis = list(title = "alphaP", domain = c(0.55, 1)),
    yaxis2 = list(title = "betaP", domain = c(0.55, 1)),
    yaxis3 = list(title = "gammaP", domain = c(0, 0.45)),
    yaxis4 = list(title = "deltaP", domain = c(0, 0.45))
  )

# Create subplot combining both visualizations
subplot_all <- subplot(
  cost_plot, param_plot,
  nrows = 2,
  heights = c(0.4, 0.6),
  margin = 0.1
) %>%
  layout(
    title = list(
      text = "Bayesian Optimization Results",
      font = list(size = 24)
    ),
    showlegend = TRUE
  )

# Display plots
subplot_all

To accelerate the Bayesian optimization using parallel processing, utilize Julia’s built-in parallelism for the SDE simulations and R’s parallel package for concurrent parameter evaluations.

### **1. Julia Parallel Setup (in R)**
library(parallel)
library(JuliaCall)

nCores <- 9
# Initialize Julia with nCores=9 cores
julia_setup(installJulia = TRUE)
julia_command("using Distributed; addprocs(9)")  # Use 9 workers

# Define parallel SDE evaluation in Julia
julia_command('
@everywhere function parallel_sde_cost_batch(params_batch, observed_data)
    pmap(params -> sde_cost_worker(params, observed_data), params_batch)
end
')

### **2. Parallelized R Objective Function**
safe_objective_parallel <- function(param_batch) {
  # Convert batch to Julia array
  julia_assign("param_batch_jl", param_batch)
  
  # Evaluate batch in parallel
  costs <- julia_eval("parallel_sde_cost_batch(param_batch_jl, observed_data_shared)")
  
  # Return scores
  lapply(costs, function(c) list(Score = -c, Pred = 0))
}

# Create cluster
cl <- makeCluster(nCores+1)
clusterExport(cl, c("julia_eval", "julia_assign"))

### **3. Modified Bayesian Optimization**
robust_bayes_opt_parallel <- function(FUN, bounds, init_points=20, n_iter=40, batch_size=10) {
  # Parameter setup
  param_names <- names(bounds)
  num_params <- length(param_names)
  
  # Initialize with parallel batches
  init_batches <- ceiling(init_points / batch_size)
  X <- matrix(nrow=0, ncol=num_params)
  y <- numeric(0)
  
  # Parallel initial evaluations
  for(b in 1:init_batches) {
    batch <- Matrix_runif(n = batch_size, 
                         lower = sapply(bounds, "[", 1),
                         upper = sapply(bounds, "[", 2))
    
    results <- parLapply(cl, split(batch, 1:batch_size), FUN)
    
    X <- rbind(X, batch)
    y <- c(y, sapply(results, "[[", "Score"))
  }
  
  # Bayesian optimization with batch parallelization
  for(iter in 1:ceiling(n_iter/batch_size)) {
    # Fit GP model
    gp <- GPfit::GP_fit(X = X, Y = y, ...)
    
    # Generate batch of candidate points
    candidates <- Matrix_runif(n = batch_size, ...)
    
    # Parallel evaluation
    batch_results <- parLapply(cl, split(candidates, 1:batch_size), FUN)
    
    # Update history
    X <- rbind(X, candidates)
    y <- c(y, sapply(batch_results, "[[", "Score"))
  }
  
  # Return best parameters
  best_idx <- which.max(y)
  list(Best_Par = X[best_idx,], Best_Value = y[best_idx])
}

### **4. Execution**
# Run parallel optimization
opt_result <- robust_bayes_opt_parallel(
  FUN = safe_objective_parallel,
  bounds = list(
    alphaP = c(0.15, 0.45),
    betaP = c(0.012, 0.048),
    gammaP = c(0.012, 0.028),
    deltaP = c(0.25, 0.55)
  ),
  init_points = 20,
  n_iter = 40,
  batch_size = 10
)

# Cleanup
stopCluster(cl)

2.1 Bayesian Optimization vs. Stochastic Gradient Descent Optimization

Bayesian Optimization and Stochastic Gradient Descent (SGD) are different optimization schemes, which can potentially be combined. Bayesian Optimization uses a probabilistic surrogate model (Gaussian Process) to approximate the objective function and explores the parameter space by balancing state-space exploration and exploitation. It wrks well for expensive-to-evaluate functions (like SDE modeling) and doesn’t require gradients of the objective function, hence, it generally needs fewer function evaluations than SGD. On the other hand, Stochastic Gradient Descent requires calculation of gradients of the objective function to iteratively ppdate the parameter estimates in the direction of steepest descent. It works well for large-scale optimization with noisy gradients and usually needs many function evaluations, but could get stuck in local minima.

For SDE model optimization, we could potentially create a hybrid approach that uses Bayesian Optimization to find promising regions of the parameter space and then apply SGD to refine the best solutions found by Bayesian Optimization. then, combining the global exploration capability of Bayesian Optimization with the local refinement of SGD may yield improved performance. However, computing numerical gradients requires additional function evaluations and the SGD phase might significantly increase total computation time. Also, SGD might be sensitive to the choice of learning rate and momentum and the SDE system might not be smooth enough for reliable gradient estimation.

In these experiments, the model dynamics include \(LC\), which follows logistic growth with treatment suppression and toxicity-driven death, \(RP2\) increases with tumor presence and treatment, decays naturally, and biomarkers that depend linearly on \(LC\) and \(RP2\) and have intrinsic drift. The cancer intervention, treatment schedule, uses intermittent chemotherapy cycles (u_t(t)). And the (hyper)parameter estimation utilizes Bayesian optimization to tunes parameters to match (smaller) observed data. This framework provides a flexible foundation for integrating clinical data and refining dynamics using domain knowledge.

SOCR Resource Visitor number Web Analytics SOCR Email