This DSPA2 module represents Part 3 of the DSPA2 Chapter 14 (Deep Learning, Neural Networks). Learners are encouraged to first complete Part 1 of the DSPA2 Chapter 14 (Deep Learning, Neural Networks), Part 2 of the DSPA2 Chapter 14 (Deep Learning, Neural Networks), and Part 3 of the DSPA2 Chapter 14 (Deep Learning, Neural Networks) prior to continuing with this Part 4 of the DSPA2 Chapter 14 (Deep Learning, Neural Networks), which covers the Torch and Tensorflow Image Pre-processing Image Classification Pipelines.

1 `Torch` Deep Convolutional Neural Network (CNN)

The U-Net: Convolutional Networks for Biomedical Image Segmentation, shown on the image below, is an example of a DCNN. U-Net Architecture The U-shaped CNN (U-Net) represents successive convolutional layers with max-pooling. During the auto-encoding (left-down-hill branch) the U-Net reduces image resolution (downsampling), whereas during the subsequent decoding phase (right-uphill branch) upsamples the images to arrive at an output of the same size as the original input. The information analysis (encoding) and synthesis (decoding) facilitate the labeling of each output image pixel by feeding information in each decoding layer from the corresponding encoding layer with matching resolution in the downsizing encoding layer.

Each upsampling (decoding) step concatenates the output from the previous layer with that from its counterpart in the compression (encoding) step. The final decoding output is a mask of the same size as the original image, derived by a $1\times 1$-convolution, which does not require a dense layer at the end as the output convolutional layer represents a single filter. Below we show how to load, train, and use a U-Net for transfer learning in 2D image segmentation. Note that this model has over $3M$ trainable parameters. You can see an R example of a Unet model for input-output tensors of shape=c(128,128), see lines 73-183.

# If necessary, download the U-Net package, before you load it into R
# remotes::install_github("r-tensorflow/unet")
library(tfdatasets)
library(tfds)
library(tfhub)
library(tfruns)
library(torch)
# torch::install_torch()

# remotes::install_github("r-tensorflow/unet")
library(unet)
library(tibble)

# The u-Net call takes additional parameters, e.g., number of downsizing blocks, number of filters to start with,
# number of classes to identify; # ?unet provides details. For instance, we can specify the shape
# of the input images we will be segmenting tumors for: 256*256 3-channel RGB images.
model <- unet::unet(input_shape = c(256, 256, 3))

# to print the model as text output, run:
# model
# Results: # Trainable params: 31,031,745

1.1 Data Import

Let’s first download and load in the Brain Tumor Imaging dataset. These data come from a 2019 study on Association of genomic subtypes of lower-grade gliomas with shape features automatically extracted by a deep learning algorithm. The 2D brain MR images are paired with 2D tumor masks, which are trivial for controls and non-trivial for patients, using The Cancer Imaging Archive (TCIA). The data represent 110 patients with lower-grade glioma and include fluid-attenuated inversion recovery (FLAIR) MRI scans. There are 3-channels of the MRI data; pre-contrast, FLAIR, and post-contrast. The corresponding tumor masks were obtained by manual-delineations on the FLAIR images by board-certified radiologists.

# If you need to start a clean fresh run, remove all old files first! Be careful with this! set eval=T in all R-blocks!
##### First check > list.files("/data/")
##### do.call(file.remove, list(list.files("/data", full.names = TRUE)))
##### unlink("/data/*", recursive=TRUE, force=TRUE)

library(httr)
pathToZip <- tempfile()
pathToZip<-paste0(pathToZip,".zip")
#
#
url <- "https://umich.instructure.com/files/21813670/download?download_frd=1"
response <- GET(url)

content_type <- http_type(response)
print(content_type)


if (content_type == "application/zip" || content_type == "application/x-zip-compressed") {
  content <- content(response, "raw")
  writeBin(content, pathToZip)
} else {
  stop("Unexpected content type received.")
}

# download.file("https://umich.instructure.com/files/21813670/download?download_frd=1", pathToZip, mode = "wb")
zip::unzip(pathToZip, files=NULL, exdir = paste0(getwd(),'/data'))

library(tibble)
library(rsample)

train_dir <- file.path(getwd(),"data","data")
valid_dir <- file.path(getwd(),"data","mri_valid")

library(magick)   # Needed for TIFF --> PNG image conversion and other image processing tasks

Create the necessary directories to store the training and validation imaging data (brain MRIs and tumor masks).

# check if ReadMe file is accessible
# file.rename("/data/ReadMe_TCGA_MRI_Segmentation_Data_Phenotypes.txt", train_dir)

# Import the meta-data
# meta_data <- read.csv(paste0(getwd(),"//data//TCGA_MRI_Segmentation_Data_Phenotypes.csv"))


file_path <- file.path(getwd(), "data", "TCGA_MRI_Segmentation_Data_Phenotypes.csv")

# Read the CSV file
meta_data <- read.csv(file_path)

# note that these are relative file/directory names. To see the complete local path
# tempdir(); getwd()

# Create a validation folder
dir.create(valid_dir)

# Check all n=110 patients are accessible
patients <- list.dirs(train_dir, recursive = FALSE)
length(patients)

# Randomly select 20 Patients for validation, remaining 90=110-20 are for training the DNN model
valid_indices <- sample(1:length(patients), 20)
valid_indices
patients[valid_indices] # prints the actual folders where the validation participants' data is

# Extract and Relocate the Validation cases (separate them from training data)
for (i in valid_indices) {
  dir.create(file.path(valid_dir, basename(patients[i])))
  for (f in list.files(patients[i])) {
    file.rename(file.path(train_dir, basename(patients[i]), f), file.path(valid_dir, basename(patients[i]), f))
  }
  unlink(file.path(train_dir, basename(patients[i])), recursive = TRUE) # clean
}

# Confirm that only 80 patients are left in the standard data folder
# list all training data imaging files: list.dirs(train_dir, recursive = FALSE)
length(list.dirs(train_dir, recursive = FALSE))

# and 30-60 validation cases are in the validation folder
length(list.dirs(valid_dir, recursive = FALSE))

# and check validation data
length(list.files(valid_dir, recursive = T)) # [1] 1268

Define data-frames containing the file-names for all training and validation data.

# Identify the TRAINING and VALIDATION data objects (raw images + tumor masks) as filenames
data_train <- tibble(
  img = grep(list.files(train_dir, full.names = TRUE, pattern = "tif", recursive = TRUE),
        pattern = 'mask', invert = TRUE, value = TRUE),
  mask = grep(list.files(train_dir, full.names = TRUE, pattern = "tif", recursive = TRUE),
        pattern = 'mask', value = TRUE)
)
data_valid <- tibble(
  img = grep(list.files(valid_dir, full.names = TRUE, pattern = "tif", recursive = TRUE),
        pattern = 'mask', invert = TRUE, value = TRUE),
  mask = grep(list.files(valid_dir, full.names = TRUE, pattern = "tif", recursive = TRUE),
        pattern = 'mask', value = TRUE)
)

(Optionally) convert all 2D TIFF images to PNG RGB format! This may be necessary to ensure the input images are 3-channels, and are correctly interpreted as tensorflow objects.

print(grepl("\\.tif$", data_train$img))

# If all training + testing data are in one folder, split them by:
#  data <- initial_split(data_train, prop = 0.8)

# convert all Training Data: TIFF images and masks to PNG format (for easier TF processing downstream)
files_img_tif <- data_train$img[grepl("\\.tif$", data_train$img), drop = TRUE]
data_train_img_png <- lapply(files_img_tif,
      function(x) {
        # image_write(image_read(x), path = gsub(".tif$", ".png", x), format = "png")
        a = image_convert(image_read(x),  format = "png")
        image_write(a, path = gsub(".tif$", ".png", x), format = "png")
    }
  )

files_mask_tif <- data_train$mask[grepl("\\.tif$", data_train$mask), drop = TRUE]
data_train_mask_png <- lapply(files_mask_tif,
     function(x) {
       # image_write(image_read(x), path = gsub(".tif$", ".png", x), format = "png")
       a = image_convert(image_read(x),  format = "png")
       image_write(a, path = gsub(".tif$", ".png", x), format = "png")
    }
  )

# Similarly convert all Validation Data
# convert all TIFF images and masks to PNG format (for easier TF processing downstream)
files_valid_img_tif <- data_valid$img[grepl("\\.tif$", data_valid$img), drop = TRUE]
data_valid_img_png <- lapply(files_valid_img_tif,
      function(x) {
        # image_write(image_read(x), path = gsub(".tif$", ".png", x), format = "png")
        a = image_convert(image_read(x),  format = "png")
        image_write(a, path = gsub(".tif$", ".png", x), format = "png")
    }
  )

files_valid_mask_tif <- data_valid$mask[grepl("\\.tif$", data_valid$mask), drop = TRUE]
data_valid_mask_png <- lapply(files_valid_mask_tif,
     function(x) {
       # image_write(image_read(x), path = gsub(".tif$", ".png", x), format = "png")
       a = image_convert(image_read(x),  format = "png")
       image_write(a, path = gsub(".tif$", ".png", x), format = "png")
    }
  )

# Check that the TIF --> PNG conversion worked, inspect one case
head(list.files("/data/data/TCGA_HT_A61A_20000127"))
# data_valid  # check root directory

# Inspect some of the images/masks
# image_info(image_read(data_train_img_png[[3]]))
# image_write(image_read(data_train$img[3]), format = "tiff")
# image_write(image_read(data_train$img[3]), path = paste0(data_train$img[3], ".png"), format = "png")
# a <- image_read(paste0(data_train$img[3], ".png"))

# list.files(train_dir)
# To clean previous file references
# # delete a directory -- must add recursive = TRUE
# unlink("/data", recursive = TRUE); # Clean space # gc(full=T)

Derive a binary class label - cancer (for non-trivial tumor masks) or control (for empty tumor masks).

# Compute a new binary outcome variable 1=Brain Tumor (mask has at least 1 white pixel), 0=Normal Brain, no white pixels in the mask
pos_neg_diagnosis <- sapply(data_train$mask,
     function(x) {   value = max(imager::magick2cimg(image_read(x)))
         ifelse (value > 0, 1, 0)  }
  )
table(pos_neg_diagnosis)   #; head(data_train)

## pos_neg_diagnosis
##    0    1 
## 2164 1153

# pos_neg_diagnosis
#    0    1
# 2046 1103

# Add the normal vs. cancer label to training and testing datasets
data_train$label <- pos_neg_diagnosis

pos_neg_diagnosis_valid <- sapply(data_valid$mask,
     function(x) {   value = max(imager::magick2cimg(image_read(x)))
         ifelse (value > 0, 1, 0)  }
  )
table(pos_neg_diagnosis_valid)

## pos_neg_diagnosis_valid
##   0   1 
## 392 220

data_valid$label <- pos_neg_diagnosis_valid
# head(data_valid)

1.2 Torch-based Transfer Learning

Next we will ingest the 3-channel (RGB) imaging data and the corresponding tumor masks (binary images) for each participant. The method torch::dataset() allows specifying initialize() and .getitem() methods for complex computable data objects. The first method initialize() creates the archive of imaging and mask file names that can be utilized by the second method .getitem() for iterating over all cases. The method .getitem() returns ordered input-output pairs and performs weighted sampling, with prevalence to large lesion images, which is useful for accounting for DNN training with imbalanced classes.

The training sets can be enhanced by data augmentation – a process expanding the set of training images and masks via operations such as flipping, resizing, and rotating based on certain specifications.

Below we use PyTorch to define a brain_dataset method providing a larger augmented training dataset, new size length(train_ds) ~ 2K, and a larger validation set, new size length(valid_ds)~1K. In practice, we can use any alternative transfer-learning strategy including pytorch, tensorflow, theano, etc.

Note that unet training takes significant computational time; training 20-epochs took a total of 600 compute hours, which translates into a couple of days of computing on a 20-core server. We have provided several precomputed/pre-trained *.pt models on Canvas.

Below, we define a function brain_dataset() required for iterative retrieval of pytorch datasets. All such datasets need to have a method called initialize() instantiating the inventory of imaging and mask file names that will be used by the second method, .getitem(), to ingest the imaging data from these files, return input-image plus mask-target pairs, and perform data-augmentation. The parameter random_sampling=TRUE, .getitem() controls the weighted sample loading of image-mask pairs with larger in size tumors. This option is used with the training set to counter any class-label imbalances.

The training sets, but not the validation sets, use of data augmentation to ensure DCNN-invariance to specific spatiotemporal and intensity transformations. During imaging-data augmentation, the training images/masks may be flipped, resized, and rotated with specifiable probabilities for each type of augmentation transformation.

# install.packages(torch)
library(torch)
library(torchvision)

# data wrangling
library(tidyverse)
library(zeallot)   # needed for the piping function "%<-%" in "brain_dataset()"

# image processing and visualization
library(magick)
#library(cowplot)

# dataset loading
library(pins)
library(zip)

torch_manual_seed(1234)
set.seed(1234)



train_dir <- file.path(getwd(),"data","data")
valid_dir <- file.path(getwd(),"data","mri_valid")


brain_dataset <- dataset(
  name = "brain_dataset",
  # 1. Initialize
  initialize = function(img_dir, augmentation_params = NULL, random_sampling = FALSE) {
    self$images <- tibble(img = grep(list.files(img_dir, full.names = TRUE,
                                                pattern = "tif", recursive = TRUE),
                                     pattern = 'mask', invert = TRUE, value = TRUE),
                          mask = grep(list.files(img_dir, full.names = TRUE,
                                                 pattern = "tif", recursive = TRUE),
                                      pattern = 'mask',value = TRUE)
                          )
    self$slice_weights <- self$calc_slice_weights(self$images$mask)
    self$augmentation_params <- augmentation_params
    self$random_sampling <- random_sampling
  },

  # 2. Load and transform images from files into TF object elements (tensors)
  .getitem = function(i) {index <- if (self$random_sampling == TRUE) sample(1:self$.length(), 1, prob = self$slice_weights)
    else i
    img <- self$images$img[index] %>% image_read() %>% transform_to_tensor()
    mask <- self$images$mask[index] %>% image_read() %>% transform_to_tensor() %>% transform_rgb_to_grayscale() %>%
      torch_unsqueeze(1)
    img <- self$min_max_scale(img)

    if (!is.null(self$augmentation_params)) {
      scale_param <- self$augmentation_params[1]
      c(img, mask) %<-% self$resize(img, mask, scale_param)

      rot_param <- self$augmentation_params[2]
      c(img, mask) %<-% self$rotate(img, mask, rot_param)

      flip_param <- self$augmentation_params[3]
      c(img, mask) %<-% self$flip(img, mask, flip_param)
    }
    list(img = img, mask = mask)
  },

  # 3. Save the total number of imaging files
  .length = function() { nrow(self$images)  },

  # 4. Estimate 2D image weights: Bigger tumor-masks correspond to higher weights
  calc_slice_weights = function(masks) {
    weights <- map_dbl(masks, function(m) {
      img <- as.integer(magick::image_data(image_read(m), channels = "gray"))
      sum(img / 255)
    })

    sum_weights <- sum(weights)
    num_weights <- length(weights)

    weights <- weights %>% map_dbl(function(w) {
      w <- (w + sum_weights * 0.1 / num_weights) / (sum_weights * 1.1)
    })
    weights
  },

  # 5. Estimate the image intensity range (min, max)
  min_max_scale = function(x) {
    min = x$min()$item()
    max = x$max()$item()
    x$clamp_(min = min, max = max)
    x$add_(-min)$div_(max - min + 1e-5)
    x
  },

  # 6. Image tensor shape resizing (when necessary)
  resize = function(img, mask, scale_param) {
    img_size <- dim(img)[2]
    rnd_scale <- runif(1, 1 - scale_param, 1 + scale_param)
    img <- transform_resize(img, size = rnd_scale * img_size)
    mask <- transform_resize(mask, size = rnd_scale * img_size)
    diff <- dim(img)[2] - img_size
    if (diff > 0) {
      top <- ceiling(diff / 2)
      left <- ceiling(diff / 2)
      img <- transform_crop(img, top, left, img_size, img_size)
      mask <- transform_crop(mask, top, left, img_size, img_size)
    } else {
      img <- transform_pad(img,padding = -c(ceiling(diff/2),floor(diff/2),ceiling(diff/2),floor(diff/2)))
      mask <- transform_pad(mask, padding = -c(ceiling(diff/2), floor(diff/2),ceiling(diff/2),floor(diff/2)))
    }
    list(img, mask)
  },

  # 7. Rotation (if/when augmentation is requested)
  rotate = function(img, mask, rot_param) {
    rnd_rot <- runif(1, 1 - rot_param, 1 + rot_param)
    img <- transform_rotate(img, angle = rnd_rot)
    mask <- transform_rotate(mask, angle = rnd_rot)
    list(img, mask)
  },

  # 8. Flipping (if/when augmentation is requested)
  flip = function(img, mask, flip_param) {
    rnd_flip <- runif(1)
    if (rnd_flip > flip_param) {
      img <- transform_hflip(img)
      mask <- transform_hflip(mask)
    }
    list(img, mask)
  }
)

Next, using the brain_dataset() method, we actually generate the (training and validation) imaging datasets as computable objects using the raw filenames in the training and validation data-frames defined above.

train_ds <- brain_dataset(
  train_dir,
  augmentation_params = c(0.05, 15, 0.5),
  random_sampling = TRUE
)

length(train_ds)

## [1] 3317

# ~3K

valid_ds <- brain_dataset(
  valid_dir,
  augmentation_params = NULL,
  random_sampling = FALSE
)
length(valid_ds)

## [1] 612

# ~700

Let’s visualize one testing and one validation image-mask pairs using plot_ly().

# if (!require("BiocManager", quietly = TRUE))
#     install.packages("BiocManager")
#
# BiocManager::install("EBImage")
library(ggplot2)
library(patchwork) # for combining ggplots side‑by‑side
library (plotly)
rasterPlotly <- function (image, name="", hovermode = NULL) {
  myPlot <- plot_ly(type="image", z=image, name=name, hoverlabel=name, text=name,
            hovertext=name, hoverinfo="name+x+y") %>%
    layout(hovermode = hovermode, xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
  return(myPlot)
}

## Training Case 20
img_and_mask <- train_ds[20]
img <- img_and_mask[[1]]  # 3-channel image
img <- img_and_mask[[1]]$permute(c(2, 3, 1)) %>% as.array()  # 3-channel image
mask <- img_and_mask[[2]]$squeeze() %>% as.array() # tumor mask
# mask <- EBImage::rgbImage(255*mask, 255*mask, 255*mask)  # manually generate a gray scale mask image
# plot_ly(z=255*bird_images[i+ (j-1)*10,,,], type="image", showscale=FALSE)

# p1 <- rasterPlotly(image=255*img, name = "RGB Image", hovermode = "y unified")
# p2 <- rasterPlotly(mask, name = "Tumor Mask", hovermode = "y unified")
# subplot(p1,p2, shareY = TRUE) %>% layout(title="Training Case 20: 3-Channel Image (Left) & Tumor Mask (Right)")
# 
# # Validation Case 18
# img_and_mask <- valid_ds[18]
# img <- img_and_mask[[1]]
# mask <- img_and_mask[[2]]
# img <- img_and_mask[[1]]$permute(c(2, 3, 1)) %>% as.array()  # 3-channel image
# mask <- img_and_mask[[2]]$squeeze() %>% as.array() # tumor mask
# mask <- EBImage::rgbImage(255*mask, 255*mask, 255*mask)  # manually generate a gray scale mask image
# 
# p1 <- rasterPlotly(image=255*img, name = "RGB Image", hovermode = "y unified")
# p2 <- rasterPlotly(mask, name = "Tumor Mask", hovermode = "y unified")
# subplot(p1,p2, shareY = TRUE) %>% layout(title="Validation Case 18: 3-Channel Image (Left) & Tumor Mask (Right)")

# Convert to data frames for ggplot
df_img <- expand.grid(x = 1:ncol(img), y = 1:nrow(img))
df_img$value <- as.vector(img[,,1])   # use first channel for grayscale display
df_img$type <- "RGB Image"

df_mask <- expand.grid(x = 1:ncol(mask), y = 1:nrow(mask))
df_mask$value <- as.vector(mask)
df_mask$type <- "Tumor Mask"

# Combine and plot
bind_rows(df_img, df_mask) %>%
  ggplot(aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_wrap(~ type, nrow = 1) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        strip.text = element_text(size = 12, face = "bold")) +
  labs(title = "Training Case 20: 3‑Channel Image (Left) & Tumor Mask (Right)")

### Training Case 18
img_and_mask <- valid_ds[18]
img <- img_and_mask[[1]]$permute(c(2,3,1)) %>% as.array()
mask <- img_and_mask[[2]]$squeeze() %>% as.array()
# mask <- EBImage::rgbImage(255*mask, 255*mask, 255*mask)  # manually generate a gray scale mask image

df_img <- expand.grid(x = 1:ncol(img), y = 1:nrow(img))
df_img$value <- as.vector(img[,,1])
df_img$type <- "RGB Image"

df_mask <- expand.grid(x = 1:ncol(mask), y = 1:nrow(mask))
df_mask$value <- as.vector(mask)
df_mask$type <- "Tumor Mask"

bind_rows(df_img, df_mask) %>%
  ggplot(aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_wrap(~ type, nrow = 1) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        strip.text = element_text(size = 12, face = "bold")) +
  labs(title = "Validation Case 18: 3‑Channel Image (Left) & Tumor Mask (Right)")

Demonstrate training-data image-augmentation by using one image to generate 7 rows of 4 augmented images.

# # OLD PLOT_LY version
# N <- 7*4
# img_and_mask <- valid_ds[77]
# img <- img_and_mask[[1]]
# mask <- img_and_mask[[2]]
# 
# imgs <- map (1:N, function(i) {
#   # spatial-scale factor
#   c(img, mask) %<-% train_ds$resize(img, mask, 0.25)
#   c(img, mask) %<-% train_ds$flip(img, mask, 0.3)
#   c(img, mask) %<-% train_ds$rotate(img, mask, 45)
# 
#   img %>%
#     transform_rgb_to_grayscale() %>%
#     as.array() %>%
#     plot_ly(z=., type="heatmap", showscale = FALSE, name=paste0("AugmImg=", i)) %>%
#     layout(showlegend=FALSE,  hovermode = "y unified") #,
#           # yaxis = list(scaleratio = 1, scaleanchor = 'x'))
# })
# 
# # imgs[[2]]
# imgs %>%
#   subplot(nrows = 7, shareX = TRUE, shareY = TRUE, which_layout=1) %>%
#   layout(title="Validation Case 77: Random Rotation/Flop/Scale Image Augmentation")
library(ggplot2)
library(dplyr)
library(purrr)
library(tidyr)

N <- 7 * 4
img_and_mask <- valid_ds[77]
img <- img_and_mask[[1]]
mask <- img_and_mask[[2]]

all_imgs <- map_dfr(1:N, function(i) {
  c(img, mask) %<-% train_ds$resize(img, mask, 0.25)
  c(img, mask) %<-% train_ds$flip(img, mask, 0.3)
  c(img, mask) %<-% train_ds$rotate(img, mask, 45)

  mat <- img %>%
    transform_rgb_to_grayscale() %>%
    as.array()

  expand.grid(x = 1:ncol(mat), y = 1:nrow(mat)) %>%
    mutate(value = as.vector(mat),
           aug_id = paste0("AugmImg=", i))
})

ggplot(all_imgs, aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_wrap(~ aug_id, nrow = 7) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        plot.title = element_text(hjust = 0.5)) +
  labs(title = "Validation Case 77: Random Rotation/Flip/Scale Image Augmentation")

Instantiate training/validation data loaders using torch::dataloader(), which combines a data-set and a sampler-iterator into a single/multi-process iterator over the entire dataset.

batch_size <- 25
# train_dl <- dataloader(train_ds, batch_size)
# valid_dl <- dataloader(valid_ds, batch_size)

train_dl <- train_ds %>% dataloader(batch_size = batch_size, shuffle = TRUE)
valid_dl <- valid_ds %>% dataloader(batch_size = batch_size, shuffle = FALSE)

Next, we specify the Unet model indicating the number of “down” or encoding (analysis) depth for shrinking the input images and incrementing the number of filters, as well as specify how do we go “up” again during the decoding (synthesis) phase.

# forward(), keeps track of layer outputs seen going “down,” to be added back in going “up.”
unet <- nn_module(name="unet",
  initialize = function(channels_in = 3, n_classes = 1, depth = 5, n_filters = 6) {
    self$down_path <- nn_module_list()
    prev_channels <- channels_in
    for (i in 1:depth) {
      self$down_path$append(down_block(prev_channels, 2^(n_filters+i-1)))
      prev_channels <- 2^(n_filters+i-1)
    }

    self$up_path <- nn_module_list()

    for (i in ((depth - 1):1)) {
      self$up_path$append(up_block(prev_channels, 2^(n_filters+i-1)))
      prev_channels <- 2^(n_filters+i-1)
    }

    self$last = nn_conv2d(prev_channels, n_classes, kernel_size = 1)
  },

  forward = function(x) {
    blocks <- list()
    for (i in 1:length(self$down_path)) {
      x <- self$down_path[[i]](x)
      if (i != length(self$down_path)) {
        blocks <- c(blocks, x)
        x <- nnf_max_pool2d(x, 2)
      }
    }

    for (i in 1:length(self$up_path)) {
      x <- self$up_path[[i]](x, blocks[[length(blocks)-i+1]]$to(device=device))
    }

    torch_sigmoid(self$last(x))
  }
)

# unet utilizes `down_block` and `up_block`
# down_block delegates to its own workhorse, conv_block, and up_block “bridges” the UNET

down_block <- nn_module(
  classname="down_block",
  initialize = function(in_size, out_size) {
    self$conv_block <- conv_block(in_size, out_size)
  },

  forward = function(x) {   self$conv_block(x)  }
)

up_block <- nn_module(
  classname="up_block",
  initialize = function(in_size, out_size) {
    self$up = nn_conv_transpose2d(in_size, out_size, kernel_size=2, stride=2)
    self$conv_block = conv_block(in_size, out_size)
  },

  forward = function(x, bridge) {
    up <- self$up(x)
    torch_cat(list(up, bridge), 2) %>% self$conv_block()
  }
)

conv_block <- nn_module(
  classname="conv_block",
  initialize = function(in_size, out_size) {
    self$conv_block <- nn_sequential(
      nn_conv2d(in_size, out_size, kernel_size = 3, padding = 1),
      nn_relu(),
      nn_dropout(0.6),
      nn_conv2d(out_size, out_size, kernel_size = 3, padding = 1),
      nn_relu()
    )
  },

  forward = function(x){
    self$conv_block(x)
  }
)

Before we start the unet model training, we need to specify CPU/GPU device and optimization scheme.

# Initialize the model with appropriate CPU/GPU
device <- torch_device(if(cuda_is_available()) "cuda" else "cpu")
device <-"cpu"
model <- unet(depth = 5)$to(device = device)

# OPTIMIZATION

# DCNN model training using cross_entropy and dice_loss.
calc_dice_loss <- function(y_pred, y_true) {
  smooth <- 1
  y_pred <- y_pred$view(-1)
  y_true <- y_true$view(-1)
  intersection <- (y_pred * y_true)$sum()
  1 - ((2*intersection+smooth)/(y_pred$sum()+y_true$sum()+smooth))
}

dice_weight <- 0.3
#     learning_rate = 0.1
optimizer <- optim_sgd(model$parameters, lr = 0.1, momentum = 0.9)

On a multi-core machine, training the UNET model will require ~2-hrs per epoch. This step can be skipped if you load in a previously pre-trained model (model) that is saved (torch_save(model, "/path/model.pt")) and available for import (torch_load("/path/model.ptt")), see several *.pt model here.

num_epochs <- 1  # for knitting, otherwise increase to 5+

scheduler <- lr_one_cycle(optimizer, max_lr = 0.1, steps_per_epoch = length(train_dl), epochs = num_epochs)

# TRAINING
train_batch <- function(b) {
  optimizer$zero_grad()
  output <- model(b[[1]]$to(device = device))
  target <- b[[2]]$to(device = device)

  bce_loss <- nnf_binary_cross_entropy(output, target)
  dice_loss <- calc_dice_loss(output, target)
  loss <-  dice_weight*dice_loss + (1-dice_weight)*bce_loss

  loss$backward()
  optimizer$step()
  scheduler$step()

  list(bce_loss$item(), dice_loss$item(), loss$item())
}

valid_batch <- function(b) {
  output <- model(b[[1]]$to(device = device))
  target <- b[[2]]$to(device = device)

  bce_loss <- nnf_binary_cross_entropy(output, target)
  dice_loss <- calc_dice_loss(output, target)
  loss <-  dice_weight * dice_loss + (1-dice_weight)*bce_loss

  list(bce_loss$item(), dice_loss$item(), loss$item())
}

Start the Unet training process (this is suspended here to enable real-time knitting of the HTML doc).

for (epoch in 1:num_epochs) {
  model$train()
  train_bce <- c()
  train_dice <- c()
  train_loss <- c()

  coro::loop(for (b in train_dl) {
    c(bce_loss, dice_loss, loss) %<-% train_batch(b)
    train_bce <- c(train_bce, bce_loss)
    train_dice <- c(train_dice, dice_loss)
    train_loss <- c(train_loss, loss)
  })

  torch_save(model, paste0(getwd(),"/model_", epoch, ".pt"))

  cat(sprintf("\nEpoch %d, training: loss:%3f, bce: %3f, dice: %3f\n",
              epoch, mean(train_loss), mean(train_bce), mean(train_dice)))

  model$eval()
  valid_bce <- c()
  valid_dice <- c()
  valid_loss <- c()

  i <- 0
  coro::loop(for (b in valid_dl) {

    i <<- i + 1
    c(bce_loss, dice_loss, loss) %<-% valid_batch(b)
    valid_bce <- c(valid_bce, bce_loss)
    valid_dice <- c(valid_dice, dice_loss)
    valid_loss <- c(valid_loss, loss)

  })

  cat(sprintf("\n Epoch %d, validation: loss: %3f, bce: %3f, dice: %3f\n",
              epoch, mean(valid_loss), mean(valid_bce), mean(valid_dice)))
}

# note that DCNN model mask prediction will be based on the model complexity!

# Epoch 1, training: loss:0.495716, bce: 0.307674, dice: 0.934480
# Epoch 1, validation: loss: 0.336556, bce: 0.070743, dice: 0.956785
#
# Epoch 2, training: loss:0.290850, bce: 0.114878, dice: 0.701452
# Epoch 2, validation: loss: 0.213417, bce: 0.046690, dice: 0.602447
#
# Epoch 3, training: loss:0.204054, bce: 0.102973, dice: 0.439909
# Epoch 3, validation: loss: 0.218209, bce: 0.051383, dice: 0.607468
#
# Epoch 4, training: loss:0.199831, bce: 0.101564, dice: 0.429123
# Epoch 4, validation: loss: 0.243201, bce: 0.068757, dice: 0.650236
#
# Epoch 5, training: loss:0.201025, bce: 0.101525, dice: 0.433192
# Epoch 5, validation: loss: 0.258278, bce: 0.077821, dice: 0.679345
#
# Epoch 6, training: loss:0.188058, bce: 0.094279, dice: 0.406877
# Epoch 6, validation: loss: 0.206198, bce: 0.038769, dice: 0.596865
#
# Epoch 7, training: loss:0.187285, bce: 0.094894, dice: 0.402866
# Epoch 7, validation: loss: 0.216684, bce: 0.050536, dice: 0.604361
#
# Epoch 8, training: loss:0.184097, bce: 0.093609, dice: 0.395237
# Epoch 8, validation: loss: 0.235675, bce: 0.060247, dice: 0.645006
#
# Epoch 9, training: loss:0.179406, bce: 0.090458, dice: 0.386951
# Epoch 9, validation: loss: 0.238004, bce: 0.055283, dice: 0.664353
#
# Epoch 10, training: loss:0.175083, bce: 0.089144, dice: 0.375608
# Epoch 10, validation: loss: 0.258206, bce: 0.080033, dice: 0.673942
#
# Epoch 11, training: loss:0.175821, bce: 0.089795, dice: 0.376549
# Epoch 11, validation: loss: 0.213895, bce: 0.037936, dice: 0.624464
#
# Epoch 12, training: loss:0.174915, bce: 0.087026, dice: 0.379987
# Epoch 12, validation: loss: 0.264673, bce: 0.083884, dice: 0.686514
#
# Epoch 13, training: loss:0.171973, bce: 0.087210, dice: 0.369754
# Epoch 13, validation: loss: 0.236396, bce: 0.065432, dice: 0.635311
#
# Epoch 14, training: loss:0.167253, bce: 0.084660, dice: 0.359970
# Epoch 14, validation: loss: 0.210641, bce: 0.047418, dice: 0.591495
#
# Epoch 15, training: loss:0.170835, bce: 0.083572, dice: 0.374449
# Epoch 15, validation: loss: 0.246905, bce: 0.067056, dice: 0.666552
#
# Epoch 16, training: loss:0.167938, bce: 0.083285, dice: 0.365460
# Epoch 16, validation: loss: 0.228558, bce: 0.057135, dice: 0.628546
#
# Epoch 17, training: loss:0.161383, bce: 0.080367, dice: 0.350421
# Epoch 17, validation: loss: 0.221012, bce: 0.055709, dice: 0.606719
#
# Epoch 18, training: loss:0.158193, bce: 0.078068, dice: 0.345150
# Epoch 18, validation: loss: 0.222881, bce: 0.052606, dice: 0.620188
#
# Epoch 19, training: loss:0.158190, bce: 0.078865, dice: 0.343280
# Epoch 19, validation: loss: 0.223159, bce: 0.054643, dice: 0.616361
#
# Epoch 20, training: loss:0.161496, bce: 0.079807, dice: 0.352102
# Epoch 20, validation: loss: 0.221487, bce: 0.051793, dice: 0.617438

# > proc.time()  # In Seconds! About 2,154,523 sec ~ 598 hours
#      user    system   elapsed
# 2154523.7  776791.3  141279.2

Evaluate the trained DCNN model using model_6.pt and a batch of $n=10$ validation datasets (2D brain imaging scans).

# EVALUATION
# without random sampling, we'd mainly see lesion-free patches
######  Old interactive plot_ly() version
# N <- 10 # number of cases to predict the masks for
# 
# eval_ds <- brain_dataset(valid_dir, augmentation_params = NULL, random_sampling = TRUE)
# eval_dl <- dataloader(eval_ds, batch_size = N)
# 
# batch <- eval_dl %>% dataloader_make_iter() %>% dataloader_next()
# 
# # Load a previously save torch model
# # epoch = 1
# # torch_load(model, paste0("model_", epoch, ".pt"))
# 
# # torch_save(model, "C:/Users/Dinov/Desktop/model_epoch_1.pt")
# 
# ##### Requirements for loading a pre-trained model .....
# # load torch packages above
# # The pre-computed model files are available on Canvas:
# # https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes
# #  localModelsFolder <- "C:/Users/IvoD/Desktop/Ivo.dir/Research/UMichigan/Publications_Books/2023/DSPA_Springer_2nd_Edition_2023/Rmd_HTML/appendix/"
# 
# localModelsFolder <-getwd()
# 
# model <- torch_load(paste0(localModelsFolder, "/appendix/model_6.pt"))
# # train_dir <- "/data/data"
# # valid_dir <- "/data/mri_valid"
# # brain_dataset ...
# # train_ds; valid_ds
# # eval_ds <- brain_dataset(valid_dir,augmentation_params=NULL,random_sampling=TRUE)
# # eval_dl <- dataloader(eval_ds, batch_size = 8)
# # batch <- eval_dl %>% dataloader_make_iter() %>% dataloader_next()
# # device <- torch_device(if(cuda_is_available()) "cuda" else "cpu")
# # calc_dice_loss
# library(plotly)
# 
# imgsTruePred <- map (1:N, function(i) {
#   # Get the 3 images
#   img <- batch[[1]][i, .., drop = FALSE]                            # Image to predict the tumor mask for
#   inferred_mask <- model(img$to(device = device))                   # Predicted MASK
#   true_mask <- batch[[2]][i, .., drop = FALSE]$to(device = device)  # True manual mask delineation
# 
#   # compute/report the BCE/Dice performance metrics
#   bce <- nnf_binary_cross_entropy(inferred_mask, true_mask)$to(device = "cpu") %>% as.numeric()
#   dc <- calc_dice_loss(inferred_mask, true_mask)$to(device = "cpu") %>% as.numeric()
#   cat(sprintf("\nSample %d, bce: %3f, dice: %3f\n", i, bce, dc))
# 
#   # extract the inferred predicted mask as a 2D image/array of probability values per voxel!
#   inferred_mask <- inferred_mask$to(device = "cpu") %>% as.array() %>% .[1, 1, , ]
#   # Binarize the probability tumor prediction to binary mask
#   inferred_mask <- ifelse(inferred_mask > 0.48, 1, 0)  # In a real run, use "inferred_mask > 0.5, 1, 0"
#   # hist(as.matrix(inferred_mask[1,1,,]))
# 
#   imgs <- img[1, 1, ,] %>% as.array() %>% as.array() %>%
#     plot_ly(z=., type="heatmap", showscale = FALSE, name="Image")  %>%
#       layout(showlegend=FALSE,  yaxis = list(scaleanchor = "x", scaleratio = 1))    # hovermode = "y unified"
#   masks <- true_mask$to(device = "cpu")[1, 1, ,] %>% as.array() %>% as.array() %>%
#     plot_ly(z=., type="heatmap", showscale = FALSE, name="True Mask") %>%
#       layout(showlegend=FALSE,  yaxis = list(scaleanchor = "x", scaleratio = 1))    # hovermode = "y unified",
#   predMasks <- inferred_mask %>% as.array() %>%
#     plot_ly(z=., type="heatmap", showscale = FALSE, name="Unet-Derived Mask") %>%
#         layout(showlegend=FALSE,  yaxis = list(scaleanchor = "x", scaleratio = 1))  # hovermode = "y unified",
#   rowSubPlots <- subplot(imgs, masks, predMasks, nrows = 1, shareY = TRUE) %>%
#             # , widths = c(0.3, 0.3, 0.3))
#     layout(hovermode = "y unified") #, yaxis = list(scaleanchor = "x", scaleratio = 1))
# })

library(ggplot2)
library(dplyr)
library(purrr)
library(tidyr)

# Evaluate on N validation cases (same as original)
N <- 10
eval_ds <- brain_dataset(valid_dir, augmentation_params = NULL, random_sampling = TRUE)
eval_dl <- dataloader(eval_ds, batch_size = N)
batch <- eval_dl %>% dataloader_make_iter() %>% dataloader_next()

# Load pre-trained torch model (adjust path as needed)
model <- torch_load(paste0(getwd(), "/appendix/model_6.pt"))

# We'll collect all plots into one data frame
all_cases <- map_dfr(1:N, function(i) {
  # Get image, true mask, predicted mask
  img_tensor <- batch[[1]][i, .., drop = FALSE]          # shape (1,3,256,256) or similar
  true_mask_tensor <- batch[[2]][i, .., drop = FALSE]
  
  # Predict
  inferred_mask_tensor <- model(img_tensor$to(device = device))
  
  # Compute metrics (optional, but keep for console output)
  bce <- nnf_binary_cross_entropy(inferred_mask_tensor, true_mask_tensor)$to(device = "cpu") %>% as.numeric()
  dc <- calc_dice_loss(inferred_mask_tensor, true_mask_tensor)$to(device = "cpu") %>% as.numeric()
  cat(sprintf("\nSample %d, bce: %3f, dice: %3f\n", i, bce, dc))
  
  # Convert to 2D matrices (use first channel, first batch dimension)
  img_mat <- img_tensor$to(device = "cpu")[1, 1, , ] %>% as.array()   # choose one channel for display
  true_mat <- true_mask_tensor$to(device = "cpu")[1, 1, , ] %>% as.array()
  
  # Predicted mask: probability -> binary
  prob_mat <- inferred_mask_tensor$to(device = "cpu") %>% as.array() %>% .[1, 1, , ]
  pred_mat <- ifelse(prob_mat > 0.48, 1, 0)
  
  # Build data frames for each type
  bind_rows(
    expand.grid(x = 1:ncol(img_mat), y = 1:nrow(img_mat)) %>%
      mutate(value = as.vector(img_mat),
             case = paste0("Case ", i),
             type = "Brain Image"),
    expand.grid(x = 1:ncol(true_mat), y = 1:nrow(true_mat)) %>%
      mutate(value = as.vector(true_mat),
             case = paste0("Case ", i),
             type = "True Mask"),
    expand.grid(x = 1:ncol(pred_mat), y = 1:nrow(pred_mat)) %>%
      mutate(value = as.vector(pred_mat),
             case = paste0("Case ", i),
             type = "Predicted Mask")
  )
})

## 
## Sample 1, bce: 0.316758, dice: 0.771990
## 
## Sample 2, bce: 0.057257, dice: 0.983731
## 
## Sample 3, bce: 0.033557, dice: 0.126262
## 
## Sample 4, bce: 0.031052, dice: 0.566501
## 
## Sample 5, bce: 0.085775, dice: 0.217008
## 
## Sample 6, bce: 0.086270, dice: 0.215835
## 
## Sample 7, bce: 0.197902, dice: 0.480702
## 
## Sample 8, bce: 0.032555, dice: 0.177114
## 
## Sample 9, bce: 0.157686, dice: 0.461385
## 
## Sample 10, bce: 0.189908, dice: 0.350955

# Create static ggplot grid
ggplot(all_cases, aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_grid(case ~ type) +   # rows = cases, columns = image types
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        strip.text = element_text(size = 10, face = "bold"),
        strip.background = element_blank()) +
  labs(title = paste0("Model Validation (Torch) – ", N, " Cases\nBrain Image | True Mask | Predicted Mask"))

Finally, we can assess the unet model performance on the independent validation images, report the DCNN-derived tumor masks (as images) and some quantitative measures (e.g., Binary Cross-Entropy, Dice coefficient). The figure displays the results - brain image, true mask, and DCNN-estimated mask.

1.3 Torch-based DNN-based Image Reconstruction and Synthetic Image Generation

1.3.1 Data Reparametrization

This image-normalization (standardization) is necessary to avoid extreme differences in image intensities (e.g., sMRI, fMRI, PET, SPECT, dMDI/DTI, MRS, and other imaging modalities tend to have vastly different scales).

reparametrize <- function(mu, logvar){
    std = torch_exp(0.5*logvar)
    eps = torch_randn_like(std)
    z = mu + std * eps
    return (z)
}

1.3.2 Define the Architecture of the UNet Encoder Branch

h_dim <- c(32, 64, 128, 256, 512)

net_encoder <- nn_module(
        initialize = function(latent_size=1000){

        self$latent_size <- latent_size  # Z

        # h_dim <- c(32, 64, 128, 256, 512)
        in_channels = 3

        # Encoder

        self$encoder <- nn_sequential(
                nn_conv2d(in_channels, out_channels=h_dim[1],
                          kernel_size= 3, stride= 2, padding = 1),
                nn_batch_norm2d(h_dim[1]),
                nn_leaky_relu(),

                nn_conv2d(h_dim[1], out_channels=h_dim[2],
                          kernel_size= 3, stride= 2, padding = 1),
                nn_batch_norm2d(h_dim[2]),
                nn_leaky_relu(),

                nn_conv2d(h_dim[2], out_channels=h_dim[3],
                          kernel_size= 3, stride= 2, padding = 1),
                nn_batch_norm2d(h_dim[3]),
                nn_leaky_relu(),

                nn_conv2d(h_dim[3], out_channels=h_dim[4],
                          kernel_size= 3, stride= 2, padding = 1),
                nn_batch_norm2d(h_dim[4]),
                nn_leaky_relu(),

                nn_conv2d(h_dim[4], out_channels=h_dim[5],
                          kernel_size= 3, stride= 2, padding = 1),
                nn_batch_norm2d(h_dim[5]),
                nn_leaky_relu()
                )

        self$mu_layer <- nn_linear(h_dim[5]*64, latent_size)
        self$logvar_layer <- nn_linear(h_dim[5]*64, latent_size)
        },

        forward = function(x){

            list_output = c()
            hidden = self$encoder(x)
            hidden = torch_flatten(hidden, start_dim=2)
            mu = self$mu_layer(hidden)
            logvar = self$logvar_layer(hidden)
            z = reparametrize(mu, logvar)
            list_output = append(list_output, z)
            list_output = append(list_output, mu)
            list_output = append(list_output, logvar)

            return (list_output)
        }
)

1.3.3 Define the Architecture of the UNet Decoder Branch

net_decoder <- nn_module(
        initialize = function(latent_size=1000){

        self$latent_size <- latent_size  # Z

        self$decoder_input <- nn_linear(latent_size, h_dim[5]*64)

        hidden_dims = c(32, 64, 128, 256, 512)

        # decoder
        self$decoder <- nn_sequential(
                nn_conv_transpose2d(hidden_dims[5],
                                   hidden_dims[4],
                                   kernel_size=3,
                                   stride = 2,
                                   padding=1,
                                   output_padding=1),
                nn_batch_norm2d(hidden_dims[4]),
                nn_leaky_relu(),

                nn_conv_transpose2d(hidden_dims[4],
                                   hidden_dims[3],
                                   kernel_size=3,
                                   stride = 2,
                                   padding=1,
                                   output_padding=1),
                nn_batch_norm2d(hidden_dims[3]),
                nn_leaky_relu(),

                nn_conv_transpose2d(hidden_dims[3],
                                   hidden_dims[2],
                                   kernel_size=3,
                                   stride = 2,
                                   padding=1,
                                   output_padding=1),
                nn_batch_norm2d(hidden_dims[2]),
                nn_leaky_relu(),

                nn_conv_transpose2d(hidden_dims[2],
                                   hidden_dims[1],
                                   kernel_size=3,
                                   stride = 2,
                                   padding=1,
                                   output_padding=1),
                nn_batch_norm2d(hidden_dims[1]),
                nn_leaky_relu()
                )

        self$final_layer <- nn_sequential(
                            nn_conv_transpose2d(hidden_dims[1],
                                               hidden_dims[1],
                                               kernel_size=3,
                                               stride=2,
                                               padding=1,
                                               output_padding=1),
                            nn_batch_norm2d(hidden_dims[1]),
                            nn_leaky_relu(),
                            nn_conv2d(hidden_dims[1], out_channels= 3,
                                      kernel_size= 3, padding= 1),
                            nn_sigmoid())
        },

        forward = function(z){

            z = self$decoder_input(z)
            z = z$view(c(-1, 512, 8, 8))
            x_hat = self$final_layer(self$decoder(z))

            return (x_hat)
        }
)

1.3.4 Define the Loss Function & Optimizer

loss_function <- function(x_hat, x, mu, logvar){
    N = mu$shape[1]
    Z = mu$shape[2]
    loss = nnf_binary_cross_entropy(x_hat, x, reduction='sum') -0.5*torch_sum(logvar-mu^2-torch_exp(logvar))-0.5*Z*N
    return (loss/N)
}

1.3.5 Train DNN UNet (Encoder & Decoder Branches)

For GPU optimization of this long calculation see this GPUComputingWithR module.

model_encoder = net_encoder()
model_decoder = net_decoder()
# model = model$to(device='cuda')
optimizer <- optim_adam(model_decoder$parameters)
# optimizer <- optim_adam(model$parameters)
# train_data <- train_data / 255 # torch tensor with shape (N, C, H, W)

# torch_model$cuda()

# Epochs
capture.output(     # Capture/Suppress Pytorch output, which overwhelms the knitted HTML report
  for (epoch in 1:5) {   # could increase the number of epochs for better results
    l <- c()
    coro::loop(for (b in train_dl) {
      optimizer$zero_grad()
      gnd = b[[1]]
      list_output <- model_encoder(gnd)
      z = list_output[[1]]
      mu = list_output[[2]]
      logvar = list_output[[3]]
      output <- model_decoder(z)

      loss <- loss_function(output, gnd, mu, logvar)
      print(loss)
      loss$backward()
      optimizer$step()
      l <- c(l, loss$item())
    })

    cat(sprintf("Loss at epoch %d: %3f\n", epoch, mean(l)))
  })

# capture.output(     # Capture/Suppress Pytorch output, which overwhelms the knitted HTML report
#   for (epoch in 1:5) {   # could increase the number of epochs for better results
#     l <- c()
#     coro::loop(for (b in train_dl) {
#       # optimizer$zero_grad()
#       encoderOptimizer$zero_grad()
#       decoderOptimizer$zero_grad()
#       gnd = b[[1]]
#       list_output <- model_encoder(gnd)
#       z = list_output[[1]]
#       mu = list_output[[2]]
#       logvar = list_output[[3]]
#       output <- model_decoder(z)
#
#       loss <- loss_function(output, gnd, mu, logvar)
#       print(loss)
#       loss$backward()
#       # optimizer$step()
#       encoderOptimizer$step()
#       decoderOptimizer$step()
#       l <- c(l, loss$item())
#     })
#
#     cat(sprintf("Loss at epoch %d: %3f\n", epoch, mean(l)))
#   })

1.3.6 Obtain the DNN-derived Image Reconstructions and Random Synthetic Images

The synthetic images are derived by randomly sampling the latent space of the DNN (bottom of network) and using solely the decoding branch of the UNet.

img <- (train_ds[1][1]$img)$view(c(1, 3, 256, 256))
z = (model_encoder(img))[[1]]
reconstructed <-  model_decoder(z)
reconstructed <- (reconstructed$view(c(3, 256, 256)))$permute(c(2, 3, 1)) %>% as.array()

# Generate a random tensor of numbers in the Latent space using a uniform distribution on the interval [0,1)
# these will be used to seed the VAE decoder branch of the Unet and synthetically generate 2D brain images
z <- torch_rand(c(1, 1000))
generated_images <- model_decoder(z)
generated_images <- (generated_images$view(c(3, 256, 256)))$permute(c(2, 3, 1)) %>% as.array()

1.3.7 Result Visualization

Finally, plot examples of the DNN-derived image reconstruction and synthetic image generation.

library (plotly)
rasterPlotly <- function (image, name="", hovermode = NULL) {
  myPlot <- plot_ly(type="image", z=image, name=name, hoverlabel=name, text=name,
            hovertext=name, hoverinfo="name+x+y+z") %>%
    layout(title=name, hovermode = hovermode, xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
  return(myPlot)
}

p1 <- rasterPlotly(image=255*reconstructed, name = "DNN-reconstructed RGB Image")
# rasterPlotly(image=255*generated_images, name = "Synthetic DNN-generated RGB Image")
p2 <- rasterPlotly(image=20/(0.1+generated_images), name = "Synthetic DNN-generated RGB Image")
# subplot(p1, p2) %>% layout(title="DNN (Unet) Image Reconstruction and Synthetic Generation")
p1

p2

2 Tensorflow Image Pre-processing Pipeline

The image prepossessing steps above use torch syntax. This blog provides context on similarities and differences between torch/pytorch/libtorch and tensorflow/keras.

Next, we will use tensorflow (TF) to again ingest independently the images employing tf$image functions, e.g., decode_png(). This introduction to RStudio-Tensorflow is helpful to understand the tf syntax.

Assuming the data is already loaded, see the previous section (Torch Data Import), we will first preload all the necessary tensorflow libraries.

# clean environment
rm(list = ls())
gc()

##            used  (Mb) gc trigger  (Mb) max used  (Mb)
## Ncells  3534060 188.8    7236325 386.5  7236325 386.5
## Vcells 16495443 125.9   53993932 412.0 84365517 643.7

venv_name <- "r-tensorflow"
reticulate::use_virtualenv(virtualenv = venv_name, required = TRUE)
library(reticulate)

library(tensorflow)
library(tfdatasets)
library(rsample)  # for training() method
# library(reticulate)
library(purrr)
library(keras)
library(unet)
library(tibble)
library(plotly)

Remember that we already have the ZIP data (training=data and validation=mri_valid) downloaded and expanded in a local partition /data/. We will split the data into training:testing $80:20$ and read the imaging/masking data from the PNG files into tensorflow data objects training_dataset and testing_dataset (this is the separate validation set).

library(rsample)

####    OLD
# train_dir <- "/data/data"
# valid_dir <- "/data/mri_valid"
#
# # Load PNG images
# data_train <- tibble(
#   img = grep(list.files(train_dir, full.names = TRUE, pattern = "\\.png$", recursive = TRUE), # or "tif"
#         pattern = 'mask', invert = TRUE, value = TRUE),
#   mask = grep(list.files(train_dir, full.names = TRUE, pattern = "\\.png$", recursive = TRUE),
#         pattern = 'mask', value = TRUE)
# )
# length(data_train$mask)  # [1] 2520
# data_train$img[[1]]  # [1] "/data/data/TCGA_CS_4941_19960909/TCGA_CS_4941_19960909_1.tif"

# Use relative paths (assuming the data folders are inside the project's 'data/' directory)
train_dir <- "./data/data"
valid_dir <- "./data/mri_valid"

# Check that directories exist
stopifnot(dir.exists(train_dir), dir.exists(valid_dir))

# Use the correct file extension: .tif (not .png)
pattern <- "\\.png$"

data_train <- tibble(
  img = grep(list.files(train_dir, full.names = TRUE, pattern = pattern, recursive = TRUE),
             pattern = 'mask', invert = TRUE, value = TRUE),
  mask = grep(list.files(train_dir, full.names = TRUE, pattern = pattern, recursive = TRUE),
             pattern = 'mask', value = TRUE)
)

data_valid <- tibble(
  img = grep(list.files(valid_dir, full.names = TRUE, pattern = pattern, recursive = TRUE),
             pattern = 'mask', invert = TRUE, value = TRUE),
  mask = grep(list.files(valid_dir, full.names = TRUE, pattern = pattern, recursive = TRUE),
             pattern = 'mask', value = TRUE)
)

# Confirm non‑empty data frames
if(nrow(data_train) == 0) stop("No training images found. Check directory and file pattern.")
if(nrow(data_valid) == 0) stop("No validation images found.")

data_train$img[[2520]]; data_train$mask[[2520]]

## [1] "./data/data/TCGA_HT_7605_19950916/TCGA_HT_7605_19950916_20.png"

## [1] "./data/data/TCGA_HT_7605_19950916/TCGA_HT_7605_19950916_20_mask.png"

# ##### Convert all TIFF files to PNG (recommended for simplicity)
# ##### Run this once before building the TensorFlow datasets:
# library(magick)
#
# convert_tiff_to_png <- function(dir_path) {
#   tiff_files <- list.files(dir_path, pattern = "\\.tif$", full.names = TRUE, recursive = TRUE)
#   for (f in tiff_files) {
#     img <- image_read(f)
#     png_path <- sub("\\.tif$", ".png", f)
#     image_write(img, path = png_path, format = "png")
#     # Optional: delete original TIFF to save space
#     # unlink(f)
#   }
# }
#
# convert_tiff_to_png(train_dir)
# convert_tiff_to_png(valid_dir)

# # Or Load the TIFF images
# data_train <- tibble(
#   img = grep(list.files(train_dir, full.names = TRUE, pattern = "*\\d\\.tif$", recursive = TRUE),
#              pattern = 'mask', invert = TRUE, value = TRUE),
#   mask = grep(list.files(train_dir, full.names = TRUE, pattern = "*mask\\.tif$", recursive = TRUE),
#             pattern = 'mask', value = TRUE)
# )
# length(data_train$mask)  # [1] 3216
# data_train$img[[1]]  # [1] "/data/data/TCGA_CS_4941_19960909/TCGA_CS_4941_19960909_1.tif"
# data_train$img[[3216]]; data_train$mask[[3216]]

data_train_valid <- initial_split(data_train, prop = 0.8)

# data_train_valid$data$img[1111];  data_train_valid$data$mask[1111]
# # "/data/data/TCGA_DU_7299_19910417/TCGA_DU_7299_19910417_17_mask.png"
# tarfile <- 'C:/Users/Dinov/Desktop/BrainTumorImagingDataZipArchive.tgz'
# tar(tarfile,'/data/',compression='gzip')
# untar(tarfile,'/data/',compression='gzip')

# Check the tensor shape
# tf$image$decode_png(tf$io$read_file(data_train$img[[1]]))
# Inspect the structure of  data_train_valid
# str(data_train_valid)

training_dataset <- training(data_train_valid) %>%
  tensor_slices_dataset() %>%
  dataset_map(~.x %>% list_modify(
    # decode_jpeg yields a 3d tensor of shape (256, 256, 3)
    # Check tensor shapes!
    img = tf$image$decode_png(tf$io$read_file(.x$img), channels = 3),
    # img = ifelse(as.character(tf$image$decode_png(tf$io$read_file(.x$img)))$shape=="(256, 256, 1)",
    #                    fix3ChannelImg(tf$image$decode_png(tf$io$read_file(.x$img))),
    #                    tf$image$decode_png(tf$io$read_file(.x$img))),
    # Note that decode_gif yields a 4d tensor of shape (1, 256, 256, 3),
    # so we remove the unneeded batch dimension and all but one
    # of the 3 (identical) channels: tf$image$decode_gif(tf$io$read_file(.x$mask))[1,,,][,,1,drop=FALSE]
    mask = tf$image$decode_png(tf$io$read_file(.x$mask))
    # wrong_tensor_shape = (as.character(.x$shape) =="(256, 256, 1)")
    # if (as.character(img$shape) == "(256, 256, 1)")
    #   img = tf$image$grayscale_to_rgb(img)
  ))
  # %>%
  # tfdatasets::dataset_filter(
  #      function(record) {
  #         tf$equal(as.character(record$img$shape), "(256, 256, 3)")
  #      }
  #   )

testing_dataset <- testing(data_train_valid) %>%
  tensor_slices_dataset() %>%
  dataset_map(~.x %>% list_modify(
    # decode_jpeg yields a 3d tensor of shape (256, 256, 3)
    # Check tensor shapes!
    img = tf$image$decode_png(tf$io$read_file(.x$img), channels = 3),
    # img = ifelse(as.character(tf$image$decode_png(tf$io$read_file(.x$img)))$shape=="(256, 256, 1)",
    #                    fix3ChannelImg(tf$image$decode_png(tf$io$read_file(.x$img))),
    #                    tf$image$decode_png(tf$io$read_file(.x$img))),
    # Note that decode_gif yields a 4d tensor of shape (1, 256, 256, 3),
    # so we remove the unneeded batch dimension and all but one
    # of the 3 (identical) channels: tf$image$decode_gif(tf$io$read_file(.x$mask))[1,,,][,,1,drop=FALSE]
    mask = tf$image$decode_png(tf$io$read_file(.x$mask))
  ))

# Check the size of the entire (training & testing) TF datasets
tf$data$experimental$cardinality(training_dataset)$numpy()  # [1] 2572

## [1] 2653

tf$data$experimental$cardinality(testing_dataset)$numpy()   # [1] 644

## [1] 664

example <- training_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next(); example

## $img
## tf.Tensor(
## [[[0 0 0]
##   [0 0 0]
##   [0 0 0]
##   ...
##   [0 0 0]
##   [0 0 0]
##   [0 0 0]]
## 
##  [[0 0 0]
##   [1 0 0]
##   [1 0 0]
##   ...
##   [0 0 0]
##   [0 0 0]
##   [0 0 0]]
## 
##  [[0 0 0]
##   [0 0 0]
##   [0 0 0]
##   ...
##   [0 0 0]
##   [0 0 0]
##   [0 0 0]]
## 
##  ...
## 
##  [[0 0 0]
##   [0 0 0]
##   [0 0 0]
##   ...
##   [0 0 0]
##   [0 0 0]
##   [0 0 0]]
## 
##  [[0 0 0]
##   [0 0 0]
##   [0 0 0]
##   ...
##   [0 0 0]
##   [0 0 0]
##   [0 0 0]]
## 
##  [[0 0 0]
##   [0 0 0]
##   [0 0 0]
##   ...
##   [0 0 0]
##   [0 0 0]
##   [0 0 0]]], shape=(256, 256, 3), dtype=uint8)
## 
## $mask
## tf.Tensor(
## [[[0]
##   [0]
##   [0]
##   ...
##   [0]
##   [0]
##   [0]]
## 
##  [[0]
##   [0]
##   [0]
##   ...
##   [0]
##   [0]
##   [0]]
## 
##  [[0]
##   [0]
##   [0]
##   ...
##   [0]
##   [0]
##   [0]]
## 
##  ...
## 
##  [[0]
##   [0]
##   [0]
##   ...
##   [0]
##   [0]
##   [0]]
## 
##  [[0]
##   [0]
##   [0]
##   ...
##   [0]
##   [0]
##   [0]]
## 
##  [[0]
##   [0]
##   [0]
##   ...
##   [0]
##   [0]
##   [0]]], shape=(256, 256, 1), dtype=uint8)

# Check to confirm all brain-images (tensors) are of the same RGB 3-channel shape (256, 256, 3)
capture.output(
  tryCatch({
    ind2 <- list(); i <- 1
    iter <- make_iterator_one_shot(training_dataset)
    until_out_of_range({
        case <- iterator_get_next(iter)
        ind2[[i]] <- ifelse (as.character(case[[1]]$shape)=="(256, 256, 1)", "Incorrect", "Correct")
        if (ind2[[i]] =="Incorrect") print(".")
        i <- i+1
        # print("."); # str(case)
    })
    print("Check if any brain-images (tensors) are NOT of the correct RGB 3-channel shape (256, 256, 3)")
    table(unlist(ind2))
  }, error = function(e) e, finally = print("Done!")))

## [1] "[1] \"Check if any brain-images (tensors) are NOT of the correct RGB 3-channel shape (256, 256, 3)\""
## [2] "[1] \"Done!\""                                                                                       
## [3] ""                                                                                                    
## [4] "Correct "                                                                                            
## [5] "   2653 "

In practice, the uint8 data type of the RGB values in PNG/TIFF files are human-interpretable, the U-net expects floating point tensor elements. Thus, we will convert the 3-channel input images to real numbers and scale them to values in the interval $[0,1)$.

For improving computational efficiency, sometimes it may be necessary to reduce the computational burden by down-sizing the images, e.g., to $128\times 128$ or even $32\times 32$. In principle, we want to protect the native aspect ratio in the images, keep the pixels isotropic, and avoid distortion. In this case we won’t reduce the image size, but it may be useful sometimes.

The Tensorflow protocol also includes augmentation of the imaging data. Earlier, we used torch augmentation, now we use tensorflow. Of course, we will apply the same augmentation-transformations (scale, rotate, flip) to the tumor-mask as well as the brain images, see these three methods resize(), flip(), rotate() in the create_dataset() function above. In addition, we can augment the data by intensity-transformations that preserve the spatial image structure but alter the contrast, brightness, and image saturation, see this random_bsh() method.

# mat_to_df helper function used to transform matrix tensors to dataframes
mat_to_df <- function(mat, name) {
  expand.grid(x = 1:ncol(mat), y = 1:nrow(mat)) %>%
    mutate(value = as.vector(mat),
           panel = name)
}

# For data augmentation - contrast, brightness & saturation perturbations
random_bsh <- function(img) {
  img %>%
    tf$image$random_brightness(max_delta = 0.2) %>%
    tf$image$random_contrast(lower = 0.3, upper = 0.6) %>%
    tf$image$random_saturation(lower = 0.5, upper = 0.6) %>%
    # ensure image intensities are between 0 and 1
    tf$clip_by_value(0, 1)
}
# Test Brightness-Saturation-Contrast Hue intensity augmentation
myIterator <- training_dataset %>% reticulate::as_iterator()
#### Old plot_ly() version for interactive testing
# example <- myIterator$get_next(); example <- myIterator$get_next(); example <- myIterator$get_next() # example
# bshExample <- random_bsh(tf$image$convert_image_dtype(example$img, dtype = tf$float32))
# 
# arr1 <- array(as.numeric(unlist(example$img)), dim=c(256, 256, 3))
# p1 <- plot_ly(type="heatmap", z=~arr1[,,1], name="Image")
# arr2 <- array(as.numeric(unlist(example$mask)), dim=c(256, 256, 1))
# p2 <- plot_ly(type="heatmap", z=~arr2[,,1], name="Mask")
# arr3 <- 255*array(as.numeric(unlist(bshExample)), dim=c(256, 256, 3))
# p3 <- plot_ly(type="heatmap", z=~arr3[,,1], name="BSC-Image")
# subplot(p1, p2, p3, nrows=1) %>% layout(title="Brightness-Saturation-Contrast Hue Image Intensity Augmentation") %>% hide_colorbar()
library(ggplot2)
library(patchwork)

# Get one example (same as original)
myIterator <- training_dataset %>% reticulate::as_iterator()
example <- myIterator$get_next()
example <- myIterator$get_next()
example <- myIterator$get_next()
bshExample <- random_bsh(tf$image$convert_image_dtype(example$img, dtype = tf$float32))

# Extract matrices
img_mat <- array(as.numeric(unlist(example$img)), dim=c(256,256,3))[,,1]   # first channel
mask_mat <- array(as.numeric(unlist(example$mask)), dim=c(256,256,1))[,,1]
bsh_mat <- array(as.numeric(unlist(bshExample)), dim=c(256,256,3))[,,1]

# # Function to convert matrix to data frame for ggplot
# mat_to_df <- function(mat, name) {
#   expand.grid(x = 1:ncol(mat), y = 1:nrow(mat)) %>%
#     mutate(value = as.vector(mat),
#            panel = name)
# }

bind_rows(
  mat_to_df(img_mat, "Raw Image"),
  mat_to_df(mask_mat, "Raw Mask"),
  mat_to_df(bsh_mat, "BSC Image")
) %>%
  ggplot(aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_wrap(~ panel, nrow = 1) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        strip.text = element_text(size = 12, face = "bold")) +
  labs(title = "Brightness‑Saturation‑Contrast Hue Image Intensity Augmentation")

# Random Rotation
# Rotation requires TF-Addon package: https://www.tensorflow.org/addons/overview

# reticulate::py_install(c('tensorflow-addons'), pip = TRUE)
# tfaddons::install_tfaddons()
# devtools::install_github('henry090/tfaddons')

library(tfaddons)
# random_rotate = function(img, mask, rot_param=15) {
#     rnd_rot <- runif(1, -rot_param, rot_param)  # random rotation in degrees +/- rot_param
#     img  <- img_rotate(img, angle = rnd_rot)    # tfa$image$rotate(img, angle = rnd_rot)
#     mask <- img_rotate(mask, angle = rnd_rot)   # tfa$image$rotate(mask, angle = rnd_rot)
#     # c(img, mask)
#     return (list(img=img, mask=mask))
# }

#### This is a work-around for the data-augmentation rotation problem
#### related to the zeallot right operator `%->%` and the `magrittr` pipe operator `%>%`
rot_param <- 15 # degrees of rotation
rnd_rot <- runif(1000, -rot_param, rot_param)
randomRotationPaired <- c(rbind(rnd_rot, rnd_rot)) # double the random rotations
currentRotationIndex <- 1

random_rotate = function(img, mask) {
    img  <- img_rotate(img, angle=randomRotationPaired[currentRotationIndex])    # tfa$image$rotate(img, angle = rnd_rot)
    mask <- img_rotate(mask, angle=randomRotationPaired[currentRotationIndex])   # tfa$image$rotate(mask, angle = rnd_rot)
    # c(img, mask)
    currentRotationIndex <- (currentRotationIndex + 1) %% 1000
    return (list(img=img, mask=mask))
}
# tesingRot = img_rotate(example$img, angle=10)

# Test rotator
myIterator <- training_dataset %>% reticulate::as_iterator()
example <- myIterator$get_next(); example <- myIterator$get_next(); example <- myIterator$get_next() # example
# arr1 <- array(as.numeric(unlist(example$img)), dim=c(256, 256, 3))
# p1 <- plot_ly(type="heatmap", z=~arr1[,,1], name="Raw Img")
# arr2 <- array(as.numeric(unlist(example$mask)), dim=c(256, 1, 256))
# p2 <- plot_ly(type="heatmap", z=~arr2[,1,], name="Raw Mask")
# subplot(p1, p2, nrows=1) %>% layout(title="Original Image + Mask") %>% hide_colorbar()
# 
# rotExample <- random_rotate(example$img, example$mask)
# rotExample$img$shape; rotExample$mask$shape
# arr1 <- array(as.numeric(unlist(rotExample$img)), dim=c(256, 256, 3))
# p1 <- plot_ly(type="heatmap", z=~arr1[,, 1], name="Rotated Img")
# arr2 <- array(as.numeric(unlist(rotExample$mask)), dim=c(256, 256, 1))
# p2 <- plot_ly(type="heatmap", z=~arr2[,, 1], name="Rotated Mask")
# subplot(p1, p2, nrows=1) %>% layout(title="Augmented/Rotated Image + Mask") %>% hide_colorbar()

# Same extraction as above (assuming `example` is still the current batch item)
img_mat_orig <- array(as.numeric(unlist(example$img)), dim=c(256,256,3))[,,1]
mask_mat_orig <- array(as.numeric(unlist(example$mask)), dim=c(256,1,256))[,1,]   # shape correction

bind_rows(
  mat_to_df(img_mat_orig, "Raw Image"),
  mat_to_df(mask_mat_orig, "Raw Mask")
) %>%
  ggplot(aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_wrap(~ panel, nrow = 1) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none") +
  labs(title = "Original Image + Mask")

currentRotationIndex <- 1 # reset the currentRotationIndex

# c(img, mask) %<-% random_rotate(example$img, example$mask)
# img$shape; mask$shape
# arr1 <- array(as.numeric(unlist(img)), dim=c(256, 256, 3))
# p1 <- plot_ly(type="heatmap", z=~arr1[,, 1])
# arr2 <- array(as.numeric(unlist(mask)), dim=c(256, 256, 1))
# p2 <- plot_ly(type="heatmap", z=~arr2[,, 1])
# subplot(p1, p2, nrows=1) %>% layout(title="Augmented/Rotated Image + Mask") %>% hide_colorbar()

# Inspect one case
# example <- training_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# example$img %>% as.array() %>% as.raster() %>% plot()

The core functions we built above can be integrated into a new function, create_dataset(), which represents the complete end-to-end image preprocessing pipeline, The input to this pipeline is a dataframe of filenames containing the brain-images and tumor-masks, and the corresponding output contains the training_dataset and the validation_dataset as tensorflow dataset-objects that will be used in the model-fitting phase.

library(zeallot)   # for the reverse-piping assignment operator: %<-%

create_dataset <- function(data, train, batch_size = 32L) {
  dataset <- data %>%   # load all PNG files as images
    tensor_slices_dataset() %>%
    dataset_map(~.x %>% list_modify(
      img = tf$image$decode_png(tf$io$read_file(.x$img), channels = 3),
      mask = tf$image$decode_png(tf$io$read_file(.x$mask))
    )) %>%     # convert image intensities from int8 to 32-bit float
    dataset_map(~.x %>% list_modify(
      img = tf$image$convert_image_dtype(.x$img, dtype = tf$float32),
      mask = tf$image$convert_image_dtype(.x$mask, dtype = tf$float32)
    )) %>%     # reshape all tensor-shape to (256 * 256) to ensure spatial-index homologies
    dataset_map(~.x %>% list_modify(
      img = tf$image$resize(.x$img, size = shape(256, 256)),
      mask = tf$image$resize(.x$mask, size = shape(256, 256))
    ))

  # data augmentation performed on training set only
  if (train) {
    dataset <- dataset %>%
      dataset_map(~.x %>% list_modify(
        # random_rotate(img=.x$img, mask=.x$mask) %->% c(img=img, mask=mask)
        # c(img=img, mask=mask) %<-% random_rotate(img=.x$img, mask=.x$mask) # <- Check this issue, it should work like this:
          # example <- myIterator$get_next(); example <- myIterator$get_next(); example <- myIterator$get_next()
          # c(img, mask) %<-% random_rotate(example[[1]], example[[2]])
          # random_rotate(img=example[[1]], mask=example[[2]]) %->% c(img, mask)
          # img$shape; mask$shape
          #
          # Alternative Solutions
          # c(img, mask) %<-% random_rotate(.x)
          # Syntax: ListOfLists <- c(a, b) %<-% list(a=list(1,2,3), b=list(4,5,6,7)); ListOfLists$b
          # c(img, mask) %<-% random_rotate(.x$img, .x$mask)
          #
          # This works, but we need joint, not separate, rotation of img + mask
          img = random_rotate(img=.x$img, mask=.x$mask)$img,
          mask= random_rotate(img=.x$img, mask=.x$mask)$mask
          # c(img, mask) %<-% unlist(random_rotate(img=.x$img, mask=.x$mask))
          # # mask = random_rotate(.x$img, .x$mask)[[2]]
      )) %>%
      dataset_map(~.x %>% list_modify(
        img = random_bsh(.x$img)
      ))
  }

  # shuffling on training set only
  if (train) {
    dataset <- dataset %>%
      dataset_shuffle(buffer_size = batch_size*4)
  }

  # train in batches; batch size might need to be adapted depending on
  # available memory
  dataset <- dataset %>%
    dataset_batch(batch_size)

  dataset %>%
    # output needs to be unnamed as required by Keras
    dataset_map(unname) # makes example$img -> example[[1]]
}

# Generate the Training and Testing data at iterated TF-Data objects
training_dataset <- create_dataset(training(data_train_valid), train = TRUE)
validation_dataset <- create_dataset(testing(data_train_valid), train = FALSE)
# tf$data$experimental$cardinality(training_dataset)$numpy()
myIterator <- training_dataset %>% reticulate::as_iterator()
example <- myIterator$get_next()  # ; example <- myIterator$get_next()  # shape=(32, 256, 256, 3 or 1)
c(img,mask) %<-% random_rotate(img=example[[1]][1,,,1], mask=example[[2]][1,,,1])
img$shape; mask$shape

## TensorShape([256, 256])

## TensorShape([256, 256])

#### Old plot_ly() for interactive demos
# arr1 <- array(as.numeric(img), dim=c(256, 256))
# p1 <- plot_ly(type="heatmap", z=~arr1, name="Rotated Img")
# arr2 <- array(as.numeric(mask), dim=c(256, 256))
# p2 <- plot_ly(type="heatmap", z=~arr2, name="Rotated Mask")
# subplot(p1, p2, nrows=1) %>% layout(title="Augmented/Rotated Image + Mask") %>% hide_colorbar()

# rotExample <- random_rotate(example$img, example$mask)
# rot_img_mat <- array(as.numeric(unlist(rotExample$img)), dim=c(256,256,3))[,,1]
# rot_mask_mat <- array(as.numeric(unlist(rotExample$mask)), dim=c(256,256,1))[,,1]

c(img,mask) %<-% 
  random_rotate(
    img=example[[1]][1,,,1], 
    mask=example[[2]][1,,,1])
# report tensor shapes
img$shape; mask$shape

## TensorShape([256, 256])

## TensorShape([256, 256])

# Helper function (define once)
mat_to_df <- function(mat, name) {
  expand.grid(x = 1:ncol(mat), y = 1:nrow(mat)) %>%
    mutate(value = as.vector(mat),
           panel = name)
}

# Assuming 'img' and 'mask' are the rotated matrices from your earlier code
bind_rows(
  mat_to_df(img, "Rotated Image"),
  mat_to_df(mask, "Rotated Mask")
) %>%
  ggplot(aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_wrap(~ panel, nrow = 1) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none") +
  labs(title = "Augmented / Rotated Image + Mask")

The UNet DCNN model summary includes:

Column 1: Layer type and specifications
Column 2: “output shape” contains the expected network U-shape parameters. Note that during the encoding phase, the image Width and Height sizes decrease initially (until the middle of the “U”, reaching a minimum resolution of $8\times 8$). Then, they start increasing again during the decoding phase, until reaching the sizes of the original image resolution. Similarly, the number of filters increases during encoding and then decreases during the decoding phase terminating with an output layer having a single filter. Finally, the model architecture includes concatenation layers in the decoding phase that aggregate information from “below” with information that comes “laterally” from the parallel nodes in the encoding phase.
Column 3: shows the number of parameters used in each layer of the DCNN.
Column 4: for each layer (row), column 4 shows the parent (previous) layer in the network.

In this image-segmentation problem, the loss function needs to account for the result of labeling ALL pixel intensities in the brain images. Hence, every pixel location contributes equally to the total loss measure. As Binary classification yields a $1$ (tumor) or $0$ (normal brain tissue), the binary_crossentropy() method is one appropriate choice of a loss function. During the iterative transfer learning process, we can track the classification accuracy, dice coefficient or other evaluation metrics that capture the proportion of correctly classified pixels.

Next we will define, compile and test the base (pre-training default) model, prior to model fitting. This is a naive prediction without model tuning or transfer learning.

# Model Training - Starting with a pre-trained U-net model and then expanding it with Transfer-learning
library(unet)
model <- unet::unet(input_shape = c(256, 256, 3))  # for RGB use 3-channels: input_shape = c(256, 256, 3)
summary(model)

## Model: "model_1"
## ________________________________________________________________________________
##  Layer (type)           Output Shape            Param   Connected to            
##                                                  #                              
## ================================================================================
##  input_2 (InputLayer)   [(None, 256, 256, 3)]   0       []                      
##  conv2d_19 (Conv2D)     (None, 256, 256, 64)    1792    ['input_2[0][0]']       
##  conv2d_20 (Conv2D)     (None, 256, 256, 64)    36928   ['conv2d_19[0][0]']     
##  max_pooling2d_4 (MaxP  (None, 128, 128, 64)    0       ['conv2d_20[0][0]']     
##  ooling2D)                                                                      
##  conv2d_21 (Conv2D)     (None, 128, 128, 128)   73856   ['max_pooling2d_4[0][0]'
##                                                         ]                       
##  conv2d_22 (Conv2D)     (None, 128, 128, 128)   14758   ['conv2d_21[0][0]']     
##                                                 4                               
##  max_pooling2d_5 (MaxP  (None, 64, 64, 128)     0       ['conv2d_22[0][0]']     
##  ooling2D)                                                                      
##  conv2d_23 (Conv2D)     (None, 64, 64, 256)     29516   ['max_pooling2d_5[0][0]'
##                                                 8       ]                       
##  conv2d_24 (Conv2D)     (None, 64, 64, 256)     59008   ['conv2d_23[0][0]']     
##                                                 0                               
##  max_pooling2d_6 (MaxP  (None, 32, 32, 256)     0       ['conv2d_24[0][0]']     
##  ooling2D)                                                                      
##  conv2d_25 (Conv2D)     (None, 32, 32, 512)     11801   ['max_pooling2d_6[0][0]'
##                                                 60      ]                       
##  conv2d_26 (Conv2D)     (None, 32, 32, 512)     23598   ['conv2d_25[0][0]']     
##                                                 08                              
##  max_pooling2d_7 (MaxP  (None, 16, 16, 512)     0       ['conv2d_26[0][0]']     
##  ooling2D)                                                                      
##  dropout_1 (Dropout)    (None, 16, 16, 512)     0       ['max_pooling2d_7[0][0]'
##                                                         ]                       
##  conv2d_27 (Conv2D)     (None, 16, 16, 1024)    47196   ['dropout_1[0][0]']     
##                                                 16                              
##  conv2d_28 (Conv2D)     (None, 16, 16, 1024)    94382   ['conv2d_27[0][0]']     
##                                                 08                              
##  conv2d_transpose_4 (C  (None, 32, 32, 512)     20976   ['conv2d_28[0][0]']     
##  onv2DTranspose)                                64                              
##  concatenate_4 (Concat  (None, 32, 32, 1024)    0       ['conv2d_26[0][0]',     
##  enate)                                                  'conv2d_transpose_4[0][
##                                                         0]']                    
##  conv2d_29 (Conv2D)     (None, 32, 32, 512)     47191   ['concatenate_4[0][0]'] 
##                                                 04                              
##  conv2d_30 (Conv2D)     (None, 32, 32, 512)     23598   ['conv2d_29[0][0]']     
##                                                 08                              
##  conv2d_transpose_5 (C  (None, 64, 64, 256)     52454   ['conv2d_30[0][0]']     
##  onv2DTranspose)                                4                               
##  concatenate_5 (Concat  (None, 64, 64, 512)     0       ['conv2d_24[0][0]',     
##  enate)                                                  'conv2d_transpose_5[0][
##                                                         0]']                    
##  conv2d_31 (Conv2D)     (None, 64, 64, 256)     11799   ['concatenate_5[0][0]'] 
##                                                 04                              
##  conv2d_32 (Conv2D)     (None, 64, 64, 256)     59008   ['conv2d_31[0][0]']     
##                                                 0                               
##  conv2d_transpose_6 (C  (None, 128, 128, 128)   13120   ['conv2d_32[0][0]']     
##  onv2DTranspose)                                0                               
##  concatenate_6 (Concat  (None, 128, 128, 256)   0       ['conv2d_22[0][0]',     
##  enate)                                                  'conv2d_transpose_6[0][
##                                                         0]']                    
##  conv2d_33 (Conv2D)     (None, 128, 128, 128)   29504   ['concatenate_6[0][0]'] 
##                                                 0                               
##  conv2d_34 (Conv2D)     (None, 128, 128, 128)   14758   ['conv2d_33[0][0]']     
##                                                 4                               
##  conv2d_transpose_7 (C  (None, 256, 256, 64)    32832   ['conv2d_34[0][0]']     
##  onv2DTranspose)                                                                
##  concatenate_7 (Concat  (None, 256, 256, 128)   0       ['conv2d_20[0][0]',     
##  enate)                                                  'conv2d_transpose_7[0][
##                                                         0]']                    
##  conv2d_35 (Conv2D)     (None, 256, 256, 64)    73792   ['concatenate_7[0][0]'] 
##  conv2d_36 (Conv2D)     (None, 256, 256, 64)    36928   ['conv2d_35[0][0]']     
##  conv2d_37 (Conv2D)     (None, 256, 256, 1)     65      ['conv2d_36[0][0]']     
## ================================================================================
## Total params: 31031745 (118.38 MB)
## Trainable params: 31031745 (118.38 MB)
## Non-trainable params: 0 (0.00 Byte)
## ________________________________________________________________________________

# define a custom loss function (dice coefficient)
dice <- custom_metric("dice", function(y_true, y_pred, smooth = 1.0) {
  y_true_f <- k_flatten(y_true)
  y_pred_f <- k_flatten(y_pred)
  intersection <- k_sum(y_true_f * y_pred_f)
  (2 * intersection + smooth) / (k_sum(y_true_f) + k_sum(y_pred_f) + smooth)
})

# Compile the DCNN model, packaging an optimizer, loss and performance metrics
model %>% compile(
  optimizer = optimizer_rmsprop(learning_rate = 1e-5),
  loss = "binary_crossentropy",
  metrics = list(dice, metric_binary_accuracy)
)

# Naive Prediction using the default model (prior to Re-Training or Transfer Learning)
batch <- validation_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
predictions <- predict(model, batch[[1]])

## 1/1 - 19s - 19s/epoch - 19s/step

images <- tibble(
  image = batch[[1]] %>% array_branch(1),
  predicted_mask = predictions[,,,1] %>% array_branch(1),
  mask = batch[[2]][,,,1]  %>% array_branch(1)
)

# Check performance of the default base model on one case (# 22)
#### OLD PLOT_LY() VERSION
# i=22
# z1 <- ifelse(as.matrix(as.data.frame(images$predicted_mask[i])) > 0.5, 1, 0)
# pl1 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$mask[i])[,1:256]), type="heatmap", name=paste0("Tumor Mask ", i))
# pl2 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$image[i])[,1:256]), type="heatmap", name=paste0("Brain Image ", i))
# pl3 <- plot_ly(z = ~ z1, type="heatmap", name=paste0("Pred Mask ", i))
# subplot(pl1, pl2, pl3, nrows=1) %>% hide_colorbar()
# Assuming `batch` and `predictions` are already computed as in the original
# batch[[1]] = images, batch[[2]] = true masks, predictions = predicted masks

i <- 22   # or any index

# Extract the three matrices
img_mat <- batch[[1]][i,,,1] %>% as.array()   # 1 channel for display
true_mat <- batch[[2]][i,,,1] %>% as.array()
# pred_mat <- ifelse(predictions[,,,1][,,i] > 0.5, 1, 0)   # binarize
pred_mat <- 
  ifelse(as.matrix(as.data.frame(images$predicted_mask[i])) > 0.5, 1, 0) # binarize

# Build data frames
df_img <- expand.grid(x = 1:ncol(img_mat), y = 1:nrow(img_mat))
df_img$value <- as.vector(img_mat)
df_img$panel <- "Brain Image"

df_true <- expand.grid(x = 1:ncol(true_mat), y = 1:nrow(true_mat))
df_true$value <- as.vector(true_mat)
df_true$panel <- "True Mask"

df_pred <- expand.grid(x = 1:ncol(pred_mat), y = 1:nrow(pred_mat))
df_pred$value <- as.vector(pred_mat)
df_pred$panel <- "Predicted Mask"

bind_rows(df_img, df_true, df_pred) %>%
  ggplot(aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_wrap(~ panel, nrow = 1) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        strip.text = element_text(size = 12, face = "bold")) +
  labs(title = paste("Default Model Prediction – Case", i))

# training_dataset <- create_dataset(training(data_train_valid), train = TRUE)
# validation_dataset <- create_dataset(testing(data_train_valid), train = FALSE)
#
# library(unet)
#
# model <- unet::unet(input_shape = c(256, 256, 3))  # for RGB use 3-channels: input_shape = c(256, 256, 3)
# # summary(model)
#
# library(keras)
# dice <- custom_metric("dice", function(y_true, y_pred, smooth = 1.0) {
#   y_true_f <- k_flatten(y_true)
#   y_pred_f <- k_flatten(y_pred)
#   intersection <- k_sum(y_true_f * y_pred_f)
#   (2 * intersection + smooth) / (k_sum(y_true_f) + k_sum(y_pred_f) + smooth)
# })
#
# model %>% compile(
#   optimizer = optimizer_rmsprop(learning_rate = 1e-5),
#   loss = "binary_crossentropy",
#   metrics = list(dice, metric_binary_accuracy)
# )

The part below shows the UNet/DCNN model fitting for a single epoch. This block of code is not run (eval=F) as it takes hours to complete, but below we show the results from offline DCNN trainings with different settings.

# MODEL FITTING
epochs = 1
history <- model %>% fit(training_dataset, epochs = epochs, validation_data = validation_dataset)

# ASSESSMENT - Actual Model Evaluation (after Transfer-learning retraining)
# Naive Prediction
batch <- validation_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
predictions <- predict(model, batch[[1]])
images <- tibble(
  image = batch[[1]] %>% array_branch(1),
  predicted_mask = predictions[,,,1] %>% array_branch(1),
  mask = batch[[2]][,,,1]  %>% array_branch(1)
)
##### OLD plot_ly() interactig display
# i=22
# pl1 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$mask[i])[,1:256]), type="heatmap", name=paste0("Tumor Mask ", i))
# pl2 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$image[i])[,1:256]), type="heatmap", name=paste0("Brain Image ", i))
# pl3 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$predicted_mask[i])[,1:256]), type="heatmap", name=paste0("Pred Mask ", i))
# subplot(pl1, pl2, pl3, nrows=1)

library(ggplot2)
library(patchwork)

# After model fitting and obtaining `batch` and `predictions` as in the original
# (assuming `images` tibble is already created)

i <- 22

# Extract matrices (already scaled 0-1, we don't multiply by 255)
mask_mat <- as.matrix(as.data.frame(images$mask[i])[,1:256])
img_mat  <- as.matrix(as.data.frame(images$image[i])[,1:256])
pred_mat <- as.matrix(as.data.frame(images$predicted_mask[i])[,1:256])

# Helper to convert a matrix to a ggplot heatmap
mat_to_gg <- function(mat, title) {
  df <- expand.grid(x = 1:ncol(mat), y = 1:nrow(mat))
  df$value <- as.vector(mat)
  ggplot(df, aes(x = x, y = y, fill = value)) +
    geom_raster() +
    scale_fill_gradient(low = "black", high = "white") +
    coord_fixed() +
    theme_void() +
    theme(legend.position = "none",
          plot.title = element_text(hjust = 0.5, size = 12)) +
    labs(title = title)
}

# Create three plots
p_mask <- mat_to_gg(mask_mat, paste0("Tumor Mask ", i))
p_img  <- mat_to_gg(img_mat,  paste0("Brain Image ", i))
p_pred <- mat_to_gg(pred_mat, paste0("Pred Mask ", i))

# Combine side by side
(p_mask | p_img | p_pred) +
  plot_annotation(title = paste("Model Evaluation after Transfer Learning – Case", i))


library(keras)
dice <- custom_metric("dice", function(y_true, y_pred, smooth = 1.0) {
  y_true_f <- k_flatten(y_true)
  y_pred_f <- k_flatten(y_pred)
  intersection <- k_sum(y_true_f * y_pred_f)
  (2 * intersection + smooth) / (k_sum(y_true_f) + k_sum(y_pred_f) + smooth)
})

model %>% compile(
  optimizer = optimizer_rmsprop(learning_rate = 1e-5),
  loss = "binary_crossentropy",
  metrics = list(dice, metric_binary_accuracy)
)

When training the DCNN model, keep in mind that this step is very computationally intensive (each epoch can take hours to complete). Running a small number of epochs may be feasible in an interactive RStudio session, but dozens of epochs are necessary to train the network to perform well (e.g., yield high Dice coefficient values).

# MODEL FITTING
history <- model %>% fit(training_dataset, epochs = 1, validation_data = validation_dataset, verbose = 0)

# 81/81 [==============================] - 3313s 41s/step -
# loss: 0.3121 - dice: 0.0106 - binary_accuracy: 0.9604 - val_loss: 0.1762 - val_dice: 1.7162e-04 - val_binary_accuracy: 0.9900

# EVALUATION
pl_loss <- plot_ly(x = ~c(1:history$params$epochs),  y = ~history$metrics$loss,
        type = "scatter", mode="markers+lines", name="Loss") %>%
  add_trace(x = ~c(1:history$params$epochs),  y = ~history$metrics$val_loss,
        type = "scatter", mode="markers+lines", name="Validation Loss") %>%
  layout(title="DNN Training/Validation Performance", xaxis=list(title="epoch"),
         yaxis=list(title="Metric Value"), legend = list(orientation='h'))

pl_acc <- plot_ly(x = ~c(1:history$params$epochs),  y = ~history$metrics$binary_accuracy,
        type = "scatter", mode="markers+lines", name="Accuracy") %>%
  add_trace(x = ~c(1:history$params$epochs),  y = ~history$metrics$binary_accuracy,
        type = "scatter", mode="markers+lines", name="Validation Binary Accuracy") %>%
  layout(title="DNN Training/Validation Performance", xaxis=list(title="epoch"),
         yaxis=list(title="Metric Value"), legend = list(orientation='h'))

subplot(pl_loss, pl_acc, nrows=2, shareX = TRUE, titleX = TRUE)

# model %>% evaluate(training_dataset %>% dataset_batch(25), verbose = 0)
model %>% evaluate(validation_dataset, verbose = 0)
#           loss            dice    binary_accuracy
#   0.1762175262    0.0001716204        0.9899647236

# Save/Load the pretrained model in HDF5 format
save_model_hdf5(model, "C:/Users/IvoD/Desktop/model_TF_Brain_epoch_1.h5",
                overwrite = TRUE, include_optimizer = TRUE)
# Mind that loading a model with custom layers/functions, e.g., dice() method, requires special import using a list
# https://github.com/rstudio/keras/issues/1240

Once we have pre-trained (history <- model %>% fit()) and saved (save_model_hdf5()) the model (model), we can load it back as mod1 in the interactive session and use it for prediction. Several pre-trained models (e.g., 50 and 100 epochs) are available on the Canvas site. Below, we are using one of the pre-trained Unet models (model_TF_Brain_epoch_100.h5, 250MB) to illustrate the prediction performance, i.e., automated brain-tumor segmentation using 2D neuroimaging data.

Notice the steady model improvement (accuracy, binary tumor-labeling, and dice-coefficient) with respect to the epoch index.

# # This HDF5 model is available on Canvas:
# # https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes
# # https://umich.instructure.com/files/22559456/download?download_frd=1
# mod1 <-
#   load_model_hdf5("E:/Ivo.dir/Research/UMichigan/Proposals/2014/UMich_MINDS_DataSciInitiative_2014/2016/MIDAS_HSXXX_DataScienceAnalyticsCourse_Porposal/MOOC_Implementation/appendix/model_TF_Brain_epoch_50.h5",
#                         custom_objects = list("dice" = dice), compile = TRUE)
#
# # Update the model's last layer: https://tensorflow.rstudio.com/guide/keras/faq/
# # mod2 <- mod1 %>%
# #   # get_layer(index=33) %>%
# #   pop_layer() %>%  # remove the output layer first, then plug in a new output layer
# #   # layer_conv_2d(filters = 64, kernel_size = c(3, 3), padding = "same")
# #   layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = "relu")
# input <- mod1$input                            # UNet input layer
# base <- (mod1 %>% get_layer(index=33))$output  # Get the last output layer of "mod1"
# target <- base %>%                             # Replace the output layer by a new conv2D layer outputting a 3-channel 2D brain image
#   layer_conv_2d(filters = 3, kernel_size = c(1,1), activation = "relu")
# transfer_model <- keras_model(input, target)   # Update mod1 for transfer learning
# summary(transfer_model)
# # Mind the final layer change:
# # Layer (type)        Output Shape         Param #    Connected to
# # conv2d_23 (Conv2D)  (None, 256, 256, 3)  6          conv2d_18[0][0]

# # Define the Dice coefficient
# dice <- custom_metric("dice", function(y_true, y_pred, smooth = 1.0) {
#   y_true_f <- k_flatten(y_true)
#   y_pred_f <- k_flatten(y_pred)
#   intersection <- k_sum(y_true_f * y_pred_f)
#   (2 * intersection + smooth) / (k_sum(y_true_f) + k_sum(y_pred_f) + smooth)
# })
#
# # Recompile the model with our  custom dice() loss function
# mod1 %>% compile(
#   optimizer = optimizer_rmsprop(learning_rate = 1e-5),
#   loss = "binary_crossentropy",
#   metrics = list(dice, metric_binary_accuracy)
# )
# # Check re-loaded mod1 model and evaluate performance metrics on validation dataset
# # summary(mod1)
# mod1 %>% evaluate(validation_dataset)
#
# # Finally display some of the results -- ASSESSMENT -- Actual Model Evaluation (after Transfer-learning retraining)
# batch <- validation_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# predictions <- predict(mod1, batch[[1]])
# images <- tibble(
#   image = batch[[1]] %>% array_branch(1),
#   predicted_mask = predictions[,,,1] %>% array_branch(1),
#   mask = batch[[2]][,,,1]  %>% array_branch(1)
# )
# i=16
# pl1 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$mask[i])[,1:256]), type="heatmap", name=paste0("Tumor Mask ", i)) %>%
#     layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl2 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$image[i])[,1:256]), type="heatmap", name=paste0("Brain Image ", i)) %>%
#     layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl3 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$predicted_mask[i])[,1:256]), type="heatmap", name=paste0("Pred Mask ", i)) %>%
#     layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# subplot(pl1, pl2, pl3, nrows=1, shareY = TRUE) %>%
#   layout(title=paste0("Case ", i))

# batch <- validation_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# predictions <- predict(mod1, batch)
# str(predictions)  # num [1:32, 1:256, 1:256, 1] 0.466 0.463 0.48 0.449 0.494 ...
# hist(predictions)
#
# images <- tibble(
#   image = batch[[1]] %>% array_branch(1),
#   predicted_mask = predictions[,,,1] %>% array_branch(1),
#   mask = batch[[2]][,,,1]  %>% array_branch(1)
# )

# Display the Transfer-Learning Model Performance across epochs
# These metrics are precomputed on SOCR-Lighthouse server, as the transfer-learning fine-tuning training
# takes days to go through 50 epochs and dozens of iterations per epoch
## https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes
##
# epochs <- 48
# # Import the metrics: Loss,   Dice,   Binary_Accuracy,    Validation_Loss,    Validation_Dice,    Validation_Binary_Accuracy
# historyMetrics_epochs48 <- read.csv("https://umich.instructure.com/files/22559501/download?download_frd=1", stringsAsFactors = F)
# hist_df <- data.frame(epoch=historyMetrics_epochs48$Epoch, loss=historyMetrics_epochs48$Loss,
#                       dice=historyMetrics_epochs48$Dice, binAccuracy=historyMetrics_epochs48$Binary_Accuracy,
#                       valid_loss=historyMetrics_epochs48$Validation_Loss, valid_dice=historyMetrics_epochs48$Validation_Dice,
#                       valid_binAccuracy=historyMetrics_epochs48$Validation_Binary_Accuracy)
#
# p_loss <- plot_ly(hist_df, x = ~epoch)  %>%
#   add_trace(y = ~loss, name = 'Training Loss', type="scatter", mode = 'lines+markers') %>%
#   add_trace(y = ~valid_loss, name = 'Validation Loss', type="scatter", mode = 'lines+markers')
# p_acc <- plot_ly(hist_df, x = ~epoch)  %>%
#   add_trace(y = ~dice, name = 'Training Dice', type="scatter", mode = 'lines+markers') %>%       # DICE
#   add_trace(y = ~valid_dice, name = 'Validation Accuracy', type="scatter", mode = 'lines+markers') %>%
#   add_trace(y = ~binAccuracy, name = 'Training Binary-Accuracy', type="scatter", mode = 'lines+markers') %>%    # ACCURACY
#   add_trace(y = ~valid_binAccuracy, name = 'Validation Binary-Accuracy', type="scatter", mode = 'lines+markers')
# subplot(p_loss, p_acc, nrows=2) %>%
#   layout(legend = list(orientation = 'h'), title="Tensorflow Transfer-Learning Performance (Brain Imaging), Epochs=48")

# # MODEL FITTING on Lighthouse SOCR Compute Server
# epochs = 100
# history <- model %>% fit(training_dataset, epochs = epochs, validation_data = validation_dataset)
#
# # ASSESSMENT - Actual Model Evaluation (after Transfer-learning retraining)
# # Naive Prediction
# batch <- validation_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# predictions <- predict(model, batch[[1]])
# images <- tibble(
#   image = batch[[1]] %>% array_branch(1),
#   predicted_mask = predictions[,,,1] %>% array_branch(1),
#   mask = batch[[2]][,,,1]  %>% array_branch(1)
# )
# # i=22
# # pl1 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$mask[i])[,1:256]), type="heatmap", name=paste0("Tumor Mask ", i))
# # pl2 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$image[i])[,1:256]), type="heatmap", name=paste0("Brain Image ", i))
# # pl3 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$predicted_mask[i])[,1:256]), type="heatmap", name=paste0("Pred Mask ", i))
# # subplot(pl1, pl2, pl3, nrows=1)
#
#
# # MODEL FITTING
# # history <- model %>% fit(training_dataset, epochs = 1, validation_data = validation_dataset, verbose = 2)
#
# # 81/81 [==============================] - 3313s 41s/step -
# # loss: 0.3121 - dice: 0.0106 - binary_accuracy: 0.9604 - val_loss: 0.1762 - val_dice: 1.7162e-04 - val_binary_accuracy: 0.9900
#
# # EVALUATION
# # pl_loss <- plot_ly(x = ~c(1:history$params$epochs),  y = ~history$metrics$loss,
# #                    type = "scatter", mode="markers+lines", name="Loss") %>%
# #   add_trace(x = ~c(1:history$params$epochs),  y = ~history$metrics$val_loss,
# #             type = "scatter", mode="markers+lines", name="Validation Loss") %>%
# #   layout(title="DNN Training/Validation Performance", xaxis=list(title="epoch"),
# #          yaxis=list(title="Metric Value"), legend = list(orientation='h'))
# #
# # pl_acc <- plot_ly(x = ~c(1:history$params$epochs),  y = ~history$metrics$binary_accuracy,
# #                   type = "scatter", mode="markers+lines", name="Accuracy") %>%
# #   add_trace(x = ~c(1:history$params$epochs),  y = ~history$metrics$binary_accuracy,
# #             type = "scatter", mode="markers+lines", name="Validation Binary Accuracy") %>%
# #   layout(title="DNN Training/Validation Performance", xaxis=list(title="epoch"),
# #          yaxis=list(title="Metric Value"), legend = list(orientation='h'))
# #
# # subplot(pl_loss, pl_acc, nrows=2, shareX = TRUE, titleX = TRUE)
#
# # model %>% evaluate(training_dataset %>% dataset_batch(25), verbose = 0)
# model %>% evaluate(validation_dataset, verbose = 0)
# #           loss            dice    binary_accuracy
# #   0.1762175262    0.0001716204        0.9899647236
#
# # Save/Load the pretrained model in HDF5 format
# save_model_hdf5(model, "/home/dinov/DSPA/test_code/Chap22_DCNN_UNet_modeling/model_TF_Brain_epoch_100.h5",
#                 overwrite = TRUE, include_optimizer = TRUE)
# # Save History object tracking model performance across epochs,
# # see https://cran.r-project.org/web/packages/keras/vignettes/training_visualization.html
# history_df <- as.data.frame(history)
# # str(history_df)
# # 'data.frame':   120 obs. of  4 variables:
# #   $ epoch : int  1 2 3 4 5 6 7 8 9 10 ...
# # $ value : num  0.87 0.941 0.954 0.962 0.965 ...
# # $ metric: Factor w/ 2 levels "acc","loss": 1 1 1 1 1 1 1 1 1 1 ...
# # $ data  : Factor w/ 2 levels "training","validation": 1 1 1 1 1 1 1 1 1 1 ...
# save(history_df, file="/home/dinov/DSPA/test_code/Chap22_DCNN_UNet_modeling/historyModel_TF_Brain_Epochs50.Rda")
# # Then load it with:
# #

# REPORT Training and Validation Performance: load the performance metrics (history_df) over 100 epochs
# https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes | historyModel_TF_Brain_Epochs100.Rda
# load(url("https://umich.instructure.com/files/22647027/download?download_frd=1"))
# 1. Define the URL and the temporary destination name
# file_url <- "https://umich.instructure.com/files/22647027/download?download_frd=1"
dest_file <- "D:/IvoD/Desktop/Ivo.dir/Research/UMichigan/Publications_Books/2023/DSPA_Springer_2nd_Edition_2023/Rmd_HTML/historyModel_TF_Brain_Epochs50.Rda"

# # 2. Download the file securely to your machine
# download.file(file_url, destfile = dest_file, mode = "wb")

# 3. Load the local file
load(dest_file)

lineWidth <- ifelse(history_df$data=='training', 1, 4)
lineNames <- paste0(history_df$data, " ", history_df$metric)
plot_ly(history_df, x = ~epoch, y=~value, color=~metric, type="scatter", mode="lines+markers", name=~lineNames,
        line = list(color = ~metric, width = lineWidth))  %>%
  layout(legend = list(orientation = 'h'), title="Tensorflow Transfer-Learning Performance (Brain Imaging), Epochs=100")

# PREDICTION
# Mind that loading a model with custom layers/functions, e.g., dice() method, requires special import using a list
# https://github.com/rstudio/keras/issues/1240
# Once we have trained (`history <- model %>% fit(()`) and saved (`save_model_hdf5()`) the model (`model`), we can load it back as `mod1`  the interactive session and use it for prediction.
# Define the Dice coefficient
dice <- custom_metric("dice", function(y_true, y_pred, smooth = 1.0) {
  y_true_f <- k_flatten(y_true)
  y_pred_f <- k_flatten(y_pred)
  intersection <- k_sum(y_true_f * y_pred_f)
  (2 * intersection + smooth) / (k_sum(y_true_f) + k_sum(y_pred_f) + smooth)
})

# The pre-computed model_TF_Brain_epoch_100.h5 model file is available in Canvas:
# https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes
# localModelsFolder <- "C:/Users/IvoD/Desktop/Ivo.dir/Research/UMichigan/Proposals/2014/UMich_MINDS_DataSciInitiative_2014/2016/MIDAS_HSXXX_DataScienceAnalyticsCourse_Porposal/MOOC_Implementation/appendix/"

localModelsFolder <- getwd()

mod1 <- load_model_hdf5(paste0(localModelsFolder, "/appendix/model_TF_Brain_epoch_100.h5"), compile = FALSE)
                        #  custom_objects = list("dice" = dice), compile = TRUE)

# # Recompile the model with our  custom dice() loss function
mod1 %>% compile(
  optimizer = optimizer_rmsprop(learning_rate = 1e-5),
  loss = "binary_crossentropy",
  metrics = list(dice, metric_binary_accuracy)
)
# Check re-loaded mod1 model and evaluate performance metrics on validation dataset
# summary(mod1)
mod1 %>% evaluate(validation_dataset)

## 21/21 - 368s - loss: 0.0185 - dice: 0.6665 - binary_accuracy: 0.9938 - 368s/epoch - 18s/step

##            loss            dice binary_accuracy 
##      0.01854581      0.66647464      0.99375749

# 16/16 [==============================] - 180s 11s/step - loss: 0.0213 - dice: 0.6842 - binary_accuracy: 0.9932
#           loss            dice binary_accuracy
#     0.02131428      0.68422282      0.99317247

# Finally display some of the results -- ASSESSMENT -- Actual Model Evaluation (after Transfer-learning retraining)
batch <- validation_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
predictions <- predict(mod1, batch[[1]])

## 1/1 - 18s - 18s/epoch - 18s/step

images <- tibble(
  image = batch[[1]] %>% array_branch(1),
  predicted_mask = predictions[,,,1] %>% array_branch(1),
  mask = batch[[2]][,,,1]  %>% array_branch(1)
)
# i=16
# pl1 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$mask[i])[,1:256]), type="heatmap", name=paste0("Tumor Mask ", i)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl2 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$image[i])[,1:256]), type="heatmap", name=paste0("Brain Image ", i)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl3 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$predicted_mask[i])[,1:256]), type="heatmap", name=paste0("Pred Mask ", i)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# subplot(pl1, pl2, pl3, nrows=1, shareY = TRUE) %>%
#   layout(title=paste0("DCNN HDF5 Model (epochs=100) Validation on Case ", i))

#### Old plot_ly() version - only test in interactive mode!
# pl_list <- list()
# n = 10
# for (i in 1:n) {    # to limit to only 1-channel, restrict the column range to 1:256: images$mask[i])[,1:256]
#   pl1 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$mask[i])), type="heatmap", name=paste0("Tumor Mask ", i)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl2 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$image[i])), type="heatmap", name=paste0("Brain Image ", i)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl3 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$predicted_mask[i])), type="heatmap", name=paste0("Pred Mask ", i)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl4 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$mask[i+n])), type="heatmap", name=paste0("Tumor Mask ", i+n)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl5 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$image[i+n])), type="heatmap", name=paste0("Brain Image ", i+n)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl6 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$predicted_mask[i+n])), type="heatmap", name=paste0("Pred Mask ", i+n)) %>%
#   layout(hovermode = "y unified", xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
# pl_list[[i]] <- subplot(pl1, pl2, pl3, pl4, pl5, pl6, nrows=1, shareY = TRUE) %>%
#   layout(title=paste0("DCNN HDF5 Model (epochs=100) Validation on Case ", i)) %>% hide_colorbar()
# }
# pl_list %>%
#   subplot(nrows = length(pl_list)) %>%
#      layout(title=paste0("DCNN HDF5 Model (epochs=100) Predictions (N=", 2*n, " Cases)"))

# Assuming `images` tibble (with columns image, predicted_mask, mask) exists
# and `n = 10` is set

n <- 10   # or however many cases you want
all_cases <- map_dfr(1:n, function(i) {
  # For each case, create a data frame for the three columns
  img_mat <- as.matrix(as.data.frame(images$image[i]))[,1:256]
  true_mat <- as.matrix(as.data.frame(images$mask[i]))[,1:256]
  pred_mat <- as.matrix(as.data.frame(images$predicted_mask[i]))[,1:256]

  bind_rows(
    expand.grid(x = 1:ncol(img_mat), y = 1:nrow(img_mat)) %>%
      mutate(value = as.vector(img_mat),
             case = paste0("Case ", i),
             type = "Brain Image"),
    expand.grid(x = 1:ncol(true_mat), y = 1:nrow(true_mat)) %>%
      mutate(value = as.vector(true_mat),
             case = paste0("Case ", i),
             type = "True Mask"),
    expand.grid(x = 1:ncol(pred_mat), y = 1:nrow(pred_mat)) %>%
      mutate(value = as.vector(pred_mat),
             case = paste0("Case ", i),
             type = "Predicted Mask")
  )
})

ggplot(all_cases, aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_grid(case ~ type) +    # rows = cases, columns = image types
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        strip.text = element_text(size = 10, face = "bold"),
        strip.background = element_blank()) +
  labs(title = paste0("DCNN HDF5 Model (epochs=100) Predictions – ", n, " Cases"))

# batch <- validation_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# predictions <- predict(mod1, batch)
# str(predictions)  # num [1:32, 1:256, 1:256, 1] 0.466 0.463 0.48 0.449 0.494 ...
# hist(predictions)
#
# images <- tibble(
#   image = batch[[1]] %>% array_branch(1),
#   predicted_mask = predictions[,,,1] %>% array_branch(1),
#   mask = batch[[2]][,,,1]  %>% array_branch(1)
# )

# Save Predictions on validation data
# batch <- validation_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# predictions <- predict(mod1, batch[[1]])
PredictedMasks <- list(predictions=predictions, images=images)
save(PredictedMasks, file="C:/Users/IvoD/Desktop/predictionsModel_TF_Brain_Epochs100.Rda")
# Then load it with:
# load("C:/Users/Dinov/Desktop/predictionsModel_TF_Brain_Epochs100.Rda")
# https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes
# load(url("https://umich.instructure.com/files/22648924/download?download_frd=1"))

Finally, we can explicate some of the Transfer Learning process by modifying the pretrained model mod2 and obtaining a new model, modTransferLearning, which demonstrates synthetic image generation. This new modified modTransferLearning DCNN will output 3-channel brain-images, not tumor masks (predicted by mod2). Note that 31M parameters part of the original pretrained mod2 are fixed and we are only estimating the remaining $16K$ parameters in this transfer-learning tuning process.

# This HDF5 model (model_TF_Brain_epoch_100.h5) is available on Canvas:
# https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes
# https://umich.instructure.com/files/22647026/download?download_frd=1
mod2 <- load_model_hdf5(paste0(localModelsFolder, "/appendix/model_TF_Brain_epoch_100.h5"), compile=FALSE)
          # custom_objects = list("dice" = dice), compile = TRUE)

#  mod2 <- load_model_hdf5("E:/Ivo.dir/Research/UMichigan/Proposals/2014/UMich_MINDS_DataSciInitiative_2014/2016/MIDAS_HSXXX_DataScienceAnalyticsCourse_Porposal/MOOC_Implementation/appendix/model_TF_Brain_epoch_100.h5", custom_objects = list("dice" = dice), compile = TRUE)

# Freeze the first 32 layers and inspect the number of trainable and non-trainable (frozen DCNN parameters)
for (layer in mod2$layers) layer$trainable <- FALSE

# summary(mod2); length(mod2$layers)
# Update the model's last layer: https://tensorflow.rstudio.com/guide/keras/faq/
# More about Keras layers: https://keras.rstudio.com/reference/index.html#section-core-layers
# mod2 <- mod1 %>%
#   # get_layer(index=33) %>%
#   pop_layer() %>%  # remove the output layer first, then plug in a new output layer
#   # layer_conv_2d(filters = 64, kernel_size = c(3, 3), padding = "same")
#   layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = "relu")
input <- mod2$input                            # Pretrained UNet input layer
base <- (mod2 %>% get_layer(index=32))$output  # Get the prior-to-last output layer of "mod2", total #layers =33
target <- base %>%                             # Replace the output layer by a new conv2D layer outputting a 3-channel 2D brain image, not a mask
  layer_conv_2d(filters = 64, kernel_size = c(1,1), activation = "relu") %>%
  layer_max_pooling_2d(pool_size = c(2,2)) %>%
  layer_conv_2d(filters = 64, kernel_size = c(1,1), activation = "relu") %>%
  layer_max_pooling_2d(pool_size = c(2,2)) %>%
  layer_conv_2d_transpose(filters = 32, kernel_size = c(2,2), strides = 2, padding = "same", activation = "relu") %>%  # deconvolution layers
  layer_conv_2d_transpose(filters = 3, kernel_size = c(2,2), strides = 2, padding = "same", activation = "relu")
modTransferLearning <- keras_model(input, target)   # Update mod2 for transfer learning
summary(modTransferLearning); length(modTransferLearning$layers)

## Model: "model_2"
## ________________________________________________________________________________
##  Layer (type)       Output Shape         Para   Connected to         Trainable  
##                                          m #                                    
## ================================================================================
##  input_1 (InputLay  [(None, 256, 256,    0      []                   N          
##  er)                3)]                                                         
##  conv2d (Conv2D)    (None, 256, 256, 6   1792   ['input_1[0][0]']    N          
##                     4)                                                          
##  conv2d_1 (Conv2D)  (None, 256, 256, 6   3692   ['conv2d[0][0]']     N          
##                     4)                   8                                      
##  max_pooling2d (Ma  (None, 128, 128, 6   0      ['conv2d_1[0][0]']   N          
##  xPooling2D)        4)                                                          
##  conv2d_2 (Conv2D)  (None, 128, 128, 1   7385   ['max_pooling2d[0]   N          
##                     28)                  6      [0]']                           
##  conv2d_3 (Conv2D)  (None, 128, 128, 1   1475   ['conv2d_2[0][0]']   N          
##                     28)                  84                                     
##  max_pooling2d_1 (  (None, 64, 64, 128   0      ['conv2d_3[0][0]']   N          
##  MaxPooling2D)      )                                                           
##  conv2d_4 (Conv2D)  (None, 64, 64, 256   2951   ['max_pooling2d_1[   N          
##                     )                    68     0][0]']                         
##  conv2d_5 (Conv2D)  (None, 64, 64, 256   5900   ['conv2d_4[0][0]']   N          
##                     )                    80                                     
##  max_pooling2d_2 (  (None, 32, 32, 256   0      ['conv2d_5[0][0]']   N          
##  MaxPooling2D)      )                                                           
##  conv2d_6 (Conv2D)  (None, 32, 32, 512   1180   ['max_pooling2d_2[   N          
##                     )                    160    0][0]']                         
##  conv2d_7 (Conv2D)  (None, 32, 32, 512   2359   ['conv2d_6[0][0]']   N          
##                     )                    808                                    
##  max_pooling2d_3 (  (None, 16, 16, 512   0      ['conv2d_7[0][0]']   N          
##  MaxPooling2D)      )                                                           
##  dropout (Dropout)  (None, 16, 16, 512   0      ['max_pooling2d_3[   N          
##                     )                           0][0]']                         
##  conv2d_8 (Conv2D)  (None, 16, 16, 102   4719   ['dropout[0][0]']    N          
##                     4)                   616                                    
##  conv2d_9 (Conv2D)  (None, 16, 16, 102   9438   ['conv2d_8[0][0]']   N          
##                     4)                   208                                    
##  conv2d_transpose   (None, 32, 32, 512   2097   ['conv2d_9[0][0]']   N          
##  (Conv2DTranspose)  )                    664                                    
##  concatenate (Conc  (None, 32, 32, 102   0      ['conv2d_7[0][0]',   N          
##  atenate)           4)                           'conv2d_transpose              
##                                                 [0][0]']                        
##  conv2d_10 (Conv2D  (None, 32, 32, 512   4719   ['concatenate[0][0   N          
##  )                  )                    104    ]']                             
##  conv2d_11 (Conv2D  (None, 32, 32, 512   2359   ['conv2d_10[0][0]'   N          
##  )                  )                    808    ]                               
##  conv2d_transpose_  (None, 64, 64, 256   5245   ['conv2d_11[0][0]'   N          
##  1 (Conv2DTranspos  )                    44     ]                               
##  e)                                                                             
##  concatenate_1 (Co  (None, 64, 64, 512   0      ['conv2d_5[0][0]',   N          
##  ncatenate)         )                            'conv2d_transpose              
##                                                 _1[0][0]']                      
##  conv2d_12 (Conv2D  (None, 64, 64, 256   1179   ['concatenate_1[0]   N          
##  )                  )                    904    [0]']                           
##  conv2d_13 (Conv2D  (None, 64, 64, 256   5900   ['conv2d_12[0][0]'   N          
##  )                  )                    80     ]                               
##  conv2d_transpose_  (None, 128, 128, 1   1312   ['conv2d_13[0][0]'   N          
##  2 (Conv2DTranspos  28)                  00     ]                               
##  e)                                                                             
##  concatenate_2 (Co  (None, 128, 128, 2   0      ['conv2d_3[0][0]',   N          
##  ncatenate)         56)                          'conv2d_transpose              
##                                                 _2[0][0]']                      
##  conv2d_14 (Conv2D  (None, 128, 128, 1   2950   ['concatenate_2[0]   N          
##  )                  28)                  40     [0]']                           
##  conv2d_15 (Conv2D  (None, 128, 128, 1   1475   ['conv2d_14[0][0]'   N          
##  )                  28)                  84     ]                               
##  conv2d_transpose_  (None, 256, 256, 6   3283   ['conv2d_15[0][0]'   N          
##  3 (Conv2DTranspos  4)                   2      ]                               
##  e)                                                                             
##  concatenate_3 (Co  (None, 256, 256, 1   0      ['conv2d_1[0][0]',   N          
##  ncatenate)         28)                          'conv2d_transpose              
##                                                 _3[0][0]']                      
##  conv2d_16 (Conv2D  (None, 256, 256, 6   7379   ['concatenate_3[0]   N          
##  )                  4)                   2      [0]']                           
##  conv2d_17 (Conv2D  (None, 256, 256, 6   3692   ['conv2d_16[0][0]'   N          
##  )                  4)                   8      ]                               
##  conv2d_39 (Conv2D  (None, 256, 256, 6   4160   ['conv2d_17[0][0]'   Y          
##  )                  4)                          ]                               
##  max_pooling2d_9 (  (None, 128, 128, 6   0      ['conv2d_39[0][0]'   Y          
##  MaxPooling2D)      4)                          ]                               
##  conv2d_38 (Conv2D  (None, 128, 128, 6   4160   ['max_pooling2d_9[   Y          
##  )                  4)                          0][0]']                         
##  max_pooling2d_8 (  (None, 64, 64, 64)   0      ['conv2d_38[0][0]'   Y          
##  MaxPooling2D)                                  ]                               
##  conv2d_transpose_  (None, 128, 128, 3   8224   ['max_pooling2d_8[   Y          
##  9 (Conv2DTranspos  2)                          0][0]']                         
##  e)                                                                             
##  conv2d_transpose_  (None, 256, 256, 3   387    ['conv2d_transpose   Y          
##  8 (Conv2DTranspos  )                           _9[0][0]']                      
##  e)                                                                             
## ================================================================================
## Total params: 31048611 (118.44 MB)
## Trainable params: 16931 (66.14 KB)
## Non-trainable params: 31031680 (118.38 MB)
## ________________________________________________________________________________

## [1] 38

# 31M parameters are fixed and we are only estimating 16K parameters in the Transfer-Learning tuning.

# Inspect the number of trainable and non-trainable (frozen DCNN parameters)
# ....
# ...
# conv2d_17 (Conv2D)    (None, 256, 256, 64)              36928              conv2d_16[0][0]   # Last Layers of base model mod2
# ___________________________________________________________________________________________
# conv2d_44 (Conv2D)    (None, 256, 256, 64)              4160               conv2d_17[0][0]   # First new TF layer
# ___________________________________________________________________________________________
# max_pooling2d_25 (MaxPooling2D)   (None, 128, 128, 64)    0                 conv2d_44[0][0]
# ___________________________________________________________________________________________
# conv2d_43 (Conv2D)    (None, 128, 128, 64)              4160         max_pooling2d_25[0][0]
# ___________________________________________________________________________________________
# max_pooling2d_24 (MaxPooling2D)  (None, 64, 64, 64)       0                  conv2d_43[0][0]
# ___________________________________________________________________________________________
# conv2d_transpose_17 (Conv2DTranspose)  (None, 128, 128, 32) 8224      max_pooling2d_24[0][0]
# ____________________________________________________________________________________________
# conv2d_transpose_16 (Conv2DTranspose)  (None, 256, 256, 3)  387    v2d_transpose_17[0][0]   # New Last Output Layer of TL model  modTransferLearning
# ============================================================================================
# Total params: 31,048,611
# Trainable params: 16,931
# Non-trainable params: 31,031,680
# ________________________________
# [1] 38 # Layers of new modTransferLearning model


## MODEL RE-TRAINING (for Transfer Learning)
# Compile the DCNN model, packaging an optimizer, loss and performance metrics
modTransferLearning %>% compile(
  optimizer = optimizer_rmsprop(learning_rate = 1e-5),
  # loss = 'loss_kullback_leibler_divergence',
  loss = 'mse',   # as we are looking at minimizing ||OrigImage - SynthImage||, i.e., the RMSE
  metrics = list(metric_mean_absolute_error, metric_poisson)
)

currentRotationIndex <- 1 # reset

### PREP New Data (replace masks by the native brain images)
create_TL_dataset <- function(data, train, batch_size = 32L) {
  dataset <- data %>%   # load all PNG files as images
    tensor_slices_dataset() %>%
    dataset_map(~.x %>% list_modify(
      img = tf$image$decode_png(tf$io$read_file(.x$img), channels = 3),  # <------ IMG
      mask= tf$image$decode_png(tf$io$read_file(.x$img), channels = 3)   # <------ MASK == IMG
    )) %>%     # convert image intensities from int8 to 32-bit float
    dataset_map(~.x %>% list_modify(
      img = tf$image$convert_image_dtype(.x$img, dtype = tf$float32),
      mask= tf$image$convert_image_dtype(.x$mask, dtype = tf$float32)
    )) %>%     # reshape all tensor-shape to (256 * 256) to ensure spatial-index homologies
    dataset_map(~.x %>% list_modify(
      img = tf$image$resize(.x$img, size = shape(256, 256)),
      mask= tf$image$resize(.x$mask, size = shape(256, 256))
    ))

  # data augmentation performed on training set only
  if (train) {
    dataset <- dataset %>%
      dataset_map(~.x %>% list_modify(
        img  = random_bsh(.x$img) # ,
        # mask = img
      )) %>%
      dataset_map(~.x %>% list_modify(  # see discussion in create_dataset()
        img = random_rotate(img=.x$img, mask=.x$mask)$img,
        mask= random_rotate(img=.x$img, mask=.x$mask)$mask
      ))
  }

  # shuffling on training set only
  if (train) {
    dataset <- dataset %>%
      dataset_shuffle(buffer_size = batch_size*4)
  }

  # train in batches; batch size might need to be adapted depending on
  # available memory
  dataset <- dataset %>%
    dataset_batch(batch_size)

  dataset %>%
    # output needs to be unnamed
    dataset_map(unname)
}

# Generate the Transfer Learning (TL) Training and Testing data at iterated TF-Data objects
training_TL_dataset <- create_TL_dataset(training(data_train_valid), train = TRUE)
validation_TL_dataset <- create_TL_dataset(testing(data_train_valid), train = FALSE)
# tf$data$experimental$cardinality(training_TL_dataset)$numpy()

# testing_TL_dataset <- testing(data_train_valid) %>%
#   tensor_slices_dataset() %>%
#   dataset_map(~.x %>% list_modify(
#     # decode_jpeg yields a 3d tensor of shape (256, 256, 3)
#     # Check tensor shapes!
#     img = tf$image$decode_png(tf$io$read_file(.x$img), channels = 3),
#     mask =tf$image$decode_png(tf$io$read_file(.x$img), channels = 3)
#   ))
# 
# example_TL <- training_TL_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# # example_TL
###### OLD plot_ly() interactive version
# rasterPlotly <- function (image, name="", hovermode = NULL) {
#   myPlot <- plot_ly(type="image", z=image, name=name, hoverlabel=name, text=name,
#             hovertext=name, hoverinfo="name+x+y") %>%
#     layout(hovermode = hovermode, xaxis = list(hoverformat = '.1f'), yaxis = list(hoverformat = '.1f'))
#   return(myPlot)
# }
# 
# # Training Case
# img_and_mask <- training_TL_dataset %>%
#       reticulate::as_iterator() %>% reticulate::iter_next()  # shape=(batch=32, 256, 256, channel=3)
# img <- img_and_mask[[1]][1,,,1]  %>% as.array()  # 3-channel Input image
# target<- img_and_mask[[2]][1,,,1]  %>% as.array()  # 3-channel Output image
# p1 <- plot_ly(z=~255*img, type="heatmap", name = "Input Image") %>% hide_colorbar()
# p2 <- plot_ly(z=~255*target, type="heatmap", name = "Output Image") %>% hide_colorbar()
# subplot(p1,p2, shareY = TRUE) %>% layout(title="Training Case: 3-Channel Input Image (Left) & Output Image (Right)")

library(ggplot2)
library(patchwork)

# Get one batch from the transfer learning dataset
example_TL <- training_TL_dataset %>% 
  reticulate::as_iterator() %>% 
  reticulate::iter_next()

# Extract input image (first sample, first channel) and target (same structure)
img <- example_TL[[1]][1,,,1] %>% as.array()   # Input image (first channel)
target <- example_TL[[2]][1,,,1] %>% as.array() # Output image (should equal input in this TL setup)

# Helper function to convert matrix to ggplot heatmap
mat_to_gg <- function(mat, title, multiply_255 = TRUE) {
  if(multiply_255) mat <- 255 * mat
  df <- expand.grid(x = 1:ncol(mat), y = 1:nrow(mat))
  df$value <- as.vector(mat)
  ggplot(df, aes(x = x, y = y, fill = value)) +
    geom_raster() +
    scale_fill_gradient(low = "black", high = "white") +
    coord_fixed() +
    theme_void() +
    theme(legend.position = "none",
          plot.title = element_text(hjust = 0.5, size = 12)) +
    labs(title = title)
}

# Create the two plots
p_input <- mat_to_gg(img, "Input Image", multiply_255 = TRUE)
p_output <- mat_to_gg(target, "Output Image", multiply_255 = TRUE)

# Combine side by side with a shared title
(p_input | p_output) +
  plot_annotation(
    title = "Training Case: 3‑Channel Input Image (Left) & Output Image (Right)",
    theme = theme(plot.title = element_text(hjust = 0.5))
  )

# MODEL FITTING - uncomment this fragment to run 6-epochs (2+ hrs)
# modTransferLearning_History_6Epochs <- modTransferLearning %>% fit(training_TL_dataset, epochs = 6, validation_data = validation_TL_dataset, verbose = 2)
# 63/63 - 1009s - loss: 1.2179 - mean_absolute_error: 0.4255 - poisson: 1.1990 - kullback_leibler_divergence: 2.0313 - val_loss: 0.4258 - val_mean_absolute_error: 0.3001 - val_poisson: 1.0742 - val_kullback_leibler_divergence: 2.0180
# Save model HDF5
# save_model_hdf5(modTransferLearning, "C:/Users/Dinov/Desktop/model_TF_TL_SynthBrainImages_epoch_6.h5",
#                 overwrite = TRUE, include_optimizer = TRUE)
# Model is on Canvas: https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes
# mod3_TL <-
#  load_model_hdf5("E:/Ivo.dir/Research/UMichigan/Proposals/2014/UMich_MINDS_DataSciInitiative_2014/2016/MIDAS_HSXXX_DataScienceAnalyticsCourse_Porposal/MOOC_Implementation/appendix/model_TF_TL_SynthBrainImages_epoch_6.h5", compile = FALSE)

Once we have trained (history <- model %>% fit(()) and saved (save_model_hdf5()) the model (model), we can load it back as mod1 the interactive session and use it for prediction.

# mod3_TL <- modTransferLearning
# mod3_TL <- load_model_hdf5("E:/Ivo.dir/Research/UMichigan/Proposals/2014/UMich_MINDS_DataSciInitiative_2014/2016/MIDAS_HSXXX_DataScienceAnalyticsCourse_Porposal/MOOC_Implementation/appendix/model_TF_TL_SynthBrainImages_epoch_6.h5", compile = F)
# Load the h5 model from Canvas: https://umich.instructure.com/courses/38100/files/folder/Case_Studies/36_TCGA_MRI_Segmentation_Data_Phenotypes
mod3_TL <- load_model_hdf5(paste0(localModelsFolder,
                  "/appendix/model_TF_TL_SynthBrainImages_epoch_6.h5"), compile = FALSE)

# Recompile the model with our  custom dice() loss function
mod3_TL %>% compile(
  optimizer = optimizer_rmsprop(learning_rate = 1e-5),
  # loss = 'loss_kullback_leibler_divergence',
  loss = 'mse',   # as we are looking at minimizing ||OrigImage - SynthImage||, i.e., the RMSE
  metrics = list(metric_mean_absolute_error, metric_poisson)
)

# Load the performance metrics of pre-computed 6-epoch model(modTransferLearning_History_6Epochs_DF.csv)
modTransferLearning_History_6Epochs_DF <- read.csv("https://umich.instructure.com/files/22825191/download?download_frd=1", header = T)
# modTransferLearning_History_6Epoc_DF <- as.data.frame(modTransferLearning_History_6Epochs$metrics)
#
# modTransferLearning_History_6Epochs_DF <- data.frame(
#   epoch = c(1:dim(modTransferLearning_History_6Epoc_DF)[1]),
#   loss = modTransferLearning_History_6Epoc_DF$loss,
#   mean_absolute_error=modTransferLearning_History_6Epoc_DF$mean_absolute_error,
#   poisson=modTransferLearning_History_6Epoc_DF$poisson,
#   val_loss = modTransferLearning_History_6Epoc_DF$val_loss,
#   val_mean_absolute_error = modTransferLearning_History_6Epoc_DF$val_mean_absolute_error,
#   val_poisson = modTransferLearning_History_6Epoc_DF$val_poisson)

head(modTransferLearning_History_6Epochs_DF)

##   X epoch       loss mean_absolute_error  poisson   val_loss
## 1 1     1 0.06418122          0.14311932 1.006497 0.05342003
## 2 2     2 0.04373177          0.12279665 1.006611 0.03685201
## 3 3     3 0.03213732          0.10813147 1.017886 0.02786113
## 4 4     4 0.02621992          0.09721498 1.042131 0.02372373
## 5 5     5 0.02366107          0.08954988 1.059174 0.02232249
## 6 6     6 0.02273074          0.08592803 1.048274 0.02177707
##   val_mean_absolute_error val_poisson
## 1              0.13215400    1.024515
## 2              0.11384638    1.023064
## 3              0.10080072    1.040739
## 4              0.09157515    1.067587
## 5              0.08643402    1.073721
## 6              0.08500605    1.057148

# write.csv(modTransferLearning_History_6Epochs_DF, "C:/Users/Dinov/Desktop/modTransferLearning_History_6Epochs_DF.csv")
# modTransferLearning_History_6Epochs_DF <- read.csv("https://umich.instructure.com/files/22825191/download?download_frd=1", header = T)
# modTransferLearning_History_4Epochs_DF <- read.csv("https://umich.instructure.com/files/22704304/download?download_frd=1", header = T)

# EVALUATION
plot_ly(data=modTransferLearning_History_6Epochs_DF, x = ~epoch,
                   y = ~loss, type = "scatter", mode="markers+lines", name="Loss") %>%
  add_trace(x = ~epoch, y = ~mean_absolute_error, type = "scatter", mode="markers+lines", name="Mean Abs Error") %>%
  add_trace(x = ~epoch, y = ~poisson, type = "scatter", mode="markers+lines", name="Poisson") %>%
  add_trace(x = ~epoch, y = ~val_loss, type = "scatter", mode="markers+lines", name="Valid Loss") %>%
  add_trace(x = ~epoch, y = ~val_mean_absolute_error, type = "scatter", mode="markers+lines", name="Valid MAE") %>%
  add_trace(x = ~epoch, y = ~val_poisson, type = "scatter", mode="markers+lines", name="Valid Poisson") %>%
  layout(title="Transfer-Learning Validation Synth Image Generation Performance", xaxis=list(title="epoch"),
         yaxis=list(title="Metric Value"), legend = list(orientation='h'))

# # mod3_TL %>% evaluate(training_TL_dataset %>% dataset_batch(25), verbose = 0)
# mod3_TL %>% evaluate(validation_TL_dataset, verbose = 0)
# #  Epoch4 loss         mean_absolute_error   poisson      kullback_leibler_divergence
# #         0.1845085    0.2612749             0.8180779    1.2870733
# # Epoch6       loss       mean_absolute_error             poisson
# #              0.02177707          0.08500605          1.05714786
# 
# # TL Model Prediction using the default (prior to Transfer Learning) model
# batch <- validation_TL_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# predictions <- predict(mod3_TL, batch[[1]])
# images <- tibble(
#   image     = batch[[1]][,,,1]  %>% array_branch(1),
#   synth_img = predictions[,,,1] %>% array_branch(1),
#   target    = batch[[2]][,,,1]  %>% array_branch(1)
# )

# Check performance of the Transfer-Learning model on one case (# 22)
###### OLD plot_ly() interactive graph
# i=22
# pl1 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$image[i])[,1:256]), type="heatmap", name=paste0("Input Image ", i))
# pl2 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$target[i])[,1:256]), type="heatmap", name=paste0("Target ", i))
# pl3 <- plot_ly(z = ~ 255*as.matrix(as.data.frame(images$synth_img[i])[,1:256]), type="heatmap", name=paste0("Synth Image ", i))
# subplot(pl1, pl2, pl3, nrows=1) %>% hide_colorbar()
# 
# pl_list <- list()
# for (i in 1:10) {
#   image = as.matrix(as.data.frame(images$image[i])[,1:256])
#   synth_img = as.matrix(as.data.frame(images$synth_img[i])[,1:256])
#   target = as.matrix(as.data.frame(images$target[i])[,1:256])
# 
#   p1 <- plot_ly(z=~255*image, type="heatmap", name=paste0("Image ", i))
#   p2 <- plot_ly(z=~255*target, type="heatmap", name=paste0("Target=Image ", i))
#   p3 <- plot_ly(z=~255*synth_img, type="heatmap", name=paste0("Synth Img ", i))
#   pl_list[[i]] <- subplot(p1,p2, p3, nrows=1) %>% hide_colorbar() # %>%
#     # layout(yaxis=list(scaleanchor = "x",  scaleratio = 1),  hovermode = "y unified")
#     # layout(hovermode = "y unified")
# }
# pl_list %>%
#   subplot(nrows = length(pl_list)) %>%
#      layout(title="Input Brain Images, Targets, and Synth Reconstructions for N=10 Cases")


# ---- Evaluation (unchanged) ----
mod3_TL %>% evaluate(validation_TL_dataset, verbose = 0)

##                loss mean_absolute_error             poisson 
##          0.02244135          0.08668123          1.07857680

# ---- TL Model Prediction ----
batch <- validation_TL_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
predictions <- predict(mod3_TL, batch[[1]])

## 1/1 - 19s - 19s/epoch - 19s/step

images <- tibble(
  image     = batch[[1]][,,,1]  %>% array_branch(1),
  synth_img = predictions[,,,1] %>% array_branch(1),
  target    = batch[[2]][,,,1]  %>% array_branch(1)
)

# ---- Helper function to convert matrix to ggplot heatmap ----
mat_to_gg <- function(mat, title, multiply_255 = TRUE) {
  if(multiply_255) mat <- 255 * mat
  df <- expand.grid(x = 1:ncol(mat), y = 1:nrow(mat))
  df$value <- as.vector(mat)
  ggplot(df, aes(x = x, y = y, fill = value)) +
    geom_raster() +
    scale_fill_gradient(low = "black", high = "white") +
    coord_fixed() +
    theme_void() +
    theme(legend.position = "none",
          plot.title = element_text(hjust = 0.5, size = 10)) +
    labs(title = title)
}

# ---- Single case (i = 22) ----
i <- 22
img_mat    <- as.matrix(as.data.frame(images$image[i])[,1:256])
target_mat <- as.matrix(as.data.frame(images$target[i])[,1:256])
synth_mat  <- as.matrix(as.data.frame(images$synth_img[i])[,1:256])

p_img    <- mat_to_gg(img_mat,    paste0("Input Image ", i))
p_target <- mat_to_gg(target_mat, paste0("Target ", i))
p_synth  <- mat_to_gg(synth_mat,  paste0("Synth Image ", i))

single_plot <- (p_img | p_target | p_synth) +
  plot_annotation(title = paste("Transfer Learning – Case", i),
                  theme = theme(plot.title = element_text(hjust = 0.5)))
print(single_plot)

# ---- Multi‑case (10 cases) – static faceted version ----
all_cases <- map_dfr(1:10, function(i) {
  img_mat    <- as.matrix(as.data.frame(images$image[i])[,1:256])
  target_mat <- as.matrix(as.data.frame(images$target[i])[,1:256])
  synth_mat  <- as.matrix(as.data.frame(images$synth_img[i])[,1:256])

  bind_rows(
    expand.grid(x = 1:ncol(img_mat), y = 1:nrow(img_mat)) %>%
      mutate(value = as.vector(255 * img_mat),
             case = paste0("Case ", i),
             type = "Input Image"),
    expand.grid(x = 1:ncol(target_mat), y = 1:nrow(target_mat)) %>%
      mutate(value = as.vector(255 * target_mat),
             case = paste0("Case ", i),
             type = "Target (Input)"),
    expand.grid(x = 1:ncol(synth_mat), y = 1:nrow(synth_mat)) %>%
      mutate(value = as.vector(255 * synth_mat),
             case = paste0("Case ", i),
             type = "Synthesised Image")
  )
})

ggplot(all_cases, aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_grid(case ~ type) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        strip.text = element_text(size = 10, face = "bold")) +
  labs(title = "Input Brain Images, Targets, and Synth Reconstructions for N=10 Cases")

3 Notes about the Tensorflow pipeline protocol

3.1 Preprocessing

The preprocessing pipeline allows inspecting intermediate results using reticulate::as_iterator() on the dataset.

# library(reticulate)
# example <- training_dataset %>% reticulate::as_iterator() %>% reticulate::iter_next()
# ex_img <- example$img %>% as.array()  #  %>% as.raster()
# plot_ly(type="image", z=255*ex_img)       #  255*EBImage::Image(t(ex_img)))

# display one brain image
batch <- training_dataset %>% reticulate::as_iterator()

# pl_list <- list()
# for (i in 1:10) {
#   record <- reticulate::iter_next(batch)
#   image = record[[1]] %>% as.array()
#   mask = record[[2]] %>% as.array()
#   p1 <- plot_ly(z=~image[i,,,3], type="heatmap", name=paste0("Image ", i))
#   p2 <- plot_ly(z=~mask[i,,,1], type="heatmap", name=paste0("Mask ", i))
#   pl_list[[i]] <- subplot(p1,p2, nrows=1) %>% hide_colorbar() # %>%
#     # layout(yaxis=list(scaleanchor = "x",  scaleratio = 1),  hovermode = "y unified")
#     # layout(hovermode = "y unified")
# }
# pl_list %>%
#   subplot(nrows = length(pl_list)) %>%
#      layout(title="Brain Images and Masks for N=10 Cases")

batch <- training_dataset %>% reticulate::as_iterator()

all_pairs <- map_dfr(1:10, function(i) {
  record <- reticulate::iter_next(batch)
  img_mat <- record[[1]] %>% as.array()   # shape (32,256,256,3) – we take first sample
  mask_mat <- record[[2]] %>% as.array()

  # Use first image in batch, first channel
  img_slice <- (1.3)*img_mat[1,,,1]
  mask_slice <- mask_mat[1,,,1]

  bind_rows(
    expand.grid(x = 1:ncol(img_slice), y = 1:nrow(img_slice)) %>%
      mutate(value = as.vector(img_slice),
             case = paste0("Case ", i),
             type = "Brain Image"),
    expand.grid(x = 1:ncol(mask_slice), y = 1:nrow(mask_slice)) %>%
      mutate(value = as.vector(mask_slice),
             case = paste0("Case ", i),
             type = "Mask")
  )
})

ggplot(all_pairs, aes(x = x, y = y, fill = value)) +
  geom_raster() +
  facet_grid(case ~ type) +
  scale_y_reverse() +
  scale_fill_gradient(low = "black", high = "white") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none",
        strip.text = element_text(size = 10, face = "bold")) +
  labs(title = "Brain Images and Masks for N=10 Cases")

For the subsequent transfer-learning, we need to add additional layers to the base-model (mod1). The output of every Conv2D and MaxPooling2D layer is a 3D tensor of shape (height, width, channels). In our-case, the final mask-output is a single-channel image. The width and height dimensions typically shrink with network layer depth. The number of output channels for each Conv2D layer is controlled by the filters parameter (e.g., 32 or 64). As the width and height shrink, we can add more output channels in each Conv2D layer in the NN.

Note that the new transfer-learning model (TL_model) has only $17K$ parameters to estimate, as the $31M$ parameters of the base model (mod) are now frozen, i.e., they will not be tuned or estimated during the TL_model re-fitting, which will be much faster than the estimation of the original model.

Total params: 31,048,611
Trainable params: 16,931
Non-trainable params: 31,031,680

After we design the DCNN model (Unet), we need to again compile it and estimate the fit to obtain the remaining trainable parameters.

3.2 Network Layers

Convolutional Layers: In the late 1990s, LeCun introduced one of the most popular strategies for generating signature (feature) vectors corresponding to single- or multi-channel 2D images. Previously, alternative methods, such as wavelet or spectral decomposition, could be used to map images as features. Then more classical AI/ML techniques, such as support vector machine, knn, logistic regression, among others, may be employed to model, analyze, predict and classify images.

Transforming 2D or higher-dimensional images as feature-vectors disregard some of the spatial interaction between pixels, voxels, and tensors. Convolution layers tend to capture and protect some of the spatial information from neighboring spatial locations. This is accomplished by down-sampling the image into features by convolving the images with kernels (filters) and then using the resampled convolution images to predict specific outcomes (images, values, classes, etc.) The use of multiple convolution kernels to “filter” the image involves computing a product to extract different features from the images.

For an image $f(m,n)$, and a kernel $g(k,l)$ defined over an integer grid $\{m,n,k,l\in \mathbb{Z}\}$, the discrete convolution of $f$ and $g$ is:

\[\underbrace{(f*g)}_{convolution}[m,n]=\sum_{m=-\infty}^{\infty} {\sum_{n=-\infty}^{\infty} {\left (f[k, l]\times g[m-k, n-l]\right )}},\] where typically, the support of $f$ and $g$ is compact, e.g., $0\leq m,k\leq M-1$ and $0\leq n,l\leq N-1$.

The convolution of two finite sequences is defined by extending the sequences to finitely supported functions on the set of integers. When the sequences are the coefficients of two polynomials, then the coefficients of the ordinary product of the two polynomials are the convolution of the original two sequences. This is known as the Cauchy product of the coefficients of the sequences.

For instance, edge detection in an image can be done using a Sobel kernel matrix for vertical ($y$) and horizontal ($x$) edges.

\[{\displaystyle g_{x}={\begin{bmatrix}+1&0&-1\\+2&0&-2\\+1&0&-1\end{bmatrix}} \quad {\mbox{and}}\quad g_{y}={\begin{bmatrix}+1&+2&+1\\0&0&0\\-1&-2&-1\end{bmatrix}}}.\]

sobelX = matrix(c(1,2,1, 0,0,0, -1,-2,-1), nrow = 3, ncol = 3); sobelX

##      [,1] [,2] [,3]
## [1,]    1    0   -1
## [2,]    2    0   -2
## [3,]    1    0   -1

sobelY=t(sobelX); sobelY

##      [,1] [,2] [,3]
## [1,]    1    2    1
## [2,]    0    0    0
## [3,]   -1   -2   -1

library(jpeg)
library(magick)

img_url <- "https://umich.instructure.com/files/1627149/download?download_frd=1"
f <- image_read(img_url)
plot(f)

# To apply the convolution process manually, we use a convolve() function of the 'magick' package.

imgX <- image_convolve(f, sobelX)
imgY <- image_convolve(f, sobelY)
# plot(imgX, imgY)

# Rotate 90 degrees
F <- imager::mirror(imager::imrotate(imager::magick2cimg(f), 90), "x")
ImgX <- imager::mirror(imager::imrotate(imager::magick2cimg(imgX), 90), "x")
ImgY <- imager::mirror(imager::imrotate(imager::magick2cimg(imgY), 90), "x")
p1 <- plot_ly(z=~255*(ImgX)[,,1,], type="image", name="(f*SobelX)")
p2 <- plot_ly(z=~255*(F)[,,1,], type="image", name="f")
p3 <- plot_ly(z=~255*(ImgY)[,,1,], type="image", name="(f*SobelY)")
subplot(p1, p2, p3, nrows=1) %>%
  layout(title="Image convolution with Sobel Kernel Filters (Left=f*SobelX, Middle=f, Right=f*SobelY)")

During the DCNN training process, the encoding phase typically included 2D convolutional layers (layer_conv_2d()) paired with pooling layers that reduce the size of the tensor shape and transform images by sliding the kernel filter by a stride (1, 2, 3, etc.) pixels to the right and down. The relation between the resulting feature-vector size and the kernel size is represented by:

\[Feature\ size = \frac{Image\ size\ −\ Kernel\ size}{Stride} + 1.\] For instance, a 10x10 square image, a filter of size 4x4, and a stride of 2 pixels, the $Feature\ size = \frac{10 − 4}{2} + 1 = 3$. Similarly, during the decoding phase, we include deconvolutional layers (layer_conv_2d_transpose()) that reverse the process by increasing the grid-size (tensor shape sizes) until we reach the desired output layer shape, usually the same as the input images, but could also be different. More information about tensorflow/keras layers, loss functions, and DCNN model performance metrics is available on the RStudio Tensorflow/Keras website.

Max Pooling Layer. Max pooling layers shrink the spatial extent of the convolved features and reduce overfitting by providing an abstracted feature representation. Instead of convolving (i.e., dot-product multiplying the image and the kernel) between the input and the kernel, Max-Pooling layers take the maximum value of image intensity over the region covered by the kernel filter. There are many alternatives to max-pooling, e.g., average-pooling, which computes the mean (arithmetic-average) of all image intensities covered by the kernel filter.
Fully Connected layers: Input nodes (from the left) in fully connected layers are connected to every node in the subsequent layer to the right. One or several fully connected layers may be common towards the end of a DCNN to provide support for learning non-linear affinities between high-level features generated as outputs of the prior convolutional layer. Good network designs typically include dropout layers between two consecutive fully connected layers (to reduce overfitting) and specify activation functions to capture non-linearity.

The final fully connected layer allows us to control the output tensor shape size which reflects the expected type of classification, prediction, or forecasting. For instance, if the outp[ut is expected to be 6-class labeling, the final layer will output a vector of size 6, i.e., one node for each possible class label. A softmax of this 6-feature vector would yield a 6D vector containing probabilities ($0\leq p_i\leq 1$, $\forall 1\leq i\leq 6$) one for each class label. Dropout layers provide a mechanism for regularizing the model and reducing overfitting of the DCNN. Dropout layers may follow fully connected layers or appear after other max-pooling layers to generate image noise augmentation. Dropout layers randomly annihilate (set to zero) some of the connections of the input tensor, according to a Bernoulli distribution; hence some inputs are triaged with probability $p$.

3.3 Model Tracking and Network Visualization

One can use tensorboard to dynamically track the progression of an ongoing training process as well as to visualize a neural network structure as a graph. See these examples for more details, example 1 and example 2. The basic mechanism for this involves the following steps.

Install tensorboard (in a terminal/shell outside R/RStudio): This can be done in different ways and is system/OS dependent. Some examples include:
- python -m pip install --user --upgrade pip
- conda install -c conda-forge tensorboard
- conda activate tensorflow
Launch the tensorboard UI (from terminal shell): > tensorboard --logdir logs/run_a. Alternatively, you can launch tensorboard from RStudio directly via Rstudio –> Tools –> Shell and entering this command in the shell tensorboard --logdir logs/run_a.
Open a local browser and point to this URL address: http://localhost:6006/
Your code has to use callback mechanism for tracking, see the example above (print_dot_callback()) with printing periods or this example.
Start the NN training process in the Rmd/R/RStudio environment and observe the tracking metrics dynamically updating in the browser. Note that your python/conda/anaconda shell needs to be open/live for this (localhost) process to work and to keep the browser portal active and listening to python updates during the fitting/training process.

4 References

Chollet and Allaire’s Deep Learning with R textbook (ISBN 9781617295546) and the corresponding code notebook
Deep Neural Networks
Google’s TensorFlow API
Olaf Ronneberger, Philipp Fischer, Thomas Brox (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation, Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, LNCS, Vol.9351: 234–241, 2015, arXiv:1505.04597.

DSPA2: Data Science and Predictive Analytics (UMich HS650)

Deep Learning, Neural Networks (Part 4)

SOCR/MIDAS (Ivo Dinov)

May 2026

1 `Torch` Deep Convolutional Neural Network (CNN)

1.1 Data Import

1.2 Torch-based Transfer Learning

1.3 Torch-based DNN-based Image Reconstruction and Synthetic Image Generation

1.3.1 Data Reparametrization

1.3.2 Define the Architecture of the UNet Encoder Branch

1.3.3 Define the Architecture of the UNet Decoder Branch

1.3.4 Define the Loss Function & Optimizer

1.3.5 Train DNN UNet (Encoder & Decoder Branches)

1.3.6 Obtain the DNN-derived Image Reconstructions and Random Synthetic Images

1.3.7 Result Visualization

2 Tensorflow Image Pre-processing Pipeline

3 Notes about the Tensorflow pipeline protocol

3.1 Preprocessing

3.2 Network Layers

3.3 Model Tracking and Network Visualization

4 References

DSPA2: Data Science and Predictive Analytics (UMich HS650)

Deep Learning, Neural Networks (Part 4)

SOCR/MIDAS (Ivo Dinov)

May 2026

1 Torch Deep Convolutional Neural Network (CNN)

1.1 Data Import

1.2 Torch-based Transfer Learning

1.3 Torch-based DNN-based Image Reconstruction and Synthetic Image Generation

1.3.1 Data Reparametrization

1.3.2 Define the Architecture of the UNet Encoder Branch

1.3.3 Define the Architecture of the UNet Decoder Branch

1.3.4 Define the Loss Function & Optimizer

1.3.5 Train DNN UNet (Encoder & Decoder Branches)

1.3.6 Obtain the DNN-derived Image Reconstructions and Random Synthetic Images

1.3.7 Result Visualization

2 Tensorflow Image Pre-processing Pipeline

3 Notes about the Tensorflow pipeline protocol

3.1 Preprocessing

3.2 Network Layers

3.3 Model Tracking and Network Visualization

4 References

1 `Torch` Deep Convolutional Neural Network (CNN)