SOCR ≫ DSPA ≫ DSPA2 Topics ≫

1 Parkinson’s Disease example

Use Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Independent component analysis(ICA), Factor analysis (FA) to reduce the dimensionality of the PD data. Interpret each of the results.

2 Allometric Relations in Plants example

2.1 Load data

Load Allometric Relations in Plants data and perform proper type conversion, e.g., convert “Province” and “Born”.

2.2 Principal Component Analysis

  • Generate a data summary
  • Apply factoextra and compare it to the results of prcomp
  • Report the rotations (scores)
  • Show the scree plot
  • Select the number of PCs and employ a bootstrap test
  • Perform SVD and ICA and compare the results of PCA.
    • Use these three variables “L”,“M”,“D” to perform ICA and show pair-plots of before-ICA and after-ICA scatter in the data. plot_ly(), and scatter3dplot() may be helpful, which you saw in Chapter 2 and Chapter 3
  • Perform factor analysis
    • Use require(nFactors) to determine the number of the factors and show a scree plot as stated in notes;
    • Use factanal() to apply FA and compare the rotation “varimax” and “promax”
    • Report the loadings and consider an appropriate visualization method
  • Interpret the findings in the context of the case-study.

3 3D Volumetric Brain Study

Use the 3D Brain Tumor Segmentation (BraTS) image dataset. Split it into training and testing sets. The complete brain MR dataset contains \(257\) 3D volumes of dimensions \(240(x)\times 240(y)\times 155(z)\). For each case, there is a categorical (phenotypic) label and there are four different imaging modalities including T1 (T1-weighted), T1C (contrast enhanced T1-weighted), T2 (T2-weighted), and FLAIR (Fluid Attenuation Inversion Recovery). Read this recent pub: DOI:10.1016/j.bspc.2021.102458.

Consider each voxel (3D generalization of a pixel) in the 3D brain volume as a feature. Use both t-SNE and UMAP to reduce the high-dimensional data (\(240*240*155=8,928,000\)) to 2D or 3D. Color-code the lower dimensional projections by the categorical labels associated with disease (clinical phenotypes). Is there clearly identifiable patterns suggestions discrimination between clinical class labels? Compare your findings against the SCOR PCA/t-SNE/UMAP interactive webapp using the default UKBB dataset.

SOCR Resource Visitor number Web Analytics SOCR Email