1 Learn and predict a power-function

We saw in Chapter 6 the square root function, it’s just one instance of an example of a power-function.

Why did we observe a decrease of the accuracy of the NN prediction of the square-root outside the interval \([0,1]\) (note we trained inside \([0,1]\))? How can you improve on the prediction of the square-root network?
Can you design a more generic NN network that can learn and predict a power-function for a given power parameter (\(\lambda \in \Re\))?

2 Pediatric Schizophrenia Study

Use the SOCR Normal and Schizophrenia pediatric neuroimaging study data to complete the following tasks:

Conduct some initial data visualization and exploration
Use derived neuroimaging biomarkers (e.g., Age, FS_IQ, TBV, GMV, WMV, CSF, Background, L_superior_frontal_gyrus, R_superior_frontal_gyrus, …, brainstem) to train a NN model and predict DX (Normals=1; Schizophrenia=2)
Try one hidden layer with different number of nodes
Try multiple hidden layers and compare the results to the single layer. Which model is better?
Compare the type I (false-positive) and type II (false-negative) errors for the alternative methods
Train separate models to predict DX (diagnosis) for the Male and Female cohorts, respectively. Explain your findings
Train an SVM (using ksvm and svm in e1071) for Age, FS_IQ, TBV, GMV, WMV, CSF, Background to predict DX. Compare the results of linear, Gaussian and polynomial SVM kernels
Add Sex to your models and see if this makes a difference
Expand the model by training on all derived neuroimaging biomarkers and re-train the SVM using Age, FS_IQ, TBV, GMV, WMV, CSF, Background, L_superior_frontal_gyrus, R_superior_frontal_gyrus, …, brainstem. Again, try linear, Gaussian and polynomial kernels. Compare the results
Are there differences between the alternative kernels?
For Age, FS_IQ, TBV, GMV, WMV, CSF, and Background, tune parameters for Gaussian and polynomial kernels
Draw a CV (cross-validation) plot and interpret the resulting graph
Use different random seeds and repeat the experiment, are the results stable?
Inspecting the results above, explain why it makes sense to set a tune over a range such as \(exp(-5:8)\)
How can we design alternative tuning strategies other than greedy search?

3 Use the ABIDE case-study

These data include imaging, clinical, genetics and phenotypic data for over 1,000 pediatric cases - Autism Brain Imaging Data Exchange (ABIDE).

Apply several models (e.g., C5.0, k-Means, linear models, neural nets, random forest) to predict the clinical diagnosis using part of the data (training data)
Evaluate the model’s performance, using confusion matrices, accuracy, \(\kappa\), precision and recall, F-measure, etc.
Evaluate, compare and interpret the results
Use the ROC to examine the tradeoff between detecting true positives and avoiding the false positives and report AUC
Finally, apply cross validation on C5.0 and report CV error.

DSPA2: Data Science and Predictive Analytics (UMich HS650)

Assignment 6: Black Box Machine-Learning Methods: Neural Networks, Support Vector Machines, Random Forests

SOCR/MIDAS (Ivo Dinov)

March 2022

1 Learn and predict a power-function

2 Pediatric Schizophrenia Study

3 Use the ABIDE case-study