SOCR ≫ | DSPA ≫ | DSPA2 Topics ≫ |

We saw in Chapter 6 the `square root function`

, it’s just one instance of an example of a power-function.

- Why did we observe a decrease of the accuracy of the NN prediction of the square-root outside the interval \([0,1]\) (note we trained inside \([0,1]\))? How can you improve on the prediction of the square-root network?
- Can you design a more generic NN network that can learn and predict a power-function for a given power parameter (\(\lambda \in \Re\))?

Use the SOCR Normal and Schizophrenia pediatric neuroimaging study data to complete the following tasks:

- Conduct some initial data visualization and exploration
- Use derived neuroimaging biomarkers (e.g.,
*Age*,*FS_IQ*,*TBV*,*GMV*,*WMV*,*CSF*,*Background*,*L_superior_frontal_gyrus*,*R_superior_frontal_gyrus*, …,*brainstem*) to train a`NN`

model and predict*DX*(Normals=1; Schizophrenia=2) - Try one hidden layer with different number of nodes
- Try multiple hidden layers and compare the results to the single layer. Which model is better?
- Compare the type I (false-positive) and type II (false-negative) errors for the alternative methods
- Train separate models to predict
*DX*(diagnosis) for the*Male*and*Female*cohorts, respectively. Explain your findings - Train an
*SVM*(using`ksvm`

and`svm`

in`e1071`

) for*Age*,*FS_IQ*,*TBV*,*GMV*,*WMV*,*CSF*,*Background*to predict*DX*. Compare the results of linear, Gaussian and polynomial SVM kernels - Add
*Sex*to your models and see if this makes a difference - Expand the model by training on all derived neuroimaging biomarkers and re-train the SVM using
*Age*,*FS_IQ*,*TBV*,*GMV*,*WMV*,*CSF*,*Background*,*L_superior_frontal_gyrus*,*R_superior_frontal_gyrus*, …,*brainstem*. Again, try linear, Gaussian and polynomial kernels. Compare the results - Are there differences between the alternative kernels?
- For
*Age*,*FS_IQ*,*TBV*,*GMV*,*WMV*,*CSF*, and*Background*, tune parameters for Gaussian and polynomial kernels - Draw a CV (cross-validation) plot and interpret the resulting graph
- Use different random seeds and repeat the experiment, are the results stable?
- Inspecting the results above, explain why it makes sense to set a tune over a range such as \(exp(-5:8)\)
- How can we design alternative tuning strategies other than greedy search?

These data include imaging, clinical, genetics and phenotypic data for over 1,000 pediatric cases - Autism Brain Imaging Data Exchange (ABIDE).

- Apply several models (e.g., C5.0, k-Means, linear models, neural nets, random forest) to predict the clinical diagnosis using part of the data (training data)
- Evaluate the model’s performance, using confusion matrices, accuracy, \(\kappa\), precision and recall, F-measure, etc.
- Evaluate, compare and interpret the results
- Use the ROC to examine the tradeoff between detecting true positives and avoiding the false positives and report AUC
- Finally, apply cross validation on C5.0 and report CV error.