SOCR ≫ | DSPA ≫ | Topics ≫ |
Use some of the methods below to do classification, prediction, and model performace evaluation.
Model | Learning Task | Method | Parameters |
---|---|---|---|
KNN | Classification | knn |
k |
Naïve Bayes | Classification | nb |
fL, usekernel |
Decision Trees | Classification | C5.0 |
model, trials, winnow |
OneR Rule Learner | Classification | OneR |
None |
RIPPER Rule Learner | Classification | JRip |
NumOpt |
Linear Regression | Regression | lm |
None |
Regression Trees | Regression | rpart |
cp |
Model Trees | Regression | M5 |
pruned, smoothed, rules |
Neural Networks | Dual use | nnet |
size, decay |
Support Vector Machines (Linear Kernel) | Dual use | svmLinear |
C |
Support Vector Machines (Radial Basis Kernel) | Dual use | svmRadial |
C, sigma |
Random Forests | Dual use | rf |
mtry |
\[\textbf{Table 1}\]
From the course datasets, use the 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv case-study data to perform a multi-class prediction.
Use ResearchGroup
as response, which have “PD”,“Control” and “SWEDD” three classes.
Delete ID column, impute missing value with mean or median and justify your choice.
Normalize the covariates.
Implement automated parameter tuning process and report the optimal accuracy and \(\kappa\).
Set arguments and rerun the tuning, trying differents method
and number
settings.
Train a random forest with tuned parameters, report the result and output cross table.
Use bagging algorithm and report the accuracy and \(\kappa\).
Perform randomForest and report the accuracy and \(\kappa\).
Report the accuracy by AdaBoost and make sure try all three methods.
Finally, give a brief summary about all the model improvement approaches.
Try the procedure on other data in the list of Case-Studies, e.g., Traumatic Brain Injury Study and the corresponding dataset.