Demonstrate cross validation on these two case-studies independently:
Go through the following protocol:
- Review each case-study.
- Choose appropriate dichotomous, polytomous or continuous outcome variables, e.g., use
ALSFRS_slope
for ALS, CHRONICDISEASESCORE
(cutoff at 1.2) for Case06_QoL_Symptom_ChronicIllness.csv and binarize the outcome.
- Apply proper data preprocessing.
- Perform regression modeling (OLS,
glmnet
, Forward or Backward model selection, etc.) for continuous outcomes.
- Perform classification and prediction using various methods (e.g., LDA, QDA, AdaBoost, SVM, Neural Network, KNN) for discrete outcomes.
- Apply cross-validation on these regression and classification methods, respectively.
- Report standard error for the regression type approaches.
- Report appropriate quality metrics that can be used to rank the forecasting approaches based on the predictive power of their results.
- Compare the results of model-driven and data-driven (e.g., KNN) techniques.
- Compare sensitivity and specificity.
- Use unsupervised classification methods, e.g., k-means and spectral clustering.
- Evaluate and justify the k-means model and detect the level of agreement the model and the real clusters labels.
- Report the discrepancy (difference of agreement) between k-means and k-mean++, also including the diagnosis of k-mean++.
SOCR Resource Visitor
number