SOCR ≫ | DSPA ≫ | DSPA2 Topics ≫ |
RCurl
and httr
-* Download CaseStudy12_ AdultsHeartAttack_Data.xlsx or require online - Load this data as data frame - Use Export()
or write.xlsx()
to renew the xlsx file - Use rio
package to convert this “.xlsx”” file to “.csv” - Generate generalizing tabular data structures - Generate a data.table
- Create disk-based data frames and perform basic calculation - Perform basic calculation on the last 5 columns as a big matrix - Use DIAGNOSIS, SEX, DRG, CHARGES, LOS and AGE to predict DIED with randomForest setting ntree=20000
. Notice: sample without replacement to get as large as possible balanced dataset - Run train()
in caret
and detect the execution time - Detect cores and make proper number of clusters - Rerun train()
parallelized and compare the execute time - Use foreach
and doMC
to design a parallelized random forest with ntree=20000
and compare the execution time against linear sequential execution.
R
, C++
and Python
IntegrationWrite an R-markdown (Rmd) electronic notebook that demonstrated data passing, cross-language object transfer, and integrated computing.
R
code block - load in a 2D image, e.g., the 2D brain hematoma image we saw in Chapter 8.C++
code (within the same Rmd notebook), which convolves (smooths) the image with a 2D Gaussian kernel.R
image object to the C++
block and retrieve within the R
environment the smoothed image and use plot_ly()
to show both the original and the C++
-smoothed images next to each other.Python
block, retrieve both the original and the smoothed images, and compute their difference.R
code block, retrieve the difference image from the Python
environment and display all 3 images using plot_ly()
in 2D and in 3D, as surfaces.