SOCR ≫ DSPA ≫ DSPA2 Topics ≫

1 Working with website data

2 Network data and visualization

  • Download 03_les miserablese_GraphData.txt
  • Visualize this undirected network graph
  • Summarize the graph and explain the output
  • Calculate the degree and the centrality of this graph
  • Find out some important nodes (corresponding to novel characters)
  • Will the results change if we assume the graph is directed?

3 Data conversion and parallel computing

-* Download CaseStudy12_ AdultsHeartAttack_Data.xlsx or require online - Load this data as data frame - Use Export() or write.xlsx() to renew the xlsx file - Use rio package to convert this “.xlsx”” file to “.csv” - Generate generalizing tabular data structures - Generate a data.table - Create disk-based data frames and perform basic calculation - Perform basic calculation on the last 5 columns as a big matrix - Use DIAGNOSIS, SEX, DRG, CHARGES, LOS and AGE to predict DIED with randomForest setting ntree=20000. Notice: sample without replacement to get as large as possible balanced dataset - Run train() in caret and detect the execution time - Detect cores and make proper number of clusters - Rerun train() parallelized and compare the execute time - Use foreach and doMC to design a parallelized random forest with ntree=20000 and compare the execution time against linear sequential execution.

4 R, C++ and Python Integration

Write an R-markdown (Rmd) electronic notebook that demonstrated data passing, cross-language object transfer, and integrated computing.

  • Start with an R code block - load in a 2D image, e.g., the 2D brain hematoma image we saw in Chapter 8.
  • Write a C++ code (within the same Rmd notebook), which convolves (smooths) the image with a 2D Gaussian kernel.
  • Pass the R image object to the C++ block and retrieve within the R environment the smoothed image and use plot_ly() to show both the original and the C++-smoothed images next to each other.
  • In a new Python block, retrieve both the original and the smoothed images, and compute their difference.
  • In a follow up R code block, retrieve the difference image from the Python environment and display all 3 images using plot_ly() in 2D and in 3D, as surfaces.
SOCR Resource Visitor number Web Analytics SOCR Email