User-data driven visual analytics of extremely high-dimensional studies using TensorBoard

SOCR Team

This SOCR HTML5 resource demonstrates:
  1. t-distributed stochastic neighbor embedding (t-SNE) statistical method for manifold dimension reduction,
  2. The TensorBoard machine learning platform, and
  3. Hands-on Big Data Analytics activity using the UK Biobank data,
  4. Interactive Visual Analytics using user-provided data.
Before you begin, review the SOCR hands-on high-dimensional t-SNE Data Analytics Learning Module and the DSPA Dimensionality Reduction Chapter.

Similarly to the analysis of the UK Biobank study, you can use your own dataset.
This will require you to provide a pair of ASCII text files that can be loaded from your computer.

The first file contains tab-delimited (TSV) data including the predictor vectors (row=case * column=features).
The second file is an optional TSV file including metadata like labels for each case (row), if any.
Examples of the two data formats that can be loaded from your computer are included below. Once the data is loaded in the app, you can run the analysis on your own data much like we did in the similar dimensionality reduction activity using UK Biobank.

Video Demonstration using UKBB Data


You can see the complete UK Biobank activity here.