User-data driven visual analytics of extremely high-dimensional studies using TensorBoard

SOCR Team

This SOCR HTML5 resource demonstrates:

t-distributed stochastic neighbor embedding (t-SNE) statistical method for manifold dimension reduction,

The TensorBoard machine learning platform, and

Hands-on Big Data Analytics activity using the UK Biobank data,

Interactive Visual Analytics using user-provided data.

Before you begin, review the SOCR hands-on high-dimensional t-SNE Data Analytics Learning Module and the DSPA Dimensionality Reduction Chapter.

Similarly to the analysis of the UK Biobank study, you can use your own dataset.
This will require you to provide a pair of ASCII text files that can be loaded from your computer.

The first file contains tab-delimited (TSV) data including the predictor vectors (row=case * column=features).
The second file is an optional TSV file including metadata like labels for each case (row), if any.
Examples of the two data formats that can be loaded from your computer are included below.

Load the data

SOCR Country Ranking Data

Optionally

Class

Phenotype

SOCR Top-30 Country Rank Indicator Labels

details about these COuntry Ranking data are available here

Once the data is loaded in the app, you can run the analysis on your own data much like we did in the similar dimensionality reduction activity using UK Biobank.

Video Demonstration using UKBB Data

You can see the complete UK Biobank activity here.