In this case-study, we use the (tab-delimited) SOCR Longitudinal data - human
electrocardiogram (ECG) signals.
This
ECG data structure represents a 2D array/tensor, [1:162, 1:3500].
The
complete data is available here, [1:162, 1:65536].
Each row [1:162, ] is an ECG recording
representing 65536 temproral measurements over a period of 512
seconds, sampled at 128 Hz.
The
Labels file
represents a vector of 162 diagnostic labels, one for each row
of Data. The three diagnostic categories are: 'ARR', 'CHF', and
'NSR'. Labels: ARR: 96 recordings from persons with arrhythmia. CHF:
30 recordings from persons with congestive heart failure. NSR 36
recordings from persons with normal sinus rhythms. Research Goal:
Train a classifier to distinguish between the 3 clinical phenotypes:
ARR, CHF, and NSR. ECGDataTensor_T3500.tsv includes only 3500
timepoints ECGDataTensor.tsv includes the complete 65536 temproral
data.
Note that longutudinal data has to be in the wide format where each columsn represents
a time index for the observed calue in the corresponding cell.