SOCR ≫ | TCIU Website ≫ | TCIU GitHub ≫ |
In this case study, we will look at another interesting example of a large structured tabular dataset UK Biobank (UKBB) data. The goal of this study is to compare the data representations as well as the subsequent data analytics between classical time-series and kime-series.
A previous investigation Predictive Big Data Analytics using the UK Biobank Data based on \(7,614\) clinical information, phenotypic features, and neuroimaging data of \(9,914\) UKBB subjects, reported the twenty most salient derived imaging biomarkers associated with mental health conditions (e.g., depression, anxiety). By jointly representing and modeling the significant clinical and demographic variables along with the derived salient neuroimaging features, the researchers predicted the presence and progression of depression and other mental health disorders in the cohort of UKBB participating volunteers. We will explore the effects of kime-direction on the findings based on the same data and similar analytical methods. To streamline the demonstration, enable efficient calculations, and facilitate direct interpretation, we will transform the data into a tight computable object of dimensions \(9,914\times 106\) (participants × features) that can be processed, interpreted and visualized more intuitively. An interactive demonstration of this tensor data showing linear and non-linear dimensionality reduction is available online.
## [1] 9914 106
The UKBB archive contains incomplete records. We employed multiple imputation using chained equations (MICE) to obtain complete instances of the data and avoid issues with missing observations.
# This chunk is not set up to run due to long runtime; instead we load the pretrained imputation results.
# MICE imputation (With parallel core computing)
<- detectCores() - 2
number_cores <- makeCluster(number_cores)
clustUKBB clusterSetRNGStream(clustUKBB, 1234)
registerDoParallel(clustUKBB)
<-
imp_tight106_UKBB_data foreach(no = 1:number_cores,
.combine = ibind,
.export = "tight106_UKBB_data",
.packages = "mice") %dopar% {mice(tight106_UKBB_data, m=2,maxit=3, printFlag=T, seed=1234, method = 'cart')}
<- as.matrix(complete(imp_tight106_UKBB_data),
comp_imp_tight106_UKBB_data dimnames = list(NULL, colnames(tight106_UKBB_data)))
stopCluster(clustUKBB)
save(comp_imp_tight106_UKBB_data, file = "origin_after_imputation.RData")
To assess the quality of kime-series based analytical inference, we will focus on predicting a specific clinical phenotype – Ever depressed for a whole week 1, i.e. column “X4598.2.0”. A number of model-based and model-free methods can be used for supervised and unsupervised, retrodictive and predictive strategies for binary, categorical, or continuous regression, clustering and classification. We will demonstrate a Random Forest classification approach for forecasting this binary clinical depression phenotype.
Other preprocessing steps were necessary prior to the main data analytics. To construct a kime-series, we randomly splitted the UKBB archive into 11 time epochs. We used all 11 epochs for phase estimation, and we iteratively used 10 epochs for modeling training and 1 epoch for model testing. Many alternative kime-direction aggregator functions can be employed, for this demonstration, we used nil-phase, averaging and smoothing splines with different degrees of freedom separately to estimate the unknown phase-angles.
Note that in later analysis, we found that the first 50 columns of neuroimaging data do not contribute much to this specific classification task. So in the following analysis, the building blocks of our model contain only elements from the last 56 clinical features (54 of them as predictors, 1 as response variable, 1 discarded due to close relationship with the response variable).
load("origin_after_imputation.RData")
# randomly splitted the UKBB archive into 11 time epochs
set.seed(1234)
<- sample(1:nrow(comp_imp_tight106_UKBB_data))
suffle_rid <- comp_imp_tight106_UKBB_data[suffle_rid[1:9900],]
comp_imp_tight106_UKBB_data <- colnames(comp_imp_tight106_UKBB_data)
col_name dim(comp_imp_tight106_UKBB_data) <- c(11, 900, dim(comp_imp_tight106_UKBB_data)[2])
dim(comp_imp_tight106_UKBB_data)
## [1] 11 900 106
We implemented Fourier Transform to shift the 54 predictors from time domain to frequency domain.
# fft of all 54 predictors
set.seed(1234)
<- array(complex(), c(11, 900, 54))
FT_epochs_tight106_UKBB <- array(NA, c(11, 900, 54))
mag_FT_epochs_tight106_UKBB <- array(NA, c(11, 900, 54))
phase_FT_epochs_tight106_UKBB
for (i in 1:11) {
<- fft(comp_imp_tight106_UKBB_data[i, , c(51:94,97:106)])
FT_epochs_tight106_UKBB[i, , ] # col96 <X4598.2.0>: response variable; col95 <X4598.0.0>: highly related to the response variable
# fourier transform
<- FT_epochs_tight106_UKBB[i, , ]
X2 <- sqrt(Re(X2)^2+Im(X2)^2);
mag_FT_epochs_tight106_UKBB[i, , ] <- atan2(Im(X2), Re(X2));
phase_FT_epochs_tight106_UKBB[i, , ]
}dim(phase_FT_epochs_tight106_UKBB)
## [1] 11 900 54
We applied Nil-Phase estimation as a trivial baseline. The idea is to estimate phase with 0 at each position of the fft transformed data. We’ll show its model performance compared to true-phase and other phase estimation methods in the next section.
# Nil Phase
set.seed(1234)
<- array(NA, c(11, 900, 54))
ift_NilPhase for(i in 1:11){
<- mag_FT_epochs_tight106_UKBB[i, , ] * cos(0)
Real_Nil <- mag_FT_epochs_tight106_UKBB[i, , ] * sin(0)
Imaginary_Nil <- Re(fft(Real_Nil+1i*Imaginary_Nil, inverse = T)/length(mag_FT_epochs_tight106_UKBB[i,,]))
ift_NilPhase[i,,] }
By this method, we average over the 11 estimated phase for each position of the \(900\times 54\) matrix.
# Average Phase
<- apply(phase_FT_epochs_tight106_UKBB, c(2,3), mean)
avgPhase_FT_epochs_tight106_UKBB set.seed(1234)
<- array(NA, c(11, 900, 54))
ift_AvgPhase for(i in 1:11){
<- mag_FT_epochs_tight106_UKBB[i, , ] * cos(avgPhase_FT_epochs_tight106_UKBB)
Real_Avg <- mag_FT_epochs_tight106_UKBB[i, , ] * sin(avgPhase_FT_epochs_tight106_UKBB)
Imaginary_Avg <- Re(fft(Real_Avg+1i*Imaginary_Avg, inverse = T)/length(mag_FT_epochs_tight106_UKBB[i,,]))
ift_AvgPhase[i,,] }
We fit a smoothing spline to the 11 estimated phase values for each position of the \(900\times 54\) matrix. We’ll see later that different degrees of freedom of the smoothing spline function will result in different prediction accuracy. The average phase and nil phase estimation can actually be seen as special cases of the smoothing spline estimation with degrees of freedom df = 1 and df = 0 respectively.
We visualize this estimation using one biomarker column “X31.0.0” as an example:
plot(1:11,phase_FT_epochs_tight106_UKBB[,10,1],'b',main = "phase_FT_epochs_tight106_UKBB[,10,1]",ylab = "fft phase",xlab = "epochs 1-11",lty = 1,lwd = 1)
lines(1:11,smooth.spline(1:11,phase_FT_epochs_tight106_UKBB[,10,1],df = 10)$y,col = 'red')
lines(1:11,smooth.spline(1:11,phase_FT_epochs_tight106_UKBB[,10,1],df = 5)$y,col = 'blue')
legend("topleft",legend = c("df = 10","df = 5"),lty = 1:1,col = c("red","blue"),cex = 0.8)
After getting the phase estimation, we need to reverse back into the time domain to conduct further classification tasks. The ability to reflect the structure of the original dataset after inverse fft represents the adequacy of the phase estimation process.
We show the results of smoothing spline phase estimation with df = 10 as an example:
# inverse FFT example - smoothing spline df = 10
set.seed(1234)
<- array(NA, c(11, 900, 54))
ift_splinePhase <- array(NA, c(11, 900, 54))
smooth_spline_value <- 10
df
for(i in 1:900){
for(j in 1:54){
<- smooth.spline(1:11,phase_FT_epochs_tight106_UKBB[,i,j],df = df)$y
smooth_spline_value[,i,j]
}
}
for(i in 1:11){
<- mag_FT_epochs_tight106_UKBB[i, , ] * cos(smooth_spline_value[i, , ])
Real_Avg <- mag_FT_epochs_tight106_UKBB[i, , ] * sin(smooth_spline_value[i, , ])
Imaginary_Avg <- Re(fft(Real_Avg+1i*Imaginary_Avg, inverse = T)/length(mag_FT_epochs_tight106_UKBB[i,,]))
ift_splinePhase[i,,] }
Original Data
head(comp_imp_tight106_UKBB_data[1, ,52],100)
## [1] 1 3 3 4 3 3 3 2 4 4 2 2 2 3 3 3 4 4 3 3 2 3 4 4 3 3 3 3 3 3 4 3 4 4 3 2 3
## [38] 3 4 3 3 3 3 4 2 4 3 4 2 3 3 4 3 1 3 4 3 4 3 3 3 3 4 3 3 4 3 3 2 2 4 3 3 2
## [75] 3 2 1 2 4 4 2 2 3 3 3 4 3 3 4 2 3 3 2 4 4 3 3 1 4 4
Raw Inverse FFT
head(ift_splinePhase[1,,2],100)
## [1] 1.3247109 3.0109421 3.0380485 3.9536578 3.0965326 2.9792134 3.0052876
## [8] 2.0398020 4.0458510 4.0057234 1.9994767 1.9411536 2.0524283 3.0157179
## [15] 3.0360039 3.0777665 3.9860074 3.9957264 2.9719703 3.0162383 1.9161712
## [22] 2.9620848 3.9687503 3.9462660 2.9716492 3.0064971 3.0089882 3.1169813
## [29] 3.0573894 3.0341785 3.9858190 2.9811874 3.9915375 3.9722213 3.0124213
## [36] 2.0457208 2.9877710 2.9418941 3.9399941 3.0442297 2.9720348 2.9641899
## [43] 3.1037576 4.0743839 2.0338855 3.9690007 2.9351730 3.9271476 1.9860520
## [50] 2.9196992 3.0628154 3.9515632 3.0055973 0.9939673 2.9840694 3.9977194
## [57] 2.9547721 3.9556126 3.0118504 3.0141401 2.9367009 3.0495779 3.9797034
## [64] 3.0051608 2.9035839 3.9462564 2.9772724 3.0457601 1.9507338 2.0121289
## [71] 4.0372838 3.0624453 2.9687833 2.0286521 2.9988872 1.9180096 1.0311940
## [78] 1.9083107 4.0499099 4.0093490 1.9983529 2.0102444 3.0531050 3.0401639
## [85] 3.0352656 3.9668564 2.9811530 3.0232930 3.9619172 2.0037215 2.9440798
## [92] 2.9938688 2.0727398 4.0473087 3.9915757 3.0407106 2.9565383 0.9115500
## [99] 3.9625790 4.0143996
Round Inverse FFT
head(round(ift_splinePhase[1,,2]),100)
## [1] 1 3 3 4 3 3 3 2 4 4 2 2 2 3 3 3 4 4 3 3 2 3 4 4 3 3 3 3 3 3 4 3 4 4 3 2 3
## [38] 3 4 3 3 3 3 4 2 4 3 4 2 3 3 4 3 1 3 4 3 4 3 3 3 3 4 3 3 4 3 3 2 2 4 3 3 2
## [75] 3 2 1 2 4 4 2 2 3 3 3 4 3 3 4 2 3 3 2 4 4 3 3 1 4 4
Then, we compared the model performance of a classification task based on original time-series data and transferred data utilizing a different phase estimation approach.
We first build the model based on original data. For comparison across different phase estimation methods, we use the 4th epoch as the test set and the other 10 epochs as the training set for all methods. We have also implemented cross-validation to check performance with each 11 epoch as the testing set and found that the model performance remains similar. So for simplicity, we will only show the result of one train-test epoch setting.
library(randomForest)
<- c()
rf_train_data_ift <- c()
label_train <- 4
test_epoch_idx for(i in setdiff(1:11,test_epoch_idx)){
<- rbind(rf_train_data_ift,comp_imp_tight106_UKBB_data[i,,c(51:94,97:106)])
rf_train_data_ift <- c(label_train,comp_imp_tight106_UKBB_data[i, ,96])
label_train
}<- as.data.frame(rf_train_data_ift)
rf_train_data_ift <- as.factor(label_train)
rf_train_label_ift
set.seed(1234)
<- randomForest(rf_train_label_ift ~ . ,data= rf_train_data_ift)
rf_Phase_ift_original_data rf_Phase_ift_original_data
##
## Call:
## randomForest(formula = rf_train_label_ift ~ ., data = rf_train_data_ift)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 7
##
## OOB estimate of error rate: 19.18%
## Confusion matrix:
## 0 1 class.error
## 0 3886 577 0.1292852
## 1 1149 3388 0.2532510
= predict(rf_Phase_ift_original_data, type="class")
pred_train confusionMatrix(pred_train, rf_train_label_ift)
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 3886 1149
## 1 577 3388
##
## Accuracy : 0.8082
## 95% CI : (0.7999, 0.8163)
## No Information Rate : 0.5041
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.6168
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.8707
## Specificity : 0.7467
## Pos Pred Value : 0.7718
## Neg Pred Value : 0.8545
## Prevalence : 0.4959
## Detection Rate : 0.4318
## Detection Prevalence : 0.5594
## Balanced Accuracy : 0.8087
##
## 'Positive' Class : 0
##
Then we show the results for Random Forest based on transformed data using smoothing spline phase estimation with df = 10 as an example. Later we will show the training and testing accuracy trending with degrees of freedom increasing from 2 to 11. We can see that smoothing spline phase estimation with df = 10 can give us a competitive classification accuracy compared to true-phase Random Forest.
<- c()
rf_train_data_ift <- c()
label_train <- 4
test_epoch_idx for(i in setdiff(1:11,test_epoch_idx)){
<- rbind(rf_train_data_ift,ift_splinePhase[i,,])
rf_train_data_ift <- c(label_train,comp_imp_tight106_UKBB_data[i, ,96])
label_train
}<- as.data.frame(rf_train_data_ift)
rf_train_data_ift <- as.factor(label_train)
rf_train_label_ift
set.seed(1234)
<- randomForest(rf_train_label_ift ~ . ,data= rf_train_data_ift)
rf_Phase_ift_spline rf_Phase_ift_spline
##
## Call:
## randomForest(formula = rf_train_label_ift ~ ., data = rf_train_data_ift)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 7
##
## OOB estimate of error rate: 19.56%
## Confusion matrix:
## 0 1 class.error
## 0 3855 608 0.1362312
## 1 1152 3385 0.2539123
= predict(rf_Phase_ift_spline, type="class")
pred_train confusionMatrix(pred_train, rf_train_label_ift)
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 3855 1152
## 1 608 3385
##
## Accuracy : 0.8044
## 95% CI : (0.7961, 0.8126)
## No Information Rate : 0.5041
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.6093
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.8638
## Specificity : 0.7461
## Pos Pred Value : 0.7699
## Neg Pred Value : 0.8477
## Prevalence : 0.4959
## Detection Rate : 0.4283
## Detection Prevalence : 0.5563
## Balanced Accuracy : 0.8049
##
## 'Positive' Class : 0
##
We will see that with average phase estimation, our model has bad classification performance. This also indicates the importance of phase information.
<- c()
rf_train_data_ift <- c()
label_train <- 4
test_epoch_idx for(i in setdiff(1:11,test_epoch_idx)){
<- rbind(rf_train_data_ift,ift_AvgPhase[i,,])
rf_train_data_ift <- c(label_train,comp_imp_tight106_UKBB_data[i, ,96])
label_train
}<- as.data.frame(rf_train_data_ift)
rf_train_data_ift <- as.factor(label_train)
rf_train_label_ift
set.seed(1234)
<- randomForest(rf_train_label_ift ~ . ,data= rf_train_data_ift)
rf_Phase_ift_avg rf_Phase_ift_avg
##
## Call:
## randomForest(formula = rf_train_label_ift ~ ., data = rf_train_data_ift)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 7
##
## OOB estimate of error rate: 46.69%
## Confusion matrix:
## 0 1 class.error
## 0 2325 2138 0.4790500
## 1 2064 2473 0.4549262
= predict(rf_Phase_ift_avg, type="class")
pred_train confusionMatrix(pred_train, rf_train_label_ift)
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 2325 2064
## 1 2138 2473
##
## Accuracy : 0.5331
## 95% CI : (0.5227, 0.5435)
## No Information Rate : 0.5041
## P-Value [Acc > NIR] : 1.961e-08
##
## Kappa : 0.066
##
## Mcnemar's Test P-Value : 0.2601
##
## Sensitivity : 0.5210
## Specificity : 0.5451
## Pos Pred Value : 0.5297
## Neg Pred Value : 0.5363
## Prevalence : 0.4959
## Detection Rate : 0.2583
## Detection Prevalence : 0.4877
## Balanced Accuracy : 0.5330
##
## 'Positive' Class : 0
##
Finally, we visualize the training and testing accuracy of Inverse FFT-based Random Forest with different degrees of freedom in the smoothing spline phase estimation process. Note that we include the Nil-Phase estimation and the Average-Phase estimation as special cases with df = 0 and df = 1 respectively. The dotted horizontal line in the plot represents the testing accuracy of original data based Random Forest. This plot provides strong evidence of the importance of correctly estimating kime-phases.
# helper function 1: compute smoothing spline with different degrees of freedom
<- function(df){
ift_df set.seed(1234)
<- array(NA, c(11, 900, 54))
ift_splinePhase <- array(NA, c(11, 900, 54))
smooth_spline_value for(i in 1:900){
for(j in 1:54){
<- smooth.spline(1:11,phase_FT_epochs_tight106_UKBB[,i,j],df = df)$y
smooth_spline_value[,i,j]
}
}for(i in 1:11){
<- mag_FT_epochs_tight106_UKBB[i, , ] * cos(smooth_spline_value[i, , ])
Real_Avg <- mag_FT_epochs_tight106_UKBB[i, , ] * sin(smooth_spline_value[i, , ])
Imaginary_Avg <- Re(fft(Real_Avg+1i*Imaginary_Avg, inverse = T)/length(mag_FT_epochs_tight106_UKBB[i,,]))
ift_splinePhase[i,,]
}
return(ift_splinePhase)
}
# helper function 2: compute model accuracy with different test epoch (mainly developed for the cross-validation part not shown)
<- function(ift_splinePhase,test_epoch_idx){
rf_model_ift # dim(ift_splinePhase) 11,900,54
<- c()
rf_train_data_ift <- c()
label_train for(i in setdiff(1:11,test_epoch_idx)){
<- rbind(rf_train_data_ift,ift_splinePhase[i,,])
rf_train_data_ift <- c(label_train,comp_imp_tight106_UKBB_data[i, ,96])
label_train
}<- as.data.frame(rf_train_data_ift)
rf_train_data_ift <- as.factor(label_train)
rf_train_label_ift # dim(rf_train_data_ift) 9000, 54
set.seed(1234)
<- randomForest(rf_train_label_ift ~ . ,data= rf_train_data_ift)
rf_Phase_ift # OOB_accuracy <- 1- rf_Phase_ift$err.rate[,1][500] # 500 - number of trees
# rf_Phase_ift
= predict(rf_Phase_ift, type="class")
pred_train = sum(pred_train == rf_train_label_ift)/length(rf_train_label_ift)
train_accuracy
# test
<- ift_splinePhase[test_epoch_idx,,]
rf_test_data_ift <- as.factor(comp_imp_tight106_UKBB_data[test_epoch_idx, ,96])
rf_test_label_ift
= predict(rf_Phase_ift, rf_test_data_ift, type="class")
pred_test = sum(pred_test == rf_test_label_ift)/length(rf_test_label_ift)
test_accuracy
return(c(train_accuracy,test_accuracy))
}
# This chunk is not setup to run due to long runtime; instead we load the pretrained results.
<- c()
train_test_accuracy for(df in 2:11){
<- rbind(train_test_accuracy,c(rf_model_ift(ift_df(df),4),df = df))
train_test_accuracy
}
<- as.data.frame(train_test_accuracy)
train_test_accuracy names(train_test_accuracy) <- c("train_accuracy","test_accuracy","df")
# add results from nil-phase and average phase
<- rbind(train_test_accuracy,c(rf_model_ift(ift_AvgPhase,4),1)) # avg-phase
train_test_accuracy <- rbind(train_test_accuracy,c(rf_model_ift(ift_NilPhase,4),0)) # nil-phase
train_test_accuracy <- rf_model_ift(comp_imp_tight106_UKBB_data[,,c(51:94,97:106)],4) # original data
origin_acc save(train_test_accuracy,origin_acc, file = "train_test_accuracy_testepoch4.RData" )
library(ggplot2)
load("train_test_accuracy_testepoch4.RData")
<- reshape(train_test_accuracy, idvar = "df", varying = list(1:2),v.names = "accuracy", direction = "long")
accuracy_data
names(accuracy_data) <- c("df","group","accuracy")
$group <- factor(accuracy_data$group,levels = c(1,2),labels = c("train","test"))
accuracy_dataggplot(data = accuracy_data, aes(x= df, y = accuracy, group = group)) +
geom_line(aes(color=group))+
geom_point(aes(shape=group,color=group)) +
scale_x_continuous(breaks = accuracy_data$df) +
geom_text(aes(label=round(accuracy,2)),hjust=0,vjust=0.2)+
ggtitle("Train & Test Accuracy of Inverse FFT-based RF with Different Degree of Freedom") +
geom_hline(yintercept = origin_acc[2], linetype="dashed")
We further compare the characteristics of two groups of people with or without suffering from depression in the kime space. For better visualization, we show the kime surface of one feature (“X4631.2.0: Ever unenthusiastic/disinterested for a whole week”) that is most correlated with depression as an example.
# data preparation
load("origin_after_imputation.RData")
set.seed(1234)
<- sample(1:nrow(comp_imp_tight106_UKBB_data))
suffle_rid <- as.data.frame(comp_imp_tight106_UKBB_data[suffle_rid[1:9900],])
comp_imp_tight106_UKBB_data
<- filter(comp_imp_tight106_UKBB_data, X4598.2.0 == 1) # 5007, 106
depressed_data <- filter(comp_imp_tight106_UKBB_data, X4598.2.0 == 0)
not_depressed_data
= which(comp_imp_tight106_UKBB_data$X4598.2.0 == 1)
depressed_idx = which(comp_imp_tight106_UKBB_data$X4598.2.0 == 0)
not_depressed_idx
load("ift_splinePhase.RData")
dim(ift_splinePhase) <- c(9900,54)
= ift_splinePhase[depressed_idx,] # dim: 5007, 54
depressed_data_ift = ift_splinePhase[not_depressed_idx,] # dim: 4893, 54
not_depressed_data_ift
# divide into 110 epochs
<- as.matrix(comp_imp_tight106_UKBB_data)
comp_imp_tight106_UKBB_data_3D dim(comp_imp_tight106_UKBB_data_3D) <- c(110,90,106)
<- ift_splinePhase
ift_splinePhase_3D dim(ift_splinePhase_3D) <- c(110,90,54)
# col45: (most salient predictor) X4631.2.0 (Ever unenthusiastic/disinterested for a whole week)
<- c()
depressed_data_ift_avg_col45 <- c()
not_depressed_data_ift_avg_col45 for(i in 1:110){
= which(comp_imp_tight106_UKBB_data_3D[i,,96] == 1)
depressed_idx_epoch = which(comp_imp_tight106_UKBB_data_3D[i,,96] == 0)
not_depressed_idx_epoch = mean(ift_splinePhase_3D[i,depressed_idx_epoch,45])
depressed_data_ift_avg_col45[i] = mean(ift_splinePhase_3D[i,not_depressed_idx_epoch,45])
not_depressed_data_ift_avg_col45[i] }
We apply NuLT function which achieves Laplace Transform to two time-series respectively and we visualize two kime-surfaces.
source('LT_ILT.R')
= seq(from = 1, to = 11, length.out = 50)
x2 = seq(from = -5, to = 5, length.out = 50)
y2 = expand.grid(X = x2, Y = y2)
XY = mapply(complex, real=XY$X,imaginary=XY$Y)
complex_xy <- seq(0+0.001, 2*pi, length.out = 110)
time_points
= depressed_data_ift_avg_col45
Y1 <- NuLT(time_points, Y1, complex_xy, mirror = FALSE, range = 2*pi,k = 3)
kime_depressed dim(kime_depressed) = c(length(x2), length(y2))
= not_depressed_data_ift_avg_col45
Y2 <- NuLT(time_points, Y2, complex_xy, mirror = FALSE, range = 2*pi,k = 3)
kime_not_depressed dim(kime_not_depressed) = c(length(x2), length(y2))
<- kime_depressed
kime_grid1 <- kime_not_depressed
kime_grid2 <- atan2(Im(kime_grid1), Re(kime_grid1))
phase_depressed <- sqrt( Re(kime_grid1)^2+ Im(kime_grid1)^2)
magnitude_depressed <- atan2(Im(kime_grid2), Re(kime_grid2))
phase_not_depressed <- sqrt(Re(kime_grid2)^2+ Im(kime_grid2)^2)
magnitude_not_depressed
plot_ly(hoverinfo="none", showscale = FALSE)%>%
add_trace(z=magnitude_depressed, surfacecolor=phase_depressed, type="surface",showlegend=TRUE,name = 'Depressed') %>%
add_trace(z=magnitude_not_depressed, surfacecolor=phase_not_depressed, type="surface", opacity=0.7,showlegend=TRUE,name = 'Not Depressed') %>%
layout(title = "X4631.2.0 Kimesurface Difference: Depressed vs Not Depressed",
# X4631.2.0 (Ever unenthusiastic/disinterested for a whole week)
showlegend = TRUE,
scene = list(aspectmode = "manual",
aspectratio = list(x=1, y=1, z=1.0)
) )
We further recover the time-series from the kime-surface by applying the ctILT to the NuLT function output.
<- function(z) NuLT(time_points,depressed_data_ift_avg_col45, z, k = 3, mirror = FALSE, fitwarning = FALSE, range = 2*pi)
z2_funct_depressed <- ctILT(z2_funct_depressed,nnt = 110,tend = 2*pi) inv_result_depressed
## [1] -2.980958e+03 2.980958e+03 -2.980956e+03 2.980935e+03 -2.980803e+03
## [6] 2.980144e+03 -2.977506e+03 2.968840e+03 -2.945008e+03 2.889400e+03
## [11] -2.778184e+03 2.586084e+03 -2.297934e+03 1.921121e+03 -1.490479e+03
## [16] 1.059837e+03 -6.830244e+02 3.948739e+02 -2.027736e+02 9.155766e+01
## [21] -3.594967e+01 1.211768e+01 -3.451502e+00 8.139694e-01 -1.545862e-01
## [26] 2.270959e-02 -2.420876e-03 1.665740e-04 -5.552467e-06
# plot to validate LT and ILT
plot(time_points, inv_result_depressed, col="blue",
xlim=c(0,2*pi))
lines(time_points,depressed_data_ift_avg_col45, col="red")
legend("top",c("recovered","data"), fill=c("blue","red"))
grid(nx = NULL, ny = NULL, col = "lightgray", lty = "dotted")
We plot the difference of recovered and original time-series to validate the LT and ILT process.
# diff
plot(time_points,inv_result_depressed - depressed_data_ift_avg_col45,lty = 1,lwd = 1,'l',col = 'blue',ylab = "diff", main = 'Difference of IFT recovery and original time-series')