sparse regularization on lexical data
Here are some results from using L1- and L2-norm with logistic regression on lexical data
code
demo3.m
function
The can apply ridge, lasso and elastic-net regularization with LR. The optimal parameters lambda and alpha are optimally selected for the whole dataset one one single time. That is, the parameters are shared across all runs.
Data matrices produced
The datamatrix is stored in the dir:
/NAS_II/Projects/MVPA_Language/lexical/${subj}/stats_images_fmri
each with the name:
${data}_sexemplar_beta_tstat_matrix_${mask}.mat
where
${data} = 'animtool', 'mamnonmam', 'allmamnonmam'
${mask} = 'lh_mask_vtc2', 'mask_vtc2', 'gray_mask2'
The work flow and the data for each step
Rawdata --> whole scan datamatrix --> task-specific data matrix --> sweep-parameter classification results --> best-parameter classification result for each subject
Raw data --->
some code, I don't remember
---> whole scan datamatrix --->
/mnt/home/kittipat/Dropbox/random_matlab_code/fMRI/lexical_project/rdm/
make_datamatrix_allmamnonmam.m
make_datamatrix_animtool.m
make_datamatrix_mamnonmam.m
---> task-specific data matrix --->
/mnt/home/kittipat/Dropbox/random_matlab_code/fMRI/lexical_project/lr_regularization/
experiments_animtool_4fold.m
experiments_mamnonmam_4fold.m
experiments_animtool_lou.m
experiments_mamnonmam_lou.m
---> task-specific data matrix --->
/mnt/home/kittipat/Dropbox/random_matlab_code/fMRI/lexical_project/lr_regularization/
summarize_out_4fold.m
summarize_out_4fold.m (not completed yet)
---> the best results from a selected criterion
More details regarding the stored files.
raw data
dir
file
raw data: the nifti file containing beta coefficients for each observation.
/NAS_II/Projects/MVPA_Language/lexical/${subject}/stats_images_fmri/{run,srun}${run_id}_${data_level}
subject = {3211, 3402, 3424, ...}
run_id = {1,2,3,4}
run = regular run
srun = "smoothened"
data_level = {exemplar,item}
${exemplar_name}_{zstat,beta}.nii.gz
whole scan data matrix
dir
file
whole scan data matrix: the n by m data matrix, where n and m are # of observations and # of features respectively.m is actually the number of voxels for the whole scan. Each row of the data marix represents the brain response to that particular stimulus presented to the subject.
/NAS_II/Projects/MVPA_Language/lexical/${subject}/stats_images_fmri/
filtered_${data_level}_beta_tstat_matrix.mat
data_level = {exemplar, sexemplar, item, sitem}
task-mask-specific data matrix
dir
file
task-mask-specific data matrix: this data matrix is trimmed by the choice of specific task, e.g., "animals vs tools" or "mammals vs non-mammals", and the choice of mask, e.g., "lh_mask_vtc2", "mask_vtc2", "gray_masks". So, this is more like a submatrix of the whole-scan data.
/NAS_II/Projects/MVPA_Language/lexical/${subject}/stats_images_fmri/
${task}_${data_level}_beta_tstat_matrix_${mask}.mat
task = {animtool, mamnonmam, allmamnonmam}
data_level = {exemplar, sexemplar, item, sitem}
mask = {lh_mask_vtc2, mask_vtc2, gray_mask2}
sweep-parameter classification results
dir
file
sweep-parameter classification results: Here we report the classification results, which includes accuracy for train/validation/test set for each pair of parameter lambda and alpha, # non-zero coefficients, etc. You can get the optimal parameters and the best accuracy from here.
/NAS_II/Projects/MVPA_Language/lexical/${subject}/sparse_LR
${subject}_${mask}_combined_${task}_${cv_type}.mat
cv_type = {lou,4fold}
best-parameter classification result for each subject
dir
file
best-parameter classification result for each subject: we pick the optimal parameters from the previous process with any criterion we like. The result here is for the optimal parameter for each subject.
/NAS_II/Projects/MVPA_Language/lexical/${subject}/sparse_LR
${subject}_${mask}_${classifier_regu}_${criterion}_${task}_${cv_type}.mat
classifier_regu = {lr_lasso, lr_ridge, lr_none, lr_elnet}
criterion = {cri1, cri2} // criterion#1, #2
Experimental data produced
Each data can be treated by 2 ways:
- 4fold = 4-fold cross validation, sweep over alpha, lambda. Report only one best accuracy. Sweeping lambda = 0 --> 1, alpha = 0-->1. We evaluate train, validation and test.
- file: ${subj}_${mask}_combined_${type}_4fold
- lou = leave-one-out, sweep over alpha, lambda
- file: ${subj}_${mask}_combined_${type}_lou
Official results#1: 4-fold cross validation
(using the code summarize_out_4fold.m)
There are 2 experiments:
- animals vs tools
- mammals vs nonmammals
Each reported on 3 masks
- lh_mask_vtc2
- mask_vtc2
- gray_mask2
There are 3 types of classifiers used in this experiment:
- LR+lasso
- LR+ridge
- LR+elastic net
- LR alone without regularization
For optimal parameter selection, I use 2 criteria:
- criterion#1: pick the parameter (lambda, alpha) based on the maximum validation accuracy alone. However, the consistence of voxels across 4 folds using this criterion is not desirable, so I propose using another criterion.
- criterion#2: pick the parameter (lambda, alpha) based on 1) the train accuracy must greater than 0.85, 2) the number of nonzeros coefficients (selected voxels) must greater than 20 3) pick the parameters where the validation accuracy is greatest 4) Use alpha and lambda as the tie-breaker, the greater is preferable as it's more sparse--> greater interpretability.
The report results include
- The optimally selected parameters alpha* lambda* for 2 criteria.
- The accuracy for train, validation and test set: acc_train, acc_valid, acc_test
- The sparsity in terms of number of non-zero coefficients
- The consistency of selected voxels across 4 folds in the same subject. The consistency can be measured by how many times it is included in each fold. There are 4 levels:
- #intersect>0: number of voxels that appears at least in one fold
- #intersect>0.25: number of voxels that appears at least 1/4 of the folds
- #intersect>0.5: number of voxels that appears at least 1/2 of the folds
- #intersect>0.75: : number of voxels that appears at least 3/4 of the folds
- #intersect=1: number of voxels that appears in every fold.
- #intersect=0: number of voxels that never appears in any fold.
The data log for post-processing data for each subject (run using the code summarize_out_4fold.m).
The summary table for lexical data
The summary bar plot when averaging across subjects.
Official results#2: leave-one-observation-out cross validation
(using the code summarize_out_lou.m)
There are 2 experiments:
- animals vs tools
- mammals vs nonmammals
Each reported on 3 masks
- lh_mask_vtc2
- mask_vtc2
- gray_mask2
There are 3 types of classifiers used in this experiment:
- LR+lasso
- LR+ridge
- LR+elastic net
- LR alone without regularization
For optimal parameter selection, I use 2 criteria:
- criterion#1: pick the parameter (lambda, alpha) based on the maximum validation accuracy alone. However, the consistence of voxels across 4 folds using this criterion is not desirable, so I propose using another criterion.
- criterion#2: pick the parameter (lambda, alpha) based on 1) the train accuracy must greater than 0.85, 2) the number of nonzeros coefficients (selected voxels) must greater than 20 3) pick the parameters where the validation accuracy is greatest 4) Use alpha and lambda as the tie-breaker, the greater is preferable as it's more sparse--> greater interpretability.
The report results include
- The optimally selected parameters alpha* lambda* for 2 criteria.
- The accuracy for train, validation and test set: acc_train, acc_valid, acc_test
- The sparsity in terms of number of non-zero coefficients
- The consistency of selected voxels across 4 folds in the same subject. The consistency can be measured by how many times it is included in each fold. There are 4 levels:
- #intersect>0: number of voxels that appears at least in one fold
- #intersect>0.25: number of voxels that appears at least 1/4 of the folds
- #intersect>0.5: number of voxels that appears at least 1/2 of the folds
- #intersect>0.75: : number of voxels that appears at least 3/4 of the folds
- #intersect=1: number of voxels that appears in every fold.
- #intersect=0: number of voxels that never appears in any fold.