A story to tell

Post date: Jul 30, 2012 6:34:12 PM

Distinguishability

Sometimes we want to be able to tell which voxel is relevant to the stimulus category or irrelevant. Single-voxel accuracy test is an expensive way to try.

MI is widely used but not normalized, making it inconvenient to compare across subjects and applications.

So, I come up with distinguishability measure (V) using pair-wise t-test, which can be done fast.

Distinguishability vs MI

It is nice to see the correlation of V-value, single-voxel accuracy and MI for each voxel. The results show that:

V and MI has high correlation value, suggesting that MI can be replaced by V, as the latter is computationally cheaper and normalized.

corr(V, single-voxel accuracy) < corr(MI, single-voxel accuracy) when considering only voxels in VTC.

corr(V, single-voxel accuracy) > corr(MI, single-voxel accuracy) when considering voxels from the whole brain.

We explored the distribution of the V values of voxels in both VTC and the whole brain, please see Brain Functional Distinguishability Measure, and found that:

whole brain: most voxels are irrelevant!

VTC: voxels are pretty much relevant when compared with whole brain.

In both cases there are very few voxels with V>=0.8, which says there is no all-around relevant voxels.

There are some more relevant voxels outside the VTC. So, VTC is just a subset of relevant voxels.

Class-specific measure

It is interesting to see which region in the brain is responding to which stimuli in particular.

So, based on the V, I developed the class-specific V able to tell how much the voxel can distinguish class k from the rest.

However, this measure just estimates how many classes are distinguishable using the voxel, but does not tell how well (how far the class-k and the rest are) it can distinguished.

This is where the class-specific MI would fill in as it can tell how far the class is separated.

In fact, I plan to use the p-value rather than the hypothesis 0 or 1, so that the V can handle this issue.

Distinguishability and classification

By thresholding the V value by the threshold V_thr, the voxels are determined as versatile when V>=V_thr and as irrelevant otherwise.

Our experiment in "Classification using Versatile Voxels" shows that after the irrelevant voxels are filtered out by the thresholding, the accuracy is improved:

VTC: 87.5% to 96.25%

whole brain: 37.5% to 97.5%

Now we perform classification using only the irrelevant voxels:

whole brain: at V_thr = 0.2, the accuracy of the irrelevant voxels is as low as 0.1X. And even if we include more points by increasing the V_thr to 0.9, the accuracy is still bad (0.4 at best). That suggests there are a very high level of noise or anti-informative voxels in the data, in which reduces the classification accuracy when included.

VTC: the irrelevant voxels (V<0.2) combined together can reach almost 0.5 (50%) accuracy, different from the whole brain, confirming the fact that there are some noisy/anti-informative voxels in the brain outside VTC.

I also compare the accuracy curve between versatile and irrelevant at the same threshold.

One interesting thing is the voxels with V<0.2 should not be able to give more than 0.2 accuracy when considered as a single-voxel independently.

The result of 0.5 accuracy suggests that there might be a synergy when using all voxels combined together.

Information in the brains is distributive

That suggests me to further the experiment to see if the information in the brain is distributed rather than isolated piece of information.

I showed this by first calculating the accuracy using each irrelevant voxel one at a time, then compare with the accuracy when using lots of irrelevant voxels at a time.

The latter accuracy is significantly better, meaning that there is synergy between the voxels, thus information is distributed in the brain.

An experiment is conducted by adding more voxels randomly to the classifier (LR), and the accuracy is increased as the number of voxels increases given a threshold value V_thr.

This is true for all the value of V_thr. In this experiment, we are using only the voxels in VTC.

Now I will restrict that the voxels to be added are from the close neighbor to the seed point, so I populate the neighbor from the k-nn of the seed point. The bigger k, the better accuracy we got, so the accuracy from the k-nn cluster supports the distributive information model.