BVC EHREN INNA COLIN COLLAB

Analysis How-Tos

Cluster related

To run the script you just need to do sbatch slurm.sh. Inside slurm.sh you can set max running time (I put it to 8h for the moment), memory, number of cores per job and how many jobs to run (it's set to 300 now).
- to check the status type: squeue -u yourusename
to cancel running jobs (e.g., if you figure you submitted them in error) scancel -s INT jobarraynumber (you get that number from squeue command.
Project specific
- Our files are in /N/project/cogai/hpc_bvc_fits
To run interactive jobs on the HPC
- srun -p general -A general --pty bash
To sync files between HPC system and local compute
- rsync -azP --ignore-existing ehnewman@quartz.uits.iu.edu:hpc_bvc_fits/BVC_EHREN_COLIN_COLLAB/cellFitResults/general_kfold_\* cellFitResults/.
- To avoid recopying existing files, add --ignore-existing
- If we want just a subset of the files, you can use this:
- rsync -azP --ignore-existing --include='*/' --include='*.mat' --exclude='*' ehnewman@quartz.uits.iu.edu:hpc_bvc_fits/BVC_EHREN_COLIN_COLLAB/cellFitResults/ cellFitResults/
- To download only _fullResults files:
- rsync -azP --ignore-existing --include='*/' --include='_fullResults.mat' --exclude='*' ehnewman@quartz.uits.iu.edu:hpc_bvc_fits/BVC_EHREN_COLIN_COLLAB/cellFitResults/ cellFitResults/

Run and aggregate model fits

Weber-Fechner code is in answer_questions.m

How to assemble data: run extract. . . from general_..._analysis.

To launch parameter fitting:
Terminal: sbatch slurm_with_shifts_general.sh

To combine results:
Matlab: combineKFoldResults

To create _fullResults.mat files with all parameters and fit statistics:
Matlab: [DATA] = extract_data_all_general2

Analyses:

[contrasts] = testParametersStability(DATA,paramName)

plotParamEstimates(contrasts,paramName)

Compare NLL values across kernel types:

comparisons = compareWithShiftGeneralModelTypes(srchStr)

Figures:

To make figures showing examples of fits:
Matlab: from docs_Inna/plot_model_fits (plots k=1 only) as of 240730

For plotting histograms over all types:
plotHistographOfParamsOverModels

TODOs

Types of analyses:

reality check parameter fitting: check that model fitting can detect parameters we set

no timeshift or blur: Can it estimate spatial parameters
with timeshift but no blur: can it estimate timeshift
with timeshift and blur: can it detect blur

across_cells: Best fits per cell fitting all sessions simultaneously

2ndRound_with_shifts_general: Best fits per cell, unique per session

Goal for each:

make sure fits exist for all cells/sessions
Does adding time-blur improve fit?
- glme between {none / boxcar / gauss / exp}
- Is one time-kernel best across all fits (boxcar vs. gauss, vs. exp)
Does adding time-shift improve fit of the best time blur kernel?
- glme between <best_kernel>_shft0 & <best_kernel>_shft1

Background

A priori questions/answers (quote from grant)

- - "Analyses of model parameters will be used to answer: 1) Is the encoding an exponential function of the past? We’ll accomplish this by testing if the log-likelihood scores are significantly better when the window of temporal encoding is modeled as an exGaussian rather than a Gaussian function. 2) Are the temporal scales logarithmically distributed? To address this, we will test if the improvement in model fit obtained by fitting neurons individually (i.e., unique parameter estimates per neuron) outweighs the cost of the additional degrees of freedom relative to a model that fits all neurons with a common set of parameters."

Methodological issues

Regarding normalized vs. non-normalized NLL

The initial computation of nll was too dependent on non-spikes: fewer spikes = better nll.
1. When comparing within a session (e.g., different parameter sets for the same data) - it is important to use the non-normalized version so that it remains sensitive to the overall number of spikes. Without this, the sa1 and sa2 parameters don't reflect actual spike rates.
2. When comparing between sessions (e.g., goodness of fits of different models) - it is important to use the normalized version otherwise the dominant source of variance is the total number of spikes in each session and not the model.

Results to share

Modeling fitting works

Comparison of spatial model types (gauss vs exgauss vs constant)
Smu
Ssigma
Tshift
Twidth

Best fitting spatial model

gauss vs exgauss vs constant
- "the BVCs of the subiculum are unlikely to be the laplace cells, as they better resemble the inverse cells (given that the exp. component adds no value to the fitting of their firing"
code in answer_questions.

Weber Fechner scaling

code in answer_questions.m

Time shift and blurring results

Full results

comparisons = compareWithShiftGeneralModelTypes('acr_*','nllTrn')

Formula:    nll ~ 1 + TshiftType + Tkern + (1 | ID)
Model fit statistics:    AIC BIC LogLikelihood Deviance    47508 47549 -23747 47494
Fixed effects coefficients (95% CIs):    Name Estimate SE tStat DF pValue Lower Upper      {'(Intercept)' } -80258 2523.6 -31.803 2784 1.3045e-189 -85206 -75310    {'TshiftType_true'} -158.24 23.271 -6.8 2784 1.275e-11 -203.87 -112.61    {'Tkern_boxcar' } -19.346 32.835 -0.5892 2784 0.55578 -83.73 45.037    {'Tkern_gauss' } 0.25619 32.849 0.0077992 2784 0.99378 -64.154 64.666    {'Tkern_exp' } 71.542 32.971 2.1698 2784 0.030103 6.8917 136.19

comparisons = compareWithShiftGeneralModelTypes('gen*','nllTrn')

Formula:    nll ~ 1 + TshiftType + Tkern + (1 | ID)
Model fit statistics:    AIC BIC LogLikelihood Deviance      1.9025e+05 1.903e+05 -95118 1.9024e+05
Fixed effects coefficients (95% CIs):    Name Estimate SE tStat DF pValue Lower Upper        {'(Intercept)' } -15556 231.74 -67.128 14021 0 -16011 -15102    {'TshiftType_true'} 84.176 1.7302 48.651 14021 0 80.785 87.567    {'Tkern_boxcar' } 44.397 2.3952 18.536 14021 8.5106e-76 39.702 49.092    {'Tkern_gauss' } -4.95 2.4956 -1.9835 14021 0.04733 -9.8417 -0.058346    {'Tkern_exp' } 54.002 2.3757 22.731 14021 2.3392e-112 49.345 58.658

comparisons = compareWithShiftGeneralModelTypes('acr*','nllTst')

Formula:    nll ~ 1 + TshiftType + Tkern + (1 | ID)
Model fit statistics:    AIC BIC LogLikelihood Deviance    39929 39970 -19957 39915
Fixed effects coefficients (95% CIs):    Name Estimate SE tStat DF pValue Lower Upper      {'(Intercept)' } -6950.4 238.97 -29.085 2784 1.2376e-162 -7419 -6481.8    {'TshiftType_true'} -6.9075 6.9019 -1.0008 2784 0.31701 -20.441 6.6259    {'Tkern_boxcar' } -36.284 9.7386 -3.7259 2784 0.00019852 -55.38 -17.189    {'Tkern_gauss' } -17.643 9.7425 -1.811 2784 0.070255 -36.747 1.46    {'Tkern_exp' } 3.4524 9.7789 0.35304 2784 0.72408 -15.722 22.627

comparisons = compareWithShiftGeneralModelTypes('gen*','nllTst')

Formula:    nll ~ 1 + TshiftType + Tkern + (1 | ID)
Model fit statistics:    AIC BIC LogLikelihood Deviance      1.7027e+05 1.7033e+05 -85130 1.7026e+05
Fixed effects coefficients (95% CIs):    Name Estimate SE tStat DF pValue Lower Upper      {'(Intercept)' } -1021 17.704 -57.672 14021 0 -1055.7 -986.32    {'TshiftType_true'} -0.21196 1.1271 -0.18806 14021 0.85083 -2.4212 1.9973    {'Tkern_boxcar' } -0.54737 1.5603 -0.35081 14021 0.72574 -3.6058 2.511    {'Tkern_gauss' } -2.7577 1.6257 -1.6963 14021 0.089842 -5.9443 0.42883    {'Tkern_exp' } -0.48208 1.5476 -0.3115 14021 0.75542 -3.5156 2.5514

comparisons = compareWithShiftGeneralModelTypes('acr_*','nllTrnNorm')

Fixed effects coefficients (95% CIs): Name Estimate SE tStat DF pValue Lower Upper {'(Intercept)' } 28.715 0.39316 73.037 2784 0 27.944 29.486 {'TshiftType_true'} 0.14426 0.011524 12.519 2784 5.071e-35 0.12167 0.16686 {'Tkern_boxcar' } -0.0078092 0.01626 -0.48028 2784 0.63107 -0.039692 0.024073 {'Tkern_gauss' } -0.014103 0.016266 -0.867 2784 0.38602 -0.045998 0.017792 {'Tkern_exp' } -0.012541 0.016327 -0.76808 2784 0.4425 -0.044555 0.019474

comparisons = compareWithShiftGeneralModelTypes('gen*','nllTrnNorm')

Fixed effects coefficients (95% CIs): Name Estimate SE tStat DF pValue Lower Upper {'(Intercept)' } 5.3867 0.017787 302.85 14021 0 5.3518 5.4215 {'TshiftType_true'} 0.0090685 0.00040314 22.495 14021 4.0998e-110 0.0082783 0.0098587 {'Tkern_boxcar' } 0.0040964 0.00055808 7.3402 14021 2.2505e-13 0.0030025 0.0051903 {'Tkern_gauss' } 0.0010681 0.00058147 1.8369 14021 0.066247 -7.1662e-05 0.0022079 {'Tkern_exp' } -0.0012176 0.00055353 -2.1996 14021 0.027849 -0.0023026 -0.00013258

comparisons = compareWithShiftGeneralModelTypes('acr*','nllTstNorm')

Formula:    nll ~ 1 + TshiftType + Tkern + (1 | ID)
Model fit statistics:    AIC BIC LogLikelihood Deviance    8125.5 8167.1 -4055.8 8111.5
Fixed effects coefficients (95% CIs):    Name Estimate SE tStat DF pValue Lower Upper       {'(Intercept)' } 28.844 0.39932 72.232 2784 0 28.061 29.627    {'TshiftType_true'} 0.073242 0.025473 2.8753 2784 0.0040676 0.023294 0.12319    {'Tkern_boxcar' } 0.40444 0.035943 11.252 2784 9.3304e-29 0.33396 0.47491    {'Tkern_gauss' } 0.46298 0.035957 12.876 2784 6.8218e-37 0.39247 0.53348    {'Tkern_exp' } -0.020344 0.036091 -0.56367 2784 0.57303 -0.091112 0.050425

comparisons = compareWithShiftGeneralModelTypes('gen*','nllTstNorm')

Formula:    nll ~ 1 + TshiftType + Tkern + (1 | ID)
Model fit statistics:    AIC BIC LogLikelihood Deviance    -13577 -13525 6795.7 -13591
Fixed effects coefficients (95% CIs):    Name Estimate SE tStat DF pValue Lower Upper        {'(Intercept)' } 5.4123 0.018459 293.2 14021 0 5.3761 5.4484    {'TshiftType_true'} 0.0041911 0.0016842 2.4885 14021 0.01284 0.00088986 0.0074924    {'Tkern_boxcar' } 0.0036286 0.0023315 1.5563 14021 0.11966 -0.00094158 0.0081987    {'Tkern_gauss' } 0.0040332 0.0024292 1.6603 14021 0.096879 -0.0007284 0.0087949    {'Tkern_exp' } -0.00024367 0.0023126 -0.10537 14021 0.91608 -0.0047766 0.0042893

Dead ends we explored

- 230417 - we spent considerably more time thinking about the analytical fits. Eventually I (Ehren) argued that we were wasting out time on this. This was based on the idea that we weren't doing true analytical fits as much as we were cobbling together ad hoc methods for approximating the curve without using analytically grounded or precidented methods. This meant that the results would be suspect regardless. Given that we had hoped to use these fits to build confidence in the maximum-likelihood fits, this was not going to fly. If there was a mismatch, it would have been our bias to assume we still had not implemented these 'analytical fits' correctly. Indeed, the NLL scores from the ML approaches were considerably higher than the last set of results we obtained with out analytical approaches.
  - We also remembered that the last time that we had been working on this, we left off trying to get the analytical fits to look reasonable by estimating the parameters a1 and a2. The catch had been that our methods were working reasonably when there was not a long positive tail in the firing but were working decently well when the tail was short or the firing dropped off to zero.

Band-i-ness

- - Some cells have what look like bands of activity that don't track walls. We tried fitting these with overt bands but didn't find that these fits were any better.

Known weaknesses

Equivalence of opposing walls

When parameterizing the 'angle' that a cell is tuned to, there is a degeneracy in whether one wall or its opposing wall anchors the tuning in rectangular enclosures. Which wall is used to anchor the tuning has effects on the parameter that defines how far from the wall the cell is tuned (Smu) and the angle.
Circle enclosures break this symmetry but then the opposite wall often takes over

Instability of parameter fits

The same cell can have multiple angles fit to it
Across k-fold, there can be more or less variability in individual parameters over cells

Missing data for some cells

missing 2 of 3 dark sessions
Missing others: Need to sort out what these are

BONEYARD

- Q: When tau is small (so the kernel is asymmetric), is this when the 'mean' parameter is 'outside of the picture' (meaning beyond the bound of the enclosure)?
- More generally, we are trying to understand why is the 'mean' so often outside of the enclosure. How is the fitting using this to approximate the data?
- Q: Another related question is whether the fits where the 'gauss' model put the mean outside of the enclosure, it changed or did something different, when it was fit with the ex-gauss model? E: current guess, based onthe fact that the exponential is only changing the right side of the kernel, the part that is away from the enclosure when the mean is already passed the right boundary of the enclosure, is that it doesn't change much.

Pinned slack messages (ideas for things to look at)

compute auto-correlation
- Notes from our call with Colin
- why r1266_18303_T1c2_T2c1_cells_5i_5h CMBH_18303t.mat has now different tetrode-cell numbers
try multiple wall exponential (ask Zoran)
- try band cell model
  1. Done. Did not fit data better.
- Might be interesting to think about adding some different weights to the spikes that happen at longer or shorter times after contacts with the wall
- distribution of mean and variance will depend on which wall the cell is reacting to (nearest/furthest) in cases like 180/360 and 90/270 (eg cell 26) (edited)
- 6. We should discuss our results with Colin for implications for boundary coding.
- 5. We should discuss our results with Zoran for implications for the SITH framework.
- 4. We should consider fitting the other cells that were not flagged by Colin as being BVCs.
3. We will look at the distributions of the gaussian means & exponential decays. The core question is what distribution they might be sampled from. Is it possibly an exponential? The efficiency of the SITH framework rests on which exact distribution it is. There is discussion of this point in the grant write up.
2. Check for Weber-Fechner scaling. This involves plotting the ‘mean’ and ‘std’ parameters against each other and looking for increasing std as a function of mean. We’ll have to double check the exact predicted relationship. This will be important to include in any publication because this WF law is central to the SITH framework but it won’t be hugely noteworthy if we find evidence of the WF law. If we find it is not there, that would be more noteworthy:
- fit with second order function x^2/a^a - y^2/b^2 = 0
- use all the data for this X fit
  1. 221017 = end of day notes: we plotted all the relevant points for the 0:90:360 deg best fits but realized that each enclosure could have variable widths making the reflection (and thus the X we might expect to find) unreliable. To address this, we should scale the mean and std values by the width of the enclosure.
    - Scale the NLL values relative to the min and max of the current enclosure to account for differences across sessions
  2. 221205 = starting the day remembering that we have to double check where things stand now that we fixed it so that boundaries cannot start outside of the enclosure - the outcome was that, as expected, the mean can never be to the left of the left-most wall. It can, still however, be to the right of the right-most wall. We tried plotting the mean-vs-std and saw that the means cluster along the walls with widely varying standard deviations. There is not obvious evidence of Weber-Fechner scaling . We are still plagued, however, by uncertainty related to whether the mean should be based on the closest wall or based on the left-most wall. The uncertainty comes from the fact that the enclosure is symetric. The only viable reason (i.e., more than chance in the optimization of the NLL values) for there to be a difference between the left and right walls is that the mean can be beyond the right wall but not the left wall. We also compared the nll values for the 'best rotation' vs 'all rotations' to perform a sanity check that the best rotation is indeed better. The results we under-whelming--the histogram of these two types of nll values were not obviously different. Our interpretation is that there is more variation across cells in the ability to fit the cell at all than there is variation across rotations to fit the cell. This strikes me as a very bad thing. This is because if the best nll values for some cells are still lower than the nll obtained when the rotation is way off, this is not a good fit to begin with.
  3. 221207 - We accomplished three main things today.
    - First, we established that spike rate is an almost perfect predictor of the final best NLL. The lower the spike rate, the better the final NLL.
    - Second, we set up a way to sort the fit plots by the NLL so we can look across them and see how nll varies with basic features of the cell and the fits. Looking at these, the clearest thing we saw was consistent with what we saw with spike rate.
    - Third, assuming that the reason why spike rate was so tightly correlated with final NLL was that the model was getting more points by fitting where the cell didn't fire than where it did, we added weighting factors to the loss-function so that it would worry about spiking and non-spiking with equal weight. We then relaunched the analysis on BigRed200.
Do model selection? This will involve computing something like the Bayes Factor which balances model complexity (# of degrees of freedom) with goodness of fit (nll). We don’t expect the ‘constant model’ to do best but it will provide a baseline against which we can compare the other models to (e.g., with log likelihood ratios). It will be more interesting to see if adding the exponential term in the exGauss model provides a meaningful benefit over the gaussian model.
230920 - Today we looked at the fitting from the 'with_shifts'. The code allowed us to look for the Weber-Fechner scaling by plotting the mean and standard deviation over either A) all recordings over cells, or B) cells by collapsing over recordings per cell.
- When we looked over enclosure, there were two clear clusters along the x-axis representing the 'distance from wall' or 'mean'. These clusters likely represented tunings that were anchored off the proximal wall versus against the distal wall. It didn't look very Weber-Fechner like because the far wall cluster also had low standard deviations.
- When we plotted over cells, it looked much better. However, this is almost certainly for all the wrong reasons. There were no longer two clusters, instead the 'means' were broadly distributed along teh x-axis. This likely was the result of averaging across fits wherein some were to the close wall and some were to the far wall. This type of averaging makes no sense. A more rational way of forcing the fits to converge to the same anchor points is required. To fix this, we should re-run the fitting by combining all of the data across recordings from a single cell into a single set of data to be fit by a single parameter.
- OTHER THINGS TO DO WHEN WE COME BACK:
  1. Set up a comparison between 'with_angle', 'with_shifts', and 'with_angle_exp_ker' to compare nll. The key question being whether the _with shifts and _exp_ker add power to the fitting. - Done, see 231027
  2. Look at the fits between cells with meaningful shifts (e.g., 500ms or greater) compared to the comparable _with_angle version to see if the difference is noticeable.
  3. Run a more formal analysis of the shift and integration_width parameters to look at the distributions of these. (That is, in the same way that we looked that the distributions of mu values (asking if it is log distributed), look at the same for the time-shifts / time-widths).
231004 - Today we compared the NLLs of the fits that compare the spiking to a 'step-function' convolution of the animal position and fits that compare the spiking to a convolution of the animal position with a kernel that has an exponential decay function. Both had the possibility of having a time shift relative to a zero time lag. We also compared both of these to the prior analyses which did not do any convolution and simply modeled the spiking as a function of the instantaneous position.
- We found that the step function kernel performed significantly better than the version with no kernel (i.e., instantaneous)
- We found that the exponential function kernel performed significantly worse than both the version with no kernel and the version with the step function kernel.
- We then began looking at the parameters selected for each:
  1. For the exponential kernel version - We found that ~2/3rds of the fits selected exponential decay terms that were the smallest allowed, amounting to having no decay. This would suggest that it did the best fitting the firing when it is modeled off of a continuously updated running average of the complete history of the animals' positions.
  2. For the step function kernel version - We found that ~2/3rds of the fits used the maximum temporal offset (+1500) and about 1/8th of the fits used the minimum offset (-1500). This suggests that the best fit came from modeling the firing from the distant future behavior or distant past. Neither make much sense. We also found that ~2/3rds of the fits of time_width were set to zero. HOWEVER - this is suspicious because the range of allowable values was set to 1-1500, so how did it set it to values below 1? This likely indicates that we are not looking at the right number in the results structure when we try to see what time_width is. Ignoring this for now, the fact that so many fits use large offsets (as with the exp kernel version selecting small decay constants) is suggesting that we are doing something wrong. Why is the fitting so often selecting the least reasonable parameter values??? - SEE BELOW FOR RESOLUTION
231027 & 231025 - This week we looked carefully at the sessions that had notable ( weird) parameter fits from the time_shift & time_width analysis.
- The key observation was that there were many with the minimum and maximum values as well as some values which did not seem to be in the correct range.
  1. The first thing we noticed is that the 'answer_questions' script was not always pulling values from the correct place in the results structure. Once we corrected this, the values were in the correct ranges (meaning we were looking at the wrong values for our parameters).
  2. The next thing we noticed is that there were 9 sessions out of ~224 sessions had the time_width parameter set to the maximum of 1500. This means that it was predicting the spiking based on an average position of > 30 seconds of past position. It is possible that this is a collapsing of a long tail distribution to the maximum of our range or possibly degenerate behavior of the model.
    - To rule out degenerate behavior, we stepped through the model and carefully made sure that it was working as expected. It was. yay.
    - We then looked carefully at the fits for those 9 sessions and saw that they were quite reasonable. These sessions had a lot of diffuse activity distributed across the enclosure.
    - From these results we concluded that these large values are likely a collapsing of the long tail distribution. This likely means that this parameter estimate is inaccurate and we would get a better value if we allowed a wider range of possible values. Thus, we deleted these reuslts and restarted the fit for them with a maximum possible of 4000. This upper bound was set based on other limitations in how the code is written (i.e, how the Gaussian convolution kernel is built).
  3. We also looked at the parameters for the time_shift and saw that there were 4 out of the total number of sessions that had the maximum value. Of these, 3 were included among the sessions that had maximum time_width values. For the last one, we also assumed that this is not a computation error but reflects an actual estimate of its parameter being greater than the max allowed value. Again, we deleted the file, updated the maximum range and restarted the fitting.
- We then looked at the NLL values (because Ehren couldn't remember the results). As expected, adding additional parameters (for time_width and time_shift) improved the nll values. However, strangely, adding the exponential decay as the method for blurring time made the NLL values worse. This is strange because the model should be able to replicate the performance of the simpler model with no exponential decay.
  1. We debugged this by looking carefully at the code for the exponential decay model generation. We found that the model had the kernel flipped so that it was effectively modeling "future position" rather than a blurred version of the past. We then fixed the code to allow it to either model the past or the future so that it could select the direction that works best. We then cleared all of the exponential fits and restarted the analysis.
Possible Idea for new temporal_width related smoothing function - in addition to the boxcar and expoential that we have presently, we could use an Gauss or an exGauss, much as we did with spatial tuning. This might have smoother edges that could help with fitting sessions with non-zero time_shifts.
231030 - We looked at the fits from the with_shifts_exp_kern - they mostly look good and certainly look better than they did before we fixed the 'predicting the future only' issue by allowing it to look at the future and past.
- We discussed how we might approach the statistics of answering whether a more complicated model fits the data better. One option is to consider something like the Bayes Factor. We can discuss this with Zoran. Another possibility is to look at generalization performance. For this, we would take the fits from each recording and use it to predict the spiking of other recordings in the same type of enclosure (e.g., square -> square) for the same cell. We would then compare the NLL values for the predicted spiking in other recordings across different model types. The most predictive fits would have the highest NLLs when applied to other recordings of the same cells (assuming that the tuning functions of each cell don't change much from one recording to the next).
- Not every cell looks like the tuning has been optimized. We should turn up the ' max tries'.
- Other things to look at:
  1. The NLLs for with_shifts_exp_kern compared to with_shifts and the basic model
  2. The parameter distributions for with_shifts_exp_kern, inc: time_shift, lambda
231107 - We downloaded the ~10 fixed with_shifts results that were reaching the upper bound of the time_shift but didn't yet look at them to see what the new fits are. We also relaunched the with_shifts with maxTries = 20. We then spent the rest of the time considering what makes the most sense for how to consider the blurring and shifting of time. This was sticky and it raised a bunch of basic questions about how we should be thinking about what each model can or should be telling us. Here are some conclusions and open thoughts that came from this:
- The boxcar results, though the have a time_shift parameter to the 'right' edge of the boxcar, it is important to remember that the actual interpretation should keep in mind that there are a set of time points that all contribute equally to predicting the firing of the cell. Any plotting should bare this in mind.
- The with_shifts and with_shifts_exp_kern fits have the ability to implement 'non-local in time' tunings. That is, they could in principle have a gap between the time covered by the kernel as parameterized and the present state of the animal. An example of this for with_shifts would be a time_shift that is negative as this would leave a gap between the 'right' edge and zero. An example of this for with_shifts_exp_kern would be a negative lambda (making it sensitive to the history before the max of the kernel) and a negative time_shift as this would create a gap between the kernel's right edge and the present.
- Ehren was wrapped up in the idea that it may make sense to ask whether the non-local in time aspect of these models substantially improves their fits. To test this, we could run variants of each of the models where time_shift was locked at zero. Inna pointed out that this expands the number of models by a lot and could hurt our ability to do meaningful stats. We agreed on this point. Inna also points out that the current models have the ability to have zero time shift if that is what is best. But as we discussed this further, we recognized that the value of running variants with and without individual terms (e.g., time_shift) allow us to quantify how much they contribute to the model fitting. From this stand-point, we should run variants of the model both with time_shift=0 and time_width=1;
- A separate but related issue is in regards to whether to use Gaussian and/or Half Gaussian kernels to blur time.
  1. The Gaussian functions can be considered to be a bit weird in that they fold in information from before and after the timepoint with the maximum predictive power of the cell firing. The oddness of this is that this can involve the future behavior of the animal in the case where time_shift=0 or is sufficiently close to zero. Ehren said that that may just be part of how cells work (i.e., having some element of the past and some prediction of the future). When asked why to consider the Gaussian, Ehren says that this is a tuning predicted by the SIPI framework as the inverse Laplace cells.
  2. The Half-Gaussian would address the wierdness that the future can be sampled by the regular Gaussian. Ehren was uncomfortable about this creating a gap between the timepoint with the maximum predictive power of the cell firing and the present but Inna pointed out that this was already true of both the with_shifts boxcar approach and the with_shifts_exp_kern exponential functions.
- Considering this from the SIPI model framework - assuming that the model is taking place in time, then:
  1. Laplace cells would be the exp_kern with no time shift
  2. Inverse Laplace cells would be either the Gaussian or exGaussian with shifts into the past.
231108 - We looked at the parameter distributinos for the with_shift and with_shift_exp_ker fits (see figures below). Some observations:
- with_shift - AKA boxcar or uniform distribution of time smearing.
  1. We see a clear positive relationship between the time_shift and time_width paramters. This indicates that when the firings is modeled by the past behavior, it takes into account a wider period of time.
  2. Remarkably, the timescales of both the time_shift and time_width parameters were relatively large. The median time_shift values were in the 5 to 8 second range. Very few time shift fits were in the <2 second range, meaning that few represent the most recent behavior of the animal. We should double check that we understand how time_shift relates to the smoothing - e.g., - does time_shift of 2seconds and time_width of 4 seconds include the present or not?
- with_shift_exp_ker - AKA exponential decay kernel of time smearing
  1. We do not see any obvious relationships between the time_shift and decay rate parameters.
  2. The timescales of the time_shift and time_width parameters are much smaller than those selected by the uniform distribution smearing.
    - For time_shift - There is still a bias away from the present, but with a smaller shift. The median time_shift parameter values are in the range of 2-5 sec into the past.
    - For decay rate - the values are generally very small, less than 1.5 - 2 seconds. This means that the kernel is has decayed by ~66% within 1.5-2 seconds.
    - Comparing how time_shift and decay_rate align over cells, very few (if any?) sessions have decay rates slow enough given their time_shift parameters to substantially consider the 'present' (0 delay). Much more often, the decay rate is smaller than the time_shift or is negative (meaning it extends into the time further into the past before the peak of the exponential which is itself already in the past). That is to say, there is a strong bias towards representing the past and not so much the present.
231129 - We decided to motivate the model generation with a design-matrix type approach, enabling us to perform contrasts between models that do and do not have various features. Per this approach, we have a partial 2x2x3 design, representing 2 levels of smoothing/convolution {with and without], 2 levels of time-shifting [with and without], and 3 types of smoothing function [exp, exGauss, Gauss]. To make running this many models reasonable from a practical standpoint, we updated the existing code to be more general. Whereas, previously, the type of the smoothing kernel was effectively hard-coded into the scripts, we have updated it so that the model type and whether or not to smooth or timeshift can be totally flexible. We got the code updated so that everything is general and the ML_fitAllCells_... function can control which models get run. It is set up to loop over all possible types but can also be directed to run one type versus another. These are all designated with filenames including the word 'general'. We did not test the code yet. We also did not implement any models beyond 'exp' at this point. What is required to update the code to allow more model types is to create a function with the name 'prepareKern_<type>.m' where type is the model type. Follow the example provided by prepareKern_exp.m to create this. Then insert a call to this function in 'setKernParams.m'.
231130 - We changed the comparison scheme into 2x4 table with 4 kernels including no kernel and with or without shift. We finished writitng the code for 'none' kernel and for 'exp' kernel and started the job on bigred200. We started writing for 'gauss' kernel and need to set the boundaries of the parameters for 'time_sig'.
231208 - ELN notes on discussion with Zoran about efforts with Inna. Zoran took a copy of the code to audit. We haven't heard back from him yet. He also recommended several sanity checks:
- Build synthetic data manipulating one or more of the coding principles (e.g., time_shift) and see that the fitting is capable of recovering the ground truth.
- Use cross validation to test generalizability of the extracted parameters (e.g., k-fold cross validation). We discussed that we could do this by extracting from one session and applying to other sessions of the same cell. This would be most conservative. Could also break up the data from a single session into slices. Keep in mind, however, that time-shifts could blur the boundaries between slices.
- Build null distributions by intentionally disrupting a feature of the spiking to see what nll values are obtained.
  1. rerun code to see how NLL varies as we sweep over values for one parameter (eg., t_shift)
231213 - We finished the code for gauss kernel and fixed the rest of the code for 3x2 table 'exp', 'gauss', 'none' with or without shifts models. We checked on Inna's computer that the code is running for all 6 models (on Ehren's computer 'particleswarm' gives an error). We ran the code on the bigred200.
240410 - We finished the aggregation of the results in tables. And we looked at the mean and std of the nllTrn for the pair of with/without shift models in two ways: group test versus pair test. As for group test, all differences are insignificant. As for pair test, we need to make sure that we pair the result lines according to the Session Number.
240416 - We made it possible to compare the data tables directly row-by-row so that we can test for changes per session. This revealed significant effects. The table below shows something like the T-stat for each comparison (mean / std_err ) such that any value with a larger magnitude than 1.96 is significant.

Here are some statements that can be made based on the above stats:

1. 1. Adding shift significantly improves fit for exp kernel (entry 1,2) and no smoothing (entry 5,6) but only a trend towards an improvement for gauss (entry 3,4)
  2. Best overall is exp kernel with shift. This is sig better than all other models except no smoothing with shift to which it is only trending towards being better.
  3. Worst overall is gauss kernel with no shift followed by gauss with shift.

NOTE: this table does not yet correct for model complexity (e.g., with AIC) nor are we dealing with multiple comparisons corrections.

240424 We started doing the analysis using glme in the script runglme.m: organized the data. The idea is that glme might allow us to perform the statistical comparisons like an ANOVA would but controlling for different numbers of observations per condition and handling the within-session like design of our testing. But the function fitglme throws an error Error using . FIELDNAMES must be a string or a cell array of strings. Error in table2dataset (line 19) p = t.Properties;.
Okay, we got this working and here are the results (note: it uses exp kernel with no time shift as the 'intercept')

This shows that 'none' type kernel is not different than the 'exp' kernel and that 'gauss' is worse and that adding a time shift helps.

This shows that for both no smoothing and exp smoothing, the time shifts are consistently positive by several hundred milliseconds.

240509 - Getting oriented on a todo list:

Re-implement the boxcar smoothing (see 'Old versions/with_shifts_boxcar_ker')
Check for smoothness of optimized values (use make synthetic data to vary one parameter around optimal set and plot resulting nll's checking for smoothness)
Check that the model fitting is working using synthetic data. (this involves generating synthetic spike trains with known tuning properties and checking that we can reclaim these using our existing fitting)
- We made some progress on this by creating a script called artificialSpikingTestCases.m. This is almost working except that the model2spk function is not generating the right number of spikes. This was due to a silly bug. We fixed this 240514.

implement null models - this will show that the models are working to capture spiking better than nonsense models and thereby show that the parameter fits are good. Note: this may be redundant to an approach where we locally vary individual parameter values to show smoothness around the local optimum.

240514 - Accomplishments

We computed the null distribution of nll values for a given model using a real trajectory across a range of numbers of spikes:

This shows that there are a range of nll values that could be gotten for random spike trains all generated from the same spiking model (M).

We thought about what this could be useful for and what we would expect if we compared this distribution to the nll value obtained with the real spike times for a cell that matches the model:

We expect that the empirical nll value would be worse, reflecting the idea that the distribution here is not capturing all of the complexity of the cell spiking whereas the null distribution spikes are only generated by the model. Might this be useful for saying that we are not missing any 'significant additional variance' in the case where the empirical nll is not significantly worse?
Another way we thought this could be useful is for comparing whether spikes generated based on one model generate significantly worse nll values when compared to a different model than would be generated by that 2nd model. For example, create spike rasters based on a model that included time delays and then ask if the nlls that result when assessed with model without time delays are significantly worse than the distribution of nlls generated by spike rasters made from the model without time delays. This could allow us to ask what sensitivity we have to the contribution of different parameters based on their magnitude and spike rates.

We examined how spike rate relates to the nll values obtained:

This shows how our confidence intervals on the nll value vary as a function of the number of spikes (i.e., the density with which we sample the probability distribution implicit to the spiking model).

We also coded up 'ArtificialSpikingTestCases.m' & 'checkModelStabilities.m'

'ArtificialSpikingTestCases.m' creates a fake model and then generates a random spike raster that could have come from that model from which it tries to estimate the parameters of the model. This takes a long time (~20min) and the results were mixed.
- Using the following parameters [0 1 0 25 0] = [a1 a2 mu sig ang], we got an nll= -0.9083 with params = [0.1739 0.5 5.8 33.5 -715 0]
  - Angle did the best - off by 5 degrees
  - a2 was fit to 0.5, the cap specified by the code. We had set it to 1. So it got as close as it could given the constraints.
  - a1 was fit to 0.174, though we had set it to 0. This is likely in compensation for the fact that it could not tune a2 as high as it wanted.
  - mu was fit to 5.8, though we had set it to 0. It is possible that this was also in compensation for the fact that a2 couldn't be tuned high enough. (Hard to know)
  - sig was fit to 33.5, though we had set it to 25. Again, this could be inflated due to the a2 parameter.
- AS WE FINISHED: we restarted the model using parameters [0 0.1 0 25 0] = [a1 a2 mu sig ang] to see if having a lower a2 allows the rest of the parameters to fit more accurately. This is more realistic after all.
  - The parameter fits are: [0.1739 0.5000 6.0375 33.2 720 0] = [a1 a2 mu sig ang] giving an nll of -0.9082

240515 - Today we looked at how well the model is able to estimate various sets of preset parameters.

It was able to detect the time shift of 250 with precision of ~10ms. It also was okay at detecting no time shift (detected 0ms with precision of +/-15ms)
It was terrible at estimating a1 and a2. It consistently way over estimates these.
- We spent a bunch of time thinking about how our model2spk.m function works and realized that it could be sampling spikes from distributions that are different from the model we feed it. This is based on mathematical arguments rather than an obvious empirical demonstration. We coded up two more variants (case 2 & 3) which generated synthetic spikes in different ways. Case 3 probably is most true to the intended distribution since it samples it directly. However, it does not control for the number of spikes generated.
Following up on this, on 240516 (ELN): I ran artificialSpikingTestCases.m with the same set of parameters ([0.1 0.3 100 25 90 250 0] = [a1 a2 mu sig ang time_shift time_width]) while either setting the time_shift range to [0 0] (preventing this variable from being fit) or to [-1500 1500] for 100 iterations with the same random spike raster.
- In either case, a1 and a2 were set too high
  - [0,0]: a1 = mean(std) = 0.1856 (0.0001); a2 = 0.5 (0.0000)
  - [-1500,1500]: a1 = mean(std) = 0.2155 (0.0001); a2 = 0.5 (0.0000)
- The version that was given room to fit time_shift, found the correct value ( time_shift = 251.40 (0.30) ) and had a better total nll score
  - [0,0]: nll = -1.0081 (0.0000)
  - [-1500,1500]: nll = -0.7440 (0.0004)
- The rest of the parameters for both versions :
  - spatial mu (distance from wall) true vs. [0,0] vs [-1500,1500]: 100 vs. 97.67 (4.18) vs. 96.73 (3.56)
  - spatial sigma (variance around mu) true vs. [0,0] vs [-1500,1500]: 25 vs. 9.98 (0.05)** vs. 28.06 (0.02) **note, [0,0] version was wrong
  - angle (wall spiking is anchored to) true vs. [0,0] vs [-1500,1500]: 90 vs. 89.40 (2.16) vs. 89.96 (2.16)
I spoke with Zoran about why a1 and a2 could be systematically wrong. We agreed it is probably our weird normalized nll.
240517 -
- I tried re-running the fitting with nll that sums over all time points with no normalization a few times
  - For the same target parameters as above:
    - 0.1000 0.3000 100.0000 25.0000 90.0000 250.0000
  - The parameter fits were:
    - 0.0000 0.0573 90.5406 4.3028 92.2309 251.3443
    - 0.0000 0.0573 90.5418 4.3032 92.2311 251.2150
    - 0.0000 0.0579 99.9828 4.2376 90.5211 251.1699
    - The a1 and a2 parameters were much lower suggesting that removing the normalization was impactful but now they were too low.
    - The spatial sigma was too low
- Wondering whether the extra low a1 and a2 were too low because I subsampled 2K spikes out of what should have been ~114K spikes given the a1 and a2 values and trajectory length, I reran the spike randomization to have the full expected number of spikes given the a1 and a2 values.
  - Again, using the same target parameters as above:
    - 0.1000 0.3000 100.0000 25.0000 90.0000 250.0000
  - When time_shift was open to [-150,1500]: the parameter fit was:
    - 0.0989 0.2876 90.8181 25.1369 93.0249 251.8506
    - The fit is very good overall. a1 and a2 are very close to correct.
  - When time_shift was locked to [0,0]: the parameter fits were:
    - 0.0983 0.2315 99.4730 31.9827 89.6890 0
    - 0.0983 0.2314 99.7970 32.0638 85.3189 0
    - The fit of a1 and a2 is much better. The fit of a2 is still a bit low however.
    - Fit of sigma is a bit high. Perhaps due to lack of accounting for variability from the time_shift?
240521 -
- I implemented the boxcar branch of the analysis
- I also uploaded the version of computeNLL that does not perform the normalization (the version that gave the most accurate a1 and a2 values above)

Deliverables for paper (started 240610)

Section: analysis capable of establishing parameters -
- Start with cartoon fit of a pretty BVC cell from Lever paper with nice round parameter values
- Do a sweep across tShift values ranging from [-500 : 500] and, for each value, compute the expected NLL value for parameter fitting that could vs could not fit tShift.
  - Plot NLL for each version (with vs without tShift fitting) as a function of tShift magnitude
  - Plot the estimated values foe each of the tShifts
  - Expected result: NLL values are significantly better when tShift can be fit so long as tShift is sufficiently large. Most important is that it is significantly better for tShift values which the empirical tShift values fall into (~200ms).

<<Results from doing the parameter mapping described above>>

Figure XX: Parameter fitting is sensitive to shifts in spike timing relative to behavior. NLL values are higher when the parameter fitting is allowed to add time shifts to the model of the cell spiking in cases when the spiking was generated to have non-zero time shifts. LEFT: Parameter fits were highly stable over twenty fits of the same spike train as indicated by the tight errorbars on a majority of the points. RIGHT: When spike trains were resampled for each of the twenty fits, the NLL values were somewhat more variable but were statistically higher when the fitting was permitted to fit the time shifts.

<<Replacement right part of above image>>

- Using some tShift value in the range of the empirical values (e.g., 200ms), do a sweep over time_width blurring magnitude to show that we could detect it if it were there
  - FIRST THING: Figure out which time_width values are reasonable and map over these)
    - We did this and found that it samples all values between -1500:1500 almost uniformily except at the boundaries which it over samples. We wondered whether the fitting is insensitive to this parameter. (update: it was for boxcar!) This is what we are answering by doing the current analysis. . .
    - We then implemented the analysis to manipulate tWidth and test whether the fitting can reliably fit it and then uploaded and started this on the HPC.
  - As above, plot NLL for fitting that could vs. couldn't fit this parameter over the range of synthetic time_width values.
  - Repeat this for each of the types of kernels
  - Expected result: NLL is better when time_width can be fit. NOTE: at present we don't think there is any blurring in the data based on the results we have seen so far (exp was not significantly better than 'none' and 'gauss' was worst) so the point here is really just to say that we could have detected it had it been there so long it was at least _____ so big.
    - NOTE: so long as the result remains that there is not time blurring, this should be sufficient. However, if it becomes the case that one of the kernels starts to beat 'none' then we will need to show that the kernel types are actually sensitive to the appropriate type of variance. We can do this by cross-validating each fitting on synthetic data made from each of the other kernels.
- Demonstrate that we can fit:
  - a1 & a2
  - mu
  - sigma
  - angle
  - HOW: for each, take a few example parameters and show the mean and variance of our ability to fit them. This can be reported in a table but also with a figure that plots what the fit should have looked like and what the range of fits actually looked like (see below)

240730 - we met after several weeks of travel and re-booted.

We found that boxcar fits of the empirical data probably did not use time_width and restarted these
We loaded the _exp_shift0 results and looked through them casually and noted that many used very small Tlambda values. We believe that this causes LOTS of temporal smearing and were surprised that this should be so common. We started plotting the fits so that we could examine this. We checked the nll values for those with very small Tlambda values and saw that they were at least as good as those with larger values (less smearing?!?).
We plotted the log transform of the Tlambda values and saw that the very small values were not zero but a log-normal distribution around small positive numbers. When we plotted the inverse of this to look at the time constant tau, we saw that these corresponded to positive lags (into the future as we figure) of a few hundred milliseconds (generally less than 500).

We learned that our implementation with lambda was bad bc large values (e.g., 10 and -10) are effectively the same with infinite decay whereas infinitely small steps around 0 give discontinuous values (e.g., 0.00001 and -0.00001) resulted in either very slow decays from the future or the past).
The new version works to fit a parameter that scales tau, the decay rate. The key difference is that this fits the inverse of what we were fitting above. As such, whereas large positive and negative lambda values had yielded similarly instantaneous drop-offs before but were numerically different from eachother, with the inverse, these become numerically similar. And, conversely, whereas lambda values very close to zero had generated minimally decaying functions in opposite directions, creating a discontinuity in parameter space, with the inverse, these become numerically very dissimilar.

240807

We restarted the exp fitting because few finished.
We downloaded the boxcar results that were re-run after fixing the t-width to be used.
We considered what should be done with the k-fold results:
- Drop any single fit that is an apparent outlier (e.g., nll is outside of some window from the rest of the K). Note: this is based on testing the ability of the fitting to reliably detect synthetic values.
- Collapse across the remaining k so long as it varies minimally and smoothly

240815

We babysat jobs on the HPC, they will hopefully finish for next time but they might also need to be split so that each 'k' is a separate job
We started implementing the outlier removal but again found evidence that the number of spikes impacts the outcome. We found that some of the k had lower nllTrn values than others but that those same k also had higher nllTst values. An explanation for this is that the number of spikes in each split varies and having more or less spikes is driving the nll values.
We started looking at whether we could use the normalized nll values instead to control for this but then saw that the nll values were all exactly the same. We need to look into how these were computed to make sure we aren't making a mistake.
- We checked this at our next meeting (240819 and found

240819

We found and fixed the problem with nllNorm
We implemented an outlier removal script (findAndRemoveOutliers.m)
We implemented a method for the fitting to step over subsets of k instead of all 10 so that it can finish in the 4 days

240822

We discussed how to test if parameters are stable over kFold and over Sessions for a given cell. This is summarized here:

In the image above, the idea is that if we compute some comparison (e.g., mean squared difference) across all pairs (ordered by kFold sets and sessions across a cell), then we can compute the mean difference across the K-fold as a lower bound of the differences we might expect to see (white larger box), and the upper bound as the mean difference between each entry and all entries for other cells (the blue box), we can then test if the parameter fits across sessions for a given cell (green boxes) are more like differences within a session or across different cells.
We implemented this in a function called testParametersStability.m

240828.
Today we started to use testParametersStability to understand the stability of parameter fits within and between sessions across cells. The following figure best summarized the types of things we saw (made with plotParamEstimates):

In the top panel of this figure, we plot the mean differences across the k-fold fits for each session, clustered by cellID for the parameter Ssigma (the standard deviation of the spatial tuning (i.e., spatial width)). In the bottom panel, we plot the mean differences between the k-fold fits for a given session and the fits to all other sessions that were not from the same cell. The idea is that the bottom plot should show something like the null hypothesis - the mean differences across recordings for which there should be no consistency. (NOTE: data comes from Tkern = none, TshiftType = 0)

One of the first things we noticed is that there is high variability across sessions in the mean differences in the parameter estimate across k-fold. That is, for some sessions, there is huge variability while in others there is very little. This can be seen in the top plot. Interestingly, this isn't consistent across multiple recordings for the same cell. For example, the cluster shown around 9 has two sessions with high variability but the rest are very low.

We also see variability in the between group differences. We spent time discussing what this might tell us about the structure of the estimates. We identified a couple of different patterns:

low within group variability & low between group variability: This indicates consistent parameter estimates that resemble the estimates of most fits.
low within group variability & high between group variability: This indicates consistent parameter estimates that diverge from the estimates of most fits.
high within group variability & high between group variability: This indicates inconsistent parameter estimates that diverge from the estimates of most fits.
high within & low between group variability: (We don't see this much) this indicates inconsistent parameter estimates that nonetheless occupy the space of values commonly used to fit recordings.

Interesting next steps:

See what differences might exist in the conditions that lead to high versus low within group variability (e.g., enclosure type or testing conditions)
Look at the actual parameter estimates look like across sessions, cells, etc.
Look at other parameters. Do the same sessions have high within group variability across multiple parameters?
Look at different TshiftTypes (with vs without) to see if these are consistent.
Look at different TKern values
Compare these values to something that we know isn't a BVC to see if those behave differently (it will take some thought to decide what can be used for this purpose: e.g., can we use other cells not labeled as BVC? should we generate our own? etc.)
Look at how within session-vs-self (top plot above) and session-vs-otherCellsSessions (bottom plot above) compare to cell-vs-self variability (not plotted above but added in figure below as grey bars)

POST MEETING NOTES:

ELN: I wrote a script to combine the kFold results. Some of the result files were corrupted and needed to be re-run.

We noticed that many of the fits used a maximum blur value and the fits don't look good. We should see if we can understand why. It may be due to the use of blur effectively changing the number of spikes - in the past, this has been a very strong predictor of the overall NLL values. Plot more fits and study them to understand what it is doing.

240919 - We brought the images above to Zoran and asked about how we might make the parameter fits more stable over kFold. He didn't have any immediate suggestion. He agreed that it could be worth trying each parameter set on the other k-fold chunks of data to test whether differences were due to variability in the data or unreliability in the parameter estimation. The logic being that if each parameter set yields the best NLL score on its own k-fold chunk of data, this would indicate that the optimization is working but that differences in the chunks lead to different parameter estimates. On the other hand, if parameter sets from other k-fold chunks yielded better NLL scores on the current chunk than its own 'optimized parameters,' than this would suggest that the optimization is not finding the global optima.

We implemented this to be done in the extract_data_all_general2.m function as it is importing the data. (note: this is slow, taking hours to import all data).

The results strongly suggest that the optimization is not optimal. We found this by examining the rank scores given by applying each parameter set to all other k-fold chunks and then asking which parameter set worked best for each chunk. We tried this for four different types of nll: with vs without normalization (equating error for spike and non-spike bins) and train vs. test.
'paramSetNormXValRank_trn' = 9.48% of the fits was the partical swarm selected parameters actually the best for the k-fold chunk they were optimized for

'paramSetNormXValRank_tst'= 8.43% of the fits was the partical swarm selected parameters actually the best for the k-fold chunk they were optimized for

'paramSetXValRank_trn'= 41.78% of the fits was the partical swarm selected parameters actually the best for the k-fold chunk they were optimized for

'paramSetXValRank_tst' = 5.24% of the fits was the partical swarm selected parameters actually the best for the k-fold chunk they were optimized for

The 'best' performance appears for the non-normalized training nll (paramSetXValRank_trn) this makes sense because this is the actual value the partical swarm was optimizing. Yet, it is still less than half of the time that the 'optimized' parameters were actually the best for the current chunk of data. It is also concerning that it was less than 10% of the time that a parameter set was the optimal for its own chunk. This is the chance value if it was random which parameter set was best for each k-fold chunk. ** NOTE: the above results are for the 'exp' kernel with time shift (general_kfold_exp_sft1). The actual parameter values aren't totally reasonable due to time-shifts causing different amounts of data to be fit for each k-fold. Thus, there is a chance that some of this current issues results from doing weird things while fitting.

With this data we could also look at whether one of the parameter sets did the best broadly across the k-fold, suggesting it is the best overall. For this, we computed the rankSum for each parameter set across the k-folds. To examine the many of computed parameter sets, we plotted the histogram of ranksum values. To anticipate what these can / should look like, consider two opposing possibilities. First, there could be an absolute best parameter set, and second best, and third best, etc. In which case, the rankSum values would be uniformily distributed from 10 to 100 (i.e., the sum of 10 k-fold rank #1s up to the sum of 10 k-fold rank #10s). On the other hand, each parameter set might do about as well as each of the others. In this case, it would rank #1 as often as #5 as often as #10. Thus the rankSums would be over values 1-10 with equal probability, leading to values consistently around 55. So, we looked at the following images to determine which was happening:

This plots the histogram of rankSum values for the general_kfold_exp_sft1 parameter set for each of the four types of nll scores. paramSetNormXValRank_trn has the most uniform distribution, suggesting that when nll computed from the training set is normalized, there is such a think as parameter sets that due consistently better (or worse) than the others. At the other end, paramSetXValRank_tst has the least uniform distribution, suggesting that it is effectively random which parameter set does the best on each k-fold. Note, it is possible that one could see this distribution in the case that each parameter set does the best on its own k-fold. However, we ruled this out in the analysis above where we found that only ~5% of the time is this true with these nll scores.

We then tried this again for general_kfold_exp_sft0 and got effectively the same result see below (graphs not shown but these look similar again).

'paramSetNormXValRank_trn',...8.8614 %

'paramSetNormXValRank_tst',...9.5050 %

'paramSetXValRank_trn',... 52.7723 %

'paramSetXValRank_tst'... 1.8812%

We then tried this again for general_kfold_boxcar_sft1 and got effectively the same result see below (graphs not shown but these look similar again).
Percentages for each of the four nll types in same order: 9.0306 10.0510 37.2959 3.9796

We then tried this again for general_kfold_boxcar_sft0 and got effectively the same result see below (graphs not shown but these look similar again).
Percentages for each of the four nll types in same order: 9.1282 10.6154 33.7436 3.7436

We then tried this again for general_kfold_none_sft1 and got effectively the same result see below (graphs not shown but these look similar again).
Percentages for each of the four nll types in same order: 10.0526 9.2105 36.6842 2.9474

We then tried this again for general_kfold_none_sft0 and got effectively the same result see below (graphs not shown but these look similar again).
Percentages for each of the four nll types in same order: 10.0000 9.5288 53.2461 1.2042

240923 - We mostly did file management:

We updated 'combineKFoldResults' to run on the HPC and started combining what we have there
We updated 'extract_data_all_general2' to be skip importing files that don't have k=10 worth of results and start started extracting the data for the files we have
We started downloading the combinedKFold results

240924 - we looked in at the results and performed the comparisons between current (partial) results:

{'general_kfold_boxcar_sft0'}

{'general_kfold_boxcar_sft1'}

{'general_kfold_exp_sft0' }

{'general_kfold_exp_sft1' }

{'general_kfold_gauss_sft0' }

{'general_kfold_gauss_sft1' }

{'general_kfold_none_sft0' }

{'general_kfold_none_sft1' }

Diff_effect =
NaN 18.4401 -34.1301 17.0818 -34.0756 12.2057 -11.7936 1.3481
-18.4401 NaN -39.3860 -2.5070 -39.3966 -0.9004 -12.9220 -1.1431
34.1301 39.3860 NaN 39.4718 1.2383 28.5633 -2.1130 3.6263
-17.0818 2.5070 -39.4718 NaN -39.4340 2.2038 -13.6785 0.9381
34.0756 39.3966 -1.2383 39.4340 NaN 28.6706 -1.5131 3.7615
-12.2057 0.9004 -28.5633 -2.2038 -28.6706 NaN -13.8709 0
11.7936 12.9220 2.1130 13.6785 1.5131 13.8709 NaN 0
-1.3481 1.1431 -3.6263 -0.9381 -3.7615 0 0 NaN

pos values mean columns are bigger than rows
NLL vals are positive, smaller is better, bigger is worse

pctMissingComps =
23.3929 28.0357 24.5536 27.8571 25.0893 62.2321 93.3929 99.8214
28.0357 20.2679 28.9286 32.0536 29.4643 66.6071 94.8214 99.8214
24.5536 28.9286 24.2857 28.7500 25.7143 62.2321 93.3929 99.8214
27.8571 32.0536 28.7500 27.5893 29.2857 65.8036 94.1071 99.8214
25.0893 29.4643 25.7143 29.2857 24.8214 62.2321 93.3929 99.8214
62.2321 66.6071 62.2321 65.8036 62.2321 62.2321 93.5714 100.0000
93.3929 94.8214 93.3929 94.1071 93.3929 93.5714 93.3929 100.0000
99.8214 99.8214 99.8214 99.8214 99.8214 100.0000 100.0000 99.8214

241001 - Today we're looking at the newest parameter fits - For the record, these should be matching the number of NANs across kFolds and hopefully also across kern types (though we haven't checked this second point). It is optimizing based on a non-normalized NLL.

For boxcar_sft0:

From this, we see that there are many recordings for which tWidth is 0 (~140 k-chunks), many for which it is between 1 and 1250ms, with a peak around 400ms (~1500 k-chunks). Then there is a long tail out to about 2000ms (~117 k-chunks). Finally, there are a bunch of roughly-uniformily distributed k-chunks between 2000ms and up to 15000ms (~206 including a cluster at 15000).

We again examined whether there is evidence of an optimal parameter-set from among the k-fold by computing ranksum scores from applying each parameter set to all other k-fold chunks and then asking which parameter set worked best for each chunk. We tried this for four different types of nll: with vs without normalization (equating error for spike and non-spike bins) and train vs. test.

In order, we show the following values:
'paramSetNormXValRank_trn', 'paramSetNormXValRank_tst', 'paramSetXValRank_trn', 'paramSetXValRank_tst'

For general_kfold_boxcar_sft0 : 9.8387 9.2473 49.7849 2.2581
(NOTE: the paramSetXValRank_trn value is up from ~34% from before we matched the number of nans to ~50%)
(ALSO: the distribution of ranksum values looks like it does before (above) in that there is a uniform distribution of ranksum values for the normalized train nll values from 10 to 100 (indicating that there are truly parameter sets that are best on the normalized data which disregards spike rates) and there is a cluster closer to 50 for the non-normalized train nll values indicating that there is less of a clear winner when we don't normalize out firing rates. We considered looking at the test nll values that come from applying each parameter set to all other k-fold but then decided that these aren't valid 'test' sets because 9 out of 10 times, these 'test sets' were actually used to establish the parameter estimates.)

If we look at the tWidth fits from only the parameter sets that have a normalized training rankSum of < 20 (only clear winners at cross generalization), then we find that the clusters at 0ms and between 1 and 1250ms remain but many of the higher (more uniformly spaced) values are dropped (see plot on left below). On the other hand, if we only look at those with rankSum of >90 (only clear losers at cross generalization), then we find a much high proportion of the weird uniformly or very high tWidths >2000ms or 15000ms (see plot on right below). Together, these figures suggest that the high values are from fits that failed to find an optimum.

Now we looked at the stability of Ssigma fitting across the fits. As a reminder, in these plots, we show the mean difference for Ssigma between the k-fold (small bars, top left) or across sessions (gray bars, top left; blue bars, right) for each cell and the mean difference for Ssigma between cells (bottom left; red, right). Ideally, the within-cell bars (top, left; blue, right) should be smaller than the between-cells bars (bottom, left; red, right) to show stable fits. If the within-cell bars are higher than the between-cell bars, then this suggest that the fits were more variable within a cell than they were across all cells (bad).

From this we see that there are some cells (e.g., 5, 6, 7, 11, 18, 30, 31, 34) that have very stable within-cell fits (good!) and that there are other cells (e.g., 4, 24, 26, 29, 32, 33) where the within-cell fits are at least as variable as the fits across cells (bad!).

241002 - Today we reviewed this document to clean it up.

We decided to create two journals - one for testing parameter fitting and the other for parameter fitting of empirical data. Finally, we'll create a section for out of date stuff that we've done but is wrong.

241021 - Today we nudged the HPC jobs along. In doing so, we found that the exp_sft1 jobs aren't finishing and tried to sort out why. That revealed some weird message we haven't seen before about raw_pos / fixPos. After trying to hunt down the source of the message we gave up and decided to just restart the jobs and see what comes of it. We also started collapsing across k-fold and extracting results for later use. Finally, we continued to clean up this log.

241031 - We looked at new results for gauss and exp kernels.

gauss with shift: all params reasonable besides Tsigma
gauss without shift: all ok
boxcar with shift: Sa2, Twidth, Tshift lean towards extreme values of the parameter. Ssigma is 0 for many sessions.
boxcar without shift: all ok but Ssigma --- again zeroes
none with shift: _fullResults.mat is old? or there are many problems with the parameters, see screenshots in the folder with _fullResults.mat inside 'parameters distributions' folder. SAW two things:
- no peaks at 90/180/270/360 for rotDeg, instead uniform distribution
- no peak at 200 for Smu, instead uniform distribution
none without shift: _fullResults.mat is old?
exp with shift: not enough results -- need to download them
exp without shift: don't know what the distribution for Tomega should look like, otherwise ok.

241113 -

Today we restarted jobs that need standardized nanmasks

We then looked at the parameter fits for gauss, boxcar and exp. In the call of gauss and boxcar we looked at time_shifts and for exp we looked at Tomega. In all three cases, it suggests that BVCs care about the future, not the past.

We looked for other work that has indication of future coding and found one about grid cells published in Science https://fujisawalab.riken.jp/publications/Ouchi2024.pdf. We looked through this paper and agreed that it could provide a template for how we will write out paper. There were several things that their paper inspired us to think about doing:

plot the spatial tuning over different values of the time_shift / Tomega parameter for representative cells
plot a summary stat (they used gridness, maybe we can have a BVC index or just use NLL) over wide range of time_shift / Tomega values
look at cells that don't obviously look lik BVCs without time_shift to see if they start to look like BVCs.

Other things important to check

Does adding a time_shift improve NLLs? (it must for this result to be meaningful)
What time_shift does Exp use?

Text blurbs

The SIPI framework hypothesizes that the brain encodes the world with many temporal scales. Those many temporal scales translate to many spatial scales when one assumes that time can be modulated by factors such as velocity so that time of moving at a velocity tracks distance. This is hypothesis number 1 of the model. As a second assumption, SIPI suggests that the brain doesn't simply use a random set of different scales but, rather, uses a set of exponentially distributed scales. The rationale for this is that exponentially distributed scales can be shown to be 'optimal' in the sense that it will provide for 'scale free encoding.' << the specifics of this should be unpacked elsehwere>>. Thus, hypothesis number 2 of the model is that the scales, when quantified, should be exponentially distributed.

To address these hypotheses in the context of spatial encoding, we examined the scales of boundary vector cells (Lever et al., 2009).

Figures

Example of possible Weber-Fechner result

These show distributions of the spatial parameters, seeking evidence for whether there is a logarithmic distribution.

Parameter fitting works

?? Spatial kernel type ?? (gauss vs exGauss vs constant)
Smu
Ssigma
Tshift
Twidth

Comparison of spatial kernel types

Gauss vs ExGauss vs Constant
relevant code in answer_questions.m

Weber Fechner

relevant code in answer_questions.m

List of possible future directions

Fit parameters for multiple sessions at a time

Establish what tunings fit over all sessions for a single cell. This will force same parameters for rectangle and circles which may resolve ambiguity about near-vs-far wall anchoring.
Test if the improvement in model fit obtained by fitting neurons individually (i.e., unique parameter estimates per neuron) outweighs the cost of the additional degrees of freedom relative to a model that fits all neurons with a common set of parameters.

How the exponential of the exGauss function was affecting the results

- 230329 - We were looking at how the addition of the exponential component of the model changes the alignment of the best fit gauss to the enclosure. The goal of this was to understand better how the model was using the exponential. The point is, the exponential only manipulates the right half of the gaussian and so any time the mean of the gaussian is adjusted to be on the right side of the enclosure or past it, the exponential is not doing much to change the shape of the curve.
  - We had been looking at how the mean changes, but in looking at the ex-gauss kernels, we saw that the center of mass changes based on tau and the mean value. None the less, the figure had looked like the figure shown above in this 'collapsible group'

History as function of velocity

In the current version of lag / integreation window analyses, we consider the past strictly as a function of time. That is, a window of size X only cares about X milliseconds and has no regard for what the animal did. Perhaps this should care about velocity in this window as it can in SIPI. To do this, each cell's activity would be modeled as a function of integrated velocity, which is distance. As such, the firing of each cell would be a function of where it had been in the last X centimeters instead of X milliseconds.
We should consider whether an equivalent could be accomplished for heading.

Page updated

Report abuse