e060517spont_2_lambda

This page contains a detailed version of the appendix of the third STAR manuscript.

The "renewal test plot" (second plot) of the automatic analysis of the spontaneous activity of neuron 2 shows that the discharge cannot be described by a renewal process. We therefore try models depending not only on the elapsed time since the last spike but also on the duration of the last inter-spike intervals (isi).

The analysis presented in the sequel can be automatically replicated by downloading the R script file attached at the bottom of this page: e060515spont_2.R. Then start R from the directory where the file was downloaded and type:

> source("e060517spont_2.R", verbose=TRUE)

We give next comments on the most important commands. We start by attaching/loading the STAR package:

> library(STAR)

We then load the data set in our workspace:

> data(e060517spont)

In order to use gss functions, that is, the functions doing the actual work of implementing the smoothing spline approach, we have to format our data into a suitable data frame. This is done with function mkGLMdf:

> spontDF <- mkGLMdf(e060517spont,0.002,0,60)

Here the second argument 0.002 (second) is the bin length used to discretize the time axis and corresponds to the duration of the shortest isi observed in this train as reported in the preliminary automatic analysis report. This data frame contains data from the 3 neurons recorded simultaneously in this data set. To save memory we will work with the part of the data frame containing the data relevant for the predicition of the spike times of neuron 2:

> spontDFn2 <- spontDF[spontDF$neuron == "2",]

Before embarking into fitting a candidate model it is a good idea to look at the distribution of the variables we will include in it and , if necessary, to try to transform them in order to have a uniform distribution of the model's variables on their definition domains. Considering the variable containing the elapsed time since the last spike (called TmO bellow but called lN.2 in the spontDFn2 data frame we just generated) we show next the distributions obtained with 4 different "transformations":

The fourth root transformation generates clearly the most uniform distribution of the variable on its definition domain and will therefore be chosen here.

We next construct the variable containing the duration of the last isi:

> spontDFn2$d <- isi(spontDFn2,lag=1)

> spontDFn2 <- spontDFn2[!is.na(spontDFn2$d),]

The last command ensures that only defined values are contained in the data frame passed to gss functions. Carrying out on variable d the same kind of analysis just performed on TmO shows that the 6th root is the best but since the 4th is almost as good it will be used here for simplicity. The same results are moreover obtained using the 6th as you can easely check yourself by repeating this analysis, changing slightly the relevant commands in the script file. We therefore go on and define our working variables, TmO.fr and d.fr:

> spontDFn2$TmO.fr <- (spontDFn2$lN.2)^(1/4)

> spontDFn2$d.fr <- (spontDFn2$d)^(1/4)

Now remember that we fit half of the data and check the model on the other half, we therefore split our data frame in two parts, an "early" and a "late" part:

> spontDFn2e <- spontDFn2[spontDFn2$time <= 30,]

> spontDFn2l <- spontDFn2[spontDFn2$time > 30,]

We can now fit to the late part a model where our two variables, TmO.fr and d.fr,can interact, that is, a non additive model:

GF0 <- gssanova(event ~ TmO.fr*d.fr, data=spontDFn2l, family="binomial", seed=20061001)

Check the documentation of gssanova for a description of the last three argument if it is not immediately meaningful for you. After a few minutes the job is done and we can time transform the early part of the data using the model fitted on the late part:

> spont.n2e.tt0 <- GF0 %tt% spontDFn2e

The "extended Ogata's tests" shown bellow are obtained with (see the second STAR manuscript for details about these tests):

> plot(summary(spont.n2e.tt0),which=c(1,2,4,6))

The results look good so we next try simplifying our model moving to an additive one. A cool feature of gss is that we can already get a good idea of the necessity of any term in a model by using the built-in Kullback-Leibler projection test:

> project(GF0,include=c("TmO.fr","d.fr"))

$ratio

[1] 0.02728272

$kl

[1] 0.0002188074

$check

[1] 0.9996708

Here the critical component is the $ratio one suggesting that the interaction term is not necessary. We therefore go on refitting our data with an additive model:

> GF1 <- gssanova(event ~ TmO.fr+d.fr, data=spontDFn2l, family="binomial", seed=20061001)

Notice the change in the formula (first argument) compared to the first call: TmO.fr*d.fr, became: TmO.fr+d.fr. We again time transform the early part of the data and generated the extended Ogata's tests:

> spont.n2e.tt1 <- GF1 %tt% spontDFn2e

> plot(summary(spont.n2e.tt1),which=c(1,2,4,6))

The results look good again and we decide to keep this model.

We want next to look at the two terms of the model displaying them with confidence intervals. In order to get better estimates, we refit the whole data set with our "best model":

> GF2 <- gssanova(event ~ TmO.fr+d.fr, data=spontDFn2, family="binomial", seed=20061001)

and after a short sequence of commands (found in the script file) we get the following plot:

Notice the non linear tick spacing on the left column.

Looking at the "time since last spike" effect (upper row) we clearly see the refractory period as the early negative part of the graph (until roughly 5 ms). It is followed by an "excitatory effect" lasting till 120 ms (with a peak at 25 ms). This excitatory period seems to be followed by a slight inhibitory one lasting until 1s. The effect of the duration of the last isi (variable d, lower row) is excitatory between 0 and 37 ms and is roughly null afterwards. This excitatory effect might seem short lasting but 49% of the isi of this train are shorter or equal to this duration as can be checked with the following command:

> isiN2 <- diff(e060517spont[["neuron 2"]])

> sum(isiN2 <= 0.037)/length(isiN2)

This short lasting excitatory effect of the last isi duration is likely to be a key factor in generating the bursty aspect of the neuron's discharge. Notice that its peak value is comparable to the one of the "time since last spike" effect.

We next take a look at the preformances of a static model, as we call them in the third STAR manuscript, that is, an inhomogenous Poisson model (IPM). In this setting we cannot split anymore our data set in two parts and we have to fit our IPM to the whole set. We start by discretizing the time axis into bins of 500 ms (giving an average of 4 spikes per bin since the mean isi is: 123 ms) and we tally the spikes in each bin. These counts become the "raw data" to which the IPM is fitted:

mySeq1 <- seq(0,max(e060517spont[[2]])+.5,by=0.5)

pdat1 <- hist(e060517spont[[2]],mySeq1,plot=FALSE)[c("mids","counts")]

names(pdat1) <- c("time","counts")

PF0 <- gssanova(counts ~ time,data=pdat1,family="poisson")

predPF0 <- predict(PF0,data.frame(time=pdat1$time),se=TRUE)

After short manipulations described in the script file we get the following graph displaying the intensity of the inhomogenous Poisson proces together with 95% confidence intervals (dashed curves) and the "raw data" (circles).

We can use the estiamted intensity to time transform the train and apply our goodness of fit tests as shown bellow. Notice that in such a setting the "Wiener Process Test" (bottom right) is meaningless since by construction of the fitted curve predicts exactly the number of actually observed events. The red curve on the graph should therefore not be the realization of a Wiener process but of a Brownian bridge. Two of the test look evertheless still rather ugly, ruling out the relevance of an IPM for these data.

In order to get a better feel for the difference between the spike train "macro structure" captured by the IPM and the "micro structure" captured by the proper dynamical model, we plot the conditional intensity estimated by both of them for a short time preriod. The figure bellow diplays the graph of the dynamical model (continuous curve) and of the IPM (dashed curve). The abrupt downward inflexions of the dynamical model graph occur when a spike is observed. In fact the curve appears continuous due to a lack of time resolution, it is truly the graph of a left continuous right limited function (see Fig. 1A of the second STAR manuscript).