stack.data.fnc

Copy, Paste and Adapts

stacked = stack.data.fnc(dat, between.factor=between.factor, within.factor=within.factor, col.start.rm=3)

stacked = stack.data.fnc(dat, between.factor=between.factor, within.factor=within.factor, col.start.rm=3, n.item=80)

Objetive

In repeated measures designs, it is useful that all measures taken on the JxK RM conditions been stacked in a single column named dv. The experimental condition membership of each value in this column is controlled by other variables that the function automatically generates (subject, item, factors). With this new data the user can now request many statistics and multiple exploratory graph with any combination of factors regardless of the type of manipulation involved (within or between).

The stacking of the data is mandatory for all mixed model analysis and Anova F1, F2 and minF.

Stack Data (Experiments without items)

Starting with OBrienKaiser with an split-plot desigh 3 x 2 x 3 x 5 treatment x gender x fhase x hour. treatment and gender are between factors included in between.factor argument. The within factors phase and hour are defined (withingruop) in the within.factor list.

We declare the structure and design with between.factor and within.factor objects. From the above figure we see that repeated measures begin with the variable pre.1 in column (3).

between.factor= c('treatment','gender')

within.factor= list(

phase = c('before','after','following'),

hour= 1:5 )

Next we stack the data and and we assign it to dat.st (stacked):

dat.st=stack.data.fnc(OBrienKaiser,

between.factor=between.factor,

within.factor=within.factor,

col.start.rm=3)

As our experiment does not contain items within each experimental condition, we simply indicate as arguments: between.factor, within.factor and col.start.rm (which colum repetead measures start).

Once data is stacked, the function shows its header (first 6 records). The first column (dv) contains 250 values ​​of 15 repeated measures (3x5) obtained in 16 participants. Six new variables were automatically created in this database. These variables will let us control the membership of each dv column value. As can be seen the first value of dv (1) belongs to the subject 1 in level before of the repeated measure factor phase and hour 1 on the other factor. This value belong too to level control in treatment factor and female gender. Finally the function creates the condition variable where the value control.F.before.1 clearly indicates the membership of that measure.

With this new stacked data we can ask for many exploratory graph as histogram, panelplot of RM experimental conditions in an easy and fast way.

histogram.fnc(dat.st, which.factor='phase:hour', layout=c(5,3))

panelplot.fnc(dat.st, which.factor='phase:hour',

by.panel='subject')

histogram

panelplot

Or ask for descriptives table for all experiment conditions:

descriptives.fnc(dat.st, dv='dv', graphic=T,

which.factor='treatment:gender:phase:hour')

Stack Data (Experiments wiith non-repeated items)

We will read the external file length.txt corresponding to a simple experiment to prove that longer words are read more slowly than shorter words. The columns in the file are separated by tabs. We have an within factor (length) with two levels: short and long with 60 items (30 per condition).

leng = read.file.fnc("length.txt", separator='tab')

*** The file longitud.txt has 14 rows and 60 variables ***

*** If the original data structure is variables x subjects ***

*** include the argument transposed=T. ***

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18

1 548 323 327 380 NA 326 838 333 NA 295 796 322 350 392 335 332 274 216

2 306 470 406 422 339 317 324 304 419 499 358 646 346 292 277 636 564 321

3 676 394 538 563 452 486 318 374 476 507 431 462 710 546 450 393 505 481

4 263 263 502 207 242 234 605 228 135 336 109 238 176 309 190 337 284 378

5 246 444 326 390 454 321 547 461 12565 466 431 299 599 350 357 1694 634 381

6 334 471 430 502 450 521 387 524 563 721 432 411 568 302 401 349 317 538

See the reaction times of the first 6 patients in the first 18 items. Now we are going to stack this matrix, although previously we will define the within.factor object that characterizes this design.

within.factor=list(length=c('short','long'))

We will assign to leng.st (short of stacked) the stacked data. See the function includes the n.item argument. If we incorrectly omit it. We assume only two columns (one for each condition) instead of 60.

leng.st=stack.data.fnc(leng, within.factor=within.factor,

n.item=60)

Error in col.start.rm:(n.col + col.start.rm - 1) : Argumento NA/NaN

Because we ommited the col.start.rm argument we get an error. Now we include it and re execute the function.

leng.st=stack.data.fnc(leng,

within.factor=within.factor, n.item=60,

col.start.rm=1)

#------------------------------------------------------------------

# STACK DATA

#------------------------------------------------------------------

*** This is the head of stacked data: ***

dv item subject leng condition

1 548 item1 suj1 short short

2 306 item1 suj2 short short

3 676 item1 suj3 short short

4 263 item1 suj4 short short

5 246 item1 suj5 short short

6 334 item1 suj6 short short

Now for example we could ask for an histogram of reaction time (dv) in each within experimental condition or panel plot for each subject.

histogram.fnc(leng.st, which.factor='leng', check=T)

panelplot.fnc(leng.st, which.factor='leng',

by.panel='subject',ylim=c(0,1500))

histogram

panelplot

In the histogram, we can see that there is a clear outlier for a "short" item and 2 values in the other condition. 99% of the times are clearly below 2000 ms. For this reason we had to change the axis limits and the graphic panel (0-1500ms). In this second figure we clearly see that reading reaction times are greater for long words for all subjects except for number 5 which we shall study in depth because it could be an extreme subject from the repeated measures point of view.

Stack Data (Experiments with repeated items)

Most of psycholinguistic design use differents items for differents experimental conditions. However, sometimes the same set of items pass through different experimental conditions. The next figure shows a design with two factors (factor B nested in factor A) with 20 items, where each set of 10 items pass through each level of factor B.

This is the procedure to stack the 30 x 40 matrix (20 repeated items in each level of B factor). First we will create a vector

with the values of items that will be repeated in each level of that factor and second with make.item.name.fnc function names

items will be renamed with the correct names.

Next we will create a simulated data set with 40 colums (20 repeated items two times)

dat=data.frame(mvrnorm(30,rep(0,40),Sigma=diag(1,40)))

Asignamos los nombres de los items, siguiendo la estructura del diseño de la figura anterior:

dat=make.item.name.fnc(dat, n.item=c(1:10,1:10,11:20,11:20),

col.start.item=1)

head(dat) #Se muestran solo las primeras 24 columnas

item1 item2 item3 item4 item5 item6 item7 item8 item9 item10 item1 item2

1 -1.22 -1.20 -0.46 1.73 0.08 -0.78 0.40 0.72 -1.39 -0.01 -1.02 0.71

2 1.31 0.98 -1.61 -0.23 1.15 0.02 0.04 0.59 -0.18 -0.13 0.06 -0.32

3 1.69 -1.54 -0.37 -0.42 1.28 -0.05 -0.45 0.06 -0.52 -0.83 -1.58 -1.47

4 -0.32 -1.26 -1.02 1.44 0.92 1.64 -0.16 -1.38 -0.09 -0.45 1.39 -1.26

5 0.15 0.24 -1.32 -0.35 0.07 -1.54 2.64 0.35 0.88 0.94 0.63 -2.04

6 0.69 -0.73 -0.06 -1.76 0.29 0.56 -1.91 -0.48 0.35 1.15 -0.19 0.22

item3 item4 item5 item6 item7 item8 item9 item10 item11 item12 item13 item14

1 -0.21 -1.18 -1.55 0.36 -0.54 -0.87 0.85 -0.61 0.78 0.53 -1.59 1.45

2 0.10 0.75 1.08 -0.41 -0.32 -1.12 -1.50 0.27 0.05 0.08 0.40 0.00

3 0.32 1.80 -0.34 0.24 -1.35 -1.18 -0.24 1.61 -0.24 -0.50 0.63 0.10

4 0.82 0.14 -0.62 0.29 0.29 -0.85 -0.66 -0.70 0.03 1.18 -0.09 -0.54

5 0.08 0.83 -0.46 -0.45 0.37 0.59 2.15 -1.34 -0.03 -0.34 -1.71 -0.24

6 -1.50 0.39 1.35 1.59 0.91 -3.20 -0.27 1.12 -1.46 0.58 0.24 1.18

Puedes ver como se repiten los nombres de los items según la secuencia indicada.

within.factor=list(mrA=c('a1','a2'), mrB=c('b1','b2'))

dat.st= stack.data.fnc(dat, within.factor=within.factor,

col.start.rm=1,

n.item=40)

Observa que a pesar de tener 20 items en el argumento n.item indicamos 40 porque son 40 las columnas que definen todo el diseño.

head(dat.st)

dv item subject mrA mrB condition

1 -0.2367242 item1 suj1 a1 b1 a1.b1

2 -0.4497758 item1 suj2 a1 b1 a1.b1

3 0.2195531 item1 suj3 a1 b1 a1.b1

4 0.9843178 item1 suj4 a1 b1 a1.b1

5 -0.5966198 item1 suj5 a1 b1 a1.b1

6 0.4662435 item1 suj6 a1 b1 a1.b1

tail(dat.st) # tail nos permite acceder a los últimos valores de la base de datos

dv item subject mrA mrB condition

1195 1.0356814 item20 suj25 a2 b2 a2.b2

1196 2.1248818 item20 suj26 a2 b2 a2.b2

1197 0.7438226 item20 suj27 a2 b2 a2.b2

1198 1.7708118 item20 suj28 a2 b2 a2.b2

1199 0.3146449 item20 suj29 a2 b2 a2.b2

1200 0.6525984 item20 suj30 a2 b2 a2.b2

Ya podemos solicitar el anova por sujetos e items utilizando la función Anova.F1F2.fnc con el argumento within.factor.item incluyendo el nombre del factor de medidas repetidas por el que pasan todos los items (mrB).

Anova.F1F2.fnc(dat.st, within.factor=within.factor,

within.factor.item='mrB')

Up ->