aggregate.data.fnc

Copy, Paste, Adapt

aggr.subject= aggregate.data.fnc(dat, dv='rt',

which.factor=c('subject','fa','fb'))

aggr.item= aggregate.data.fnc(dat, dv='rt',

which.factor=c('item','fa','fb'))

aggr.item= aggregate.data.fnc(dat, dv='rt',

which.factor=c('item','fa','fb'),

statistics='sum')

Objetive

Generates summary statistics of a quantitative variable at the intersection of J user-defined factors. The most common is the aggregate of subjects and items using the mean and statistical summary.

Aggregate

Simulate a fictitious experiment where 20 subjects have undergone the exposure of 20 stimuli (items) that obey a completely repeated two factors A x B 2 x 2 (5 items per condition).

dat=data.frame(mvrnorm(20,rep(0,20),Sigma=diag(1,20)))

head(dat[,1:5])

X1 X2 X3 X4 X5

1 1.29142446 0.4948692 -0.4428219 0.4840264 -2.0553614

2 0.06735783 1.2716655 -0.4561439 -1.6141761 0.8010648

3 1.28391676 -0.6396540 -1.3635776 -0.2230140 0.7595894

4 -1.00765876 0.5141792 -0.5008012 -0.4013213 0.5519320

5 -1.38675584 0.5778478 -1.8057209 0.5890165 0.1616928

6 0.50568643 1.5108605 -1.6978354 -1.5681316 0.2611976

With within.factor we declare the within structure of our design.

within.factor=list(A=c('a1','a2'), B=c('b1','b2'))

We need to stack data in order to properly handle the 20 repeated measures through the 4 experimental conditions.

dat.st=stack.data.fnc(dat, within.factor=within.factor,

col.start.rm=1, n.item=20)

*** Head of your stacked data

dv item subject A B condition

1 1.5178825 item1 sub1 a1 b1 a1#b1

2 -1.5168830 item1 sub2 a1 b1 a1#b1

3 -1.3864176 item1 sub3 a1 b1 a1#b1

4 -1.5203952 item1 sub4 a1 b1 a1#b1

5 -0.3080751 item1 sub5 a1 b1 a1#b1

6 -0.2596997 item1 sub6 a1 b1 a1#b1

Now we can aggregate the data from our experiment. We will first by subject where we estimate the mean of each subject in the four experimental conditions.

aggr.subject=aggregate.data.fnc(dat.st,

which.factor=c('subject','A','B'))

aggr.item=aggregate.data.fnc(dat.st,

which.factor=c('item','A','B'))

which.factor argument sequentially indicate the factors by which the aggregation must be done. By default the statistical aggregation is the mean, but the user may request including statistical argument sum, standard deviation and number of observations with values ​​'sum', 'dt' and 'n' respectively.

*** This is the head of your data aggregated by mean statistic ***

subject A B dv

1 sub1 a1 b1 0.5028665

21 sub1 a2 b1 0.1032125

41 sub1 a1 b2 0.1516793

61 sub1 a2 b2 0.4537608

2 sub10 a1 b1 -0.5041226

22 sub10 a2 b1 0.1702154

*** This is the head of your data aggregated by mean statistic ***

item A B dv

1 item1 a1 b1 0.12288472

11 item10 a1 b2 -0.05703570

6 item11 a2 b1 0.10010628

7 item12 a2 b1 0.09607783

8 item13 a2 b1 0.01986696

9 item14 a2 b1 0.18900893

Note that the aggregate data are stacked by subject. The four measures of each subject occupy four rows. As we will use the Anova.fnc this data must be "unstacked" where each subject fills one row of four columns (A x B 2 x 2). We will use unstack.data.fnc function.

dat.by.subjects=unstack.data.fnc(aggr.subject,

within.factor=within.factor)

*** This is header of your unstacked data ***

a1.b1 a1.b2 a2.b1 a2.b2

1 0.5028665 0.1516793 0.10321254 0.4537608257

2 -0.5041226 0.2701002 0.17021537 0.6514121143

3 -0.1626694 -0.6595365 -0.35134359 -0.0001791568

4 0.2556698 -0.3524445 0.22970505 0.1882953943

5 0.8032216 0.2896018 -0.05547946 -0.0210073675

6 0.3013787 -0.4888167 0.83278611 -0.7162498370

We can perform the analysis of variance of our data aggregated by subject.

within.factor=list(A=c('a1','a2'), B=c('b1','b2'))

Anova.fnc(dat.by.subjects, within.factor=within.factor, col.start.rm=1)

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

SS num Df Error SS den Df F Pr(>F)

(Intercept) 0.010565 1 5.4339 19 0.0369 0.8496

A 0.082717 1 2.5356 19 0.6198 0.4408

B 0.008070 1 4.3604 19 0.0352 0.8532

A:B 0.052527 1 4.7680 19 0.2093 0.6525

If you look at the aggregate data for items you will see that the design is completely intergroup. Item 1 to 5 belong to the condition a1b1 and a1b2 6 to. That is from the point of view of the analysis of the items we are looking at a completely intergroup Intergroup two factors (A and B). We estimate now the ANOVA for this data matrix.

Anova.fnc(aggr.item, between.factor=c('A','B'), dv='dv')

Anova Table (Type III tests)

Response: dv

Sum Sq Df F value Pr(>F)

(Intercept) 0.00394 1 0.1327 0.7204

A 0.00043 1 0.0144 0.9061

B 0.00243 1 0.0818 0.7786

A:B 0.01313 1 0.4424 0.5154

Residuals 0.47493 16

Up->