Post date: Feb 25, 2014 9:59:46 PM
Hypothesis Testing Lab
For this lab we will use four datasets:
CAMPNET:
This is a dichotomous adjacency matrix of 18 participants in a qualitative methods class. Ties are directed and represent that the ego indicated that the nominated alter was one of the three people with which s/he spent the most time during the seminar.
ZACKAR & ZACHATTR:
ZACKAR is another stacked dataset, containing a dichotomous adjacency matrix, ZACHE, which represents the simple presence or absence of ties between members of a Karate Club, and ZACHC, which contains valued data counting the number of interactions between actors. ZACHATTR is a rectangular matrix with three columns of attributes for each of the actors from the ZACKAR datasets.
KRACK-HIGH-TEC & HIGH-TEC-ATTRIBUTES
KRACK-HIGH-TEC is another stacked dataset, containing three dichotomous relations (REPORTS_TO, ADVICE, FRIENDSHIP). HIGH-TEC-ATTRIBUTES contains several attributes about the nodes in KRACK-HIGH-TEC, including Age, Level (CEO, Manager, Staff), Tenure, and Department.
WIRING
This is a stacked dataset that includes many different files. This is a dichotomous adjacency matrix of 14 employees of the bank wiring room of Western Electric used in the famous Hawthorne Studies. Ties are symmetric and represent participation in games during work breaks. RDGAM records people playing games together, RDCON records conflict between people, RDPOS is positive interactions, RDCON is negative interactions.
1) Testing dyadic hypothesis
a. Run Data | Unpack on ZACKAR (if you have not yet), which will create ZACHE and ZACHC. ZACHE has dichotomous data about the ties and ZACHC has valued data (the strength of ties).
b. Run Tools | Similarities and use the cross-product measure to compute similarities among the rows of ZACHE. (The cross product is a very powerful and common matrix operation that, in this case, will count how many friends each pair of actors have in common.) Call the output CommonFriends.
c. Go to Tools | Testing Hypotheses | Dyadic (QAP) | QAP Correlation and browse to include both ZACHC and CommonFriends to be correlated and click okay. What do the results mean?
d. Congratulations, you have just statistically demonstrated the first part of Granovetter’s famous “strength of weak ties” theory, which states that I have stronger ties (ZACHC) with those people with whom I share more friends in common (CommonFriends).
2) Testing multivariate dyadic hypotheses
a. Unpack the WIRING dataset if you have not done so yet.
b. Go to Tools | Testing Hypotheses | Dyadic (QAP) | QAP Regression | Double Dekker. Put RDCON (conflict between members about whether the windows should be open or shut) in as the dependent variable. Put in RDPOS (positive relationships), RDNEG (negative relationships), and RDGAM (playing games together) in as independent variables. Before running it, what do you think would most significantly predict conflict? After running it, are your results what you expected? How would you explain the results?
c. Record the standardized coefficient and significance for any significant predictor, and run the same procedure two more times (still using the default value of 2000 for the number of permutations) and record the same results. Now, run the same procedure three more times setting the number of random permutations set to 100000. Record the same results. How did the parameter affect the results? Why?
3) Testing monadic hypotheses.
a. You should have already unpacked the KRACK-HIGH-TEC dataset, but if not, do so now. You will get three datasets (REPORTS_TO, ADVICE, FRIENDSHIP). We are going to use the ADVICE dataset. Run Network | Centrality | Degree on this dataset, using the directed version, telling it NOT to treat the data as symmetric. By default, it will name the output FreemanDegree.
b. We are particularly interested in who is sought after for advice, which is captured by InDegree centrality. So, we are going to pull out just that column from the results, but using Data | Extract | Submatrix. Specify FreemanDegree as your input dataset and that we want to “Keep” “ALL” rows. Then click on the L to the right of the box for “Which Columns” and select the column labeled “InDegree” and call your output ADVISING. This is a measure of how many people said they sought advice from each person.
c. Display (D) the HIGH-TEC-ATTRIBUTES dataset to determine which columns the AGE and TENURE attributes are in.
d. Now, it is common wisdom that people look to the “senior” people for advice, but is unclear in an organizational context whether senior is “older” or “longer tenured”. You will test if either of these is supported by the data. Run Tools | Testing Hypotheses | Node-Level | Regression specifying ADVISING for your dependent dataset with the appropriate column and HIGH-TEC-ATTRIBUTES and the appropriate columns for your independent dataset (i.e., the columns for Age and Tenure separated by a space), and set the number of permutations to 10000. Which meaning of “senior” do the data support?
e. Why did we use the Regression option of Node-Level instead of T-Test or Anova? When would we use those?
4) Testing Mixed-Dyadic Monadic hypotheses
a. Since it is only fitting that we end where we started, we shall use the campnet data for these final exercises.
b. You will run Tools | Testing Hypotheses | Mixed Dyadic/Nodal | Categorical attributes | Anova Density twice. For both, specify CAMPNET as the network matrix, and the gender column of the CAMPATTR matrix as the Actor Attribute. For the first run, choose “Constant Homophily” for your model, and for the second, choose “Variable Homophily”. Interpret both sets of results. What do they mean? Is there homophily? Which gender tends to be more homophilous?
5) Using QAP for Mixed Monadic/Dyadic Hypotheses testing.
a. Using Data | Attribute to matrix, create a matrix of exact matches among the actors in Campnet based on gender.
b. View this new matrix (named CAMPATTR-MAT by default) in Netdraw. What does the diagram show?
c. Use Tools | Testing Hypotheses | Dyadic (QAP) | MR-QAP Linear Regression | Double-Dekker MRQAP to regress the Campnet network on this new matrix of gender similarity, CAMPATTR-MAT. What do the results show?
d. Do you prefer this approach of the ANOVA Density Tables? When might you use each of these separate techniques? What research question might involve using Moran’s I (or Geary’s C) instead of the ANOVA Density Tables? In that case, how would you use QAP to test for Autocorrelation?