Post date: Apr 11, 2014 1:11:37 AM
For this lab we will use three datasets:
KAPTAIL:
This is a stacked dataset containing four dichotomous matrices. There are two adjacency matrices each for social ties (indicating the pair had social interaction) and instrumental ties (indicated the pair had work-related interaction). The two pairs of matrices represent two different points in time. The names of the datasets encode the type of tie in the sixth letter, and the time period in the seventh. Thus, the dataset KAPFTS1 is social ties at time 1 and KAPFTI2 is instrumental ties at time 2, etc.
ZACKAR & ZACHATTR:
ZACKAR is another stacked dataset, containing a dichotomous adjacency matrix, ZACHE, which represents the simple presence or absence of ties between members of a Karate Club, and ZACHC, which contains valued data counting the number of interactions between actors. ZACHATTR is a rectangular matrix with three columns of attributes for each of the actors from the ZACKAR datasets.
PV504
PV504 is a 504-actor network of consultants working for an R&D consulting firm. The data are symmetric and valued and represent the number of days that pair of individuals worked on a project together.
EXERCISES:
1) Hierarchical Clustering using UCINET with ZACKAR
a) This section uses the ZACHE dataset (you may have to unpack ZACKAR using Data | Unpack to create ZACHE) and the ZACHATTR attribute dataset
b) Now run using the HiClus using the SIMPLE_AVERAGE method. Interpret your results. Do you know why you might use SIMPLE_AVERAGE over other methods? It isn't necessary, but you may want to experiment with the other methods to see what works and what doesn't.
2) Newman-Girvan using NetDraw with ZACKAR
a) Open the ZACKAR stacked dataset in NetDraw. It should open to displaying the relation ZACHE but if not, make sure it does.
b) Now, open the attribute file, ZACHATTR, using the folder with the A next to it.
c) Run the Girvan-Newman analysis (Analysis | Subgroups | Girvan-Newman) specifying a minimum of 2 and a maximum of 40 clusters desired. It should automatically color your nodes so that nodes are one of two colors. What it has done behind the scenes is color based on the ngPart_2 partition (a partition with 2 colors). Click on the color palette icon and pull down on the drop down list to select ngPart_3 to see how it partitions it next. And then ngPart_4. How useful are these partitions?
d) Using the color palette, go back to the ngPart_2 partition. Now, click on the shape palette icon, and select “Club” from the list. This will shape the nodes according to which club the members went to after the split. How well did the Girvan-Newman algorithm predict the affiliation of the club members?
3) Factions using Netdraw with ZACKAR
Now run Analysis | Subgroups | Factions selecting 2 for the desired number of groups. This time, instead of using the color palette, use the “Nodes” tab in the control area on the right hand side of the screen and scroll down to the last attribute, which should be called “Factions 2” and then click the “Color” checkbox. How does factions compare with the Newman-Girvan algorithm in terms of predicting the affiliations? How could you display the Girvan-Newman results, the Factions result, and the Hierarchical Clustering Results ALL at the same time?
4) Cliques using UCINET and NetDraw with KAPFTS2
a) If you have not done so before, unpack the KAPTAIL using Data | Unpack.
b) In UCINET run Network | Subgroups | Cliques on KAPFTS2 with a minimum size of 3. How many cliques do you get? How many actors are in this network? How useful is this?
c) Visualize KAPFTS2 in NetDraw. Does this help us identify clique structures?
d) What about if we open CliqueOverlap (which is an actor-by-actor matrix in which each cell holds the number of different cliques that this pair of actors is in together that was created when we ran Cliques in UCINET). Start increasing the filter at the bottom of the “Rels” tab on the control panel on the right side of the screen up from 1 using the “+” button. Does this indicate there is a significant or minimal overlap between cliques in this network?
e) Now set the filter back down to 0 and open CliqueSets in Netdraw and redraw the picture (lightning bolt). This is a two-mode network were lines indicate actors (typically red circles with names) belong to a specific clique (typically blue squares with numbers). What does this picture convey about the structure of the network? Are there actors who seem embedded in a lot of different cliques?
f) Run Analysis | Centrality on these data specifying the undirected option. (Although this is a 2-mode network, NetDraw allows you to run centrality on it. Now size the nodes by degree centrality. Who is embedded in the most unique cliques? And Next?
5) K-CORES using NetDraw with PV504
a) Open PV504 in NetDraw. Because it is very large, NetDraw does not optimize the layout automatically when opening it. To make the diagram more readable, turn off labels (using the script L button on the icon bar), and then redraw the network. This may take some time, but let it finish. You should begin to see some structure in the network as it draws it.
b) These are valued data about the number of days individuals worked together on projects. Let’s increase the filtering to be greater than 3 by clicking on the “+” button toward the bottom of the Rels tab in the control region three times. Now redraw the network by clicking on the lightning bolt. Much more structure should be visible.
c) Now run Analysis | K-Cores. It will automatically color the nodes according to their k-Core. Select the Nodes tab, and pull down to the *K-core attribute, and use the “s” button below the values to step through the k-cores from 0 to 10. What does this tell you about the network?
d) Since all nodes of a higher “coreness” are automatically members of the lower cores, we’d like to step down from the highest coreness, to the lowest, but cumulatively. To do this, press the “a” button below the values in the control region to select all the check box, then check the “i” button to “inverse” the selection (i.e., uncheck everything that is checked and check everything that is unchecked). This should leave no boxes checked and a blank screen. Now check the box next to the highest value (it should be 10) and look at the graph. Now ALSO check the box next to the second highest value. Repeat until you have checked all boxes. Was this more or less useful in evaluating the structure of the network?