Stata Graph Library for Network Analysis
Hirotaka Miura (contact)
03/31/2012 update:
The Stata Journal has kindly published the program. The command name has been changed from network to netsis to avoid possible complications with commands created in the future. Please use findit netsis to download the program. Thanks.
08/14/2011 update:
SGL QA (Quality Assurance) program files added.
08/08/2011 update:
SGL version 1.1.4 released. Max-flow min-cut routines and Floyd-Warshall all-pairs shortest-path algorithm added. This version will likely be the one finalized for the Stata Journal so there probably won't be any more updates in the near future as far as the program goes.
NEXT >> FAQ page; contemplating on the next phase of the project...please let me know if you have any inputs regarding possible areas of development, what the current package can improve on, etc. Thanks!
08/02/2011 update:
Planned for version 1.1.4 release:
Floyd-Warshall algorithm for dense networks. Based on results from running tests, the algorithm will be implemented automatically when calculating distance matrices for unweighted and weighted networks when density (|E|/|V|^2, with unweighted edges counted as two directed edges) is greater than 0.75 and 0.5, respectively.
Edmonds-Karp algorithm for computing maximum flow value, flow matrix, and residual capacity matrix. Will implement breadth-first search on the residual graph to find minimum cut. As there are multiple outputs, it's going to take some time working out the logistics with the ado Stata wrapper.
07/26/2011 update:
SGL version 1.1.3 released. Improved codes for Dijkstra's algorithm and algorithm to compute betweenness centrality for weighted networks. Under review at the Stata Journal.
Stata Conference Chicago 2011: Had a great opportunity to present the Stata Graph Library at this year's Stata Conference. Thank you to all the organizers and attendees. The conference presentation slides are available here.
NEXT >> Floyd-Warshall algorithm for dense networks and maximum flow...
06/06/2011 update:
SGL version 1.1.2 (beta) updated with revised documentation and help files for network and netsummarize.
05/09/2011 update:
A bit of a disclaimer: As of now large networks cannot be handled efficiently and thus Stata Graph Library functions are only well suited for smaller networks with 1000 vertices or less if unweighted, 250 vertices or less if weighted when density is around 0.1 (density is defined as the number of edges over number of vertices^2, where undirected edges are treated as two directed edges). For large networks, or very dense ones, I suggest using MatlabBGL if you have access to Matlab - it's faster and handles large datasets better than most R packages. Otherwise, designing your own C++ wrapper for the Boost Graph Library, via Stata's st_plugin() if that's of interest, may be the other "free" alternative for large networks. I definitely have plans to improve SGL algorithm speeds in the future, but it'll take me a while to get there.
The idea of switching the project over to a wrapper for BGL has been discussed, but as the C++ code will have to be compiled for each platform, not to mention the (sometimes random) "segmentation fault"s (probably due to my poor coding) which can wreak havoc on the user's dataset, I think we'll stick with Mata for now.
Installation tips: What seems to work for a lot of people is to save the .ado and .mlib files in 'C:\ado\personal' directory (assuming a PC). For me at least, saving the files in the current working directory works as well. Either way, when you launch a new session of Stata and type mata mata mlib query, lsgl should show up indicating that the library is loaded.
SGL version 1.1.2 (beta) released. Version to be presented at this year's Stata Conference in Chicago. Totally re-designed ado-file wrappers (sorry folks) but now you can use if exps and in ranges so it's pretty cool. Making it byable would be nice too...
03/12/2011 update:
SGL version 1.1.1: Under review (still) at the Stata Journal. Command name change from NETGEN, otherwise program identical to version 1.1.3. R/MatlabBGL programs used to test Stata code now included in program attachment. Demonstration of a method to implement additional custom written routines with SGL provided with program SGLX.
NETGEN version 1.1.3: Katz-Bonacich centrality added. Several syntax changes.
NETGEN version 1.1.2: Eigenvector centrality added.
NETGEN version 1.1.1: Under second review at the Stata Journal.
NETWORK SGL: Older version of NETGEN.
RGRAPH: Stata command to generate random graphs.
NETWORKS: Older version of NETWORK SGL project.
NETPLOT & PAJEK2STATA: Rense Corten has developed NETPLOT, which provides network visualization capabilities to Stata, and PAJEK2STATA, which allows importing of relational data from Pajek to Stata. You can also follow his blog here.
STATA2PAJEK: Gabriel Rossman has developed STATA2PAJEK for exporting data from Stata to Pajek.
CENTPOW: Stata module to compute network centrality and power developed by Z. P. Neal.