About me according to one of the grad students. "Every time he looks at an Overleaf document, a small piece of him dies."
"Science progresses one funeral at a time" - a paraphrasing of Planck (1950). This insightful observation also applies to administrators of the scientific enterprise.
" Simple citation analysis presupposes a highly rational model of reference-giving, in which citations are held to reflect primarily scientific appreciation of previous work of high quality or importance, and potential citers all have the same chance to cite particular papers...Such a model is obviously a grossly oversimplified and possibly highly misleading representation of reference giving..." Martin & Irvine (1980) Research Policy, 12 (1983) 61-90
“When the seagulls follow the trawler, it is because they think sardines will be thrown into the sea.” Eric Cantona (1995)
(Picasso-themed sketches courtesy of gemini@google
Updates:
Oct 18 Simulations under SASCA-ReS result in networks exceeding 200M nodes.
Oct 12 Simulations under SASCA-ReS (Scalable Agent-based Simulator for Citation Analysis with Recency-emphasized Sampling) are under way.
Oct 3, Dindoost et al. (2025) On the Optimization of Methods for Establishing Well-Connected Communities. (Accepted, CNA 2025)
Oct 3, Vu-Le et al. (2025) Dense Subgraph Clustering and a new Cluster Ensemble Method (In Press, CNA 2025)
Oct 3, Park et al. (2025) Very Large Scale Simulations of Network Growth with the Scalable Agent-based Simulator for Citation Analysis with sampling (SASCA-s) (In Press, CNA 2025)
Aug 7, 2025: Milestone- we generated a network in excess of 100 million nodes using SASCA-s.
At the University of Illinois Urbana-Champaign, I have two jobs. One as research faculty in computer science. In the second, I run a research analytics unit for the College of Engineering. The two roles complement each other although time will tell whether either have had broad impact. My work has been supported by awards from the National Institutes of Health, the US National Science Foundation, private foundations, and industry. At present, we are supported by an award from the Illinois:Insper Partnership and a grant from the NSF.
My research interests fall in an area bounded by computer science, informatics, scientometrics, the history of science, biomedical research, philosophy, and sociology. The ideas of the Kuhnian research community, center-periphery structure observed by Price and Beaver, and community detection in graphs come together- in a 'computer-sciency' sense- in my work . I am also interested in epistemic and post-epistemic misconduct- no shortage of case studies there. Before academia, I worked in industry, and, even before that, in the federal government. My PhD work concerned signaling by the low affinity Fcγ receptor on human platelets and was performed in the laboratory of Clark Anderson, MD. For a while after, I worked on proximal signaling by antigen receptors then my interests evolved towards research assessment; what is referred to as "meta-research" by some and "science of science" by others. My work does not fit well into either category.
While in biology, I was fortunate to interact with a few outstanding researchers whose influence on me is evident even today. Predictably, I also encountered a few unprincipled types who served to mark the other end of the spectrum. This calibration awakened an interest in the scientific enterprise as a living organism in a symbiotic relationship with society.
My principal collaborator is Tandy Warnow, also at Illinois. In this collaboration, we combine common and complementary interests in theory, methods development, and discovery. I currently work with David Bader from NJIT, Ananth Grama from Purdue, Pablo Robles Granda from Illinois, and Fabio Ayres from Insper, São Paulo.
More recently, we've begun to work on a new agent-based modeling project (manuscript under review). Making a refined version of the model scalable by at least one order of magnitude falls under the umbrella of the SASCA (Scalable Agent-based Simulator for Citation Analysis) project, builds upon our initial ABM work with Pablo and Ananth. Other active projects concern community detection and the generation of realistic synthetic networks.