Santa Fe Institute

Graduate Workshop

Project Overviews

2022


Organizers: John Miller, Carnegie Mellon University & Scott E Page, University of Michigan

Sponsor: Bob Brown, author of Sys-Tao

Coordinator: Carla Shedivy, Santa Fe Institute

Overview

An interdisciplinary group of ten students spend less than two weeks learning about complex systems and agent based models, worked collectively on homework projects, and started innovative research projects. This document summarizes their projects which were exceptional and demonstrate how the open intellectual environment of the Santa Fe Institute promotes novel research.

The Projects

Political Speech in Sermons

Shayla Olson: University of Michigan, Political Science

Shayla's project analyzes text from sermons to study political content in religion services and variation among and within demoninations. Her data consists of approximately 260,000 sermons from 1,500 US churches. Using a la carte word embeddings, she uses pretrained word embeddings to identify the local context for keywords such as abortion or immigration. For example, on abortion, Catholic churces use the words womb, fetus, and miscarriage more than other religions. On immigration, Mainline churces use the terms influx, and non-white, and on taxes, Black Protestant Churches talk about paychecks and deductions, terms that other churches do not. Overall, she finds meaningful differences in contexts that align with historical differences between churches.

Empirically Measuring Organizational Structure

Brandon Freiberg: Columbia Business School

Brandon's projects considers a core question in organizational theory: why is organizational structure important. Classic studies consider the benefits of division of labor (horizontal structure) and hierarchy (vertical structure). Existing work relies on case studies, field studies, surveys, and some agent based models. Brandon relies on 500 million LinkedIn profiles, which allows him to build time series of company structures. Goal: to build a model that predicts a given company's division of labor / role diversity / level of employee level specialization. He relies on word embeddings of job titles to understand 90 million promotion paths with his data set. As a test of his methods, he finds that software engineer is much closer to web developer than to financial analyst or accountant. He finds, first, that variance in average job title difference decreases as firms become larger. Second, maximal role diversity increases with the number of employees, and third, average distance and median distance increases the likelihood of additional funding conditional on having already received funding both with and without controls. One possible theory would be that job title diversity corresponds to product market fit. Good fit could imply that job categories align with specific markets. Finally, he finds that low division of labor is correlated to to the adoption of upward-trending technologies over legacy technologies, one explanation for which is that specialization leads to routinization and imalleability, decreasing the probability that employees try new technologies or adapt to dynamic environments. Future work includes sample validation (i.e., which industries are best captured by LinkedIn data), the creation of additional measures (e.g., hierarchy, divisional breakdown) and the generation of computational models (e.g., ABM) to support or explain empirical results.

Demographic Impacts on Anthropogenic Ecosystem Engineering

Elic M Weitzel: University of Connecticut, Archeology


Humans have great capacity to modify our environments, and these need not be detrimental. Controlled burns are just one example of beneficial ecosystem engineering. Ecosystem engineering can be purposeful or a byproduct of other behavior. Elic considers whether population decline reduces the scale and scope of ecosystem engineering. As an archaeologist, he looks to historical cases. His study, an agent based model, analyzes the deer and human populations in 17th century New England. During this time, 50% to 98% of the Indigenous population died. In his model, deer move, die, and reproduce. People move, hunt, and burn. Burning creates habitat for deer. In his model, the human population is experimentally reduced while the deer population fluctuates. He can calibrate his model so that each time step corresponds to a day and assign empirically accurate reproduction rates. He finds that the total number of fires set (an ecological engineering technique) positively correlates with population but mean number of fires set per individual negatively correlates with population. He finds a population level decrease in ecosystem engineering but an increase in per capital rates of fire setting, the latter result being necessary to maintain the deer population, which he finds stays within the same Lotka-Voltera cycle.




Whoso List to Hunt, I Know where is an Hind

Thomas Wyatt (1503-1542)


Whoso list to hunt, I know where is an hind,

But as for me, hélas, I may no more.

The vain travail hath wearied me so sore,

I am of them that farthest cometh behind.

Yet may I by no means my wearied mind

Draw from the deer, but as she fleeth afore

Fainting I follow. I leave off therefore,

Sithens in a net I seek to hold the wind.

Who list her hunt, I put him out of doubt,

As well as I may spend his time in vain.

And graven with diamonds in letters plain

There is written, her fair neck round about:

Noli me tangere, for Caesar's I am,

And wild for to hold, though I seem tame.

How do super-local network features shape residential segregation?

A modified Schelling/Sakoda model

Laura Fursich: Linkoping University, Institute for Analytic Sociology

Laura analyzes an extension of Schelling's Segregation Model. The original model reveals how mild preferences for homogeneity in location choices produces macro level segregation. Schelling's original model does not consider social networks. Evidence and logic suggests that people prefer to live closer to their networks. If individuals belong to homogenous networks, then adding this effect should produce stronger lock-in of segregation. Her model makes three changes to the original model. First, rather than satisfy , agents will move if their friends are far away. Second, in her model, specific friends matter. Third, her model allows for compensating behavior. She begins with clusters of friends in personal network and considers only the personal network effecs, she finds that people cluster within their networks. As she increases the homophily (sameness) of networks, the networks become more segregated. When she then adds a preference for ethnic segregation, she does not find as much segregation as Schelling. As she increases the weight on personal network and high levels of homophily, she does find high levels of segregation. When she abandons the clustering assumption and rewires friendships within ethnic groups, she finds the model produces segreagation as personal network effects amplify segregation effects.

A Formal Model of In-group animosity and polarized information diffusion

Herbert Chang: Annenberg School for Communication and Journalism

Polarization can be attitudinal (distance in beliefs) or affective (dislike). These can be difficult to disentangle empirically. Herbert's research to date has been empirical. He finds asymmetries in how elected Democrats and Republicans use Twitter, and the impact of their followers: Republicans tend not to jump into Democratic conversations, but Democrats will jump into Republican conversations. In his project, he constructs an agent-based model in which the probability of endorsement as a function of attitudinal and partisan status among members of congress. First, he finds that high variation in diffusion for moderate infection and dormancy rates. Second, increasing assortativity does not dimension diffusion for random graphs. Third (surprisingly) to maximize spread, choosing the most popular person overall creates more diffusion than choosing the person most popular with the opposite party. Finally, he finds no interaction between popularity with outside group and assortativity for diffusion.

Self Censorship

Qiankun Zhong: UC Davis, Communications

In Qiankun's model, the government issues a policy and people decide whether to question the policy. Everyone has a fixed belief, either high (H) or low (L) about the policy. Type H can signal high or low, while low types will always signal low. People who signal high to a type H receive a benefit but risk punishment. If they signal high to a type L they pay a peer cost. Signalling can result in censorship by the state. To evolve strategies, she assumes social learning: people are more likely to copy the strategies of people with higher fitness. She finds that "in union there is strength", that is, lots of type H's lead to more high signalling. She finds an asymmetry in that increasing peer benefit has a larger effect than reducing the peer cost. Also, having high peer benefit and peer cost leads to more signalling than having low values of each.

Concentration & Supply in Rental Markets

Benjamin Preis: MIT, Urban Studies and Planning

Rental markets for homes in the US have changed drastically in the last three decades, with more landlords and higher concentration in the market. Rental housing markets are extremely complex, owing to search costs, informational constraints on proper pricing given the variety of units, the effect of neighborhood characteristics, and the down side risk of not having housing. His model considers renters who follow search strategies that embed a moving cost. It assumes two type of landlords: small landlords with little information who must rent units and large landlords who have better information and care about average return rather than renting each unit. The specific model has two towns, either unequal or equal distributions of income, and with landlords who set rents at a rate on capital return plus error. His baseline model (equal numbers of tenants and houses) converges to an equilibrium. When he doubles the number of renters, he finds ever increasing rates and only the rich renting houses. Reversing the assumption, with more houses than rentals, prices fall, and no vacancy exists. Returning to case with an equal number of houses and renters, when the number of landlords falls and each owns lots of houses, prices rise relative to case of lots of small owners.

Campaign Contributions on a Network

Elaine Yao: Princeton, Politics

Elaine's project asks why people donate to campaigns. Existing theories propose donations as a consumption good or as a way to influence policies or because donating to a particular campaign has become focal. Her model assumes that people donate in part because of social connections and ideology. Her model asssumes agents with ideologies as well as wealth endowments. Ideologies are normally distributed while wealth has an exponential distribution. She endogenously builds a network by connecting to people with similar ideologies. Candidates seek a minimal spanning tree. Voters decide whether to donate based on ideological closeness and the percentage of their contacts who donate. The model produces distributions that include many runs with no donations, but if donations do occur they can be large. An analysis that changes the relative influence of friends shows that if friends matter more than, no donations are more likely but if donations occur, the total funds raised is larger. If friends matter a lot, then ideoloy does not matter. if ideology matters then, as expected, more moderate candidates raise more. Finally, she shows that on more connected networks, candidates raise more.

Why Do Companies Invest in Unprofitable Assets?

Likun Cao: Chicago, Sociology

Likun's project relies on Kauffman's NK model to study product evolution. She assumes that companies offer products that contain observable features and others that consumers might not take into account when making decisions about whether to buy. In her repurposing of the NK model, the first half of the N features are observable and others are not. The value of the NK landscape corresponds to the cost. Revenue depends on the number of the first N/2 features take value one. Last, she reduces profits if their product is too similar to other products. She also varies whether the interdependencies are random or modular. Results form N=12, K=6, and random interdependencies as she increases levels of competition, she finds that greater levels of competition she finds both observable and non observable features are diverse. With modular interactions, that is, no interdependencies between observable and non observable features. WIthout competitive pressure, products are similar. Competition drives diversity.

Modelling Collaboration in Coding Projects

Kesong Cao: University of Wisconsin, Cognitive Psychology

Any information processing systems. can be analyzed at three levels: computation (what to solve), algorithm (how to solve), and implementation (hardware and wetware). Cognitive Psychology focuses on the first two levels and neuroscience the third. In Kesong's project, he analyzes what types of project teams are more likely to succeed. Teams may differ in structure, cognitive diversity, communication style and risk preference. Success might be measured by profit, satisfaction, popularity. Using GitHub data from 2012 to 2019, Kesong gathered project and user meta data along with source code. Using terms such as join, leave, create, delete, fork, watch, issue, commit, pull request, dependence, and comment, he can measure team dynamics. Individual contributions can be measured by lines of code and the attributes of the team and success can be measured using the aforementioned terms. For example, fork and dependence correlate with success. By filtering our active projects, he is left with 390,000 projects. He finds collaborations events to be bursty ruling out a Poisson model. He finds less hierarchy to produce greater success, and diversity to be negative correlated with success. He is working on an agent-based model of the collaborative process.

Original Document: July 7, 2022

Final Document: