I'm a PhD Student in the Berlin School of Economics / BDPEMS joint program of Humboldt University, Free University, Technical University, ESMT Berlin, WZB and DIW as a PhD Fellow at ESMT Berlin.


I research in the fields of organization design and leadership, combining methods from deep learning and network analysis.

Recent Research

Semantic Decision Networks (2021)
with Matthew Bothner

Working paper coming soon!


When would members of an organization interpret a choice made by its main decisionmaker as ambiguous, important, irrelevant, surprising, or symbolic? To address this question, we develop a formal model of networks of choices. Extending research on natural language processing, these networks mirror networks of words and enable us to identify the meaning of choices, as well as the level of ambiguity surrounding these meanings. Using our model, we uncover latent relationships between choices and contexts in a focal decision-maker makes these choices. Our contributions our methodological and theoretical. Our method involves a mixture of state-of-the-art deep learning model with classical network analysis, which enables us to present a series of network-based measures characterizing choices. Our primary theoretical contribution is to cast new light on how audiences interpret choices as a function of their organizational context.

Using Semantic Networks to Identify the Meanings of Leadership (2021)
with Nghi Truong and Matthew Bothner

Download working paper

Youtube Presentation: Using Semantic Networks to Identify the Meanings of Leadership


We develop a novel method that integrates techniques from machine learning with canonical concepts from network analysis in order to examine how the meaning of leadership has evolved over time. Using articles in Harvard Business Review from 1990 through 2019, we induce yearly semantic networks comprised of roles structurally equivalent to the role of leader. Such roles, from which leader derives meaning, vary in content from coach and colleague to commander and dictator. Yearly shifts in the structural equivalence of leader to clusters of thematically-linked roles reveal a decline in the degree to which leadership is associated with consultative activities and a corresponding rise in the extent to which a leader is understood to occupy a hierarchical position. Our analyses further reveal that the role of leader comes to eclipse the role of manager, measured through changes in PageRank centrality as well as Betweenness centrality over the course of our panel. Implications for new research on leadership, culture, and networks are discussed.

Random Forest Consensus Clustering for Regression and Classification (2021)
with Ebru Koca Marquart

Download working paper here

Download the python package here


Random forests are invariant and robust estimators that can fit complex interactions between input data of different types and binary, categorical, or continuous outcome variables, including those with multiple dimensions. In addition to these desirable properties, random forests impose a structure on the observations from which researchers and data analysts can infer clusters or groups of interest. These clusters not only provide a structure to the data at hand, they also can be used to elucidate new patterns, define subgroups for further analysis, derive prototypical observations, identify outlier observations, catch mislabeled data, and evaluate the performance of the estimation model in more detail.

We present a novel clustering algorithm called Random Forest Consensus Clustering and implement it in the Scikit-Learn / SciPy data science ecosystem. This algorithm differs from prior approaches by making use of the entire tree structure. Observations become proximate if they follow similar decision paths across trees of a random forest. We illustrate why this approach improves the resolution and robustness of clustering and that is especially suited to hierarchical approaches.

Text analysis and deep learning: A network approach (2021)
with Nghi Truong and Matthew Bothner

Working Paper Coming Soon!


In the recent past, deep neural networks have revolutionized natural language processing and have since set the new state of the art in most language modeling tests. Transformer architectures, such as BERT or GPT-x have been particularly successful by generating flexible, context-aware representation of text inputs for downstream tasks. However, much less is known about how researchers can use these models to analyze existing text. This is a question of great importance, because much information available to applied researchers is contained within written language or spoken text. Although the use of these models' ability to capture sophisticated linguistic relations is thus imminently desirable, much uncertainty about how they operate remains, and there is substantial debate about how much meaning they truly capture. We propose a novel method that combines transformer models with network analysis to form a self-referential representation of language use in a corpus of interest. This approach avoids many issues related to understanding the internal workings of the deep neural network. It produces linguistic relations strongly consistent with the underlying model as well as mathematically well-defined operations on them. In an analysis of a random sample of news publication from 1990-2018, we find that ties in our network track the semantics of discourse over time, while higher order structures allow us to identify clusters of semantic and syntactic relations. This new approach offers several advantages over the use of contextual word embeddings, and gives researchers a new tool to make sense of language use while reducing the amount of discretionary choices of representation and distance measures. We discuss how this method can also complement and inform analyses of the behavior of deep learning models.

Graph Embedding on Hierarchical Manifolds (2021)

More Information coming soon!

When does catalyzing social comparisons cause growth? (2020)
with Nghi Truong, Richard Haynes and Matthew Bothner

Download working paper here


When does a manager’s choice to activate social comparisons among employees prompt organizational growth? When should a manager instead allow employees to form aspirations and exert effort in relative autonomy, based on their own past performance? To address these questions, we develop an agent-based model that examines the growth-related effects of these two contrasting approaches. Our analyses reveal that activating social comparisons can be either beneficial or corrupting depending on three features of organizational context drawn from performance feedback theory: (i) employees’ goal adaptation rates, (ii) employees’ tendencies to engage in self-improvement, self-assessment, or self-enhancement, and (iii) the skewness of the distribution of their initial goals. We find that whether this distribution is right-skewed (the highly ambitious constitute the right tail) or left-skewed (the un-ambitious comprise the left tail) acts as the governing contextual moderator. Under right skew, social comparisons promote growth. Under left-skew, this effect reverses, but not if employees self-improve or adapt slowly: Slow adaptation “purifies” the intrafirm monitoring network of otherwise corrupting stimuli and thus restores the link between social comparisons and growth. Implications for research in performance feedback theory and organizational design are discussed.

Other work

How to Manage ‘Invisible Transitions’ in Leadership (MIT Sloan Management Review, 2021)
with Nora Grasselli and Gianluca Carnabuci

Taking on a substantial new role without a change in title or authority is hard, but there are ways to manage this transition.

Read here!

Threshold Spectral Community Detection for NetworkX
A Python Package for Community Detection with NetworkX

NetworkX Community detection based on the algorithm proposed in Guzzi et. al. 2013 (*).

Developed for semantic similarity networks, this algorithm specifically targets weighted and directed graphs. This implementation adds a couple of options to the algorithm proposed in the paper, such as passing an arbitrary community detection function (e.g. python-louvain). Similarity networks are typically dense, weighted and difficult to cluster. Experience shows that algorithms such as python-louvain have difficulty finding outliers and smaller partitions.

Given a networkX.DiGraph object, threshold-clustering will try to remove insignificant ties according to a local threshold. This threshold is refined until the network breaks into distinct components in a sparse, undirected network.

More information: on GitHub

Python RFCC - Data understanding, clustering and outlier detection for regression and classification tasks
Python Package, joint with Ebru Marquart

Companion package to "Random Forest Consensus Clustering for Regression and Classification (2021)"

More information: on GitHub

Get in touch at ingo.marquart@esmt.org