Modularity is algorithmic independence of mechanisms.
A dynamic system encompasses a sequence of modifications to the mechanisms.
Modularity in a dynamic system is the conditional algorithmic independence of mechanisms, conditioned on its previous state.
Learning algorithms are dynamic systems.
Modularity requires independent feedback (e.g. gradients).
By formally treating learning algorithms as algorithmic causal graphs, we can directly test, without any training, whether the causal structure of the credit assignment mechanism makes it possible to modify the learnable mechanisms independently by inspecting whether the gradients it produces are d-separated by the previous state of the learner's weights before the credit assignment update.
Theoretical question: Which reinforcement learning algorithms produce independent gradients?
Empirical question: Does modularity improve transfer efficiency?
Modular algorithm - CVS:
Chang, Michael, et al. "Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions." ICML (2020).
Non-modular algorithm - PPO:
Schulman, John, et al. "Proximal policy optimization algorithms." (2017).