ICLR 2025 Workshop on Modularity for Collaborative, Decentralized, and Continual Deep Learning (MCDC)
ICLR 2025 Workshop on Modularity for Collaborative, Decentralized, and Continual Deep Learning (MCDC)
While the success of large-scale deep learning models has hinged on the ``bigger is better'' approach – scaling model size and training data – this paradigm may rapidly be reaching an inflection point. Beyond the prohibitive cost of training and maintaining gigantic models, this approach exposes and exacerbates inherent flaws in the current design philosophy of machine learning systems.
One of the most glaring contradictions lies in the development life cycle of these models which, once deprecated, are simply discarded in favor of new ones and are generally trained from scratch.
This unsustainable practice stems from the fact that models are currently built and trained as generalist black-box monolithic systems where functionalities and emerging capabilities are intertwined in their parameters and any attempt to change a specific aspect can have unpredictable and potentially disastrous consequences for the entire model's performance (e.g., catastrophic forgetting).
In stark contrast, a fundamental principle in software development is the organization of code into modular components. This allows developers to import modules and seamlessly integrate new functionalities, improving code reusability and maintainability.
Similarly, biological systems provide compelling evidence for the benefits of modularity and functional specialization, such as rapid adaptation to new environments and resilience to perturbations. Despite these clear benefits, modular approaches are rarely applied in the development of machine learning models, presenting significant opportunities for innovation.
Scope and Topics: The scope of this workshop covers all methods enabling collaborative development of modular models. This includes mixture-of-experts where each expert can be independently trained, decentralized training to share regularly information between experts, and upcycling to re-use existing models.
Distinguished Scientist @ TogetherAI
Founding Engineer @ PrimeIntellect
Research Scientist @ Meta
Research Scientist @ Google DeepMind
Research Scientist @ Cohere For AI
PhD student @ University of Washington
Senior Research Scientist @ Google DeepMind