A key and distinctive feature of the proposed workshop is the panel discussion, where we bring together leading academic and industry experts to engage in an open discussion on data-centric methods that drive the development of large models today. The panel serves to address two main topics: 1) candidly share insights and opinions on opaque data-centric methods; 2) transparently discuss the merits of data-centric vs. model-centric development. As the community transitions from model-driven innovation towards more emphasis on improving data quality, quantity, and diversity, this discussion is both timely and essential.
The discussion will explore topics such as:
The practical challenges of building high-quality datasets at scale;
The gap between academic insights and industrial practices;
How data-centric methods adds value to model-centric AI;
The challenges of using synthetic data;
Future directions for data and model optimization.
The discussion will be guided by a moderator, and the panelists will be encouraged to share their insights on the aforementioned topics. The audiences are encouraged to ask questions and be involved in the discussion. By creating a structured but open discussion, the panel aims to share and discuss a vision for the next generation of data in data-centric AI.Â