Science meets Engineering of Deep Learning 2019

SEDL at NeurIPS 2019

Goal of SEDL

We anticipate that the transition from experimental mystery to rigorous resolution will occur in multiple stages, one of which should involve bringing together diverse groups working toward seemingly different goals. While on the surface, the goals of the practitioner and theoretician may not appear to be aligned, a collaboration between the two has great potential to further both agendas in the long run. The goal of this workshop is to support the transition to such collaboration.


Why SEDL?

Science meets Engineering of Deep Learning

Deep learning can still be a complex mix of art and engineering despite its tremendous success in recent years, and there is still progress to be made before it has fully evolved into a mature scientific discipline. The interdependence of architecture, data, and optimization gives rise to an enormous landscape of design and performance intricacies that are not well-understood. The evolution from engineering towards science in deep learning can be achieved by pushing the disciplinary boundaries.

Unlike in the natural and physical sciences – where experimental capabilities can hamper progress, i.e., limitations in what quantities can be probed and measured in physical systems, how much and how often – in deep learning, the vast majority of relevant quantities that we wish to measure can be tracked in some way. As such, a more significant limiting factor towards scientific understanding and principled design in deep learning is how to harness the tremendous collective experimental capability of the field insightfully.

As a community, some primary aims would be to (i) identify obstacles in the way to better models and algorithms; (ii) identify the general trends from which we would like to build scientific and potentially theoretical understanding; and (iii) the rigorous design of scientific experiments and experimental protocols whose purpose is to clearly resolve and pinpoint the origin of mysteries (so-called `smoking-gun' experiments), while ensuring reproducibility and robustness of conclusions.


On Bridging Communities

We anticipate that the transition from experimental mystery to rigorous resolution will occur in multiple stages, one of which should involve bringing together diverse groups working toward seemingly different goals. While on the surface, the goals of the practitioner and theoretician may not appear to be aligned, a collaboration between the two has great potential to further both agendas in the long run. The goal of this workshop is to support the transition to such collaboration.

This workshop would bring emerging issues, empirically interesting phenomena, and recent theoretical insights to the attention of the greater community. Our goal is twofold:

  • Provide a venue for deep learning experimenters to present empirical phenomena they have routinely observed, or obstacles they have consistently encountered in practice but reduced down to the simplest setup which gives rise to the same feature. These minimal examples would be a first step towards identifying phenomena which the community can study empirically and theoretically in greater depth.

  • Allow scientists and theoreticians to present their ideas for guiding and designing experiments to resolve mysterious phenomenon on larger, more complex systems and tasks than targeted by this community at present.

Finally, we will provide a seed source of problems to attack. Those problems will constitute at least the minimal examples of realistic problems in deep learning. We will bring together expert practitioners who are interested in learning more about the foundations of machine learning and expert theoreticians who are interested in identifying challenging problems in cutting edge applications.


Structure of the Workshop

The workshop will consist of three themed sessions composed of invited talks and short panel discussions, followed by a final session including a longer panel and contributed talks covering a wide variety of methods and applications, including but not limited to:

  • Historical impact, insights and potential limitations of theoretical studies to large scale machine learning

  • Network training dynamics and its dependence on tuning & over-parametrization

  • Hyper-parameter choices and architectural search in computer vision, natural language processing, speech and reinforcement learning applications

  • Disentangling the role of architecture, data, and algorithms

  • Current challenges and open problems in computer vision, natural language processing, speech, and reinforcement learning


Workshop Schedule

The workshop will took place on Sat Dec 14th 2019, room West 121 + 122 in Canada Place, Vancouver.


08:00 - 08:15 Welcoming remarks and introduction (video)


08:15 - 09:45 Session 1 - Theory

08:15-08:35 Surya Ganguli An analytic theory of generalization dynamics and transfer learning in deep linear networks (video)

08:35-08:55 Yasaman Bahri Tractable limits for deep networks: an overview of the large width regime (video)

08:55-09:15 Florent Krzakala Learning with "realistic" synthetic data (video)

09:15-09:45 Theory Panel Discussion: Surya Ganguli, Yasaman Bahri, Florent Krzakala (video)

Moderator: Lenka Zdeborova

09:45 - 10:30 Coffee break and posters

10:30 - 12:00 Session 2 - Vision

10:30-10:50 Carl Doersch Self-supervised visual representation learning: putting patches into context

10:50-11:10 Raquel Urtasun Science and Engineering for Self-driving

11:10-11:30 Sanja Fidler TBA

11:30-12:00 Vision Panel Discussion: Raquel Urtasun,Carl Doersch, Sanja Fidler

Moderator: Natalia Neverova

12:00 - 14:00 Lunch break and posters

14:00 - 15:30 Session 3 - Further Applications

14:00-14:20 Douwe Kiela Benchmarking Progress in AI: A New Benchmark for Natural Language Understanding (video)

14:20-14:40 Audrey Durand Trading off theory and practice: A bandit perspective (video)

14:40-15:00 Kamalika Chaudhuri A Three Sample Test to Detect Data Copying in Generative Models (video)

15:00-15:30 Further Applications Panel Discussion: Audrey Durand, Douwe Kiela, Kamalika Chaudhuri (video)

Moderator: Yann Dauphin

15:30 - 16:15 Coffee break and posters

16:15 - 17:10 Panel - The Role of Communication at Large

Aparna Lakshmiratan, Jason Yosinski, Been Kim, Surya Ganguli, Finale Doshi-Velez (video)

Moderator: Zack Lipton

17:10 - 18:00 Contributed Session - Spotlight Submissions

17:10 - 17:20 Non-Gaussian Processes and Neural Networks at Finite Widths, Sho Yaida (Facebook AI Research) (video)

17:20 - 17:30 Training Batchnorm and Only Batchnorm, Jonathan Frankle (MIT); David J Schwab (ITS, CUNY Graduate Center); Ari S Morcos (Facebook AI Research (FAIR)) (video)

17:30 - 17:40 Asymptotics of Wide Networks from Feynman Diagrams, Guy Gur-Ari (Google); Ethan Dyer (Google) (video)

17:40 - 17:50 Fantastic Generalization Measures and Where to Find Them, YiDing Jiang (Google); Behnam Neyshabur (Google); Dilip Krishnan (Google); Hossein Mobahi (Google Research); Samy Bengio (Google Research, Brain Team) (video)

17:50 - 18:00 Complex Transformer: A Framework for Modeling Complex-Valued Sequence, Martin Ma (Carnegie Mellon University); Muqiao Yang (Carnegie Mellon University); Dongyu Li (Carnegie Mellon University); Yao-Hung Tsai (Carnegie Mellon University); Ruslan Salakhutdinov (Carnegie Mellon University) (video)

Advisors

  • Theory Session advisors: Joan Bruna, Adji Bousso Dieng

  • Vision Session advisors: Ilija Radosavovic, Riza Alp Guler

  • Further Applications Session advisors: Dilan Gorur, Orhan Firat

  • Panel advisors: Michela Paganini, Anima Anandkumar


Contributed Session - Spotlight Submissions

  • Complex Transformer: A Framework for Modeling Complex-Valued Sequence, Martin Ma (Carnegie Mellon University); Muqiao Yang (Carnegie Mellon University); Dongyu Li (Carnegie Mellon University); Yao-Hung Tsai (Carnegie Mellon University); Ruslan Salakhutdinov (Carnegie Mellon University)

  • Non-Gaussian Processes and Neural Networks at Finite Widths, Sho Yaida (Facebook AI Research)

  • Asymptotics of Wide Networks from Feynman Diagrams, Guy Gur-Ari (Google); Ethan Dyer (Google)

  • Fantastic Generalization Measures and Where to Find Them, YiDing Jiang (Google); Behnam Neyshabur (Google); Dilip Krishnan (Google); Hossein Mobahi (Google Research); Samy Bengio (Google Research, Brain Team)

  • Training Batchnorm and Only Batchnorm, Jonathan Frankle (MIT); David J Schwab (ITS, CUNY Graduate Center); Ari S Morcos (Facebook AI Research (FAIR))

Contributed talks abstract can be found here.


Contributed Posters and Reviewers

A detailed list of contributed posters can be found here.

We would like to thank our reviewers who helped us to choose the papers for our workshop.


Call for Papers

We welcome and encourage both theoretical and empirical contributions under the workshop theme. We are interested in positive and negative results, that are useful in identifying clear engineering challenges to guide new experiments and help develop principled understanding. We expect short but high impact observations, code, and data.

To reiterate: Core aim of the workshop is to provide a venue for researchers with different ways of thinking to communicate and collaborate. Therefore, submissions will be evaluated on their accessibility to experts at either side. For instance, a theoretical work may also include a proposal for a robust empirical verification. An engineering work may consist of several hypotheses that can be understood and studied by more theoretically inclined researchers. Submissions may also combine existing theoretical understandings of toy models with robust empirical verifications.

Submissions should be extended abstracts of no more than 3 pages (excluding references), which can include previously published work. We request and recommend that authors rely on the supplementary material only to include minor details that do not fit in the 3 pages. All submissions should be in pdf format, must use the NeurIPS 2019 template, and will be managed through CMT website. The review process is double-blind – therefore, abstracts should be appropriately anonymized. We will not be able to provide comments on the submissions; therefore, decisions will be binary. However, the post-workshop report will include an overview of selected submissions.

Accepted works will be presented as posters during the workshop and listed on this website. Additionally, a small number of abstracts will be selected to be presented as lightning talks during the workshop.

Dates (DD/MM/YYYY):

    • Submission deadline: 26.09.2019

    • Author notification: 01.10.2019

    • Workshop date: 14.12.2019

Some important questions and answers about the submissions:

Q: Can we submit a paper that will also be submitted to ICLR 2020?

A: Yes.

Q: Is it OK to submit a paper that was rejected from the NeurIPS main conference?

A: Yes.

Q: Will there be official archival proceedings?

A: No.


Organizers

Levent Sagun
Facebook AI Research

Adriana Romero
Facebook AI Research

Nando de Freitas
DeepMind
University of Oxford