First International Workshop on the

Efficiency of Modern Data Centers (EMDC)

EMDC 2021 will be held in conjunction with the Web Conference (2021)

April 12-23, 2021, Ljubljana, Slovenia

Introduction to EMDC Workshop

Major research challenges in the operations of data centers include performance, power efficiency, availability, scalability, security among many others. As the number of Internet of Things (IoT) devices proliferates, data center capabilities will transcend basic management operations. That is, traditional management capabilities for CPU, memory and input/output operations need to be replaced with more advanced IoT-based management capabilities to include items such as temperature sensors, fan speed sensors, power sensors, moisture sensors among many others. Many modern data centers today continuously collect and aggregate a wide range of telemetry data in order to avoiding critical downtimes. For example, as heat load of modern data centers increases, the ability to monitor and manage ambient temperature becomes more and more vital for the availability, reliability, serviceability, safety, manageability and scalability of these mission-critical assets. However, such management capabilities also contribute to the consumption of network bandwidth, computational processing power and data storage. Therefore, we need more rigorous architectures and design methods for the efficient modern data centers, more sophisticated design and simulation tools, reliable equipment and software systems benchmarks, accurate performance evaluation methods, among many others.

This workshop will provide researchers and practitioners a venue to discuss the efficiency of modern data centers. The workshop’s ambition is to help in shaping a community of interest on the existing research opportunities and challenges associated with the engineering design and management of modern data centers. In this context, we believe having a dedicated workshop that brings researchers and practitioners together will help investigate innovative ideas or approaches to this new research challenge with main focus on the efficiency of modern data centers, foster collaborations and exchange points of view.

Research Topics

    • IoT-Based Management for Ambient Devices

    • Reliability and Performance Methods

    • Design Methodologies for Data Centers

    • IT Equipment and Software Systems Benchmarks

    • Risk Management and Implementation Methods

    • Disaster Recovery Planning Methods

    • Server Metrics and Dashboards

    • Datacenters’ Design and Simulation Tools

    • Datacenter Standards and Certifications

    • Datacenters’ Capacity Planning Methods

    • Datacenters’ Big Data Analytics

    • Datacenters’ Trends and Research Challenges

    • Datacenter’s Architecture Design

    • Datacenters’ Environmental Conditions and Energy Efficiency

    • Datacenter’s Power Provisioning and Management

    • Datacenter’s Cooling System

    • Datacenter’s (Renewable) Energy Sources and Management

    • Datacenter’s Workload Management

    • Datacenter’s Hardware Design and Optimization

    • Datacenter’s Availability and Reliability

    • Datacenter’s Infrastructure Optimization

    • Datacenter’s Network Provisioning, Design and Optimization

    • Datacenter’s Compute Provisioning, Design and Optimization

    • Datacenter’s Storage Provisioning, Design and Optimization

    • Datacenter’s Operation Optimization

    • Datacenter’s Emerging Energy Sources

    • Datacenter’s Hardware/Software Stack Co-Optimization

    • Datacenter’s Performance Monitoring, Accounting, and Optimization

    • Datacenter’s Service Pricing Design and Optimization

    • Multi-tenant datacenter design and optimization

    • Virtualization in datacenters

    • Software-defined compute, network and storage

    • Distributed training of deep learning models in datacenters

    • Deep learning inference optimization in datacenters

    • Green datacenters

    • Micro-Datacenter management for fog computing

    • Resource allocation and management for fog computing

Important Dates

    • Feb 14, 2021: Due date for full workshop paper submissions

    • Feb 24, 2021: Notification of paper acceptance to authors

    • Mar 1, 2021: Camera-ready of accepted papers (firm date)

    • Apr 15, 2021: Workshop Day

Keynote Speakers

  • Sean James (Director of Energy Research, Microsoft): The Future of Energy in Data Centers

Abstract: Microsoft is a pioneer and leader in the field of corporate sustainability with ambitious goals such as being carbon negative including its global datacenter footprint. Learn about Microsoft's journey in clean tech research and a variety of energy technologies such as clean fuels, energy storage, and hydrogen. Gain a first-hand understanding of how they are learning and adapting, both separately and in close collaboration with others around the world.

Biography: Sean James runs Microsoft’s datacenter research and development program within the Cloud Operations + Innovation. CO+I provide the foundational cloud infrastructure for over 1,000,000,000 Customers, 20,000,000 Businesses, 200+ Microsoft online services, in 90 Markets. Sean drives new datacenter technology for Microsoft’s next generation data centers including the evaluation, development, and testing. Sean joined Microsoft in 2006 to manage operations at one of Microsoft’s datacenters. Later, he joined the construction team and oversaw the design and building of new Microsoft datacenters. Prior to joining Microsoft, Sean worked in datacenter management overseeing the day-to-day maintenance and repair operations for both IT hardware and critical infrastructure, such as electrical infrastructure and cooling equipment. Prior to joining Microsoft, Sean served in the US Navy Submarine Fleet as an electrician. Sean holds many patents related to datacenters and energy, a degree in Information Technology, and is a certified Project Management Professional from the Project Management Institute. He enjoys spending time with his family, guitar, and technology.


  • ‎Minjia Zhang (Principal Researcher, ‎Microsoft): DL Inference and Training Optimization Towards Speed and Scale

Abstract: The application of deep learning models presents significant improvement to many services and products in Microsoft. However, it is challenging to provide efficient computation and memory capabilities for both DNN workload inference and training given that the model size and complexities keep increasing. From the serving aspect, many DL models suffer from long inference latency and high cost, preventing their deployment in production. On the training side, large-scale model training often requires complex refactoring of models and access to prohibitively expensive GPU clusters, which are not always accessible to many practitioners. We want to deliver solid solutions and systems while exploring the cutting-edge techniques to address these issues. In this talk, I will introduce our experience and lessons from designing and implementing optimizations for both DNN serving and training at large scale with remarkable compute and memory efficiency improvement and infrastructure cost reduction.

Biography: Minjia Zhang is currently a principal researcher at Microsoft. His research interest lies in machine learning algorithms, modeling, and systems, and their applications on natural language processing and information retrieval. In particular, he focuses on building ultra fast and high throughput DL inference acceleration libraries (USENIX ATC'18, OpML'19), automated DL compilation (NeurIPS'20, ICLR'21, IPDPS'21), democratizing large-scale training (NeurIPS'20, NVMW'20, HPCA'21), and large-scale next-generation vector search engine (CIKM'19, SIGMOD'20, NeurIPS'20, NVMW'21). Several of his research results have been transferred to Microsoft systems and products, such as Bing, Ads, Azure SQL, Windows, leading to significant latency and capacity improvement.

Thursday April 15th, 2021 - virtual (All times are Pacific Time - PDT) - Please use this time zone converter to know exact time in your location.

Virtual Meeting URL: Click here to join EMDC (using Microsoft Teams)

Workshop Schedule

  • 10:00 am – 10:05 am Opening Remarks and Welcome

  • 10:10 am – 10:40 am [Keynote] The Future of Energy in Data Centers
    Sean James (Microsoft)

  • 11:00 am – 11:30 am [Keynote] DL Inference and Training Optimization Towards Speed and Scale
    Minjia Zhang (Microsoft)

  • 11:50 am – 12:00 pm Short Break (10 mins)

  • 12:00 pm – 12:15 pm Parallelizing DNN Training on GPUs: Challenges and Opportunities
    Weizheng Xu Youtao Zhang Xulong Tang

  • 12:25 pm – 12:40 pm Transitive Power Modeling for Improving Resource Efficiency in a Hyperscale Datacenter
    Alexander Gilgur, Brian Coutinho, Iyswarya Narayanan, Parth Malani

  • 12:50 pm – 1:05 pm Reliability of Large-scale GPU Clusters for Deep Learning Workloads
    Junjie Qian, Taeyoon Kim, Myeongjae Jeon

  • 1:05 pm – 1:10 pm Closing Remarks

Workshop Team

Chairs:

    • Eyhab Al-Masri (University of Washington, USA)

    • Di Wang (Microsoft Research, USA)

Program Committee Members (pending):

    • Xiaolin Chang (Beijing Jiaotong University)

    • Jordi Guitart (Universitat Politècnica de Catalunya, Spain)

    • Myeongjae Jeon (Ulsan National Institute of Science and Technology, South Korea)

    • Chao Li (Shanghai Jiao Tong University, China)

    • Shaolei Ren (University of California, Riverside, USA)

    • Krishna Malladi (Samsung, USA)

    • Anand Ramesh (Google, USA)

    • Alexandros Daglis (Georgia Institute of Technology, USA)

    • Abhinandan Majumdar (Intel, USA)

Submission Instructions

We welcome contributions describing original ideas, experiments and applications relevant to the workshop theme which have not been published earlier or are not currently pending submission at any other venue. All submitted papers must include the names and affiliations of all authors. Submitted papers will be peer-reviewed by members of the Workshop Program Committee. All accepted papers will be included in the main conference proceedings (see Proceedings section below).

Submission Categories:

    • Long Papers: up to 8 pages (research at a mature stage)

    • Short/Work-in-Progress Papers: up to 4 pages (early or intermediate stage)

Paper Submission Link:

    • Please use this link for submitting papers to EMDC.

Templates:

Camera Ready Instructions:

Proceedings:

    • All papers accepted will be published in the ACM Digital Library and will be included in the Companion volume of the Web Conference (WebConf) 2021 proceedings. At least one author of each accepted paper must register for the conference and present the paper at the workshop for the paper to be included in the companion volume. Details on the registration will be posted on the main conference's page.

    • Excellent papers selected from EMDC 2021 Workshop will be recommended to be published in an upcoming MDPI IoT Journal Special Issue "Efficiency of Modern Data Centers (EMDC)" (deadline: August 28, 2021).