2021 IEEE Big Data Conference

International Workshop on Serverless Machine Learning for Intelligent and Scalable AI Workflow

With several innovations emerging in the domain of AI Automation, the next wave of automation will be focused on an AI applications. The design and development of AI workflows is at the core of emerging AI applications. AI workflows are “dataset centric”, with characteristics and quality of dataset varying across industries and applications. For example, it is common to have “big noisy data” in Oil and Gas industries as there are many oils pumps installed across a wide geographical space. On the other hand, the data generated by imaging technology is “wide clean data” (i.e., fewer number of records, but with a very high number of attributes). Researchers have introduced purpose-built programming models and pipelines APIs that allowing end users to construct a “Pipeline Graphs” to create an AI workflow for Automated Model Discovery. These programming interfaces support multiple machine learning ecosystems and frameworks, e.g., scikit-learn, Keras, pyearth, XGBoost, as part of the same pipeline graph definition, and new pipeline API such as CodeFlare Pipelines. Moreover, using Pipeline Graph we can specify multiple machine learning tasks such as Classification, Regression, Imputation, Time Series Forecasting, Imbalance Learning, Data Sampling, etc. In the race of getting the state-of-the-art result, the data scientists construct a very large Pipeline Graph. Typically, the size of Pipeline Graph varies across applications, across different AI tasks, or even across different personas. In summary, Execution of Pipeline Graph generates bursty workload and execution on Pipeline graph is also adhoc.

With emerging serverless platform offerings emerging, for example IBM Cloud Function and Code Engine, there are new opportunities to build a serverless machine learning toolkit to support the seamless execution of Pipeline Graphs as well as other common operations that can be scaled out. The on-demand capability of spinning up resources on Cloud with negligible instantiation using serverless technology is the center of attraction for AI workload. The focus on this workshop is to introduce serverless technology along with how it is leveraged to build next generation reusable serverless machine learning toolkit to be used for various AI Applications.

Research topics included in the workshop but not limited to the following

· AI application demonstration using serverless technology

· Design, Development and API extension for popular ML library such as sklearn to natively support serverless

· Experimental Analysis of Serverless vs traditional pre-configured

· Workflow manager design

· Scalability and fault tolerance

· Automated ML using serverless

· Feature engineering

· Benchmark papers including emerging technology such as Ray

· Explanability scale out

· Case studies

· On demand data cleaning and repair

· Web-service scaling

· Big Data Search

· Performance Benchmark of serverless platforms

Regular paper submissions must be at most 10 pages long, including all figures, tables, and references. They must be formatted according to the paper submission formatting guidelines provided in the IEEE Big Data 2021 Call for Papers. We also encourage short paper submissions (at most 6 pages) describing new work in progress.

Important dates

· Nov 2, 2021: Due date for full workshop papers submission (Extension Request should be sent to Organizer)

· Nov 15, 2021: Notification of paper acceptance to authors

· Nov 20, 2021: Camera-ready of accepted papers

· Dec 15-18, 2021: Workshops

Program Committee Members

· Dr. David LO, Associate Professor, Singapore Management University, Singapore

· Dr. Vijay Mago, Associate Professor, Lakehead University, Canada

· Dr. Kewen Liao, Associate Professor, Australian Catholic University, Australia

· Dr. Deepak P, Associate Professor, Queen's University Belfast, UK

· Dr. Zeyar Aung, Associate Professor, Khalifa University, Abu Dhabi, UAE

· Dr. Sukanya Manna, Assistant Professor at Santa Clara University, USA

· Dr. Hai Dong, Assistant Professor, RMIT University, Australia

· Dr. Chandresh Maurya, Assistant Professor, Indian Institute of Technology - Indore

· Dr. Rutvij Jhaveri, Assistant Professor, Pandit Deendayal Energy University, India

· Kuheli Sai, Researcher, University of Pittsburgh School of Computing and Information, USA

· Dr. Pankesh Patel, Researcher, University of South Carolina, USA

· Dr. Raghava Mutharaju, Assistant Professor, IIIT - Delhi

· Dr. Sahely Bhadra, Assistant Professor, IIT-Palakkad

· Dr. Lahari Poddar, Researcher, Amazon

· Mr. Sandeep Singh, UCLA, USA

· Few more in pending approval

Invited keynote speakers

· Prof. Evgenia Smirni, Sidney P. Chockley Professor College of William and Mary Computer Science

Workshop Organizer:

Dhaval Patel (pateldha@us.ibm.com)

Carlos Costa (chcost@us.ibm.com)

Shuxin Lin (shuxin.lin@ibm.com)

Paper Submission

BigData-2021 Conference System (wi-lab.com)

Page updated

Google Sites

Report abuse