The 3rd International Workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) 2021

in conjunction with 2021 IEEE International Conference on Big Data (IEEE BigData 2021)

December 15-18, 2021 @ Taking place virtually

Workshop Date/Time: 12/15/2021 8:50am-

Call for Papers

Program Chairs

  • Sangkeun (Matt) Lee

  • Pravallika Devineni

  • Jong Youl Choi

Organizers’ Background

  • Sangkeun (Matt) Lee received his Ph.D. degree in computer science and engineering from Seoul National University in 2012. He is currently an R&D Associate in Computer Science and Mathematics Division at Oak Ridge National Laboratory. He has been studying big data, data science, and machine learning and applied state-of-the-art data analysis technologies in many application domains. He has developed many data analytics software, and one of his developed software, ORiGAMI has won the 2016 DOE R&D 100 Award. He has been contributing to many of leading computer science conferences and journals such as ACM WWW, ACM RecSys, Expert Systems with Applications. For the last few years, he has collaborated with scientists across various domains including material science, nuclear science, and mechanical engineering, and published papers in scientific journals such as Journal of Nuclear Materials, Acta Materialia, The Electricity Journal, Advanced Theory, and Simulations.

  • Pravallika (Pravi) Devineni is a Research Scientist in the Computing Directorate at Oak Ridge National Laboratory, TN, USA. She received her Ph.D. from University of California Riverside in 2018, where her dissertation focused on mining patterns and anomalies in dynamic graph networks. Her research interests include tensor decompositions for machine learning and data science, explainable AI using HPC, large-scale network mining, natural language processing and their applications, and applying computing techniques across a variety of scientific domains. Pravi actively serves on conference and journal committees such as IJCAI, KDD, PAKDD, WSDM and IEEE TMC. Pravi has a passion for advocating for women in tech. She is an organizing committee member for Women in High Performance Computing (WHPC) and is the co-chair for AI track for vGHC 2021.

  • Jong Youl (Jong) Choi is a researcher working in Discrete Algorithms Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory (ORNL), Oak Ridge, Tennessee, USA. He earned his Ph.D. degree in Computer Science at Indiana University Bloomington in 2012 and his MS degree in Computer Science from New York University in 2004. His areas of research interest span data mining and machine learning algorithms, high-performance data-intensive computing, parallel and distributed systems. More specifically, he is focusing on researching and developing data-centric machine learning algorithms for large scale data management, in situ/in transit data processing, and data management for code coupling. Jong Choi actively serves on conference commitee and journal review such as ParaMo, CCPE, and CLUS.

Introduction to Workshop

  • Advances in big data technology, artificial intelligence, and machine learning have created so many success stories in a wide range of areas, especially in industry. These success stories have been motivating scientists, who study physics, chemistry, materials, medicine and many more, to explore a new pathway of utilizing big data tools for their scientific activities.

  • However, there are barriers to overcome. Most existing big data tools, systems, and methodologies have been developed without considering scientific purposes or scientists’ specific requirements. They are not originally developed for scientists who have no or little knowledge of programming or computer science. On the other hand, for computer scientists, understanding the domain problem is often very challenging due to the lack of enough background knowledge.

  • We expect that big data technologies can play a great role in contributing to scientific innovation in many ways. There are already a lot of ongoing scientific projects around the world that aim to discover novel hypotheses, analyze big multidimensional data which couldn’t be handled by manually, and reduce the time required by complex calculations via machine. This workshop intends to bring domain scientists and computer scientists together while exploring and extending opportunities in the development of big data tools, systems, and methodologies for scientific discovery, to share success stories and lessons learned, and discuss challenges, which if overcome would enable successful collaboration across different domains, especially domain scientists and computer/data scientists.

  • In this workshop, we discuss the following questions:

    • What makes big data tools for scientists different from the existing tools?

    • What specific needs and challenges do domain scientists face when they try to adopt big data tools?

    • How can computer scientists and domain scientists communicate to define a feasible problem together?

    • What are the barriers of using big data for scientific discovery and how do these barriers differ in different science domains?

Workshop History

The international workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) was first held in December 2019 in conjunction with IEEE Big Data 2019 conference, organized by Matt Lee and Travis Johnston. Total of 26 submissions were received, and 12 papers were accepted. It was a great start to build a strong scientific collaboration community. The second BTSD workshop in 2021 was held in December 2020 as a virtual workshop. Total of 26 submissions were received and 11 papers were accepted and presented. It was a great communication and opportunity to learn from experiences across many scientific domains.

Research Topics Included in the Workshop

  • Big data tools, systems, and methods related to, but not limited to:

    • Scientific data processing

    • Artificial intelligence/Deep neural networks/Machine learning

    • Text mining/Graph mining

    • Database/Query processing/Query Optimization

    • Parallel computation/High Performance Computing

    • Visualization/User Interface/HCI

    • Parallelization/Performance/Scalability

    • High Performance Computing …

  • that facilitate innovation and discovery in a scientific domain, such as:

    • Physics

    • Chemistry

    • Material science

    • Mechanical engineering

    • Nuclear engineering

    • Biomedical science …

  • Use cases, success stories, lessens learned in scientific discovery using big data tools, systems, and methods

Program Committee Members

  • Ramakrishnan Kannan, Oak Ridge National Laboratory, kannanr@ornl.gov

  • Yan Da, University of Alabama Birmingham, yanda@uab.edu

  • Seungha Shin, University of Tennessee, sshin@utk.edu

  • Feng Bao, Florida State University, fbao@fsu.edu

  • Youngjae Kim, SOGANG University, Seoul, Republic of Korea, youkim@sogang.ac.kr

  • Supriya Chinthavali, Oak Ridge National Laboratory

  • Michael Churchill, Princeton Plasma Physics Laboratory

  • Pei Zhang, Oak Ridge National Laboratory

  • Ivy Peng, Lawrence Livermore National Laboratory

  • Shaden Smith, Microsoft

  • Priyanka Ghosh, Pacific Northwest National Laboratory

  • Christine Klymko, Lawrence Livermore National Laboratory

  • Gopinath Chennupati, Los Alamos National Laboratory

  • Ralph Kube, Princeton Plasma Physics Laboratory

  • Anika Tabassum, Oak Ridge National Laboratory

Paper Submission

Please submit a short paper (up to 4 page IEEE 2-column format) or full paper (up to 8 page IEEE 2-column format) through the online submission system.

https://wi-lab.com/cyberchair/2021/bigdata21/scripts/submit.php?subarea=S30&undisplay_detail=1&wh=/cyberchair/2021/bigdata21/scripts/ws_submit.php

Papers should be formatted to IEEE Computer Society Proceedings Manuscript Formatting Guidelines (see link to "formatting instructions" below).

Formatting Instructions

8.5" x 11" (DOC, PDF)

LaTex Formatting Macros

Important Dates

  • Deadline Extended !! (FINAL EXTENSION) Oct 24 2021 Oct 17, 2021 : Due date for full workshop papers submission

  • As per many requests, we opened the submission site again, but this will be the final extension.

  • Notice: If you intend to submit your paper by the deadline, please submit a document with an abstract or work-in-progress document as soon as you can.
    (
    You can update the submitted document anytime before the deadline.)
    This is not mandated but it allows us to estimate the number of submitted papers. Thank you!

  • Nov 10, 2021 (We are finalizing the reviews, please stay tuned. Sorry for the delay) Nov 8, 2021: Notification of paper acceptance to authors

  • Nov 21 (New Deadline), 2021: Camera-ready of accepted papers

Presentation Preparation

Registration

Please see the registration information and register by 11/21

https://web.cvent.com/event/122a6d98-2b39-4b98-9536-6680548b454f/summary

Thanks for your paricipation.

Workshop Primary Contact

  • Sangkeun (Matt) Lee, Computational Data Analytics Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory, TN, USA. Tel: +1 865 574 8858 Email: lees4@ornl.gov

Agenda

Date: December 15/2021 (Eastern Standard Time)

Join Link

Please use the following link to join:

https://underline.io/events/222/sessions?eventSessionId=9599

Opening Remarks

BTSD_opening_2021.pdf