The 4th International Workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) 2022
Call for Papers
Program Chairs
Sangkeun (Matt) Lee
Jong Youl Choi
Anika Tabassum
Organizers’ Background
Sangkeun (Matt) Lee received his Ph.D. degree in computer science and engineering from Seoul National University in 2012. He is currently an R&D Associate in Computer Science and Mathematics Division at Oak Ridge National Laboratory. He has been studying big data, data science, and machine learning and applied state-of-the-art data analysis technologies in many application domains. He has developed many data analytics software, and one of his developed software, ORiGAMI has won the 2016 DOE R&D 100 Award. He has been contributing to many of leading computer science conferences and journals such as ACM WWW, ACM RecSys, and Expert Systems with Applications. For the last few years, he has collaborated with scientists across various domains including material science, nuclear science, and mechanical engineering, and published papers in scientific journals such as Journal of Nuclear Materials, Acta Materialia, The Electricity Journal, Advanced Theory, and Simulations.
Jong Youl (Jong) Choi is a researcher working in the Discrete Algorithms Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory (ORNL), Oak Ridge, Tennessee, USA. He earned his Ph.D. degree in Computer Science at Indiana University Bloomington in 2012 and his MS degree in Computer Science from New York University in 2004. His areas of research interest span data mining and machine learning algorithms, high-performance data-intensive computing, and parallel and distributed systems. More specifically, he is focusing on researching and developing data-centric machine learning algorithms for large-scale data management, in situ/in-transit data processing, and data management for code coupling. Jong Choi actively serves on conference committees and journal reviews such as ParaMo, CCPE, and CLUS.
Anika Tabassum is currently working as a Postdoctoral researcher at Oak Ridge National Laboratory, where she is contributing toward Deep Learning for multi-scale and multimodal battery analytics and plasma simulation for fusion energy. Her research focuses on developing deep learning models for robust scientific computing, specifically, she works on knowledge-guided ML and scientific ML. She received her Ph.D. from the Department of Computer Science at Virginia Tech where she worked on bringing knowledge-guided ML to address multiple challenges in power system failures and clean energy. Her Ph.D. research work was funded by an NSF Urban Computing fellowship. Apart from her primary research focus, she also worked on designing the COVID-19 forecasting model for the CDC challenge. She has published in multiple venues such as ACM SigKDD, AAAI, CIKM, IEEE BigData, IAAI, and journals like ACM TIST and Elsevier. She completed her bachelor's degree in Computer Science and Engineering from the Bangladesh University of Engineering and Technology.
Introduction to Workshop
Advances in big data technology, artificial intelligence, and machine learning have created so many success stories in a wide range of areas, especially in industry. These success stories have been motivating scientists, who study physics, chemistry, materials, medicine, and many more, to explore a new pathway of utilizing big data tools for their scientific activities.
However, there are barriers to overcome. Most existing big data tools, systems, and methodologies have been developed without considering scientific purposes or scientists’ specific requirements. They are not originally developed for scientists who have no or little knowledge of programming or computer science. On the other hand, for computer scientists, understanding the domain problem is often very challenging due to the lack of enough background knowledge.
We expect that big data technologies can play a great role in contributing to scientific innovation in many ways. There are already a lot of ongoing scientific projects around the world that aim to discover novel hypotheses, analyze big multidimensional data which couldn’t be handled manually, and reduce the time required by complex calculations via machine. This workshop intends to bring domain scientists and computer scientists together while exploring and extending opportunities in the development of big data tools, systems, and methodologies for scientific discovery, to share success stories and lessons learned, and discuss challenges if overcome would enable successful collaboration across different domains, especially domain scientists and computer/data scientists.
In this workshop, we discuss the following questions:
What makes big data tools for scientists different from the existing tools?
What specific needs and challenges do domain scientists face when they try to adopt big data tools?
How can computer scientists and domain scientists communicate to define a feasible problem together?
What are the barriers of using big data for scientific discovery and how do these barriers differ in different science domains?
Workshop History
The international workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) was first held in December 2019 in conjunction with IEEE Big Data 2019 conference, organized by Matt Lee and Travis Johnston. Total of 12 papers were accepted. It was a great start to build a strong scientific collaboration community. The second BTSD workshop was held in December 2020 as a virtual workshop in conjunction with IEEE Big Data 2020. Total of 11 papers were accepted and presented. The third BTSD workshop was held in December 2021 as a virtual workshop in conjunction with IEEE Big Data 2021. Total of 9 papers were accepted and presented. It was a great communication and opportunity to learn from experiences across many scientific domains.
Research Topics Included in the Workshop
Big data tools, systems, and methods related to, but not limited to:
Scientific data processing
Artificial intelligence/Deep neural networks/Machine learning
Text mining/Graph mining
Database/Query processing/Query Optimization
Parallel computation/High Performance Computing
Visualization/User Interface/HCI
Parallelization/Performance/Scalability
High Performance Computing …
that facilitate innovation and discovery in a scientific domain, such as:
Physics
Chemistry
Material science
Mechanical engineering
Nuclear engineering
Biomedical science …
Use cases, success stories, lessons learned in scientific discovery using big data tools, systems, and methods
Program Committee Members
Youngjae Kim, Sogang University, South Korea
Feng Bao, Florida State University, USA
Supriya Chinthavali, Oak Ridge National Laboratory, USA
Guimu Guo, Rowan University, USA
Ramakrishnan Kannan, Oak Ridge National Laboratory, USA
Seungha Shin, University of Tennessee, USA
Pei Zhang, Oak Ridge National Laboratory, USA
Ivy Peng, Lawrence Livermore National Laboratory, USA
Ralph Kube, Princeton Plasma Physics Laboratory, USA
Ohyung Kwon, Korea Institute of Industrial Technology, South Korea
Nikhil Muralidhar, Stevens Institute of Technology, USA
Gopinath Chennupati, Amazon Alexa, USA
Paper Submission
Please submit a short paper (minimum 4 page, up to 6 page IEEE 2-column format) or full paper (minimum 8 page, up to 10 page IEEE 2-column format) through the online submission system.
Papers should be formatted to IEEE Computer Society Proceedings Manuscript Formatting Guidelines (see link to "formatting instructions" below).
Formatting Instructions
Important Dates
* Abstract Submission: Oct 1, 2022: -- please submit your abstract through the online submission system (update your submission with your full paper later)
Authors are strongly recommended required to submit abstracts of their papers to show the intention of submission by the deadline
Authors do not have to submit their abstracts but please consider submitting your abstract so that the workshop can be well organized.
Due date for full workshop papers submission: Closed
Nov 4Nov 1, 2022: Notification of paper acceptance to authors. The notification date may be postponed due to the submission deadline extension.
Nov 27, 2022: Camera-ready of accepted papers
Presentation Preparation
The author registration deadline is Nov 27, 2022
The final camera-ready paper submission deadline is Nov 27, 2022, please don’t miss the deadline, otherwise, your paper won’t be published in the conference proceedings, Pls follow the URL for the camera-ready paper submission
https://wi-lab.com/cyberchair/2022/bigdata22/index.php
All accepted papers need to submit a pre-recorded video of their paper presentation and uploaded it into the conference virtual platform so the online participants could access all the presentations of the conference. Please follow the instruction on this URL to prepare your video recording and upload it on the virtual platform on or before November 28, 2022. The submission site will be closed on Nov. 29. For a workshop paper, your video should be less than 20 minutes. https://docs.google.com/document/d/1y-sU5aqc_-6nhjNDkKfgVdo0p9ft-wJukmobraSVGo8/edit?usp=sharing
ALL pre-recorded presentations must be uploaded on or before 28 November 2022
The conference schedule will be announced around mid-November. We will email you the login in instructions to the virtual platform to attend the conference a few days before the event if you register as virtual conference registration. So everyone who wants to attend the conference must register the conference in person or online in order to attend the conference in person or online.
A Detailed workshop schedule will be announced in early-December.
Registration
https://web.cvent.com/event/346deae6-ccea-449f-82b5-cdd40f84f74d/summary
Author Deadline: November, 27, 2022
Workshop Primary Contact
Sangkeun (Matt) Lee <lees4@ornl.gov> Jong Youl <choij@ornl.gov> Tabassum, Anika <tabassuma@ornl.gov>
Workshop Schedule
Please prepare 15 min (short paper) and 20 min (long paper) presentations.
Live presentation will be required, even if you uploaded pre-recorded presentation video.
Online zoom link:
Please join 20 minutes before your presentation (The schedule may be flexible)
https://us06web.zoom.us/j/89813849809?pwd=SUlJNW5Ld3pLeVhtTkhuNmFMc2FzZz09
For more details, log in through the BigData22 website with your registeration information: https://events.rdmobile.com/Events/Details/15890
Thanks for your participation.