The First International Workshop on
Big Data eXploration, Compression and Systems (BDXCS 2026)
Afternoon (13:30 - 16:30) on Monday, January 26th, 2026
Afternoon (13:30 - 16:30) on Monday, January 26th, 2026
The first international workshop on Big Data eXploration, Compression and Systems (BDXCS) is an international workshop held in conjunction with SCA/HPCAsia 2026, focusing on big data, data compression, and their associated systems.
SCA/HPCAsia 2026: https://www.sca-hpcasia2026.jp/
Call for Papers
Objectives
In addition to traditional applications, the rise of AI and cloud computing has significantly increased the volume of data processing and communication required in high-performance computing (HPC).
Efficient data analytics and data movement across distributed and parallel environments (e.g., the Internet, inter-node networks, and system interconnects) have become critical factors in determining the performance and energy efficiency of supercomputers, data centers, and cloud platforms.
This workshop aims to address key research challenges related to big data from multiple perspectives, including data exploration, data compression, and big data systems.
To tackle these challenges, the workshop will aim to explore practical and effective approaches to data analytics and mining, big data visualization, data integration, scalable data compression, and storage/processing systems for big data.
These investigations will consider both the characteristics of large-scale data workloads and the constraints of modern hardware architectures.
In particular, the workshop will emphasize optimization strategies for big data processing, adaptive and general-purpose compression techniques, and high-performance systems designed for high-throughput, low-latency, and hardware-efficient data operations.
Scopes
Big Data Exploration
Data Analytics and Mining: Statistical and Machine Learning methods for Big Data, Graph Analytics and Network Mining, Pattern Recognition and Anomaly Detection, Time Series and Spatial Data Analysis
Interactive Visualization and Visual Analytics: Real-time and Interactive Visualization Techniques, Scalable Visualization Algorithms, Exploratory Data Analysis, Visualization of Complex Data Structures
Data Integration and Fusion: Multi-source and Multi-modal Data Integration, Schema Matching and Data Harmonization, Semantic Web and Knowledge Graphs, Data Fusion Techniques for Heterogeneous Data
Big Data Compression
Lossless and Lossy Data Compression: Compression Techniques for Structured and Unstructured Scientific Data, Multimedia Data Compression, Time-series Data Compression, Textual Data Compression
Compression Algorithms and Techniques: Quantization, Predictive Coding, Transform-based Compression, Dictionary/Entropy-based Compression, Tensor Decomposition and Low-rank Approximations
Compression and Analytics Integration: Compression-aware Data Mining and Machine Learning, Performance investigation by applying compression, Analysis of power consumption associated with compression
Compression/Reduction-conscious Architecture: Offloading data compression/reduction to the network, Data reduction in smart NICs, Adaptive compression with dedicated hardware, Online data compression methods
Big Data Systems
Scalable and Distributed Systems: Distributed Storage Systems (e.g., Hadoop, HDFS, Ceph), Distributed and Parallel Computing Frameworks (e.g., Spark, Flink, MPI), Cloud and Edge Computing Platforms for Big Data,
Performance and Optimization: Big Data-based Resource Scheduling and Load Balancing, Hardware-accelerated Big Data Processing (GPU, FPGA), Energy-efficient and Cost-efficient Big Data Processing
Reliability, Privacy, and Security: Fault Tolerance and Reliability in Big Data Systems, Data Security and Encryption in Large-scale Storage, Privacy-preserving Analytics and Differential Privacy
Architecture and Middleware: Big Data Workflow Management, Middleware Systems for Data-intensive Computing, Containers and Virtualization for Big Data Applications, Custom Architecture for Big Data systems
Keynotes
Speaker: Takaki Hatsui, RIKEN SPring-8 Center
Title: TBA
Speaker: Xiaoyi Lu, University of Florida
Dr. Xiaoyi Lu is a tenured Associate Professor in the Department of Electrical and Computer Engineering at the University of Florida, where he leads the Parallel and Distributed Systems Laboratory (PADSYS Lab). His research focuses on parallel and distributed computing; scalable and efficient systems across diverse computing paradigms, including HPC, big data, AI, cloud, and edge computing; high-performance communication and I/O technologies (e.g., RDMA, NVMe, GPUs, and DPUs); scalable algorithms and applications; and interdisciplinary computing for social good. Dr. Lu has published over 180 papers in prestigious conferences and journals and has received ten Best Paper or Best Student Paper Awards or Nominations, including those at SC 2019 and IPDPS 2024. He has delivered over 100 invited talks, tutorials, and presentations worldwide and is deeply engaged in academic service and community leadership. His research outputs, including OpenDOTA, SR-APPFL, PMIdioBench, HiBD, MVAPICH2-Virt, and DataMPI, have made a broad impact across both industry and academia. Dr. Lu has received notable honors, including the NSF CAREER Award and research awards from Amazon, Google, and Meta/Facebook.
Title: Heterogeneity-Enriched Communication: Rethinking Data Movement for Data-Intensive AI
Modern AI workloads are increasingly data-intensive, and their scalability depends on how effectively computation and data movement are coordinated across heterogeneous systems. As datasets and model sizes continue to grow, the efficiency of data handling becomes increasingly important. This talk introduces Heterogeneity-Enriched Communication (HEC) as a new perspective for understanding and optimizing data movement in large-scale, data-driven AI. I will discuss how HEC principles inspire new opportunities in data compression and communication-efficient system designs that leverage modern CPUs, GPUs, DPUs, and emerging accelerators. Through recent studies and examples, I will illustrate how rethinking data movement alongside computation can improve the scalability and efficiency of distributed, data-intensive AI workloads and can provide a path toward the next generation of high-performance big data computing systems.
Important Dates
Paper submission deadline: October 21, 2025 November 4, 2025 (AoE) November 11, 2025 (AoE, Firm)
Notification of acceptance: November 19, 2025
Camera-ready paper deadline: December 15, 2025
Paper Submission
Please submit your paper here
Accepted papers will be published together with SCA/HPCAsia 2026 proceedings.
The paper format and page limits are the same as for SCA/HPCAsia 2026. For details, please see the following page.
https://www.sca-hpcasia2026.jp/submit/papers.html
For this workshop, we will accept both single- and double-column formats.
The page limit is 18 pages for single-column submissions and 10 pages for double-column submissions (both limits include all content, such as references, figures, and tables).
Location
Registration
Please refer to the SCA/HPCAsia 2026 registration page: https://www.sca-hpcasia2026.jp/