HiPC BigDF 2016

Archived section for HiPC BigDF 2016

International Workshop on Foundations of Big Data Computing (BigDF)

 In conjunction with HiPC 2016

19 December 2016

Hyderabad International Conference Centre, Hyderabad, India



Keynote Speaker: Prof. Geoffrey Fox, Distinguished Professor, School of Informatics and Computing, Indiana University, Bloomington, Indiana, U.S.A. 

Title: Big Data and High Performance Computing: Convergence?



We summarize a community discussion http://www.exascale.org/bdec/ of the relationship between Big Data and High Performance Computing HPC with possible "convergence" on aspects of both. HPC can include both traditional supercomputer simulations and analysis of research data. We can identify issues for both hardware, software and perhaps more important system requirements. The latter includes both application structure studied by NISThttp://dsc.soic.indiana.edu/publications/NISTUseCase.pdf and differences from the intrinsically federated nature of research (data) with no "central ownership" as seen in commercial systems, which allow common approaches.

At Indiana University, http://dsc.soic.indiana.edu/publications/HPCBigDataConvergence.pdf, the concept of HPC-ABDS software -- High Performance Computing Enhanced Apache Big Data Stack -- is explored where we try to blend the usability and functionality of the community big data stack with the performance of HPC. Here we examine major Apache Programming environments including Spark, Flink, Hadoop, Storm, Heron and Beam. We suggest that parallel and distributed computing often implement similar concepts (such as reduction, communication or dataflow) but that these need to be implemented differently to respect the different performance, fault-tolerance, synchronization, and execution flexibility  requirements of parallel and distributed programs. We present early results on the HPC-ABDS strategy of implementing best-practice runtimes for both these computing paradigms in major Apache environments.

Keynote Speaker: Prof. Vijay V. Raghavan, Alfred and Helen Lamson Endowed Professor in Computer Science, University of Louisiana at Lafayette, U.S.A.

Title: A Framework for Real-Time Event Detection for Emergency Situations using Social Media Streams


In this presentation, we propose an event detection approach to aid in real-time event detection. Social media generates information about news and events in real-time. Given the vast amount of data available and the rate of information propagation, reliably identifying events can be a challenge. Most state of the art techniques are post hoc techniques, which detect an event after it happened. Our goal is to detect the onset of an event as it is happening, using the user-generated information from Twitter streams. To achieve this goal, we use a discriminative model to identify a sudden change in the pattern of conversations over time. We also use a topic evolution model to identify credible events and propose an approach to eliminate random noise that is prevalent in many of the existing topic detection approaches. The simplicity of our proposed approach allows us to perform fast and efficient event detection, permitting discovery of events within minutes of the first conversation relating to an event started. We also show that this approach is applicable for other social media datasets to detect change over the longer periods of time.


We extend the proposed event detection approach to incorporate information from multiple data sources with different velocity and volume. We study the event clusters generated from event detection approach for changes in events over time. We also propose and evaluate a location detection approach to identify the location of a user or an event based on tweets related to them.



Session 1


Opening Remarks by Workshop Chairs


QoS aware Resource Management for Apache Cassandra 
Yasaswi Kishore, Venkat Datta, KV Subramaniam, Dinkar Sitaram


A High Performance Computing Framework for Data Mining 
Navneet Goyal, Sundar Balasubramaniam, Poonam Goyal, Saiyedul Islam, Mohit Sati


Application of an Asset Bubble Model to Microblog Data Analytics 
K. M. George, Ashwin Kumar Thandapani Kumarasamy


Coffee Break


Session 2


Keynote Address: “A Framework for Real-Time Event Detection for Emergency Situations using Social Media Streams,” by Prof. Vijay V. Raghavan, Alfred and Helen Lamson Endowed Professor in Computer Science, University of Louisiana at Lafayette, U.S.A.


A fast, Apriori based approach to Association Rule Mining in large and growing datasets (short paper)
Atmika Honnalgere, Abhinav Patluri


A Hybrid Recommender System using Weighted Ensemble Similarity Metrics and Digital Filters 
Ramesh Naidu Laveti, Janaki Ch, Supriya N Pal, N. Sarat Chandra Babu


Lunch Break


Session 3


Keynote Address: “Big Data and High Performance Computing: Convergence?” by Prof. Geoffrey Fox, Distinguished Professor, School of Informatics and Computing, Indiana University, Bloomington, Indiana, U.S.A.


High Frequency Trading with Complex Event Processing (short paper)
Ajay Acharya, Nandini Sidnal


Big Data Analytics Architecture for Agro Advisory System 
Purnima Shah, Deepak Hiremath, Sanjay R Chaudhary


Coffee Break


Session 4


#ChennaiFloods: Leveraging Human and Machine Learning for Crisis Mapping during Disasters using Social Media 
Bhuvaneswari Anbalagan, Valliyammai Chinnaiah


Multi-Dimensional Predictive Analytics for Risk Estimation of Extreme Events 
Laks Raghupathi, David Randell, Emma Ross, Kevin C. Ewans, Philip Jonathan


Compression acceleration using GPGPU 
Krishnaprasad Shastry, Avinash Pandey, Ashutosh Agrawal, Ravi Sarveswara





What constitutes a “Big Data” problem? What application domains are best suited to benefit from Big Data analytics and computing? What are the traits and characteristics of an application that make it suited to exploit Big Data analytics? How can Big Data systems and frameworks be designed to allow the integration and analysis of complex data sets? How can research in Big Data Analytics benefit from the latest advances in supercomputing and High Performance Computing (HPC) architectures? The goal of this workshop is to address questions like these that are fundamental to the advancement of Big Data computing, and in the process, build a diverse research community that has a shared vision to advance the state of knowledge and discovery through Big Data computing.

Topics of interest include research contributions and innovative methods in the following areas (but not limited to):

  • Scalable tools, techniques and technologies for Big Data analytics (e.g., graph and stream data analysis, machine learning and emerging deep learning methods)
  • Algorithms and Programming Models for Big Data
  • Big Data applications - Challenges and Solutions (e.g., life sciences, health informatics, geoinformatics, climate, socio-cultural dynamics, business analytics, cybersecurity)
  • Scalable Big Data systems, platforms, services, and management
  • Big Data toolkits, workflows, metrics,  and provenance.

We invite paper submissions that describe original research contributions in the area of Big Data computing, and position papers that highlight the potential challenges and opportunities that arise in Big Data computing. We also invite short papers that describe work-in-progress original research. 

Regular papers can be up to 8 pages long and short papers can be up to 4 pages long. All submissions will undergo rigorous peer-review by the technical program committee, and accepted manuscripts will appear in the HiPC workshop proceedings and will be indexed by IEEE digital library. Authors of the accepted manuscripts will be required to present their work at the workshop proceedings. 

EDAS Submission link:  All submissions need to be made through EDAS using the following submission link:  http://edas.info/N22849

Authors can submit an abstract prior to submitting the full paper for review. The abstract is not mandatory but is recommended to help organizers plan the review phase in a timely fashion (i.e., authors can submit a full paper without having submitted an abstract). However, submissions with only full papers will be reviewed. 

Organizing Committee:

Technical Program Committee:

  • Rekha Singhal, TCS Research
  • Arindam Pal, TCS Research
  • Mahantesh Halappanavar, Pacific Northwest National Laboratory
  • Abhinav Vishnu, Pacific Northwest National Laboratory
  • Yinglong Xia, Huawei Research America
  • Jaroslaw Zola, SUNY Buffalo 
  • Nabanita Das, Indian Statistical Institute
  • Hui Huang, Google
  • Marlon Pierce, Indiana University
  • Devesh Tiwari, Oak Ridge National Laboratory
  • Sharma Thankachan, Georgia Institute of Technology/Univ. Central Florida
  • Suresh Marru, Indiana University

Schedule and Review Process:

 (optional) Abstract Deadline: July 24, 2016  August 12, 2016 (Extended deadline)
 Paper Submission Deadline: August 1, 2016  August 12, 2016 (Extended deadline)
 Author Notification: September 15, 2016  September 19, 2016
 Camera-ready Paper Deadline: October 3, 2016
 Author Registration Deadline: October 16, 2016   
 Final Schedule Available: November 30, 2016
 BigDF 2016 Workshop:  December 19, 2016