EDGELAB
Group Leader
Subhajit Sidhanta, Assistant Professor, CSE, IIT Bhilai
Team members
Kolichala Rajsekhar, Ph.D student, CSE, IIT Bhilai
Sachin Kumar, Project Associate, IIT Bhilai
Gowry Sailaja, PhD student, DSAI, IIT Bhilai
Deepali Thombre, PhD student, DSAI, IIT Bhilai
Sivananda, PhD student (Part-time), CSE, IIT Bhilai
Chaitanya Bisht, B.Tech Honours, CSE, IIT Bhilai
Ananya Hooda, B.Tech Honours, CSE, IIT Bhilai
Vamsi Krishna, BTech, DSAI, IIT Bhilai
Chaitanya Sai, BTech, DSAI, IIT Bhilai
Hemanth Gaddey, BTech, EE, IIT Bhilai
Vaibhabh Arora, BTech, CSE, IIT Bhilai
Past Members:
Shashwat Jaiswal, BTech Hons student, EECS, IIT Bhilai now doing Ph.D in UIUC
Rohit Das, Mtech student, EECS, IIT Bhilai now working with RadiSys
Saptarshi Mukherjee, BTech Hons student, EECS, IIT Bhilai now with Google India
Sreechakra Muttareddygar, M.Tech Student, IIT Bhilai now with Delloite
ASHUTOSH GARG, B.Tech Hons student, IIT Bhilai now with Amazon India
Chitraksh Sadayat, BTech student, IIT Jodhpur
Ajat Prabha, BTech student, IIT Jodhpur
Collaborators:
Yogesh Simmhan, Associate Professor, Indian Institute of Science
Supratik Mukhopadhyay, Professor, Louisiana State University
Yimin Zhu, Professor, Louisiana State University
Sushanta Karmakar, Associate Professor, IIT Guwahati
Open Positions
Multiple Internship positions are open in Distributed Systems, Federated learning, and Reinforcement learning. Seeking applicants soon. Email subhajit AT iitbhilai DOT ac DOT in.
Ph.D positions available. Please apply through formal channels.
Projects:
Reinforcement Learning for Performance Optimization in Edge/Fog/Cloud Computing
The explosive growth of IoT devices makes the cloud computing paradigm unsuitable for real-time applications. Instead, edge computing leverages relatively low responses. However, the limited resource capability of edge servers makes the edge computing paradigm more challenging. Further, the mobility of IoT devices and edge servers makes the problem even more difficult. In this paper, we outline the mobility-aware scenario with respect to the application of drone deployment in a disaster area. In such scenarios, we tackle different problems and challenges such as the optimal deployment of a swarm of drones to act as Network Access Points, optimal assignment of IoT data sources to Edge devices in an edge computing cluster, minimizing churn in an edge cluster comprising mobile edge devices, etc. To that end, we leverage reinforcement learning-based solutions.
Federated Learning: Performance and Accuracy Tradeoff:
Owing to the large number of clients, synchronous FL imposes considerable waiting time at the parameter server where the parameter server has to wait until all or sampled clients are ready. Further, this wait also introduces the staleness in the model as all the devices are not able to participate in FL. In this scenario, Asynchronous Federated Learning (AFL) is leveraged, where the parameter server does not wait for the clients and performs fashioning as soon as it receives the updates. FedAvg can cope with the not non-independent and identically distributed (Non-IID) data to a certain degree, but a lot of research has indicated that a deterioration in the accuracy of FL is almost inevitable on Non-IID or heterogeneous data. We are developing a library/package for enabling developers and users to develop FL applications that can provide different degrees of accuracy and performance guarantees while working in an asynchronous manner with non-IID data.
Automated Tuning of Distributed NOSQL Databases
Users and developers who employ distributed databases (NOSQL) for storing data, especially replicated databases, are burdened with the choice of a suitable client-centric tuning of configurations of the database for each storage operation. The above tuning is difficult to reason about as it requires deliberating about the tradeoff between different performance metrics and the consistency of the data. The performance and consistency of a given operation depend on the client-centric database setting applied, as well as dynamic parameters such as the current workload and network condition. We are developing a reinforcement learning-based predictive tool chain, that can automate the choice of database setting under user-specified thresholds given in the service level agreement (SLA).
Power, temperature, and performance tradeoff in Edge Computing and IoT
Given the recent advances in IoT and cyberphysical systems, the computing tasks involved in the above monitoring activities are performed on edge devices such as Jetson Nano/Raspberry Pi onboard the drone. The temperature of the monitored environment can become quite high such that it may result in a violation of application-level correctness as well as substantial degradation of observed performance for computing tasks performed on the onboard devices. We propose to develop optimal path-planning algorithms which can operate under very high temperatures in the vicinity of bushfires and forest fires. We are working towards development of models for safe temperature ranges for edge computing worked to run on exposed systems such as Jetson Nano/Raspberry Pi. For example, we are developing a library that can model the performance-temperature tradeoff, and predict the performance estimated for a particular workload run on the device under a given temperature.
Edge Resource MANAGEMENT: MIGRATION AND AUTO SCALING
Mission-critical edge analytics applications have stringent Service Level Agreement (SLA) guarantees to ensure that the recorded latency in the application with the benchmark workloads satisfy the SLA deadlines, and are always available. Such a deadline-aware application design is only possible with a benchmark workload generator that generates dynamically varying workloads. To achieve the Service Level Agreement (SLA) guarantees imposed by the services hosted in a given edge cluster, efficient load balancing mechanisms need to be in place. Moreover the above task of load balancing becomes harder when we consider the fact that the edge nodes are intrinsically mobile in nature, failure prone, and work with intermittent network connectivity. In of the that case, the above mechanisms must be able to adapt to conditions like edge devices leaving and joining the cluster or existing devices changing their location. This mandates adequate migration techniques in the lightweight edge clusters to manage node failures, overloading and reconfiguration and be in tandem with quality of cloud services.
Single-board computers like Raspberry Pi usually come with a quad-core processor and at most 4GB of RAM. Nowadays, in the advent of data analytics, the above-mentioned resources are too less for a single node to handle vast amounts of data. So, we are faced with the following problems:
To enable fast processing of a huge in-stream of data, either the rate of data fed into the cluster must be regulated, or the data stream must be parallelized amongst multiple nodes.
Some nodes will be overburdened with processes taking up a lot of resources, like applications involving deep-learning or neural networks, object-recognition from a live video feed, so efficient load-balancing algorithms are essential.
In case of failure of a node, container running on the failed node must be migrated before-hand to a standby node, and the new node must be up and running with as less downtime as possible.
CONTACT
Dr. Subhajit Sidhanta
subhajit[AT]iitbhilai[DOT]ac[DOT]in
Room No 404B, ED1, IIT Bhilai, Kutelabhata, Durg 491001