Since the invention of the Turing machine and computability theory, the design and analysis of computer algorithms have played intrinsic roles in efficiently solving computational problems. Scheduling is one of such computational problems, which has been a part of human life. The concept of scheduling has emerged from the scenario, where a number of tasks are given with an objective and is asked to achieve the objective efficiently. Every day a layman schedules at least one of the following activities such as daily routine, meetings, assignments, workloads, academic curriculum, technical events, cultural shows, sports, transportation of vehicles, patient appointments, treatment mechanisms, production, and manufacturing operations. When all tasks to be scheduled are known from the beginning, then the scheduling is known as "Offline scheduling" and it is a bit easier to develop offline scheduling algorithms in polynomial time. However, when all tasks are not given at the outset, it is a non-trivial challenge to design efficient scheduling algorithms. Modern world applications offer tasks one by one in order and ask to schedule each task irrevocably before releasing the next task. Such scheduling is termed “Online Scheduling”. Online scheduling finds extensive applications in emerging areas such as High-Performance Computing (HPC), Cloud Computing, Intelligent Computing, Routing, and has been a well-studied research problem in the literature. It is important to understand the online scheduling problem in a theoretical framework to identify all its variants. Efficient solutions to the problem seek an amalgamation of methods from Computer Science and Optimization Theory with novel ideas and formalizations. In the current literature, less attention has been paid to the online scheduling of parallel submissions, where tasks from multiple sources are given online and are asked to optimize the objective of each of the sources fairly. The development of a model to formally capture the problem, the design of an online scheduling algorithm, and a quantitative fairness measure are the exciting research challenges, which I am trying to address in my ongoing works.
Our Contributions
– Computational Hardness
An alternative proof technique has been developed to show that the Multi-processor Scheduling Problem (MPSP) is NP-Complete by introducing Scheduling Solution Space Tree(SSST) and Weighted Scheduling Solution Space Tree(WSSST) as novel data structures.
A polynomial-time verifier has been derived based on the proposed SSST and WSSST data structures to establish a unique NP-proof method.
A non-deterministic polynomial time algorithm named Magic Scheduling (MS) has been designed as a novel framework to develop optimal offline algorithms for the MPSP problem.
The Multi-user Multi-processor Scheduling Problem (MUMPSP) has been formally defined and proved to be NP-complete by a polynomial-time reduction from the MPSP problem.
- Online Scheduling
The variants of the online scheduling problem based on machine models, job characteristics, and optimality criteria have been extensively studied with a comprehensive review of the well-known deterministic and randomized online scheduling algorithms, important competitive analysis results, non-trivial research challenges, and open problems.
A novel multi-list scheduling (MLS) model as a standard variant of the online scheduling over the list model has been developed to represent the parallel job submissions pattern in a multi-user multi-processor setting.
A 2-competitive deterministic online algorithm named Largest Job on the Least Loaded Machine (LJLLM) has been designed based on the proposed MLS model.
– Fairness in Online Scheduling
A standard model with new parameters to analyze the fairness of online scheduling algorithms has been developed.
An innovative fairness analysis method has been proposed to quantify fairness and unfairness in multi-user multi-processor online scheduling based on makespan as the user’s objective.
Prospective Ongoing and Future Research
The following non-trivial research questions drive his ongoing research work.
– Computational Hardness
• Can the Scheduling Solution Space Tree and Weighted Scheduling Solution Space Tree be refined to develop as standard data structures?
• How the SSST and WSSST data structures can be used to establish the computational complexity of any new optimization problem?
• Design of deterministic optimal offline algorithms for the variants of the MPSP using the proposed non-deterministic algorithmic framework MS.
– Online Scheduling
• Can the upper bound of 2 on the competitive ratio for the online scheduling over the MLS model be improved?
• Is the MLS model practically useful in online scheduling on uniformly related machine settings?
• How to characterize real-world inputs to develop meaningful data sets to implement well-known online scheduling algorithms?
• Can a standard mathematical framework be developed to measure algorithmic fairness?
The scheduling problems in modern Computational Science are neither completely Offline (i.e. all inputs of the problem are known from the beginning) nor Online (i.e. inputs are given one by one in order), rather some extent in between i.e. Semi-online (i.e. inputs are given one by one in order with some relevant information of future inputs). To understand a bit more about semi-online scheduling let me highlight a few emerging areas, where semi-online scheduling finds applications :
Resource Management in Operating System: In a multi-user, time-shared operating system, it is not known at the outset the sequence of jobs or the number of jobs that would be submitted to the system. Here, jobs are given to the scheduler over time. However, it is the inherent property of the scheduler to make an educated guess about the maximum and minimum time required to complete a resident job. The objective is to irrevocably assign the required computer resources such as memory and processors immediately upon the availability of a job to attain a minimum completion time.
Distributed Data Management: Distributed and parallel systems are often confronted to store files of varying sizes on limited capacity remote servers. It is evident that files are submitted from a known source on the fly and each received file must be assigned immediately to one of the remote servers. The central scheduler of the system is handicapped about the successive submissions prior to an irrevocable assignment. However, it is known for instance that the source stores the files in k unit capacity servers, which provides a hint for the total size of the files to be received. The challenge is to store the files on the remote servers with a minimum storage requirement.
Server Request Management or Web Caching: In a client-server model, it is not known in advance the number of requests that would be submitted to the remote servers nor the time required to process the requests. However, the hierarchical organizations of servers can serve as an extra piece of information for the scheduler to cater different levels of services to the requests with a broader objective of processing all requests as latest as possible.
Production and Manufacturing: Orders from clients arrive on the fly to a production system. The resources such as human beings, machinery equipment(s), and manufacturing unit(s) must be allocated immediately upon receiving each client order with no knowledge of the future orders. However, one could estimate the minimum or maximum time required to complete the order. Online arrival of the orders has a high impact on the renting and purchasing of the high-cost machines in the manufacturing units.
Our Contributions
A state-of-the-art survey of the important competitive analysis results, well-known deterministic semi-online algorithms, critical ideas, non-trivial research challenges, and open problems have been presented with a unique taxonomy of related works.
The non-preemptive semi-online scheduling has been investigated with a makespan minimization objective in identical parallel machine settings. The semi-online scheduling with lookahead has been investigated and a (4/ 3)-competitive optimal semi-online algorithm named 2 − LA1 has been designed for a 2-identical machine setting.
A (16 /11)-competitive semi-online algorithm named 3 − LA1 has been proposed for a 3- identical machine setting.
The semi-online scheduling with combined partial information has been investigated in a 3-identical machine setting and a (10/ 9)-competitive optimal algorithm named 3−DS has been designed by considering known Decr and Sum as the combined partial input information.
Prospective Ongoing and Future Research
• Can a standard look-ahead model be defined as a practically significant alternative to online scheduling with a buffer?
• How much lookahead is necessary and sufficient to beat the best-known competitive bound for online scheduling on 2-uniformly related machines?
• Consider an industry scenario, where a list of projects from several clients arrives on the fly. A project manager has to allocate resources such as working teams, machinery, and time slots to the received project without knowledge of successive ones. A research question arises here. What can be the Extra Piece of Information (EPI) about future projects known a priori that helps efficiently allocate resources to the current project for minimizing the project cost and maximizing the company's throughput?
Given a simple non-planner, undirected graph G (v, e). Consider an edge coloring, not necessarily proper. A path between two vertices is a rainbow path if no two edges in this path have the same color. If there is a rainbow path between every pair of vertices, then the coloring is called rainbow coloring.
In online rainbow coloring, the inputs are a non-trivial undirected connected simple graph G (v, e) and a set of colors c. The output is a rainbow-colored graph, where there must be at least one rainbow path between every distinct pair of vertices. Our goal is to use a minimum number of colors while making G (v, e) rainbow-colored. We have the following constraints:
The edges of G (v, e) are not known from the beginning and are given one by one in order.
Each edge must be colored irrevocably as soon as it is given before the availability of the next edge.
The partial graph formed after the addition of each new edge must be connected.
Online Rainbow coloring finds application in Cellular Networks for frequency distribution among different links. When it is required distinct communication channels between a pair of mobile stations to communicate, then rainbow coloring can be applied to minimize the number of unique channels in the whole network.
Our Contributions
A deterministic online algorithm named Least Recently used Color (LRUC) has been designed for efficient bandwidth allocation in cellular networks as an application of online rainbow coloring and the performance of algorithm LRUC has been analyzed by competitive analysis method by considering various critical graph structures.
Design of efficient online algorithms for online rainbow coloring of well-known graphs.
When users of a cellular network (4G or 5G) are joining and leaving the network frequently, then how to capture the live communication links for such networks through an appropriate graph architecture for efficient frequency distributions among the live links through online rainbow coloring?
Exploration of new applications and design of efficient algorithms for online rainbow coloring in Data Communications and Computer Networks.