Cascade AI Lab (CAIL) Research

Cascade AI (device → edge → cloud)

AI is most useful when it runs where data is created—on phones, drones, and sensors in farms, hospitals, and factories. But there's a tradeoff: the tiny models that fit on these devices often lack the accuracy and reliability of their larger, cloud-based counterparts. We're working to close that gap. Our research focuses on the algorithmic problems in designing hierarchical/cascading inference systems that intelligently combine device, edge, and cloud resources—making on-device AI reliable, robust, and energy-efficient. Explore our recent work below.

Moothedath VN, Agarwal U, Gross JR, Champati JP, Moharir S, "Inference Offloading for Cost-Sensitive Binary Classification at the Edge," to appear in AAAI, 2026.
Behera AP, Daubaris P, Bravo I, Gallego J, Morabito R, Widmer J, Champati JP, "Exploring the Boundaries of On-Device Inference: When Tiny Falls Short, Go Hierarchical," IEEE IoT Journal, 2025.
Datta P, Moharir S, Champati JP. Online learning with stochastically partitioning experts, UAI, 2025.
G Al-Atat, P Datta, S Moharir, JP Champati, "Regret Bounds for Online Learning for Hierarchical Inference," ACM MobiHoc, 2024.
VN Moothedath, JP Champati, J. Gross, "Getting the Best Out of Both Worlds: Algorithms for Hierarchical Inference at the Edge," IEEE Transactions on Machine Learning in Communications and Networking (TMLCN), 2024.
Adarsh Prasad Behera, Roberto Morabito, Joerg Widmer, and Jaya Prakash Champati, Improved Decision Module Selection for Hierarchical Inference in Resource-Constrained Edge Devices, MobiCom 2023.
Ghina Al-Atat, Andrea Fresa, Adarsh Prasad Behera, Vishnu Narayanan Moothedath, James Gross, and Jaya Prakash Champati, The Case for Hierarchical Deep Learning Inference at the Network Edge, NetAI workshop, MobiSys 2023.
Andrea Fresa, Jaya Prakash Champati, Offloading Algorithms for Maximizing Inference Accuracy on Edge Device Under a Time Constraint, IEEE TPDS, 2023.

Past Research

AoI Analysis and Optimization

AoI is a freshness metric that measures the time elapsed since the generation time of the freshest packet available at the Receiver. In contrast to system delay, AoI increases linearly between the packet receptions, through which it accounts for the frequency of sampling the Information Source. We analyze AoI for fundamental queueing systems and also study optimal sampling and transmission strategies for minimizing AoI in these systems. Selected publications below.

M Fidler, JP Champati, J Widmer, M Noroozi, "Statistical age-of-information bounds for parallel systems: When do independent channels make a difference?" IEEE Journal on Selected Areas in Information Theory, 2023.
Jaya Prakash Champati, Hussein Al-Zubaidy, James Gross. Statistical Guarantee Optimization for AoI in Single-Hop and Two-Hop FCFS Systems with Periodic Arrivals. IEEE Transactions on Communications. 69 - 1, pp. 365 - 381. 2021.
Jaya Prakash Champati, Ramana R. Avula, Tobias J. Oechtering; James Gross. Minimum Achievable Peak Age of Information Under Service Preemptions and Request Delay. IEEE Journal on Selected Areas in Communications. 39 - 5, pp. 1365 - 1379. 2021.
Jaya Prakash Champati and Hussein Al-Zubaidy and James Gross, “On the distribution of AoI for the GI/GI/1/1 and GI/GI/1/2* systems: Exact expressions and bounds”, in Proc. IEEE INFOCOM, May 2019.

Transient Delay Analysis and Optimization

Most of the research dealing with general network performance analysis using queuing theory consider systems in steady-state. For example, for simple M/M/1 or more general Markovian queuing systems, the steady-state is governed by the (conceptually simple) flow balance equations. In contrast, transient analysis of these systems results in intractable differential equations. Using Stochastic Network Calculus, we derived the end-to-end delay violation probability for a sequence of time-critical packets, given the transient network state (queue backlogs) when the time-critical packets enter the network. Leveraging this analysis we compute good resource allocation strategies for wireless protocols such as WirelessHART to support the QoS requirements of time-critical industrial applications.

S Zoppi, JP Champati, J Gross, W Kellerer, "Scheduling of wireless edge networks for feedback-based interactive applications," IEEE Transactions on Communications, 2022.
Jaya Prakash Champati, Hussein Al-Zubaidy, James Gross. "Transient Analysis for Multi-hop Wireless Networks Under Static Routing," IEEE/ACM Transactions on Networking, 2020.

Computation Offloading Algorithms (PhD work)

Edge computing or fog computing, where computational resources are placed close to (e.g. one hop away) entities that offload computational tasks or data for processing, is a key architectural component of 5G and future wireless networks. Offloading computational tasks from mobile devices to edge servers instead of the cloud results in internet bandwidth savings and circumvents the long delays involved in communicating the data load of the offloaded tasks to a cloud data centre residing somewhere on the internet. Above all, edge computing augments the compute and memory limitations of edge devices. Selected publications below:

Jaya Prakash Champati and Ben Liang. Single Restart with Time Stamps for Parallel Task Processing with Known and Unknown Processors. IEEE Transactions on Parallel and Distributed Systems. 31 - 1, pp. 187 - 200. 2020.
Jaya Prakash Champati and Ben Liang. Semi-Online Algorithms for Computational Task Offloading with Communication Delay. IEEE Transactions on Parallel and Distributed Systems. 28 - 4, pp. 1189 - 1201. 2017.

Google Sites

Report abuse