A robot must balance task accuracy and compute cost, such as energy or latency, when choosing between an array of heterogeneous compute resources. Our interpretable model selection policy πselect leverages the statistical correlation between fast and slow computation models ffast and fslow to dynamically decide an appropriate model to invoke.
A robot can invoke heterogeneous computation resources such as CPUs, cloud GPU servers, or even human computation for achieving a high-level goal. The problem of invoking an appropriate computation model so that it will successfully complete a task while keeping its compute costs within a budget is called a model selection problem. In this paper, we present an optimal solution to the model selection problem with two compute models, the first being fast but less accurate, and the second being slow but more accurate. The main insight behind our solution is that a robot should invoke the slower compute model only when the benefits from the gain in accuracy outweigh the computational costs. We show that such cost-benefit analysis can be performed by leveraging the statistical correlation between the accuracy of fast and slow compute models. We apply our approach on diverse problems such as high-dimensional linear regression, perception with deep neural networks, and safe navigation of a simulated Mars Rover to demonstrate its broad applicability.
To simultaneously achieve high task accuracy while minimizing the cost of compute, we introduce a per-timestep reward.
Informally: Reward = - α (Loss) - β (Cost). Where α and β are user given constants.
Left: DNN Cost vs. Accuracy: Cost vs. Loss trade-off achieved by various offloading strategies. Fast has low cost but low accuracy, Slow has high accuracy but high cost, and Random lies sub-optimally in the middle (with a high variance). Only our offloading policy (Our Selector), proposed in this paper, achieves a delicate balance by minimizing cost and maximizing accuracy. Right: DNN Rewards: Rewards gathered by various offloading strategies. Clearly our model selection policy (Our Selector) achieves the maximum reward compared to other benchmark policies. Policies like Random have a higher variance as they does not leverage the relationship between the fast and the slow model, and therefore has no guarantee provided on the gathered reward for its decision.
Left: Mars Rover Reachable Sets. Green: Reachable Set computed by the slow model to detect collision; Blue: Reachable Set computed by the fast model to detect collision. The safety of the path is determined by using the computed reachable set (accounting for uncertainties). Note that our model selection policy uses the slow model (to determine the safety) only when the rover makes tricky maneuvers. And for the cases where the rover is far from the obstacles, the fast model is sufficient to determine its safety. Thus, this plot shows how our model selection policy delicately balances cost and accuracy. Right: Reachable Set Relationship. Blue: Reachable Set computed by the fast model; Black: Reachable Set computed by the slow model; Cyan: Over-approximated reachable set of the slow model as computed by the fast model. Note that, our model selection policy offloads only when the set represented in cyan intersects with an obstacle (red). Intuitively this means that the model selection policy invokes the slow model only when it suspects a collision (by leveraging the relationship between the fast and the slow model).
Same as the previous figure.
Left: (Reachable Set Cost vs. Accuracy) (Scenario 1) Cost vs. Loss trade-off achieved by various offloading strategies over a period of time. This shows that our policy (Our Selector) delicately balances accuracy and efficiency by querying the larger model only when it is close to the obstacles and it suspects a collision. Whereas, Fast has low cost but low accuracy, Slow has high accuracy but high cost, and Random lies sub-optimally in the middle (with a high variance). Right: (Reachable Set Rewards) Rewards (Eq. 1) gathered by various offloading strategies. Clearly our model selection policy (Our Selector) achieves the maximum reward compared to other benchmark policies. Policies like Random have a higher variance as this does not leverage the relationship between the fast and the slow model, and therefore no guarantee provided on the gathered reward for its decision.
Rewards gathered by various offloading strategies. Clearly our model selection policy Our Selector achieves the maximum reward compared to other benchmark policies. Policies like Random has a higher variance as this does not leverage the relationship between the fast and the slow model, and therefore has no guarantee provided on the gathered reward for its decision.
Cost vs. Loss trade-off achieved by various offloading strategies. Fast has low cost but low accuracy, Slow has high accuracy but high cost, and Random lies sub-optimally in the middle (with a high variance). Only our offloading policy (Our Selector), proposed in this paper, achieves a delicate balance by minimizing cost and maximizing accuracy.
Frequency of offloading for various policies. Note that Fast never offloads, Slow always offloads, and Random offloads almost 50% of the time. Whereas the frequency of our optimal model selection policy (Our Selector) is delicately balanced to achieve best cost vs. loss performance.
Frequency of offloading for various policies. Note that Fast never offloads, Slow always offloads, and Random offloads almost 50% of the time. Whereas the frequency of our optimal model selection policy (Our Selector) is delicately balanced to achieve best cost vs. loss performance.
This technical report discusses the work that has been used in Sec IV. C