DIPP: Discriminative Impact Point Predictor for Catching Diverse In-Flight Objects

Ngoc Huy Nguyen, Kazuki Shibata, Takamitsu Matsubara

Nara Institute of Science and Technology (NAIST), Robot Learning Laboratory

Youtube

arXiv

Dataset

Abstract

In this study, we address the problem of in-flight object catching using a quadruped robot with a basket. Our objective is to accurately predict the impact point, defined as the object's landing position. This task poses two key challenges: the absence of public datasets capturing diverse objects under unsteady aerodynamics, which are essential for training reliable predictors; and the difficulty of accurate early-stage impact point prediction when trajectories appear similar across objects. To overcome these issues, we construct a real-world dataset of 8,000 trajectories from 20 objects, providing a foundation for advancing in-flight object catching under complex aerodynamics. We then propose the Discriminative Impact Point Predictor (DIPP), consisting of two modules: (i) a Discriminative Feature Embedding (DFE) that separates trajectories by dynamics to enable early-stage discrimination and generalization, and (ii) an Impact Point Predictor (IPP) that estimates the impact point from these features. Two IPP variants are implemented: an Neural Acceleration Estimator (NAE)-based method that predicts trajectories and derives the impact point, and a Direct Point Estimator (DPE)-based method that directly outputs it. Experimental results show that our dataset is more diverse and complex than existing dataset, and that our method outperforms baselines on both 15 seen and 5 unseen objects. Furthermore, we show that improved early-stage prediction enhances catching success in simulation and demonstrate the effectiveness of our approach through real-world experiments.

Real-world dataset for in-flight objects

In this study, we construct a real-world dataset that captures complex aerodynamic effects across diverse objects. The dataset comprises 2,000 measured trajectories from 20 objects (100 per object), which are further expanded to 8,000 trajectories through translational and rotational augmentation. This dataset provides the foundation for developing and evaluating our prediction method and contributes to advancing in-flight object catching under complex aerodynamics.

Proposed framework

The DIPP framework consists of two modules: the DFE and IPP modules. The DFE maps historical motion states into a feature space where trajectories with similar dynamics are mapped close together and dissimilar ones are mapped farther apart, enabling discriminative representation and generalization to unseen objects. The IPP estimates the impact point from these features. We consider two variants of the IPP: one based on the NAE method, which learns dynamics to predict trajectories and derives the impact point, and another DPE-based method, which directly outputs the impact point from historical states.

Demonstration Video

To demonstrate the practical applicability of our approach, we conducted real-world catching experiments using a quadruped robot. We tested two seen objects and three unseen objects, and compared our method (DIPP-NAE) with a baseline (NAE) under similar conditions, keeping the robot's initial pose, throwing motion, and impact point as consistent as possible.

soft_frisbee-NAE.mp4

soft_frisbee-DIPP.mp4

boomerang-NAE.mp4

boomerang-DIPP.mp4

pinwheel-NAE.mp4

pinwheel-DIPP.mp4

fan-NAE.mp4

fan-DIPP.mp4

plane-NAE.mp4

plane-DIPP.mp4

Page updated

Google Sites

Report abuse