NIPS_2024_DiMoP3D

Harmonizing Stochasticity and Determinism:

Scene-responsive Diverse Human Motion Prediction

_ _________________________________________________________________

Anonymous author

in submission to NIPS 2024 (do not distribute)

Harmonizing Stochasticity and Determinism:

Scene-responsive Diverse Human Motion Prediction

1. Video

2. Motivation

3. Proposed Method

4. Comparisons with the state-of-the-art baseline

5. Quailititive comparison with BiFU

1. Video

2. Motivation

● Existing approaches to diverse motion prediction concentrate on the stochastic characteristics of human movement, often overlooking the external environment, leading to significant issues such as scene context penetration, and scene inconsistency in predictions when applied to real-world contexts.

● Cross-modal analysis of the observed motion and the scene is needed to undertand potential human intention within 3D scenes.

● Scene-aware motion prediction demands the predicted motion to be consistent with the scene context, including obstable avoiding.

3. Proposed Method

SIF3D integrates two input modalities, 1) past motion sequences, and 2) 3D scene point clouds.

● Context-Aware Intermodal Interpreter identifies interactive objects in the scene, and analyze potential human interest through a cross-modal InterestNet, finally samples an object as movement target based on this analysis.

● Behaviorally-Consistent Stochastic Planner first predicts the human-object interactive poses as the final state of the predicted motion, and then search obstacle-free trajectories from the observation toward destination.

● Self-Prompted Motion Generator diverse human motions while maintain the observation and the planned trajectory through overwriting intermediate results at each denoising step.

● MotionCLIP is introduced to further supervise the predicted motion to be consistent with the target object.

Experiments show that DiMoP3D is able to predict motions with diverse actions and also varies motions toward a deterministic object, while maintain each motion sequence to be physical consistent.