Jointly Learning Human Skeletal Control and Motion Prediction from Single-view Videos under Geometric Constraints by Leveraging SDE

Liyan Chen, Charless Fowlkes

Dataset

Abstract

Motion prediction is a highly ill-posed problem due to the high dimensions of the plausible trajectory space. Common approaches attempt to introduce probabilistic modeling for the pose space, and yet suffer from divergence and instability. Furthermore, we argue motion prediction without environment geometric constraints is not well-defined with limited implications. In this work, we formulate the problem as learning a predictive model for human motion from single-view videos. Unlike fully deep models, we adopted a more physical approach – deriving parameterized differential equations from geometric mechanics, constraining the solution with geometric boundaries from the scene, and learning the human skeletal control as a Neural Differential Equation with generalizations to Stochastic Differential Equations to model the intrinsic stochasticity. Learning human skeletal control itself is a valuable task and we argue that jointly learning two tasks benefits both of them: human skeletal controls have repeatable patterns and serve as a strong prior as demonstrated by past works; an analytical predictive model facilitate learning a better controller; such hybrid model with limited parameterization of the physics-based model alleviates the burden of estimation the Lagrangian state of the system. We demonstrate our method with the Geometric Pose Affordance (GPA) 1.2 Dataset by showing it outperforms current motion prediction methods and it can generalize maneuvering to new geometry scenes.

Publications

Jointly Learning Human Skeletal Control and Motion Prediction from Single-view Videos under Geometric Constraints by Leveraging SDE
Liyan Chen, and Charless Fowlkes. In Preparation
Implementation Supplementary