Parametric 3D models have formed a fundamental role in modeling deformable objects, such as human bodies, faces, and hands; however, the construction of such parametric models requires significant manual intervention and domain expertise.
Recently, neural implicit 3D representations have shown great expressibility in capturing 3D shape geometry. We observe that deformable object motion is often semantically structured, and thus propose to learn Structured Implicit Parametric Models (SPAMs) as a deformable object representation that structurally decomposes non-rigid object motion into part-based disentangled representations of shape and pose, with each being represented by deep implicit functions. This enables a structured characterization of object movement, with part decomposition characterizing a lower-dimensional space in which we can establish coarse motion correspondence.
In particular, we can leverage the part decompositions at test time to fit to new depth sequences of unobserved shapes, by establishing part correspondences between the input observation and our learned part spaces; this guides a robust joint optimization between the shape and pose of all parts, even under dramatic motion sequences. Experiments demonstrate that our part-aware shape and pose understanding lead to state-of-the-art performance in reconstruction and tracking of depth sequences of complex deforming object motion.
Here we show how our test-time model fitting converges across optimization steps, guided by our coarse semantic correspondences between input points (left pointcloud in each image) and posed estimates (right). Thanks to this structured characterization of object movement, we can even recover from poor initial pose estimates (e.g., green, right arm in optim step 0). Use the sliders to traverse across optimization steps for this given frame.
Optim Step 0
Optim Step 200
We observe that our part-aware disentanglement of shape and pose provides significant robustness in pose tracking, and in particular, maintains robustness in the absence of pose encoder initialization (which may not be available in scenarios such as generalizing to different sensor inputs).
@article{palafox2021spams,
title={SPAMs: Structured Implicit Parametric Models},
author={Palafox, Pablo and Sarafianos, Nikolaos and Tung, Tony and Dai, Angela},
journal={CVPR},
year={2022}
}