ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs

(ICCV 2023)

1University of California San Diego, 2ByteDance

We present ActorsNeRF, a category-level human actor NeRF model that generalizes to unseen actors in a few-shot setting.

With only a few images (e.g., 30 frames) of an unseen actor in the AIST++ Dataset, ActorsNeRF animates the actor with novel poses from the 3DPW Dataset.



Abstract

While NeRF-based human representations have shown impressive novel view synthesis results, most methods still rely on a large number of images / views for training. In this work, we propose a novel animatable NeRF called ActorsNeRF. It is first pre-trained on diverse human subjects, and then adapted with few-shot monocular video frames for a new actor with unseen poses. Building on previous generalizable NeRFs with parameter sharing using a ConvNet encoder, ActorsNeRF further adopts two human priors to capture the large human appearance, shape, and pose variations. Specifically, in the encoded feature space, we will first align different human subjects in a category-level canonical space, and then align the same human from different frames in an instance-level canonical space for rendering. We quantitatively and qualitatively demonstrate that ActorsNeRF significantly outperforms the existing state-of-the-art on few-shot generalization to new people and poses on multiple datasets.

Video

ActorsNeRF Approach

In order to achieve generalization to novel actors, a category-level NeRF model is first trained on a diverse set of subjects. During the inference phase, we fine-tune the pre-trained category-level NeRF model using only a few images of the target actor, enabling the model to adapt to the specific characteristics of the actor.

Few-shot Generalization

We demonstrate on multiple datasets that, with only a few images (e.g., 30 frames from a monocular video), ActorsNeRF synthesizes novel views of a new person with novel poses .


Different Actors with Synchronized Novel Actions


Same Actor with Different Novel Actions


ZJU-MoCap Dataset



Comparisons

We vary the number of input images to test the novel view synthesis of novel actors with unseen poses, the results suggests that the category-level prior of ActorsNeRF improves the rendering quality over a large few-shot spectrum

10shot



30shot



100shot


BibTeX

@article{mu2023actorsnerf,
                author = {Mu, Jiteng and Sang, Shen
                          and Vasconcelos, Nuno and Wang, Xiaolong},
                title = {ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs},
                booktitle = {ICCV},
                pages = {18391-18401},
                year={2023}}