RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data

1Penn State University, 2Roblox
Equal contribution *Work partially completed during an internship at Roblox
NeurIPS 2025

TL;DR: We propose a scalable neural auto-rigging framework for facial meshes of diverse topologies with multiple disconnected components.

RigAnyFace teaser

We present RigAnyFace (RAF), an auto-rigging framework that supports facial meshes of diverse topologies with multiple disconnected components such as eyeballs. These meshes are drawn from diverse sources and cover both humanoid and non-humanoid heads. Given only a neutral facial mesh and explicitly controllable FACS parameters specifying activated action units, RAF accurately deforms the input mesh into corresponding FACS poses, creating an expressive blendshape rig.

Abstract

In this paper, we present RigAnyFace (RAF), a scalable neural auto-rigging framework for facial meshes of diverse topologies, including those with multiple disconnected components. RAF deforms a static neutral facial mesh into industry-standard FACS poses to form an expressive blendshape rig. Deformations are predicted by a triangulation-agnostic surface learning network augmented with our tailored architecture design to condition on FACS parameters and efficiently process disconnected components. For training, we curated a dataset of facial meshes, with a subset meticulously rigged by professional artists to serve as accurate 3D ground truth for deformation supervision. Due to the high cost of manual rigging, this subset is limited in size, constraining the generalization ability of models trained exclusively on it. To address this, we design a 2D supervision strategy for unlabeled neutral meshes without rigs. This strategy increases data diversity and allows for scaled training, thereby enhancing the generalization ability of models trained on this augmented data. Extensive experiments demonstrate that RAF is able to rig meshes of diverse topologies on not only our artist-crafted assets but also in-the-wild samples, outperforming previous works in accuracy and generalizability. Moreover, our method advances beyond prior work by supporting multiple disconnected components, such as eyeballs, for more detailed expression animation.

Applications

RigAnyFace enables various downstream applications:

  • User-Controlled Animation: Artists can directly edit FACS parameters to pose meshes.
  • Video-to-Mesh Retargeting: Transfer facial expressions from videos to 3D meshes.
  • Animating Generated Meshes: Automatically rig meshes from text-to-3D models.

Data Collection

Data Collection

We collect a diverse set of artist-crafted facial meshes for model training and evaluation. (a) (i) The dataset includes meshes with multiple disconnected components (e.g., eyeballs) and diverse facial shapes. (ii) A subset of neutral head meshes is annotated with blendshape rigs by professional artists. (iii) To expand the dataset, we apply a head interpolation strategy based on standardized UV layouts. (b) For the remaining unrigged samples, we generate 2D supervision. Given a posed image rendered from a rigged head and a neutral image from an unrigged head, a 2D animation model transfers the expression while preserving identity. A flow estimation model then predicts pixel offsets between the neutral and synthesized posed images as 2D displacements.

Method Overview

Method Architecture

(a) Given a neutral facial mesh, our deformation model predicts the 3D displacement needed to deform the mesh into different expressions based on the input FACS vector. During training, 2D supervision is utilized for both rigged and unrigged heads, while 3D supervision is exclusively applied to rigged heads. (b) We modify the original diffusion block in DiffusionNet to support the FACS vector as an additional conditional inputs (left). Additionally, we design a global encoder that processes vertex positions and normals of the neutral facial mesh to capture holistic information across disconnected components (right).

Results

Artist-Crafted Meshes

Our method achieves high-quality rigging results on diverse facial meshes, including humanoid and non-humanoid characters with multiple disconnected components.

Artist-Crafted Results Gallery

Qualitative results on our artist-crafted unrigged heads.

Artist-Crafted Results Gallery

Comparison with Baseline Methods. Reference mesh and corresponding points are provided for Deformation Transfer.

In-the-Wild Meshes

In-the-Wild Results Gallery

RigAnyFace generalizes well to in-the-wild facial meshes from ICT FaceKit, Objaverse, and CGTrader compared with piror art NFR.

BibTeX

@inproceedings{ma2025riganyface,
  title     = {RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data},
  author    = {Ma, Wenchao and Kneubuehler, Dario and Chu, Maurice and Sachs, Ian and Jiang, Haomiao and Huang, Sharon X.},
  booktitle = {39th Conference on Neural Information Processing Systems (NeurIPS)},
  year      = {2025}
}

Acknowledgement

We thank Hsueh-Ti Derek Liu, Chrystiano Araújo, and Jinseok Bae for proofreading the draft and providing helpful comments, and Jihyun Yoon for curating the dataset. We also thank the authors of DiffusionNet, MegActor, and NFR for their codes.