Skip to content

Latest commit

 

History

History
30 lines (20 loc) · 1.6 KB

README.md

File metadata and controls

30 lines (20 loc) · 1.6 KB

audio2face-pytorch

This repository provides PyTorch implementations for audio driven face meshes or blendshape models.
Currently, it supports the following models:

  • Audio2Face
  • VOCA
  • FaceFormer

And the following feature extractors are available:

  • Wav2Vec
  • MFCCExtractor

Dataset

This repository uses VOCASET as the template, which is introduced in 'Capture, Learning, and Synthesis of 3D Speaking Styles' (CVPR 2019).
Additionally, FLAME_sample has been extracted and converted to assets/FLAME_sample.obj and the Renderer has been redesigned. As a result, the psbody library is not required in this repository, which may cause installation issues for Apple Silicon users.

License

VOCA link

References

  • VOCASET ref
  • Cudeiro, Daniel, et al. "Capture, learning, and synthesis of 3D speaking styles." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. ref
  • TimoBolkart/voca ref
  • Fan, Yingruo, et al. "FaceFormer: Speech-Driven 3D Facial Animation with Transformers." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022. ref
  • NVIDIA. Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion. ref