"The CAPP (Contrastive Audio and Pose Pretraining) model should be available in a few weeks." #21

johndpope · 2024-10-30T12:17:16Z

Joint Optimization:
Primary diffusion loss ensures high-quality generation
CAPP loss ensures better audio-pose alignment
Weighted combination allows control of importance

Training Insights:

CopyWithout CAPP:

Only optimizes for motion prediction
No explicit audio-pose alignment objective

With CAPP:

Direct feedback on alignment quality
Better learning of natural head movements
Improved synchronization with speech patterns

Validation Benefits:

CAPP score provides quantitative metric
Helps identify best checkpoints
Better model selection criteria

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"The CAPP (Contrastive Audio and Pose Pretraining) model should be available in a few weeks." #21

"The CAPP (Contrastive Audio and Pose Pretraining) model should be available in a few weeks." #21

johndpope commented Oct 30, 2024

"The CAPP (Contrastive Audio and Pose Pretraining) model should be available in a few weeks." #21

"The CAPP (Contrastive Audio and Pose Pretraining) model should be available in a few weeks." #21

Comments

johndpope commented Oct 30, 2024