Awesome Human Motion

An aggregation of human motion understanding research.

Reviews & Surveys

Motion Generation Motion Editing Motion Stylization

Human-Object Interaction Human-Scene Interaction Human-Human Interaction

Datasets Humanoid Bio-stuff

Reviews & Surveys

(ArXiv 2025) Gu et al. Humanoid Locomotion and Manipulation: Current Progress and Challenges in Control, Planning, and Learning.
(ArXiv 2024) Zhao et al. Motion Generation Review: Exploring Deep Learning for Lifelike Animation with Manifold, Zhao et al.
(T-PAMI 2023) Zhu et al. Human Motion Generation: A Survey, Zhu et al.

Motion Generation, Text/Speech/Music-Driven

(ICRA 2025) MotionGlot. MotionGlot: A Multi-Embodied Motion Generation Model, Harithas et al.
(ICLR 2025) CLoSD. CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control, Tevet et al.
(ICLR 2025) DART. DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control, Zhao et al.
(ICLR 2025) Motion-Agent. Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs, Wu et al.
(ArXiv 2025) MotionPCM. MotionPCM: Real-Time Motion Synthesis with Phased Consistency Model, Jiang et al.
(ArXiv 2025) GestureLSM. GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling, Liu et al.
(ArXiv 2025) Free-T2M. Free-T2M: Frequency Enhanced Text-to-Motion Diffusion Model With Consistency Loss, Chen et al.
(ArXiv 2025) LS-GAN. LS-GAN: HumanMotion Synthesis with Latent-space GANs, Amballa et al.
(ArXiv 2025) FlexMotion. FlexMotion: Lightweight, Physics-Aware, and Controllable Human Motion Generation, Tashakori et al.
(ArXiv 2025) PackDiT. PackDiT: Joint Human Motion and Text Generation via Mutual Prompting, Jiang et al.
(3DV 2025) Unimotion. Unimotion: Unifying 3D Human Motion Synthesis and Understanding, Li et al.
(AAAI 2025) UniMuMo. UniMuMo: Unified Text, Music and Motion Generation, Yang et al.
(AAAI 2025) ALERT-Motion. Autonomous LLM-Enhanced Adversarial Attack for Text-to-Motion, Miao et al.
(AAAI 2025) MotionCraft. MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls, Bian et al.
(AAAI 2025) Light-T2M. Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation, Zeng et al.
(WACV 2025) ReinDiffuse. ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model, Han et al.
(WACV 2025) MoRAG. MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion, Shashank et al.
(ArXiv 2024) SemTalk. SemTalk: Holistic Co-speech Motion Generation with Frame-level Semantic Emphasis, Zhang et al.
(ArXiv 2024) InterDance. InterDance: Reactive 3D Dance Generation with Realistic Duet Interactions, Li et al.
(ArXiv 2024) ScaMo. ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model, Lu et al.
(ArXiv 2024) EnergyMoGen. EnergyMoGen: Compositional Human Motion Generation with Energy-Based Diffusion Model in Latent Space, Zhang et al.
(ArXiv 2024) Move in 2D. Move-in-2D: 2D-Conditioned Human Motion Generation, Huang et al.
(ArXiv 2024) Motion-2-to-3. Motion-2-to-3: Leveraging 2D Motion Data to Boost 3D Motion Generation, Pi et al.
(ArXiv 2024) Languate of Motion. The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human Motion, Chen et al.
(ArXiv 2024) Mogo. Mogo: RQ Hierarchical Causal Transformer for High-Quality 3D Human Motion Generation, Fu et al.
(ArXiv 2024) CoMA. CoMA: Compositional Human Motion Generation with Multi-modal Agents, Sun et al.
(ArXiv 2024) SoPo. SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization, Tan et al.
(ArXiv 2024) RMD. RMD: A Simple Baseline for More General Human Motion Generation via Training-free Retrieval-Augmented Motion Diffuse, Liao et al.
(ArXiv 2024) BiPO. BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis, Hong et al.
(ArXiv 2024) AtoM. AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward, Han et al.
(ArXiv 2024) MVLift. Lifting Motion to the 3D World via 2D Diffusion, Li et al.
(ArXiv 2024) DisCoRD. DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding, Cho et al.
(ArXiv 2024) MoTe. MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks, Wue et al.
(ArXiv 2024) InfiniDreamer. InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation, Zhuo et al.
(ArXiv 2024) FTMoMamba. FTMoMamba: Motion Generation with Frequency and Text State Space Models, Li et al.
(ArXiv 2024) MARDM. Rethinking Diffusion for Text-Driven Human Motion Generation, Meng et al.
(ArXiv 2024) KinMo. KinMo: Kinematic-aware Human Motion Understanding and Generation, Zhang et al.
(ArXiv 2024) LLaMo. Human Motion Instruction Tuning, Li et al.
(ArXiv 2024) Morph. Morph: A Motion-free Physics Optimization Framework for Human Motion Generation, Li et al.
(ArXiv 2024) KMM. KMM: Key Frame Mask Mamba for Extended Motion Generation, Zhang et al.
(ArXiv 2024) MotionGPT-2. MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding, Wang et al.
(ArXiv 2024) Lodge++. Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns, Li et al.
(ArXiv 2024) MotionCLR. MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms, Chen et al.
(ArXiv 2024) LEAD. LEAD: Latent Realignment for Human Motion Diffusion, Andreou et al.
(ArXiv 2024) Leite et al. Enhancing Motion Variation in Text-to-Motion Models via Pose and Video Conditioned Editing, Leite et al.
(ArXiv 2024) MotionRL. MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning, Liu et al.
(ArXiv 2024) MotionLLM. MotionLLM: Understanding Human Behaviors from Human Motions and Videos, Chen et al.
(ArXiv 2024) Wang et al. Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models, Wang et al.
(ArXiv 2024) T2M-X. T2M-X: Learning Expressive Text-to-Motion Generation from Partially Annotated Data, Liu et al.
(ArXiv 2024) Mandelli et al. Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models, Mandelli et al.
(ArXiv 2024) BAD. BAD: Bidirectional Auto-regressive Diffusion for Text-to-Motion Generation, Hosseyni et al.
(ArXiv 2024) synNsync. Synergy and Synchrony in Couple Dances, Manukele et al.
(EMNLP 2024) Dong et al. Word-Conditioned 3D American Sign Language Motion Generation, Dong et al.
(NeurIPS D&B 2024) Text to blind motion. Text to Blind Motion, Kim et al.
(NeurIPS 2024) UniMTS. UniMTS: Unified Pre-training for Motion Time Series, Zhang et al.
(NeurIPS 2024) Christopher et al.. Constrained Synthesis with Projected Diffusion Models, Christopher et al.
(NeurIPS 2024) MoMu-Diffusion. MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence, You et al.
(NeurIPS 2024) MoGenTS. MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling, Yuan et al.
(NeurIPS 2024) M3GPT. M3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation, Luo et al.
(NeurIPS Workshop 2024) Bikov et al. Fitness Aware Human Motion Generation with Fine-Tuning, Bikov et al.
(ICPR 2024) FG-MDM. FG-MDM: Towards Zero-Shot Human Motion Generation via ChatGPT-Refined Descriptions, Shi et al.
(ACM MM 2024) SynTalker. Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion Generation, Chen et al.
(ACM MM 2024) L3EM. Towards Emotion-enriched Text-to-Motion Generation via LLM-guided Limb-level Emotion Manipulating. Yu et al.
(ACM MM 2024) StableMoFusion. StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework, Huang et al.
(ACM MM 2024) SATO. SATO: Stable Text-to-Motion Framework, Chen et al.
(ICANN 2024) PIDM. PIDM: Personality-Aware Interaction Diffusion Model for Gesture Generation, Shibasaki et al.
(HFES 2024) Macwan et al. High-Fidelity Worker Motion Simulation With Generative AI, Macwan et al.
(ECCV 2024) Jin et al.. Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation, Jin et al.
(ECCV 2024) Motion Mamba. Motion Mamba: Efficient and Long Sequence Motion Generation, Zhong et al.
(ECCV 2024) EMDM. EMDM: Efficient Motion Diffusion Model for Fast, High-Quality Human Motion Generation, Zhou et al.
(ECCV 2024) CoMo. CoMo: Controllable Motion Generation through Language Guided Pose Code Editing, Huang et al.
(ECCV 2024) CoMusion. CoMusion: Towards Consistent Stochastic Human Motion Prediction via Motion Diffusion, Sun et al.
(ECCV 2024) Shan et al.. Towards Open Domain Text-Driven Synthesis of Multi-Person Motions, Shan et al.
(ECCV 2024) ParCo. ParCo: Part-Coordinating Text-to-Motion Synthesis, Zou et al.
(ECCV 2024) Sampieri et al.. Length-Aware Motion Synthesis via Latent Diffusion, Sampieri et al.
(ECCV 2024) ChroAccRet. Chronologically Accurate Retrieval for Temporal Grounding of Motion-Language Models, Fujiwara et al.
(ECCV 2024) MHC. Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs, Liu et al.
(ECCV 2024) ProMotion. Plan, Posture and Go: Towards Open-vocabulary Text-to-Motion Generation, Liu et al.
(ECCV 2024) FreeMotion. FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models, Zhang et al.
(ECCV 2024) Text Motion Translator. Text Motion Translator: A Bi-Directional Model for Enhanced 3D Human Motion Generation from Open-Vocabulary Descriptions, Qian et al.
(ECCV 2024) FreeMotion. FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis, Fan et al.
(ECCV 2024) Kinematic Phrases. Bridging the Gap between Human Motion and Action Semantics via Kinematic Phrases, Liu et al.
(ECCV 2024) MotionChain. MotionChain: Conversational Motion Controllers via Multimodal Prompts, Jiang et al.
(ECCV 2024) SMooDi. SMooDi: Stylized Motion Diffusion Model, Zhong et al.
(ECCV 2024) BAMM. BAMM: Bidirectional Autoregressive Motion Model, Pinyoanuntapong et al.
(ECCV 2024) MotionLCM. MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model, Dai et al.
(ECCV 2024) Ren et al.. Realistic Human Motion Generation with Cross-Diffusion Models, Ren et al.
(ECCV 2024) M2D2M. M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models, Chi et al.
(ECCV 2024) Large Motion Model. Large Motion Model for Unified Multi-Modal Motion Generation, Zhang et al.
(ECCV 2024) TesMo. Generating Human Interaction Motions in Scenes with Text Control, Yi et al.
(ECCV 2024) TLcontrol. TLcontrol: Trajectory and Language Control for Human Motion Synthesis, Wan et al.
(ICME 2024) ExpGest. ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text Guidance, Cheng et al.
(ICME Workshop 2024) Chen et al. Anatomically-Informed Vector Quantization Variational Auto-Encoder for Text-to-Motion Generation, Chen et al.
(ICML 2024) HumanTOMATO. HumanTOMATO: Text-aligned Whole-body Motion Generation, Lu et al.
(ICML 2024) GPHLVM. Bringing Motion Taxonomies to Continuous Domains via GPLVM on Hyperbolic Manifolds, Jaquier et al.
(SIGGRAPH 2024) CondMDI. Flexible Motion In-betweening with Diffusion Models, Cohan et al.
(SIGGRAPH 2024) CAMDM. Taming Diffusion Probabilistic Models for Character Control, Chen et al.
(SIGGRAPH 2024) LGTM. LGTM: Local-to-Global Text-Driven Human Motion Diffusion Models, Sun et al.
(SIGGRAPH 2024) TEDi. TEDi: Temporally-Entangled Diffusion for Long-Term Motion Synthesis, Zhang et al.
(SIGGRAPH 2024) A-MDM. Interactive Character Control with Auto-Regressive Motion Diffusion Models, Shi et al.
(SIGGRAPH 2024) Starke et al.. Categorical Codebook Matching for Embodied Character Controllers, Starke et al.
(SIGGRAPH 2024) SuperPADL. SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation, Juravsky et al.
(CVPR 2024) ProgMoGen. Programmable Motion Generation for Open-set Motion Control Tasks, Liu et al.
(CVPR 2024) PACER+. PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios, Wang et al.
(CVPR 2024) AMUSE. Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion, Chhatre et al.
(CVPR 2024) Liu et al.. Towards Variable and Coordinated Holistic Co-Speech Motion Generation, Liu et al.
(CVPR 2024) MAS. MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion, Kapon et al.
(CVPR 2024) WANDR. WANDR: Intention-guided Human Motion Generation, Diomataris et al.
(CVPR 2024) MoMask. MoMask: Generative Masked Modeling of 3D Human Motions, Guo et al.
(CVPR 2024) ChapPose. ChatPose: Chatting about 3D Human Pose, Feng et al.
(CVPR 2024) AvatarGPT. AvatarGPT: All-in-One Framework for Motion Understanding, Planning, Generation and Beyond, Zhou et al.
(CVPR 2024) MMM. MMM: Generative Masked Motion Model, Pinyoanuntapong et al.
(CVPR 2024) AAMDM. AAMDM: Accelerated Auto-regressive Motion Diffusion Model, Li et al.
(CVPR 2024) OMG. OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers, Liang et al.
(CVPR 2024) FlowMDM. FlowMDM: Seamless Human Motion Composition with Blended Positional Encodings, Barquero et al.
(CVPR 2024) Digital Life Project. Digital Life Project: Autonomous 3D Characters with Social Intelligence, Cai et al.
(CVPR 2024) EMAGE. EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling, Liu et al.
(CVPR Workshop 2024) STMC. Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation, Petrovich et al.
(CVPR Workshop 2024) InstructMotion. Exploring Text-to-Motion Generation with Human Preference, Sheng et al.
(ICLR 2024) Single Motion Diffusion. Raab et al.
(ICLR 2024) NeRM. NeRM: Learning Neural Representations for High-Framerate Human Motion Synthesis, Wei et al.
(ICLR 2024) PriorMDM. PriorMDM: Human Motion Diffusion as a Generative Prior, Shafir et al.
(ICLR 2024) OmniControl. OmniControl: Control Any Joint at Any Time for Human Motion Generation, Xie et al.
(ICLR 2024) Adiya et al.. Bidirectional Temporal Diffusion Model for Temporally Consistent Human Animation, Adiya et al.
(ICLR 2024) Duolando. Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment, Li et al.
(AAAI 2024) HuTuDiffusion. HuTuMotion: Human-Tuned Navigation of Latent Motion Diffusion Models with Minimal Feedback, Han et al.
(AAAI 2024) AMD. AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion, Jing et al.
(AAAI 2024) MotionMix. MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation, Hoang et al.
(AAAI 2024) B2A-HDM. Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model, Xie et al.
(AAAI 2024) Everything2Motion. Everything2Motion: Synchronizing Diverse Inputs via a Unified Framework for Human Motion Synthesis, Fan et al.
(AAAI 2024) MotionGPT. MotionGPT: Finetuned LLMs are General-Purpose Motion Generators, Zhang et al.
(AAAI 2024) Dong et al. Enhanced Fine-grained Motion Diffusion for Text-driven Human Motion Synthesis, Dong et al.
(AAAI 2024) UNIMASKM. A Unified Masked Autoencoder with Patchified Skeletons for Motion Synthesis, Mascaro et al.
(AAAI 2024) B2A-HDM. Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model, Xie et al.
(TPAMI 2024) GUESS. GUESS: GradUally Enriching SyntheSis for Text-Driven Human Motion Generation, Gao et al.
(WACV 2024) Xie et al.. Sign Language Production with Latent Motion Transformer, Xie et al.
(NeurIPS 2023) GraphMotion. Act As You Wish: Fine-grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs, Jin et al.
(NeurIPS 2023) MotionGPT. MotionGPT: Human Motion as Foreign Language, Jiang et al.
(NeurIPS 2023) FineMoGen. FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing, Zhang et al.
(NeurIPS 2023) InsActor. InsActor: Instruction-driven Physics-based Characters, Ren et al.
(ICCV 2023) AttT2M. AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism, Zhong et al.
(ICCV 2023) TMR. TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis, Petrovich et al.
(ICCV 2023) MAA. Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation, Azadi et al.
(ICCV 2023) PhysDiff. PhysDiff: Physics-Guided Human Motion Diffusion Model, Yuan et al.
(ICCV 2023) ReMoDiffusion. ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model, Zhang et al.
(ICCV 2023) BelFusion. BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction, Barquero et al.
(ICCV 2023) GMD. GMD: Guided Motion Diffusion for Controllable Human Motion Synthesis, Karunratanakul et al.
(ICCV 2023) HMD-NeMo. HMD-NeMo: Online 3D Avatar Motion Generation From Sparse Observations, Aliakbarian et al.
(ICCV 2023) SINC. SINC: Spatial Composition of 3D Human Motions for Simultaneous Action Generation, Athanasiou et al.
(ICCV 2023) Kong et al.. Priority-Centric Human Motion Generation in Discrete Latent Space, Kong et al.
(ICCV 2023) FgT2M. Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model, Wang et al.
(ICCV 2023) EMS. Breaking The Limits of Text-conditioned 3D Motion Synthesis with Elaborative Descriptions, Qian et al.
(SIGGRAPH 2023) GenMM. Example-based Motion Synthesis via Generative Motion Matching, Li et al.
(SIGGRAPH 2023) GestureDiffuCLIP. GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents, Ao et al.
(SIGGRAPH 2023) BodyFormer. BodyFormer: Semantics-guided 3D Body Gesture Synthesis with Transformer, Pang et al.
(SIGGRAPH 2023) Alexanderson et al.. Listen, denoise, action! Audio-driven motion synthesis with diffusion models, Alexanderson et al.
(CVPR 2023) AGroL. Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model, Du et al.
(CVPR 2023) TALKSHOW. Generating Holistic 3D Human Motion from Speech, Yi et al.
(CVPR 2023) T2M-GPT. T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations, Zhang et al.
(CVPR 2023) UDE. UDE: A Unified Driving Engine for Human Motion Generation, Zhou et al.
(CVPR 2023) OOHMG. Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training, Lin et al.
(CVPR 2023) EDGE. EDGE: Editable Dance Generation From Music, Tseng et al.
(CVPR 2023) MLD. Executing your Commands via Motion Diffusion in Latent Space, Chen et al.
(CVPR 2023) MoDi. MoDi: Unconditional Motion Synthesis from Diverse Data, Raab et al.
(CVPR 2023) MoFusion. MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis, Dabral et al.
(CVPR 2023) Mo et al.. Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation, Mo et al.
(ICLR 2023) HMDM. MDM: Human Motion Diffusion Model, Tevet et al.
(TPAMI 2023) MotionDiffuse. MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model, Zhang et al.
(TPAMI 2023) Bailando++. Bailando++: 3D Dance GPT with Choreographic Memory, Li et al.
(ArXiv 2023) UDE-2. A Unified Framework for Multimodal, Multi-Part Human Motion Synthesis, Zhou et al.
(ArXiv 2023) Motion Script. MotionScript: Natural Language Descriptions for Expressive 3D Human Motions, Yazdian et al.
(NeurIPS 2022) NeMF. NeMF: Neural Motion Fields for Kinematic Animation, He et al.
(SIGGRAPH Asia 2022) PADL. PADL: Language-Directed Physics-Based Character, Juravsky et al.
(SIGGRAPH Asia 2022) Rhythmic Gesticulator. Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings, Ao et al.
(3DV 2022) TEACH. TEACH: Temporal Action Composition for 3D Human, Athanasiou et al.
(ECCV 2022) Implicit Motion. Implicit Neural Representations for Variable Length Human Motion Generation, Cervantes et al.
(ECCV 2022) Zhong et al.. Learning Uncoupled-Modulation CVAE for 3D Action-Conditioned Human Motion Synthesis, Zhong et al.
(ECCV 2022) MotionCLIP. MotionCLIP: Exposing Human Motion Generation to CLIP Space, Tevet et al.
(ECCV 2022) PoseGPT. PoseGPT: Quantizing human motion for large scale generative modeling, Lucas et al.
(ECCV 2022) TEMOS. TEMOS: Generating diverse human motions from textual descriptions, Petrovich et al.
(ECCV 2022) TM2T. TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts, Guo et al.
(SIGGRAPH 2022) AvatarCLIP. AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars, Hong et al.
(SIGGRAPH 2022) DeepPhase. Deepphase: Periodic autoencoders for learning motion phase manifolds, Starke et al.
(CVPR 2022) Guo et al.. Generating Diverse and Natural 3D Human Motions from Text, Guo et al.
(CVPR 2022) Bailando. Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory, Li et al.
(ICCV 2021) ACTOR. Action-Conditioned 3D Human Motion Synthesis with Transformer VAE, Petrovich et al.
(ICCV 2021) AIST++. AI Choreographer: Music Conditioned 3D Dance Generation with AIST++, Li et al.
(SIGGRAPH 2021) Starke et al.. Neural animation layering for synthesizing martial arts movements, Starke et al.
(CVPR 2021) MOJO. We are More than Our Joints: Predicting how 3D Bodies Move, Zhang et al.
(ECCV 2020) DLow. DLow: Diversifying Latent Flows for Diverse Human Motion Prediction, Yuan et al.
(SIGGRAPH 2020) Starke et al.. Local motion phases for learning multi-contact character movements, Starke et al.

Motion Editing

(SIGGRAPH Asia 2024) MotionFix. MotionFix: Text-Driven 3D Human Motion Editing, Athanasiou et al.
(NeurIPS 2024) CigTime. CigTime: Corrective Instruction Generation Through Inverse Motion Editing, Fang et al.
(SIGGRAPH 2024) Iterative Motion Editing. Iterative Motion Editing with Natural Language, Goel et al.
(CVPR 2024) DNO. DNO: Optimizing Diffusion Noise Can Serve As Universal Motion Priors, Karunratanakul et al.

Motion Stylization

(ArXiv 2024) MulSMo. MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow, Li et al.
(TSMC 2024) D-LORD. D-LORD for Motion Stylization, Gupta et al.
(ECCV 2024) HUMOS. HUMOS: Human Motion Model Conditioned on Body Shape, Tripathi et al.
(SIGGRAPH 2024) SMEAR. SMEAR: Stylized Motion Exaggeration with ARt-direction, Basset et al.
(CVPR 2024) MCM-LDM. Arbitrary Motion Style Transfer with Multi-condition Motion Latent Diffusion Model, Song et al.
(CVPR 2024) MoST. MoST: Motion Style Transformer between Diverse Action Contents, Kim et al.
(ICLR 2024) GenMoStyle. Generative Human Motion Stylization in Latent Space, Guo et al.

Human-Object Interaction

(ArXiv 2025) MobileH2R. MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data, Wang et al.
(AAAI 2025) DiffGrasp. Diffgrasp: Whole-Body Grasping Synthesis Guided by Object Motion Using a Diffusion Model, Zhang et al.
(3DV 2025) Paschalidis et al. 3D Whole-body Grasp Synthesis with Directional Controllability, Paschalidis et al.
(3DV 2025) InterTrack. InterTrack: Tracking Human Object Interaction without Object Templates, Xie et al.
(3DV 2025) FORCE. FORCE: Dataset and Method for Intuitive Physics Guided Human-object Interaction, Zhang et al.
(ArXiv 2024) SyncDiff. SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis, He et al.
(ArXiv 2024) CHOICE. CHOICE: Coordinated Human-Object Interaction in Cluttered Environments for Pick-and-Place Actions, Lu et al.
(ArXiv 2024) TriDi. TriDi: Trilateral Diffusion of 3D Humans, Objects and Interactions, Petrov et al.
(ArXiv 2024) OOD-HOI. OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains, Zhang et al.
(ArXiv 2024) COLLAGE. COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models, Daiya et al.
(ArXiv 2024) SMGDiff. SMGDiff: Soccer Motion Generation using diffusion probabilistic models, Yang et al.
(ArXiv 2024) SkillMimic. SkillMimic: Learning Reusable Basketball Skills from Demonstrations, Wang et al.
(ArXiv 2024) CORE4D. CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement, Zhang et al.
(ArXiv 2024) Wu et al. Human-Object Interaction from Human-Level Instructions, Wu et al.
(NeurIPS 2024) HumanVLA. HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid, Xu et al.
(NeurIPS 2024) OmniGrasp. Grasping Diverse Objects with Simulated Humanoids, Luo et al.
(NeurIPS 2024) EgoChoir. EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views, Yang et al.
(NeurIPS 2024) CooHOI. CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics, Gao et al.
(NeurIPS 2024) InterDreamer. InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction, Xu et al.
(NeurIPS 2024) PiMForce. Posture-Informed Muscular Force Learning for Robust Hand Pressure Estimation, Seo et al.
(ECCV 2024) InterFusion. InterFusion: Text-Driven Generation of 3D Human-Object Interaction, Dai et al.
(ECCV 2024) CHOIS. Controllable Human-Object Interaction Synthesis, Li et al.
(ECCV 2024) F-HOI. F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions, Yang et al.
(ECCV 2024) HIMO. HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects, Lv et al.
(SIGGRAPH 2024) PhysicsPingPong. Strategy and Skill Learning for Physics-based Table Tennis Animation, Wang et al.
(CVPR 2024) NIFTY. NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis, Kulkarni et al.
(CVPR 2024) HOI Animator. HOIAnimator: Generating Text-prompt Human-object Animations using Novel Perceptive Diffusion Models, Son et al.
(CVPR 2024) CG-HOI. CG-HOI: Contact-Guided 3D Human-Object Interaction Generation, Diller et al.
(IJCV 2024) InterCap. InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction, Huang et al.
(3DV 2024) Phys-Fullbody-Grasp. Physically Plausible Full-Body Hand-Object Interaction Synthesis, Braun et al.
(3DV 2024) GRIP. GRIP: Generating Interaction Poses Using Spatial Cues and Latent Consistency, Taheri et al.
(AAAI 2024) FAVOR. Favor: Full-Body AR-driven Virtual Object Rearrangement Guided by Instruction Text, Li et al.
(SIGGRAPH Asia 2023) OMOMO. Object Motion Guided Human Motion Synthesis, Li et al.
(ICCV 2023) CHAIRS. Full-Body Articulated Human-Object Interaction, Jiang et al.
(ICCV 2023) HGHOI. Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models, Pi et al.
(ICCV 2023) InterDiff. InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion, Xu et al.
(CVPR 2023) Object Pop Up. Object pop-up: Can we infer 3D objects and their poses from human interactions alone? Petrov et al.
(CVPR 2023) ARCTIC. A Dataset for Dexterous Bimanual Hand-Object Manipulation, Fan et al.
(ECCV 2022) TOCH. TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement, Zhou et al.
(ECCV 2022) COUCH. COUCH: Towards Controllable Human-Chair Interactions, Zhang et al.
(ECCV 2022) SAGA. SAGA: Stochastic Whole-Body Grasping with Contact, Wu et al.
(CVPR 2022) GOAL. GOAL: Generating 4D Whole-Body Motion for Hand-Object Grasping, Taheri et al.
(CVPR 2022) BEHAVE. BEHAVE: Dataset and Method for Tracking Human Object Interactions, Bhatnagar et al.
(ECCV 2020) GRAB. GRAB: A Dataset of Whole-Body Human Grasping of Objects, Taheri et al.

Human-Scene Interaction

(ICLR 2025) Sitcom-Crafter. Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes, Chen et al.
(3DV 2025) Paschalidis et al. 3D Whole-body Grasp Synthesis with Directional Controllability, Paschalidis et al.
(ArXiv 2024) ZeroHSI. ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation, Li et al.
(ArXiv 2024) Mimicking-Bench. Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking, Liu et al.
(ArXiv 2024) SCENIC. SCENIC: Scene-aware Semantic Navigation with Instruction-guided Control, Zhang et al.
(ArXiv 2024) EnvPoser. EnvPoser: Environment-aware Realistic Human Motion Estimation from Sparse Observations with Uncertainty Modeling. Xia et al.
(ArXiv 2024) Diffusion Implicit Policy, Diffusion Implicit Policy for Unpaired Scene-aware Motion synthesis, Gong et al.
(ArXiv 2024) SIMS. SIMS: Simulating Human-Scene Interactions with Real World Script Planning, Wang et al.
(ArXiv 2024) LaserHuman. LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment, Cong et al.
(SIGGRAPH Asia 2024) LINGO. Autonomous Character-Scene Interaction Synthesis from Text Instruction, Jiang et al.
(NeurIPS 2024) DiMoP3D. Harmonizing Stochasticity and Determinism: Scene-responsive Diverse Human Motion Prediction, Lou et al.
(ECCV 2024) Liu et al.. Revisit Human-Scene Interaction via Space Occupancy, Liu et al.
(ECCV 2024) TesMo. Generating Human Interaction Motions in Scenes with Text Control, Yi et al.
(ECCV 2024 Workshop) SAST. Massively Multi-Person 3D Human Motion Forecasting with Scene Context, Mueller et al.
(Eurographics 2024) Kang et al. Learning Climbing Controllers for Physics-Based Characters, Kang et al.
(CVPR 2024) Afford-Motion. Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance, Wang et al.
(CVPR 2024) GenZI. GenZI: Zero-Shot 3D Human-Scene Interaction Generation, Li et al.
(CVPR 2024) Cen et al.. Generating Human Motion in 3D Scenes from Text Descriptions, Cen et al.
(CVPR 2024) TRUMANS. Scaling Up Dynamic Human-Scene Interaction Modeling, Jiang et al.
(ICLR 2024) UniHSI. UniHSI: Unified Human-Scene Interaction via Prompted Chain-of-Contacts, Xiao et al.
(3DV 2024) Purposer. Purposer: Putting Human Motion Generation in Context, Ugrinovic et al.
(3DV 2024) InterScene. Synthesizing Physically Plausible Human Motions in 3D Scenes, Pan et al.
(3DV 2024) Mir et al. Generating Continual Human Motion in Diverse 3D Scenes, Mir et al.
(ICCV 2023) DIMOS. DIMOS: Synthesizing Diverse Human Motions in 3D Indoor Scenes, Zhao et al.
(ICCV 2023) LAMA. Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments, Lee et al.
(ICCV 2023) Narrator. Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning, Xuan et al.
(CVPR 2023) CIMI4D. CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions, Yan et al.
(CVPR 2023) Scene-Ego. Scene-aware Egocentric 3D Human Pose Estimation, Wang et al.
(CVPR 2023) SLOPER4D. SLOPER4D: A Scene-Aware Dataset for Global 4D Human Pose Estimation in Urban Environments, Dai et al.
(CVPR 2023) CIRCLE. CIRCLE: Capture in Rich Contextual Environments, Araujo et al.
(CVPR 2023) SceneDiffuser. Diffusion-based Generation, Optimization, and Planning in 3D Scenes, Huang et al.
(SIGGRAPH 2023) PMP. PMP: Learning to Physically Interact with Environments using Part-wise Motion Priors, Bae et al.
(SIGGRAPH 2023) QuestEnvSim. QuestEnvSim: Environment-Aware Simulated Motion Tracking from Sparse Sensors, Lee et al.
(SIGGRAPH 2023) Hassan et al. Synthesizing Physical Character-Scene Interactions, Hassan et al.
(NeurIPS 2022) Mao et al.. Contact-aware Human Motion Forecasting, Mao et al.
(NeurIPS 2022) HUMANISE. HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes, Wang et al.
(NeurIPS 2022) EmbodiedPose. Embodied Scene-aware Human Pose Estimation, Luo et al.
(ECCV 2022) GIMO. GIMO: Gaze-Informed Human Motion Prediction in Context, Zheng et al.
(ECCV 2022) COINS. COINS: Compositional Human-Scene Interaction Synthesis with Semantic Control, Zhao et al.
(CVPR 2022) Wang et al.. Towards Diverse and Natural Scene-aware 3D Human Motion Synthesis, Wang et al.
(CVPR 2022) GAMMA. The Wanderings of Odysseus in 3D Scenes, Zhang et al.
(ICCV 2021) SAMP. Stochastic Scene-Aware Motion Prediction, Hassan et al.
(ICCV 2021) LEMO. Learning Motion Priors for 4D Human Body Capture in 3D Scenes, Zhang et al.
(3DV 2020) PLACE. PLACE: Proximity Learning of Articulation and Contact in 3D Environments, Zhang et al.
(SIGGRAPH 2020) Starke et al.. Local motion phases for learning multi-contact character movements, Starke et al.
(CVPR 2020) PSI. Generating 3D People in Scenes without People, Zhang et al.
(SIGGRAPH Asia 2019) NSM. Neural State Machine for Character-Scene Interactions, Starke et al.
(ICCV 2019) PROX. Resolving 3D Human Pose Ambiguities with 3D Scene Constraints, Hassan et al.

Human-Human Interaction

(ArXiv 2024) Two in One. Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent Diffusion Transformer, Li et al.
(ArXiv 2024) It Takes Two. It Takes Two: Real-time Co-Speech Two-person’s Interaction Generation via Reactive Auto-regressive Diffusion Model, Shi et al.
(ArXiv 2024) InterMask. InterMask: 3D Human Interaction Generation via Collaborative Masked Modelling, Javed et al.
(ArXiv 2024) COLLAGE. COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models, Daiya et al.
(NeurIPS 2024) Harmony4D. Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions, Khirodkar et al.
(NeurIPS 2024) InterControl. InterControl: Generate Human Motion Interactions by Controlling Every Joint, Wang et al.
(ACM MM 2024) PhysReaction. PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation, Liu et al.
(ECCV 2024) Shan et al.. Towards Open Domain Text-Driven Synthesis of Multi-Person Motions, Shan et al.
(ECCV 2024) ReMoS. ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions, Ghosh et al.
(CVPR 2024) Inter-X. Inter-X: Towards Versatile Human-Human Interaction Analysis, Xu et al.
(CVPR 2024) ReGenNet. ReGenNet: Towards Human Action-Reaction Synthesis, Xu et al.
(CVPR 2024) Fang et al.. Capturing Closely Interacted Two-Person Motions with Reaction Priors, Fan et al.
(CVPR Workshop 2024) in2IN. in2IN: Leveraging Individual Information to Generate Human INteractions, Ruiz-Ponce et al.
(IJCV 2024) InterGen. InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions, Liang et al.
(ICCV 2023) ActFormer. ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation, Xu et al.
(ICCV 2023) Tanaka et al.. Role-aware Interaction Generation from Textual Description, Tanaka et al.
(CVPR 2023) Hi4D. Hi4D: 4D Instance Segmentation of Close Human Interaction, Yin et al.
(CVPR 2022) ExPI. Multi-Person Extreme Motion Prediction, Guo et al.
(CVPR 2020) CHI3D. Three-Dimensional Reconstruction of Human Interactions, Fieraru et al.

Datasets & Benchmarks

(ICLR 2025) MotionCritic. Aligning Human Motion Generation with Human Perceptions, Wang et al.
(AAAI 2025) EMHI. EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs, Fan et al.
(ArXiv 2025) Motion-X++. Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset, Zhang et al.
(ArXiv 2024) Mimicking-Bench. Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking, Liu et al.
(ArXiv 2024) LaserHuman. LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment, Cong et al.
(ArXiv 2024) AtoM. AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward, Han et al.
(ArXiv 2024) SCENIC. SCENIC: Scene-aware Semantic Navigation with Instruction-guided Control, Zhang et al.
(ArXiv 2024) synNsync. Synergy and Synchrony in Couple Dances, Manukele et al.
(ArXiv 2024) MotionBank. MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations, Xu et al.
(ArXiv 2024) CORE4D. CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement, Zhang et al.
(Github 2024) CMP & CMR. AnimationGPT: An AIGC tool for generating game combat motion assets, Liao et al.
(Scientific Data 2024) Evans et al. Synchronized Video, Motion Capture and Force Plate Dataset for Validating Markerless Human Movement Analysis, Evans et al.
(SIGGRAPH Asia 2024) LINGO. Autonomous Character-Scene Interaction Synthesis from Text Instruction, Jiang et al.
(NeurIPS 2024) Harmony4D. Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions, Khirodkar et al.
(NeurIPS D&B 2024) EgoSim. EgoSim: An Egocentric Multi-view Simulator for Body-worn Cameras during Human Motion, Hollidt et al.
(NeurIPS D&B 2024) Muscles in Time. Muscles in Time: Learning to Understand Human Motion by Simulating Muscle Activations, Schneider et al.
(NeurIPS D&B 2024) Text to blind motion. Text to blind motion, Kim et al.
(ACM MM 2024) CLaM. CLaM: An Open-Source Library for Performance Evaluation of Text-driven Human Motion Generation, Chen et al.
(ECCV 2024) AddBiomechanics. AddBiomechanics Dataset: Capturing the Physics of Human Motion at Scale, Werling et al.
(ECCV 2024) LiveHPS++. LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment, Ren et al.
(ECCV 2024) SignAvatars. SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark, Yu et al.
(ECCV 2024) Nymeria. Nymeria: A massive collection of multimodal egocentric daily motion in the wild, Ma et al.
(Multibody System Dynamics 2024) Human3.6M+. Using musculoskeletal models to generate physically-consistent data for 3D human pose, kinematic, dynamic, and muscle estimation, Nasr et al.
(CVPR 2024) Inter-X. Inter-X: Towards Versatile Human-Human Interaction Analysis, Xu et al.
(CVPR 2024) HardMo. HardMo: ALarge-Scale Hardcase Dataset for Motion Capture, Liao et al.
(CVPR 2024) MMVP. MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors, Zhang et al.
(CVPR 2024) RELI11D. RELI11D: A Comprehensive Multimodal Human Motion Dataset and Method, Yan et al.
(SIGGRAPH Asia 2023) GroundLink. GroundLink: A Dataset Unifying Human Body Movement and Ground Reaction Dynamics, Han et al.
(NeurIPS D&B 2023) HOH. HOH: Markerless Multimodal Human-Object-Human Handover Dataset with Large Object Count, Wiederhold et al.
(NeurIPS D&B 2023) Motion-X. Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset, Lin et al.
(NeurIPS D&B 2023) Humans in Kitchens. Humans in Kitchens: A Dataset for Multi-Person Human Motion Forecasting with Scene Context, Tanke et al.
(ICCV 2023) CHAIRS. Full-Body Articulated Human-Object Interaction, Jiang et al.
(CVPR 2023) CIMI4D. CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions, Yan et al.
(CVPR 2023) FLAG3D. FLAG3D: A 3D Fitness Activity Dataset with Language Instruction, Tang et al.
(CVPR 2023) Hi4D. Hi4D: 4D Instance Segmentation of Close Human Interaction, Yin et al.
(CVPR 2023) CIRCLE. CIRCLE: Capture in Rich Contextual Environments, Araujo et al.
(NeurIPS 2022) MoCapAct. MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control, Wagener et al.
(ACM MM 2022) ForcePose. Learning to Estimate External Forces of Human Motion in Video, Louis et al.
(ECCV 2022) BEAT. BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis, Liu et al.
(ECCV 2022) BRACE. BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis, Moltisanti et al.
(ECCV 2022) EgoBody. Egobody: Human body shape and motion of interacting people from head-mounted devices, Zhang et al.
(ECCV 2022) GIMO. GIMO: Gaze-Informed Human Motion Prediction in Context, Zheng et al.
(ECCV 2022) HuMMan. HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling, Cai et al.
(CVPR 2022) ExPI. Multi-Person Extreme Motion Prediction, Guo et al.
(CVPR 2022) HumanML3D. Generating Diverse and Natural 3D Human Motions from Text, Guo et al.
(CVPR 2022) BEHAVE. BEHAVE: Dataset and Method for Tracking Human Object Interactions, Bhatnagar et al.
(ICCV 2021) AIST++. AI Choreographer: Music Conditioned 3D Dance Generation with AIST++, Li et al.
(CVPR 2021) Fit3D. AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training, Fieraru et al.
(CVPR 2021) BABEL. BABEL: Bodies, Action and Behavior with English Labels, Punnakkal et al.
(AAAI 2021) HumanSC3D. Learning complex 3d human self-contact, Fieraru et al.
(CVPR 2020) CHI3D. Three-Dimensional Reconstruction of Human Interactions, Fieraru et al.
(ICCV 2019) PROX. Resolving 3D Human Pose Ambiguities with 3D Scene Constraints, Hassan et al.
(ICCV 2019) AMASS. AMASS: Archive of Motion Capture As Surface Shapes, Mahmood et al.

Humanoid, Simulated or Real

(ICLR 2025) CLoSD. CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control, Tevet et al.
(ArXiv 2025) Gu et al. Humanoid Locomotion and Manipulation: Current Progress and Challenges in Control, Planning, and Learning.
(ArXiv 2024) UH-1. Learning from Massive Human Videos for Universal Humanoid Pose Control, Mao et al.
(ArXiv 2024) Mimicking-Bench. Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking, Liu et al.
(ArXiv 2024) Exbody2. Exbody2: Advanced Expressive Humanoid Whole-Body Control, Ji et al.
(ArXiv 2024) SIMS. SIMS: Simulating Human-Scene Interactions with Real World Script Planning, Wang et al.
(ArXiv 2024) HOVER. HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots, He et al.
(ArXiv 2024) Humanoidlympics. Humanoidlympics: Sports Environments for Physically Simulated Humanoids, Luo et al.
(ArXiv 2024) SkillMimic. SkillMimic: Learning Reusable Basketball Skills from Demonstrations, Wang et al.
(ArXiv 2024) PhySHOI. PhysHOI: Physics-Based Imitation of Dynamic Human-Object Interaction, Wang et al.
(RA-L 2024) Liu et al. Human-Humanoid Robots Cross-Embodiment Behavior-Skill Transfer Using Decomposed Adversarial Learning from Demonstration, Liu et al.
(SIGGRAPH Asia 2024) PDP. PDP: Physics-Based Character Animation via Diffusion Policy, Truong et al.
(SIGGRAPH Asia 2024) MaskedMimic. MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting, Tessler et al.
(NeurIPS 2024) HumanVLA. HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid, Xu et al.
(NeurIPS 2024) OmniGrasp. Grasping Diverse Objects with Simulated Humanoids, Luo et al.
(NeurIPS 2024) InterControl. InterControl: Generate Human Motion Interactions by Controlling Every Joint, Wang et al.
(NeurIPS 2024) CooHOI. CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics, Gao et al.
(NeurIPS 2024) Radosavovic et al.. Humanoid Locomotion as Next Token Prediction, Radosavovic et al.
(CoRL 2024) HARMON. Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions, Jiang et al.
(CoRL 2024) OKAMI. OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation, Li et al.
(CoRL 2024) HumanPlus. HumanPlus: Humanoid Shadowing and Imitation from Humans, Fu et al.
(CoRL 2024) OmniH2O. OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning, He et al.
(Humanoids 2024) Self-Aware. Know your limits! Optimize the behavior of bipedal robots through self-awareness, Mascaro et al.
(ACM MM 2024) PhysReaction. PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation, Liu et al.
(IROS 2024) H2O. Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation, He et al.
(ECCV 2024) MHC. Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs, Shrestha et al.
(ICML 2024) DIDI. DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation, Liu et al.
(SIGGRAPH 2024) MoConVQ. MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations, Yao et al.
(SIGGRAPH 2024) PhysicsPingPong. Strategy and Skill Learning for Physics-based Table Tennis Animation, Wang et al.
(SIGGRAPH 2024) SuperPADL. SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation, Juravsky et al.
(CVPR 2024) SimXR. Real-Time Simulated Avatar from Head-Mounted Sensors, Luo et al.
(CVPR 2024) AnySkill. AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents, Cui et al.
(ICLR 2024) PULSE. Universal Humanoid Motion Representations for Physics-Based Control, Luo et al.
(ICLR 2024) H-GAP. H-GAP: Humanoid Control with a Generalist Planner, Jiang et al.
(ICLR 2024) UniHSI. UniHSI: Unified Human-Scene Interaction via Prompted Chain-of-Contacts, Xiao et al.
(3DV 2024) Phys-Fullbody-Grasp. Physically Plausible Full-Body Hand-Object Interaction Synthesis, Braun et al.
(RSS 2024) ExBody. Expressive Whole-Body Control for Humanoid Robots, Cheng et al.
(SIGGRAPH Asia 2023) CASE. C·ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters, Dou et al.
(SIGGRAPH Asia 2023) AdaptNet. AdaptNet: Policy Adaptation for Physics-Based Character Control, Xu et al.
(SIGGRAPH Asia 2023) NCP. Neural Categorical Priors for Physics-Based Character Control, Zhu et al.
(SIGGRAPH Asia 2023) DROP. DROP: Dynamics Responses from Human Motion Prior and Projective Dynamics, Jiang et al.
(NeurIPS 2023) InsActor. InsActor: Instruction-driven Physics-based Characters, Ren et al.
(CoRL 2023) Humanoid4Parkour. Humanoid Parkour Learning, Zhuang et al.
(CoRL Workshop 2023) Words into Action. Words into Action: Learning Diverse Humanoid Robot Behaviors using Language Guided Iterative Motion Refinement, Kumar et al.
(ICCV 2023) PHC. Perpetual Humanoid Control for Real-time Simulated Avatars, Luo et al.
(CVPR 2023) Trace and Pace. Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion, Rempe et al.
(SIGGRAPH 2023) Vid2Player3D. DiffMimic: Efficient Motion Mimicking with Differentiable Physics, Zhang et al.
(SIGGRAPH 2023) QuestEnvSim. QuestEnvSim: Environment-Aware Simulated Motion Tracking from Sparse Sensors, Lee et al.
(SIGGRAPH 2023) Hassan et al. Synthesizing Physical Character-Scene Interactions, Hassan et al.
(SIGGRAPH 2023) CALM. CALM: Conditional Adversarial Latent Models for Directable Virtual Characters, Tessler et al.
(SIGGRAPH 2023) Composite Motion. Composite Motion Learning with Task Control, Xu et al.
(ICLR 2023) DiffMimic. DiffMimic: Efficient Motion Mimicking with Differentiable Physics, Ren et al.
(NeurIPS 2022) EmbodiedPose. Embodied Scene-aware Human Pose Estimation, Luo et al.
(NeurIPS 2022) MoCapAct. MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control, Wagener et al.
(SIGGRAPH Asia 2022) Gopinath et al.. Motion In-betweening for Physically Simulated Characters, Gopinath et al.
(SIGGRAPH Asia 2022) AIP. AIP: Adversarial Interaction Priors for Multi-Agent Physics-based Character Control, Younes et al.
(SIGGRAPH Asia 2022) ControlVAE. ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters, Yao et al.
(SIGGRAPH Asia 2022) QuestSim. QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars, Winkler et al.
(SIGGRAPH Asia 2022) PADL. PADL: Language-Directed Physics-Based Character, Juravsky et al.
(SIGGRAPH Asia 2022) Wang et al. Differentiable Simulation of Inertial Musculotendons, Wang et al.
(SIGGRAPH 2022) ASE. ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters, Peng et al.
(Journal of Neuro-Engineering and Rehabilitation 2021) Learn to Move. Deep Reinforcement Learning for Modeling Human Locomotion Control in Neuromechanical Simulation, Peng et al.
(NeurIPS 2021) KinPoly. Dynamics-Regulated Kinematic Policy for Egocentric Pose Estimation, Luo et al.
(SIGGRAPH 2021) AMP. AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control, Peng et al.
(CVPR 2021) SimPoE. SimPoE: Simulated Character Control for 3D Human Pose Estimation, Yuan et al.
(NeurIPS 2020) RFC. Residual Force Control for Agile Human Behavior Imitation and Extended Motion Synthesis, Yuan et al.
(ICLR 2020) Yuan et al.. Diverse Trajectory Forecasting with Determinantal Point Processes, Yuan et al.
(ICCV 2019) Ego-Pose. Ego-Pose Estimation and Forecasting as Real-Time PD Control, Yuan et al.
(SIGGRAPH 2018) DeepMimic. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills, Peng et al.

Bio-stuff: Human Anatomy, Biomechanics, Physiology

(ICLR 2025) ImDy. ImDy: Human Inverse Dynamics from Imitated Observations, Liu et al.
(WACV 2025) OpenCapBench. A Benchmark to Bridge Pose Estimation and Biomechanics, Gozlan et al.
(BiorXiv 2024) Lai et al. Mapping Grip Force to Muscular Activity Towards Understanding Upper Limb Musculoskeletal Intent using a Novel Grip Strength Model, Lai et al.
(ArXiv 2024) HDyS. Homogeneous Dynamics Space for Heterogeneous Humans, Liu et al.
(IROS 2024) Shahriari et al. Enhancing Robustness in Manipulability Assessment: The Pseudo-Ellipsoid Approach, Shahriari et al.
(SIGGRAPH Asia 2024) BioDesign. Motion-Driven Neural Optimizer for Prophylactic Braces Made by Distributed Microstructures, Han et al.
(Scientific Data 2024) Evans et al. Synchronized Video, Motion Capture and Force Plate Dataset for Validating Markerless Human Movement Analysis, Evans et al.
(NeurIPS D&B 2024) Muscles in Time. Muscles in Time: Learning to Understand Human Motion by Simulating Muscle Activations, Schneider et al.
(CoRL 2024) Wei et al. Safe Bayesian Optimization for the Control of High-Dimensional Embodied Systems, Wei et al.
(HFES 2024) Macwan et al. High-Fidelity Worker Motion Simulation With Generative AI, Macwan et al.
(ECCV 2024) AddBiomechanics. AddBiomechanics Dataset: Capturing the Physics of Human Motion at Scale, Werling et al.
(ECCV 2024) MANIKIN. MANIKIN: Biomechanically Accurate Neural Inverse Kinematics for Human Motion Estimation, Jiang et al.
(ICML 2024) DynSyn. DynSyn: Dynamical Synergistic Representation for Efficient Learning and Control in Overactuated Embodied Systems, He et al.
(Multibody System Dynamics 2024) Human3.6M+. Using musculoskeletal models to generate physically-consistent data for 3D human pose, kinematic, dynamic, and muscle estimation, Nasr et al.
(CVPR 2024) HIT. HIT: Estimating Internal Human Implicit Tissues from the Body Surface, Keller et al.
(Frontiers in Neuroscience 2024) Dai et al. Full-body pose reconstruction and correction in virtual reality for rehabilitation training, Dai et al.
(ICRA 2024) He et al.. Self Model for Embodied Intelligence: Modeling Full-Body Human Musculoskeletal System and Locomotion Control with Hierarchical Low-Dimensional Representation, He et al.
(SIGGRAPH Asia 2023) SKEL. From skin to skeleton: Towards biomechanically accurate 3d digital humans, Keller et al.
(SIGGRAPH Asia 2023) MuscleVAE. MuscleVAE: Model-Based Controllers of Muscle-Actuated Characters, Feng et al.
(SIGGRAPH 2023) Bidirectional GaitNet Bidirectional GaitNet, Park et al.
(SIGGRAPH 2023) Lee et al.. Anatomically Detailed Simulation of Human Torso, Lee et al.
(ICCV 2023) MiA. Muscles in Action, Chiquer et al.
(CVPR 2022) OSSO. OSSO: Obtaining Skeletal Shape from Outside, Keller et al.
(Scientific Data 2022) Xing et al. Functional movement screen dataset collected with two Azure Kinect depth sensors, Xing et al.
(NCA 2020) Zell et al. Learning inverse dynamics for human locomotion analysis, Zell et al.
(ECCV 2020) Zell et al. Weakly-supervised learning of human dynamics, Zell et al.
(SIGGRAPH 2019) LRLE. Synthesis of biologically realistic human motion using joint torque actuation, Jiang et al.
(TII 2018) Pham et al. Multicontact Interaction Force Sensing From Whole-Body Motion Capture, Pham et al.
(ICCV Workshop 2017) Zell et al. Learning-based inverse dynamics of human motion, Zell et al.
(CVPR Workshop 2017) Zell et al. Joint 3d human motion capture and physical analysis from monocular videos, Zell et al.
(AIST 2017) HuGaDb. HuGaDB: Human Gait Database for Activity Recognition from Wearable Inertial Sensor Networks, Chereshnev et al.
(SIGGRAPH 2016) Lv et al. Data-driven inverse dynamics for human motion, Lv et al.

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Human Motion

Reviews & Surveys

Motion Generation, Text/Speech/Music-Driven

Motion Editing

Motion Stylization

Human-Object Interaction

Human-Scene Interaction

Human-Human Interaction

Datasets & Benchmarks

Humanoid, Simulated or Real

Bio-stuff: Human Anatomy, Biomechanics, Physiology

About

Contributors 2

Foruck/Awesome-Human-Motion

Folders and files

Latest commit

History

Repository files navigation

Awesome Human Motion

Reviews & Surveys

Motion Generation, Text/Speech/Music-Driven

Motion Editing

Motion Stylization

Human-Object Interaction

Human-Scene Interaction

Human-Human Interaction

Datasets & Benchmarks

Humanoid, Simulated or Real

Bio-stuff: Human Anatomy, Biomechanics, Physiology

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2