RoboStriker: Hierarchical Decision-Making for Autonomous Humanoid Boxing

Shanghai Jiao Tong University1, Shanghai Artificial Intelligence Laboratory2,
Shanghai Innovation Institute3, Peking University4,
The Hong Kong University of Science and Technology (Guangzhou)5,

Abstract

Achieving human-level competitive intelligence and physical agility in humanoid robots remains a major challenge, particularly in contact-rich and highly dynamic tasks such as boxing. While Multi-Agent Reinforcement Learning (MARL) offers a principled framework for strategic interaction, its direct application to humanoid control is hindered by high-dimensional contact dynamics and the absence of strong physical motion priors. We propose RoboStriker, a hierarchical three-stage framework that enables fully autonomous humanoid boxing by decoupling high-level strategic reasoning from low-level physical execution. The framework first learns a comprehensive reper- toire of boxing skills by training a single-agent motion tracker on human motion capture data. These skills are subsequently distilled into a structured latent manifold, regularized by projecting the Gaussian-parameterized distribution onto a unit hypersphere. This topological constraint effectively confines exploration to the subspace of physically plausible motions. In the final stage, we introduce Latent-Space Neural Fictitious Self- Play (LS-NFSP), where competing agents learn competitive tactics by interacting within the latent action space rather than the raw motor space, significantly stabilizing multi-agent training. Experimental results demonstrate that RoboStriker achieves superior competitive performance in simulation and exhibits sim-to-real transfer.


Video Demonstrations


Isaaclab


stage3 warmup with a stationary player

stage3 neural fictitious selfplay

stage3 neural fictitious selfplay

Mujoco sim2sim validation

Realworld deploy(More clips are coming soon...)

Approach Overview


BibTeX