ENHANCING FULL-BODY TRACKING IN VIRTUAL REALITY THROUGH IMPROVED POSE ESTIMATION USING FINE-TUNED POSE RESNET
Antonius Teddy Kurniawan, Prof. Dr.-Ing. Mhd. Reza M. I. Pulungan, S.Si., M.Sc.; Novera Istiqomah, S.T., M.T., Ph.D.; Wahyono, S.Kom., Ph.D.
2025 | Skripsi | ILMU KOMPUTER
Virtual reality (VR) technology offers immersive experiences in the Metaverse, enabling users to interact in virtual worlds. However, current VR systems primarily track the user's head and hands, limiting full-body immersion and resulting in unrealistic avatar movements. These limitations hinder user experience and the scope of virtual interactions.
This research aims to improve 2D pose estimation for VR scenarios to enhance full-body tracking and provide more natural interactions. A fine-tuned Pose ResNet model achieves a mean per joint positional error (MPJPE) of 39.07 with an inference time of 0.032 seconds per batch. Compared to HRNet, which offers higher accuracy but requires extensive computational resources, and HybridTrak, which achieves superior precision (MPJPE of 0.098) at greater complexity, Pose ResNet provides a practical balance between accuracy and efficiency. It also outperforms the physics-based method (MPJPE of 68.1) in accuracy while maintaining computational feasibility.
By refining 2D pose estimation and integrating it with existing three-point tracking systems, this study demonstrates the potential of Pose ResNet to overcome current VR limitations. The findings contribute to advancing full-body tracking for VR applications, enhancing immersion and enabling more realistic virtual interactions in resource-constrained environments.
Virtual reality (VR) technology offers immersive experiences in the Metaverse, enabling users to interact in virtual worlds. However, current VR systems primarily track the user's head and hands, limiting full-body immersion and resulting in unrealistic avatar movements. These limitations hinder user experience and the scope of virtual interactions.
This research aims to improve 2D pose estimation for VR scenarios to enhance full-body tracking and provide more natural interactions. A fine-tuned Pose ResNet model achieves a mean per joint positional error (MPJPE) of 39.07 with an inference time of 0.032 seconds per batch. Compared to HRNet, which offers higher accuracy but requires extensive computational resources, and HybridTrak, which achieves superior precision (MPJPE of 0.098) at greater complexity, Pose ResNet provides a practical balance between accuracy and efficiency. It also outperforms the physics-based method (MPJPE of 68.1) in accuracy while maintaining computational feasibility.
By refining 2D pose estimation and integrating it with existing three-point tracking systems, this study demonstrates the potential of Pose ResNet to overcome current VR limitations. The findings contribute to advancing full-body tracking for VR applications, enhancing immersion and enabling more realistic virtual interactions in resource-constrained environments.
Kata Kunci : Virtual Reality, 2D Pose Estimation, Full-Body Tracking, Immersion, Metaverse, Interaction