Nonnengießer, F., Kshirsagar, A., Belousov, B., & Peters, J. (2025).
Visuotactile In-Hand Pose Estimation

German Robotics Conference (GRC)
Full text
PDF
Share

Abstract

This paper presents an approach to robotic inhand object pose estimation, combining visual and tactile information to accurately determine the position and orientation of objects grasped by a robotic hand. We address the challenge of visual occlusion by fusing visual information from a wristmounted RGB-D camera with tactile information from visionbased tactile sensors mounted on the fingertips of a robotic gripper. Our approach employs a weighting and sensor fusion module to combine point clouds from heterogeneous sensor types and control each modality’s contribution to the pose estimation process. We use an augmented Iterative Closest Point (ICP) algorithm adapted for weighted point clouds to estimate the 6D object pose. Our experiments show that incorporating tactile information significantly improves pose estimation accuracy, particularly when occlusion is high. Our method achieves an average pose estimation error of 7.5 mm and 16.7 degrees, outperforming vision-only baselines by up to 20%. To validate the practical applicability of our method, we conducted an insertion task experiment, demonstrating the ability to perform precise object manipulation in a real-world scenario.