Image-goal navigation enables a robot to reach the location where a target image was captured, using visual cues for guidance. However, current methods either rely heavily on data and computationally expensive learning-based approaches or lack efficiency in complex environments due to insufficient exploration strategies. To address these limitations, we propose Bayesian Embodied Image-goal Navigation Using Gaussian Splatting, a novel method that formulates ImageNav as an optimal control problem within a model predictive control framework. BEINGS leverages 3D Gaussian Splatting as a scene prior to predict future observations, enabling efficient, real-time navigation decisions grounded in the robot's sensory experiences. By integrating Bayesian updates, our method dynamically refines the robot's strategy without requiring extensive prior experience or data. Our algorithm is validated through extensive simulations and physical experiments, showcasing its potential for embodied robot systems in visually complex scenarios.
图像目标导航(Image-goal navigation)使机器人能够使用视觉线索引导,前往捕捉目标图像的位置。然而,当前的方法要么过度依赖数据和计算开销较大的基于学习的方式,要么由于探索策略不足,在复杂环境中缺乏效率。为了解决这些限制,我们提出了一种基于高斯投影的贝叶斯具身图像目标导航方法,称为 BEINGS。这种新方法将图像导航(ImageNav)作为模型预测控制框架中的最优控制问题来进行求解。BEINGS 利用 3D 高斯投影作为场景先验,预测未来的观测结果,从而基于机器人的感知经验做出高效的实时导航决策。通过整合贝叶斯更新,我们的方法能够动态优化机器人的策略,而无需广泛的先验经验或数据。我们通过大量仿真和物理实验验证了该算法,展示了其在视觉复杂场景中对具身机器人系统的潜力。