We present CPO, a fast and robust algorithm that localizes a 2D panorama with respect to a 3D point cloud of a scene possibly containing changes. To robustly handle scene changes, our approach deviates from conventional feature point matching, and focuses on the spatial context provided from panorama images. Specifically, we propose efficient color histogram generation and subsequent robust localization using score maps. By utilizing the unique equivariance of spherical projections, we propose very fast color histogram generation for a large number of camera poses without explicitly rendering images for all candidate poses. We accumulate the regional consistency of the panorama and point cloud as 2D/3D score maps, and use them to weigh the input color values to further increase robustness. The weighted color distribution quickly finds good initial poses and achieves stable convergence for gradient-based optimization. CPO is lightweight and achieves effective localization in all tested scenarios, showing stable performance despite scene changes, repetitive structures, or featureless regions, which are typical challenges for visual localization with perspective cameras.
CPO is a localization algorithm that takes a panorama image and colored point cloud as input and finds the camera pose. Panorama images are beneficial for localization as the full 360-degree view provides the holistic scene context with less ambiguity. However, drastic scene changes can make localization challenging, as illustrated in the right. The goal of CPO is to perform robust localization amidst such scene changes.
Given a query image and point cloud, CPO first creates 2D, 3D score maps that reflect regional color consistencies and attenuate regions that possibly contain scene changes. Using the score maps, CPO first selects promising candidate poses which are further refined with gradient descent optimization.
2D score maps assign higher scores to image regions that are consistent with the point cloud color. To create 2D score maps, synthetic views are first rendered at various locations in the point cloud. Then, intersections are computed between the patch-wise color histograms of the synthetic views and the query image. Finally, for each patch in the 2D score map, the maximum histogram intersection is stored.
3D score maps assign higher scores to point cloud regions that have consistent colors with the query image. To build 3D score maps, we re-use the patch-wise histogram intersections computed between the query image and synthetic views. For each synthetic view, we back-project the histogram intersection values to the point cloud, and average the back-projected histogram intersections.
CPO leverages the 2D score map for candidate pose selection. Given a pool of translations and rotations, CPO renders a synthetic view for each pose and computes the patch-wise color histograms. The histograms are compared against the query image through histogram intersection weighted with 2D score maps. Finally, the top-k poses with the largest intersection values are selected for refinement.
CPO uses the 3D score map with sampling loss minimization for pose refinement. Sampling loss is defined as the color difference between each 3D point’s color and its projected location’s sampled color. Here, the 3D score map is applied to weigh the color difference of each point. On the right, we show the optimization trajectories of the candidate poses. The candidate pose with the smallest sampling loss is chosen after optimization.
Here we show visualizations of 2D, 3D score maps. On the left, new furniture and a moving person are introduced since the 3D scan, which are all attenuated by the 2D score map. On the right, we display the 3D score map, where the map places smaller values on regions near chairs and the blue carpet which are not present in the query image.
@InProceedings{Kim_2022_ECCV, author = {Kim, Junho and Jang, Hojun and Choi, Changwoon and Kim, Young Min}, title = {CPO: Change Robust Panorama to Point Cloud Localization}, booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)}, month = {October}, year = {2022}, pages = {176-192}, }