We introduce LDL, a fast and robust algorithm that localizes a panorama to a 3D map using line segments. LDL focuses on the sparse structural information of lines in the scene, which is robust to illumination changes and can potentially enable efficient computation. While previous line-based localization approaches tend to sacrifice accuracy or computation time, our method effectively observes the holistic distribution of lines within panoramic images and 3D maps. Specifically, LDL matches the distribution of lines with 2D and 3D line distance functions, which are further decomposed along principal directions of lines to increase the expressiveness. The distance functions provide coarse pose estimates by comparing the distributional information, where the poses are further optimized using conventional local feature matching. As our pipeline solely leverages line geometry and local features, it does not require costly additional training of line-specific features or correspondence matching. Nevertheless, our method demonstrates robust performance on challenging scenarios including object layout changes, illumination shifts, and large-scale scenes, while exhibiting fast pose search terminating within a matter of milliseconds. We thus expect our method to serve as a practical solution for line-based localization, and complement the well-established point-based paradigm. The code for LDL is available through the following link: https://github.com/82magnolia/panoramic-localization.
LDL is a localization algorithm that takes a panorama image as input and finds the camera pose with respect to a 3D point cloud map. LDL exploits line information during localization to further reduce the map size and attain robustness against 2D-3D domain gaps or illumination changes. This is in contrast to many prior methods that use photometric cues, which are susceptible to the aforementioned issues.
LDL performs localization through a three step process. First, LDL extracts lines, principal directions, and local feature descriptors from the panorama and point cloud. Then, LDL performs coarse pose search using novel line descriptors called line distance functions. As the final step, LDL applies local feature matching between the panorama and selected poses within the map to obtain the refined pose.
During coarse pose search, we exploit line distance functions in 2D and 3D. The distance functions are designed to capture the holistic distribution of lines. First, 2D line distance functions are defined as the spherical distance to the nearest line segment. 3D line distance functions are similarly defined for pool of poses within the map. Specifically, for each pose we first project the 3D line segments onto the sphere and compute the line distance function values.
We further decompose line distance functions using principal directions. Instead of defining a single distance function using all visible lines, we define three line distance functions for each set of lines parallel to the principal directions. The decomposed line distance functions are finally compared using a robust cost function. Here the cost function is defined for each rotation and translation within the map, and compares the decomposed distance function values for uniformly sampled points on the sphere.
We evaluate LDL on Stanford2D-3D-S and OmniScenes, which are common datasets used for evaluating panoramic localization. As shown above, numerous scenes in these datasets contain repetitive structures and noisy lines.
The figure on the left shows the translation and rotation recall curves for LDL and a conventional learning-based pose search method called NetVLAD. LDL shows performance on a par with the learning-based method, while maintaining an order-of-magnitude shorter time.
We find that a small modification our method can offer light-weight privacy protection in client-server localization scenarios. By changing LDL to only exploit local features near lines during refinement, we can prevent privacy breaches such as feature inversion attacks which aim to reveal the original image content from local feature information.
Here we plot the image error metrics of feature inversion attacks against the original image along with the localization accuracy using various line-based filtering threshold values. While the discrepancy values increase largely, the localization accuracy remains relatively constant.
@InProceedings{Kim_2023_ICCV, author = {Kim, Junho and Choi, Changwoon and Jang, Hojun and Kim, Young Min}, title = {LDL: Line Distance Functions for Panoramic Localization}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {17882-17892} }