Monocular visual localization in 3D LiDAR map

2019-07-30

Recently, I have published my new paper about 3D monocular global localization in sparse LiDAR map with deep learning method: EnforceNet: Monocular Camera Localization in Large Scale Indoor Sparse LiDAR Point Cloud.

Pose estimation is a fundamental building block for robotic applica- tions such as autonomous vehicles, UAV, and large scale augmented reality. It is also a prohibitive factor for those applications to be in mass production, since the state-of-the-art, centimeter-level pose estimation often requires long mapping procedures and expensive localization sensors, e.g. LiDAR and high precision GPS/IMU, etc. To overcome the cost barrier, we propose a neural network based solution to localize a consumer degree RGB camera within a prior sparse LiDAR map with comparable centimeter-level precision. We achieved it by introducing a novel network module, which we call “resistor module”, to enforce the network generalize better, predicts more accurately, and converge faster. Such results are benchmarked by several datasets we collected in the large scale indoor parking garage scenes. We plan to open both the data and the code for the community to join the effort to advance this field.

So the beginning of this project is that we need trade-off cost and accuracy of our packet. If we use the LiDAR for 3D localization in 3D prior map, the accuray and robustness of the LiDAR SLAM is much better than visual SLAM. But the cost of LiDAR is high. So in this project, I combine the LiDAR map accuray and camera inference cost together in large scale indoor localization for the autonomous driving car.

Traditional mutual information matching is not performed well in our scenario, so I design an EnforceNet to combat the cross modal problem.

We proposed EnforceNet, an end-to-end solution for camera pose localization within a large scale and sparse 3D LiDAR point cloud. The EnforceNet has a novel resistor module and a weight-sharing scheme that is inspired by the state value function and value-iteration in RL framework. We conducted detailed experiments on real-world datasets of large scale indoor parking garages, and demonstrated the EnforceNet has reached the state-of-the-art localization accuracy with better generalization capability than previous research. We are preparing to open-source our system and the dataset for the ease of further development of similar research for the community.

For more details, see the above link to see my paper.

If you have any problem about the paper, you can contact me through: yuuc99445@gmail.com.