Beyond EKF SLAM

My series of tutorials on SLAM with the extended Kalman filter is finally done. At this point it’s a good idea to reflect on where we’re trying to go and what motivates us. My goal is to make a mobile robot that can autonomously map indoor spaces. I’m doing this mainly because it’s fascinating, but also to win a bet. Learning state estimation with EKFs was a great step toward that goal; it gave us an initial introduction to probabilistic state estimation. But we have a long way to go. We need to develop a SLAM system that’s more robust and better suited to our problem. 


Mapping Methods

My eventual goal is to develop a SLAM system capable of accurately mapping an indoor environment. This gives us a couple of requirements. First, its map representation must be human readable and metrically accurate. A sparse collection of point landmarks, like we used in our EKF, is not sufficient for this. Some sort of dense map representation like an occupancy grid or signed distance function is required.

Example of an occupancy grid map build with RTAB-map. Black cells indicate occupied areas, white cells indicate unoccupied, and gray cells indicate uncertain areas.

 


Sensor Selection

Second, we need to select sensors suited to the task. Our sensors must be able to give us a metrically accurate picture of the local environment’s structure, not just the locations of discrete landmarks as in the UTIAS dataset. A 2D scanning LIDAR is ideal for taking dense measurements of the surrounding environment and they can be found relatively cheaply. Wheel encoders will also provide useful information on the robot’s motion between LIDAR measurements. Also, because we’re trying to build a globally consistent map, we need a topological understanding of the whole environment. That is, we need to understand how different areas in the environment are connected. This means we must be able to detect when the robot revisits a previously-visited location. This can be done using LIDAR data alone, but cameras using visual bag-of-words or deep place recognition could also be helpful here.

Visual depiction of loop closure using ORB-SLAM. Map on left is prior to loop closure with place association indicated by blue line. Map on right is after loop closure and reoptimization of the map and robot trajectory.


SLAM Formulation

Lastly, we need an estimation framework that can solve our problem quickly and efficiently. We want to estimate a global map of the environment and the robot’s full path through the environment. Filtering approaches are ill-suited to this task because they do not allow for re-estimation of previous poses. This makes it difficult to constrain drift in the robot’s state estimate. One would normally use loop closures to do this, but this is difficult with filtering since ideally the full map and robot trajectory should be re-estimated when a loop closure is detected. That is why we will be using a factor graph approach. A factor graph is a representation of the SLAM problem that visualizes the quantities to be estimated (the robot trajectory, map, etc) as nodes and sensor measurements as constraints between those nodes. The graph model can be translated into a single cost function that can be solved through nonlinear optimization. Best of all, this framework allows for the use of an arbitrary combination of sensors as long as we’re able to correctly incorporate the sensor models into the cost function.

SLAM factor graph example: Blue circles denote robot poses at consecutive time steps (x1, x2 , . . .), green circles denote landmark positions (l1 , l2 , . . .), red circle denotes the variable associated with the intrinsic calibration parameters (K). Factors are shown as black squares: the label “u” marks factors corresponding to odometry constraints, “v” marks factors corresponding to camera observations, “c” denotes loop closures, and “p” denotes prior factors. Taken from Cadena et al, 2016.


Moving Forward

So the plan is to design a graph SLAM system for indoor mapping that uses 2D scanning LIDAR and wheel encoders. It should be able to accurately map and estimate its position within its local environment, as well as perform reliable place recognition and loop closure. It should ideally be able to run in real time on a Raspberry Pi or similar single board computer, so careful attention must be paid to the sparsity of the problem’s factor graph. The graph should be as sparse as possible for efficient optimization. This means I will need to include mechanisms to avoid adding redundant information and to sparsify the problem as it proceeds.

I will first be testing my software on a publicly available dataset. The IRC dataset includes sensor streams from odometry and a 2D scanning LIDAR on a wheeled mobile robot as it traverses the Intel Research Center. It will allow us to test all the requirements on our algorithm including accurate mapping and place recognition. After we’re getting good results on the IRC dataset we can implement it on Colin!