This post is the second in a series of tutorials on SLAM using scanning 2D LIDAR and wheel odometry. The other posts in the series can be found in the links below. The links will be updated as work on the series progresses.
- Intro To LIDAR SLAM and the IRC Dataset
- Matching Scans To A Persistent Map
- SLAM Using LIDAR And Wheel Odometry
- Globally Consistent SLAM With LIDAR
In the previous post I introduced the Intel Research Center (IRC) Dataset and we used it in some basic visualizations. In particular, I used the robot’s odometry to get a rough estimate of its pose over time by simply concatenating the relative transformations between timesteps, :
I then used the transform to project the laser scan at each timestep into the global coordinate frame: where is the set of homogeneous scan points in the global frame, is the set of homogeneous points in the robot’s local frame, and is the estimated transform between the global and local coordinate frames. This process is visualized in VisualizeMeasurements.py in my github repo:
Watching this visualization even over a short time, it’s obvious that the robot’s odometry is very noisy and collects drift very quickly. With perfect odometry, the objects measured by the LIDAR would stay static as the robot moves past them. This is clearly not the case. Interestingly, the odometry seems to be fairly reliable for translational motion, but it drifts quickly in rotation. This will be significant later.
How Do We Fix The Drift Problem?
It’s clear to us the robot’s wheel odometry isn’t sufficient to estimate its motion. We know this because we can overlay the robot’s LIDAR scans in our minds and get a sense of how the robot’s estimated motion is deviating from its true motion. If we can do this in our minds, could we tell the robot how to do it? Can the robot use its LIDAR scans to estimate its own motion?
Hopefully you’ve guessed the answer is yes, through a process called scan matching. Basically the goal is to take a new scan from the robot’s LIDAR and find the transformation that best aligns the new scan with either previous scans or some sort of abstracted map. There are many ways to implement this idea and for this tutorial I’m going to demonstrate the simplest method: using the Iterative Closest Point (ICP) algorithm to align the newest LIDAR scan with the previous scan.
Iterative Closest Point In Pictures
The ICP algorithm involves 3 steps: association, transformation, and error evaluation. These are repeated until the scans are aligned satisfactorily. I’ll first demonstrate the process pictorially with an example from the IRL dataset and delve into the math below. Let’s first imagine we want to find the transformation that aligns the two scans pictured below. The previous scan, referred to as the target, is in cyan while the new scan, also called the source is in magenta. The goal is to find the rigid transformation (rotation and translation) that best aligns the source to the target.
Our first step in estimating this transformation is to decide which points in the source scan correspond to the same physical features as points in the target scan. The simplest way to do this is through a nearest neighbor search: points in the source scan are associated to the nearest point in the target scan. In the image below I’ve found the nearest neighbors of each point in the target scan. Associated points are connected with blue lines:
We can immediately see some mistakes in the nearest neighbor search, but in general the associations pictured will pull the source points in the right direction. The next step in the process is transformation. We find the transformation that, when applied to the source points, minimizes the mean-squared distance between the associated points: where is the final estimated transform and and are target points and source points, respectively. The result of this estimation is pictured below:
After this we evaluate the error in the alignment as and decide if we need to repeat the above process. In this case our scans still aren’t aligned very well, so we redo the associations with the transformed source points and repeat the process.
After five iterations our scans the algorithm finds a pretty good alignment:
The Math Details
ICP is actually pretty straightforward, mathematically. The association step is pretty simple. It can just be a brute-force search for the nearest pairs of points between the source and target scans. If you’re working with large scans though, it’s a good idea to use KD Trees for this step.
The real trick to ICP is in the transformation step. Basically, we find the covariance between the two point sets, the matrix .
Where is a matrix whose column is or the source point expressed relative to the centroid of the source point set . Similarly, is a matrix whose column is . Once we have the covariance matrix , we find the rotation between the two point clouds using singular value decomposition:
If you’re wondering how you break the matrix down into , , and , know that most linear algebra packages (including matlab and numpy) have functions for SVD. That’s about as far as you need to get into it. If you’re interested though, the wikipedia page has some good details.
We’ve found the rotation between the point sets, now we just need the translation . Luckily it’s pretty simple, just the difference between the centroids of the point clouds .
Once we have our translation and rotation we evaluate the alignment error as . We use this to determine if we should quit or iterate again. Normally we stop the process if the error at the current step is below a threshold, if the difference between the error at the current step and the previous step’s error is below a threshold, or if we’ve reached a maximum number of iterations.
Below is a visualization of a simple ICP motion estimation algorithm. It simply aligns the newest scan to the previous scan to find the motion of the robot between scans:
Note that this method for motion estimation works pretty well sometimes. If the robot is near large sections of wall at different angles it can estimate its transformation between scans pretty reliably. This is because it has good environmental queues to its motion in all directions. On the other hand, if the robot is in a mostly straight hallway, there’s really nothing in its measurements that will tell it how it’s moving along the hallway. One alignment is as good as any other as long as the walls line up. In these cases the robot’s estimates of its translation are very poor. Similarly, if there just aren’t a lot of unique, persistent features in the scan, which happens sometimes when the robot approaches corners, there aren’t any good cues for the robot to estimate its rotation.
So, matching successive LIDAR scans via the iterative closest point algorithm can give our robot some information about its own movement. However, this on its own is not enough to provide a reliable motion estimate. Luckily, our robot has wheel odometry in addition to LIDAR. Even luckier, in fact, ICP is pretty reliable at estimating rotations but poor with translation in some cases. The wheel odometry, on the other hand, gives us very accurate translation but it is very unreliable with rotation. Next time, we’ll experiment with fusing information from these two sensors to create a more reliable motion estimate.
The Code
Below you can see an implementation of the ICP algorithm in python. You can find the full class, Align2D.py, in my github repo as well as a demonstration of its use in VisualizeICP.py.
# uses the iterative closest point algorithm to find the # transformation between the source and target point clouds # that minimizes the sum of squared errors between nearest # neighbors in the two point clouds # params: # max_iter: int, max number of iterations # min_delta_err: float, minimum change in alignment error def AlignICP(self, max_iter, min_delta_err): mean_sq_error = 1.0e6 # initialize error as large number delta_err = 1.0e6 # change in error (used in stopping condition) T = self.init_T num_iter = 0 # number of iterations tf_source = self.source while delta_err > min_delta_err and num_iter < max_iter: # find correspondences via nearest-neighbor search matched_trg_pts,matched_src_pts,indices = self.FindCorrespondences(tf_source) # find alingment between source and corresponding target points via SVD # note: svd step doesn't use homogeneous points new_T = self.AlignSVD(matched_src_pts, matched_trg_pts) # update transformation between point sets T = np.dot(T,new_T) # apply transformation to the source points tf_source = np.dot(self.source,T.T) # find mean squared error between transformed source points and target points new_err = 0 for i in range(len(indices)): if indices[i] != -1: diff = tf_source[i,:2] - self.target[indices[i],:2] new_err += np.dot(diff,diff.T) new_err /= float(len(matched_trg_pts)) # update error and calculate delta error delta_err = abs(mean_sq_error - new_err) mean_sq_error = new_err num_iter += 1 return T # uses singular value decomposition to find the # transformation from the target to the source point cloud # assumes source and target point clounds are ordered such that # corresponding points are at the same indices in each array # # params: # source: numpy array representing source pointcloud # target: numpy array representing target pointcloud # returns: # T: transformation between the two point clouds def AlignSVD(self, source, target): # first find the centroids of both point clouds src_centroid = self.GetCentroid(source) trg_centroid = self.GetCentroid(target) # get the point clouds in reference to their centroids source_centered = source - src_centroid target_centered = target - trg_centroid # get cross covariance matrix M M = np.dot(source_centered.T,target_centered) # get singular value decomposition of the cross covariance matrix U,W,V_t = np.linalg.svd(M) # get rotation between the two point clouds R = np.dot(V_t.T,U.T) # get the translation (simply the difference between the point clound centroids) t = trg_centroid - src_centroid # assemble translation and rotation into a transformation matrix T = np.identity(3) T[:2,2] = np.squeeze(t) T[:2,:2] = R return T def GetCentroid(self, points): point_sum = np.sum(points,axis=0) return point_sum / float(len(points))