Intro to the EKF Step 0: The Data

This post is part of a series on SLAM with the extended Kalman filter. For other posts in the series see below.

Intro To The UTIAS Dataset

Writing programs for actual robots is difficult. Robots are expensive, testing code takes a lot of time, and the results are often difficult to interpret. This is why we don’t do most of our software development on actual robots. Instead it is much easier to run the actual robot once and log all of the robot’s sensor data, command inputs, and other relevant data. This way we can test our software on the logged data in a few seconds rather than doing a new test run every time we make a change to our code.

Setup used to make the UTIAS dataset

However, it’s difficult to make our own datasets for localization and mapping. In order to evaluate the accuracy of our SLAM software we need to know the actual (or groundtruth) pose of the robot with high accuracy at all times. This requires the use of motion capture setups which are well beyond the reach of most hobbyists. The good news is that university research groups routinely make their data sets available to the public. One such data set that is useful for our purposes is the UTIAS Multi-Robot Localization and Mapping (MRCLAM) dataset from the Autonomous Space Robotics Lab at the University of Toronto. I’ll refer to it as the UTIAS dataset for short.

The UTIAS data includes everything we’ll need to test our localization and mapping software: logs of the robot’s sensor data, accurate groundtruth poses for the robots throughout the run, and groundtruth positions for each landmark. I’ll go through each of those in detail below. You should take this opportunity download and extract “Dataset 1” from the download section of the MRCLAM webpage. You should also read the “Data Collection Details” of that page.


The Robots

The UTIAS dataset actually includes data from five different robots, all of which were run at the same time. This is because the dataset was originally intended for use in cooperative localization and mapping. However, this adds an unnecessary complication for our purposes. We will only be using the data from robot 1 in this tutorial.

The Measurements

The robot’s measurements are split up into two files: one for landmark measurements (Robot1_Measurements.dat) and one for odometry data (Robot1_Odometry.dat).

Illustration of a landmark measurement.

Each line of the Measurements file represents a measurement of a single landmark. We define a “measurement” with three quantities: range, the distance between the robot and the landmark in meters; bearing, the angle in radians between the landmark and the robot’s heading; and landmark ID, the unique identifier of the landmark being measured. Each line of the measurements file records the time the measurement occurred, the barcode for the landmark that was measured, the range, and the bearing. Note that the landmark barcode in the measurements file is not equivalent to the landmark ID in the landmark groundtruth file. The barcodes.dat file defines the mapping between the landmark barcodes and ID numbers.

The odometry file is pretty simple by comparison. Each line specifies the time, the robot’s linear velocity (in m/s), and its angular velocity (in rad/s).

The Groundtruth

Robot1_Groundtruth.dat specifies the robot’s actual pose in the global reference frame at any given time throughout the run. This is the part of the dataset that required the use of a motion capture system. Most of us could not recreate this on our own, so it’s nice that the University of Toronto made their data publicly available. Each line of the groundtruth file specifies the time, the robot’s x position in meters, the robot’s y position in meters, and the robot’s heading in radians.


The Landmarks

Landmark_Groundtruth.dat specifies the actual position of all of the landmarks in the environment. Each line specifies the landmark ID number, the x position, the y position, the standard deviation of the x position, and the std dev of the y position.


The Utilities

There are also a few useful MATLAB scripts available on the MRCLAM webpage.

loadMRCLAMdataSet.m will load the information in the .dat files into MATLAB variables. sampleMRCLAMdataSet.m will resample the data at 50Hz. This means that all of the measurements, odometry, and groundtruth information will come in at a predictable time interval. That will make things much easier. Lastly, animateMRCLAMdataSet.m will animate the motion of the robots throughout the run.

The problem with these scripts is they assume we want to work with all 5 robots. There are adapted versions of the scripts in my GitHub repo that only work with robot 1. Additionally, I adapted animateMRCLAMdataSet.m to animate both the groundtruth pose of robot 1 and the pose estimated by the EKF localization program. This will allow us to see just how well our program’s pose estimate tracks the robot’s true pose.