FiducialSLAM.md
Author: Addison Pitha (additions by Pito Salas)
Link
Introduction
This project began as an effort to improve the localization capabilities of campus rover as an improvement to lidar based localization. Motivating its conception were the following issues: Upon startup, in order to localize the robot, the user must input a relatively accurate initial pose estimate. After moving around, if the initial estimate is good, the estimate should converge to the correct pose.
When this happens, this form of localization is excellent, however sometimes it does not converge, or begins to diverge, especially if unexpected forces act upon the robot, such as getting caught on obstacles, or being picked up. Furthermore, in the absence of identifiable geometry of terrain features, localization may converge to the wrong pose, as discussed in Joint 3D Laser and Visual Fiducial Marker based SLAM for a Micro Aerial Vehicle. While the article may discuss aerial vehicles, the issue is clearly common to SLAM.
One solution incolves fiducials, which are square grids of large black and white pixels, with a bounding box of black pixels that can be easily identified in images. The pattern of pixels on the fiducial encodes data, and from the geometry of the bounding box of pixels within an image we can calculate its pose with respect to the camera. So, by fixing unique fiducials at different places within the robot's operating environment, fixed locations can be used to feed the robot accurate pose estimates without the user needing to do so manually.
Fiducial SLAM is not a clearly defined algorithm. Not much is written on it specifically, but it is a sub-problem of SLAM, on which a plethora of papers have been written. In TagSLAM: Robust SLAM with Fiducial Markers the authors discuss how Fiducial SLAM can be seen as a variation of Landmark SLAM. This broader algorithm extracts recognizable features from the environment and saves them as landmarks, which is difficult with changes in lighting, viewing landmarks from different angles, and choosing and maintaining the set of landmarks.
If successful, using landmarks should be allow the robot to quickly identify its pose in an environment. Fiducial SLAM provides a clear improvement in a controlled setting: fiducials can be recognized from a wide range of angles, and given that they are black and white, are hopefully more robust to lighting differences, with their main downside being that they must be manually placed around the area. Therefore, the goal of this project was to use fiducial markers to generate a map of known landmarks, then use them for localization around the lab area. It was unknown how accurate these estimates would be. If they were better than lidar data could achieve, fiducials would be used as the only pose estimation source, if not, they could be used simply as a correction mechanism.
What was created
This project uses the ROS package fiducials, which includes the packages aruco_detect and fiducial_slam. Additionally, something must publish the map_frame, for which I use the package turtlebot3/turtlebot3_navigation, which allows amcl to publish to map_frame, and also provide the utility of visualizing the localization and provide navigation goals.
I used the raspberry pi camera on the robots, recording 1280x960 footage at 10 fps. Setting the framerate too fast made the image transport slow, but running it slow may have introduced extra blur into the images. Ideally the camera is fixed to look directly upwards, but many of the lab's robots have adjustable camera angles. The transform I wrote assumes the robot in use has a fixed angle camera, which was at the time on Donatello.
To set up the environment for fiducial mapping, fiducials should be placed on the ceiling of the space. There is no special requirement to the numbering of the fiducials, but I used the dictionary of 5x5 fiducials, DICT_5X5_1000, which can be configured in the launch file. They do not need to be at regular intervals, but there should not be large areas without any fiducials in them. The robot should be always have multiple fiducials in view, unless it is pressed up against a wall, in which case this is obviously not practical. This is because having multiple fiducials in view at once provides a very direct way to observe the relative position of the fiducials, which fiducial_slam makes use of. Special attention should be made to make sure the fiducials are flat. If, for example, one is attached on top of a bump in the ceiling, it will change the geometry of the fiducial, and decrease the precision with which its pose, especially rotation, can be estimated by aruco_detect.
Before using the code, an accurate map generated from lidar data or direct measurements should be constructed. In our case we could use a floor plan of the lab, or use a map generated through SLAM. It does not matter, as long as it can be used to localize in with lidar. Once the map is saved, we can begin constructing a fiducial map. First, amcl should be running, to publish to map_frame. This allows us to transform between the robot and the map. We must also know the transform between the robot and the camera, which can be done with a static transform I wrote, rpicam_tf.py. aruco_detect provides the location of fiducials in the camera's frame. Then we run fiducial_slam, which can now determine fiducial's locations relative to the map_frame. We must pay special attention to the accuracy of amcl's localization while constructing the fiducial map. Although the purpose of this project is to prevent the need to worry about localization diverging, at this stage we need to make sure the localization is accurate, at least while the first few fiducials are located. The entire area should be explored, perhaps with teleop as I did.
Discussion of the fiducial_slam package
The fiducial_slam package was clearly the most important part of this project, but after using it, I realize it was also the biggest problem with the project, so I am dedicating this section to discussing the drawbacks to using it and how to correct those, should anyone want to try to use fiducial SLAM in the future.
The package is very poorly documented. Its ROS wiki page contains very little information on how it works, and how to run it, even if one follows the Github link provided. The information on how to run the packages at a very basic level is incomplete, since it does not specify that something must publish to map_frame for the packages to function.
The error messages provided are not very helpful. For example, when fiducial_slam and aruco_detect are run without amcl publishing to move base, they will both appear to work fine and show no errors, even though the rviz visualization will not show a camera feed. There is another error I received which I still have not solved. Upon launching the fiducial_slam package, it began crashing, and in the error message there was only a path to what I assume to be the source of the error, which was the camera_data argument, as well as a C++ error which related to creating an already existent file, with no information about which file that is. However, when I checked the camera_data argument, and even changed the file so I did not provide that parameter, the same error specifying a different path showed up. Note that upon initially trying to use the package, I did not specify the camera_data argument and this error did not appear, which suggests that parameter was not actually the problem. Issues like these are incredibly difficult to actually diagnose, making the use of the package much more challenging than it should be.
Code documentation is poor, and even now I am unsure which variety of SLAM the code uses. Compare to the previously mentioned paper on TagSLAM, which clearly indicates the algorithm uses a GTSAM nonlinear optimizer for graph-based slam. TagSLAM is a potential alternative to fiducial_slam, although as I have not tested its code, I can only analyze its documentation. I think TagSLAM would be much easier to use than fiducial_slam, or perhaps some other package I have not looked at. It may also be possible for a large team (at least three people) to implement the algorithm themselves.
Story of the project
The original learning goal of this project was for me to better understand how fiducial detection works, and how SLAM works. Instead, my time was mostly spent discovering how the fiducial_slam package works. My first major issue was when trying to follow the tutorial posted on the ROS wiki page. I wasn't given any error messages in the terminal window running fiducial_slam, but noticed that rviz was not receiving camera data, which I assumed indicated that the issue was with the camera.
After much toil and confusion, it turned out that for some reason, the data was not passed along unless something published to move_base. I did not originally try to run the packages with amcl running because I wrongly assumed that the package would work partially if not all parameters were provided, or at least it would give error messages if they were necessary. After solving that problem, I encountered an issue with the camera transforms, which I quickly figured out was that there was no transform between base_link and the camera. I solved this with a static transform publisher, which I later corrected when fiducial_slam allowed me to visualize where the robot thought observed fiducials were. Once I was mapping out fiducial locations, I believed that I could start applying the algorithm to solve other problems. However, overnight, my code stopped working in a way I'm still confused about. I do not remember making any modifications to the code, and am not sure if any TAs updates the campus rover code in that time in a way that could make my code break. After trying to run on multiple robots, the error code I kept getting was the fiducial_slam crashed, and that there was some file the code attempted to create, but already existed. Even after specifying new map files (which didn't yet exist) and tweaking parameters, I could not diagnose where the error was coming from, and my code didn't work after that.
In hindsight, I see that my problem was choosing the fiducial_slam package. I chose it because it worked with ArTag markers, which the lab was already using, and I assumed that would make it simpler to integrate it with any existing code, and I could receive help from TAs in recognizing fiducials with aruco_detect. Unfortunately that did not make up for the fact that the fiducial_slam package wasn't very useful. So one thing I learned from the project was to pick packages carefully, paying special attention to documentation. Perhaps a package works well, but a user might never know if they can't figure out what the errors that keep popping up are.
Last updated