Object-based Semantic Simultaneous Localization and Mapping for Flying Robots

Ya Wang

The recent development of RGB-D sensors and deep convolutional neural networks (CNNs) allows robots to estimate their 3D surroundings more accurately and efficiently than before. Object-based semantic simultaneous localization and mapping (SLAM) identifies semantic categories (labels) of identified objects. So, instead of storing all visual features in a 3D map, we only store object classes and a few object parameters in this map. This needs less memory, but also helps loop closure detection, because similarity computation of the current image with the stored 3D map is easier. The smaller map size and the higher robustness of semantic features compared to simple image features allows mapping a larger environment and using the same map for longer time scales. We use this for SLAM of flying robots.

In this research project, we are using two self-built quadrocopters, both equipped with an Intel NUC onboard computer and one with an Intel Realsense RGB-D camera ZR300, another with an Intel Realsense RGB-D camera R200. An Nvidia Jetson TX2 development kit is used as the prototype for experiments using GPU acceleration for deep convolutional neural networks.

Fig. 2: The hardware be used in this research.