Visual simultaneous localisation and mapping (SLAM) uses cameras to recover a representation of the environment that concurrently enables accurate pose estimation. Although there have been significant recent advances in visual SLAM, these have been primarily concerned with the localisation aspect of the problem and progress in developing more meaningful maps of the environment has been somewhat neglected. Incorporating higher level structure into the map enhances its value for tasks involving interaction with the real world and provides a simplified representation that enforces implicit constraints between the features. The main aim of this thesis is to address the problem of discovering and incorporating higher level structure concurrently with normal SLAM operation, in a way that preserves the statistical consistency and accuracy of the system and advances the possibilities of meaningful interaction with the map. Initially, a novel model-building SLAM system is presented. This system performs online construction of sparse wireframe models and uses them to simultaneously track the location of a camera in real time. The system combines existing robust techniques for model-based tracking with recursive 3-D line estimation and generates a model that could be used as the basis for a higher level map representation. In order to extend the model-building approach, a state reduction method is proposed that introduces structure into a fully correlated visual SLAM system. Higher level structure is discovered concurrently with normal SLAM operation and incorporated in a rigorous manner which maintains the important correlations between features. This is achieved using a bottom-up process, in which subsets of low level features are ‘folded in’ to a parameterisation of the higher level feature. The method is demonstrated for the specific cases of discovering and incorporating planes and lines from low level point and edgelet features. The online discovery of structure in this method provides a novel form of interaction that can be exploited in augmented reality (AR) applications and is difficult to achieve with other systems. In the final part of the work, an analysis of the state reduction method in simulation demonstrates that it is possible to incorporate structure without compromising the consistency and accuracy of the underlying SLAM system. Comparison against two nonlinear equality constraint enforcement algorithms shows that the state reduction approach is able to avoid the overhead of applying explicit constraints every frame, enabling it to operate in real time.
|Translated title of the contribution||Incorporating Higher Level Structure in Visual SLAM|
|Publication status||Published - 2010|