We present a direct monocular visual odometry system which runs in real-time
on a smartphone. Being a direct method, it tracks and maps on the images
themselves instead of extracted features such as keypoints. New images are
tracked using direct image alignment, while geometry is represented in the
form of a semi-dense depth map. Depth is estimated by filtering over many
small-baseline, pixel-wise stereo comparisons. This leads to significantly
less outliers and allows to map and use all image regions with sufficient
gradient, including edges. We show how a simple world model for AR
applications can be derived from semi-dense depth maps, and demonstrate the
practical applicability in the context of an AR application in which
simulated objects can collide with real geometry.