SurfaceView: Seamless and tile-based orthomosaics using millions of street-level images from vehicle-mounted cameras

Abstract

We tackle the problem of building city- or country-scale seamless mosaics of the road network from millions of street-level images. These ‘orthomosaics’ provide a virtual top-down, orthographic view, as might be captured by a satellite though at vastly reduced cost and avoiding limitations caused by atmospheric interference or occlusion by tree cover. We propose a novel, highly efficient planar visual odometry method that scales to millions of images. This includes a fast search for potentially overlapping images, relative pose estimation from approximate ground plane projected images and a largescale optimisation, which we call motion-from-homographies, that exploits multiple motion, GPS and control point priors. Since even city-scale orthomosaics have petapixel resolution, we work with a tile-based mosaic representation which is more efficient to compute and makes web-based, real-time interaction with the images feasible. Our orthomosaics are seamless both within tiles and across tile boundaries due to our proposed novel variant of gradient-domain stitching. We show that our orthomosaics are qualitatively superior to those produced using state-of-the-art structure-from-motion output yet our pose optimisation is several orders of magnitude faster. We evaluate our methods on a dataset of 1.4M images that we collected.

Publication
In IEEE Transactions on Intelligent Transportation Systems
Avatar
Supannee Tanathong
Computer Vision Research Engineer
Avatar
Will Smith
Professor in Computer Vision