scispace - formally typeset
Search or ask a question
Journal ArticleDOI

‘Structure-from-Motion’ photogrammetry: A low-cost, effective tool for geoscience applications

15 Dec 2012-Geomorphology (Elsevier)-Vol. 179, pp 300-314
TL;DR: The Structure-from-Motion (SfM) method as mentioned in this paper solves the camera pose and scene geometry simultaneously and automatically, using a highly redundant bundle adjustment based on matching features in multiple overlapping, offset images.
About: This article is published in Geomorphology.The article was published on 2012-12-15 and is currently open access. It has received 2901 citations till now. The article focuses on the topics: Photogrammetry & Structure from motion.

Summary (3 min read)

1.1. Photogrammetric Survey Methods

  • Similarly, improvements in the cost and quality of compact and single lens reflex (SLR) cameras, and methods for the calibration of such non-metric cameras (Clarke and Fryer, 1998; Chandler et al., 2005) have democratized access to photogrammetric modelling and encouraged a wide range of uses in geomorphology.
  • Digital photogrammetry has also been applied to a number of geological problems, including discontinuity characterization (e.g. Krosley et al., 2006; Sturzenegger and Stead, 2009) and rock slope stability analysis (e.g. Haneberg, 2008) .
  • Close-range applications have also included direct quantification of soil erosion and the morphodynamics of laboratory-scale landscape evolution models (e.g. Stojic et al., 1998; Brasington and Smart, 2003; Lane et al., 2001; Hancock and Willgoose, 2001; Rieke-Zapp and Nearing, 2005; Heng et al., 2010) .

1.2. Structure-from-Motion

  • Developed in the 1990s, this technique has its origins in the computer vision community (e.g. Spetsakis and Aloimonos, 1991; Boufama et al., 1993; Szeliski and Kang, 1994) and the development of automatic feature-matching algorithms in the previous decade (e.g. Förstner, 1986; Harris and Stephens, 1988) .
  • The approach has been popularized through a range of cloud-processing engines, most notably Microsoft® Photosynth™ (Microsoft, 2010), which uses SfM approaches documented in Snavely (2008) and Snavely et al. (2008) .
  • These tools can make direct use of user-uploaded and crowd-sourced photography to generated the necessary coverage of a target scene, and can automatically generate sparse 3-D point clouds from these photosets.
  • The possibilities of SfM appear boundless, however, to date, the technique has rarely been used within the geosciences (e.g. Niethammer et al., 2012) and there exist few quantitative assessments of the quality of terrain products derived from this approach.

1.3. The First Principles of SfM

  • GCPs can be derived post-hoc, identifying candidate features clearly visible in both the resulting point cloud and in the field, and obtaining their coordinates by ground survey (i.e., by GPS).
  • In practice, however, it is often easier to deploy physical targets with a high contrast and clearly defined centroid in the field before acquiring images.
  • This approach simplifies the unambiguous co-location of image and object space targets and also ensures a reliable, well-distributed network of targets across the area of interest, enabling an assessment of any non-linear structural errors in the SfM reconstruction.

1.4. Goals of this Article

  • Applications of SfM to a range of contrasting landscapes and landforms are described, including coastal cliffs, a moraine-dammed lake, and a smaller scale glacially-sculpted bedrock ridge.
  • Importantly, the authors also undertake a detailed assessment of the quality of a derived topographic model, in this case a c. 300 x 300 m cliff section in Aberystwyth, Wales, through comparison with a high resolution terrain model derived from a precision terrestrial laser scan survey.

2.1.1. Image acquisition and keypoint extraction

  • A wide variety of imaging sensors can be used for SfM, from video stills, through to low grade compact digital cameras.
  • The primary requirement is well-exposed photographs of the feature(s) of interest.
  • From their experience, 'bigger' is not necessarily 'better'.
  • Whereas image quality and resolution are improved by using increasingly expensive digital SLR models, images captured at the highest resolutions (e.g. >12 megapixel) will almost inevitably need to be re-sized (with the consequent loss of image detail) to avoid lengthy processing times.
  • If operating in remote regions, specific consideration should be given to robustness and battery life, including methods for charging and performance in extreme temperatures.

2.1.3. Post-processing and digital elevation model generation

  • When combined, SfM and point-cloud decimation potentially offer a powerful tool for geomorphological analysis.
  • This model may be visualized effectively by draping the orthophoto derived from the SfM processing over this surface.
  • The final result is a fully georeferenced, high-resolution, photo-realistic DEM.

3.1. Data acquisition and processing

  • The extensive photoset was decomposed into three 'batches' to reduce computational demand, and input photographs re-scaled to 55% of their original resolution to reduce computational demand.
  • The processing steps outlined in section 2 were employed, producing unreferenced sparse and dense point-clouds as output (Table 2 ).
  • The SfM data were transformed to the TLS co-ordinate system through manual identification of matching GCP centroids in both datasets (Fig. 4b-d ).
  • The three SfM batches were registered individually, with no significant difference in the quality of the three transformation models, and average transformation residuals of 0.124 m, 0.058 m and 0.031 m for xyz.

4.1 Dig Tsho moraine complex

  • Background photographic information was sufficient to reconstruct the entire lake basin, including the 2 km long northern lateral moraine.
  • As in the previous example, significant topographic detail (sub-metre scale) has been resolved.
  • The entire breach was successfully reconstructed, and notable morphological features captured by the model include the narrow central section and expansive exit, as well as two abandoned spillways.
  • A number of interpolation artefacts are present across the scene, but are largely confined towards the south and correspond to an extensive area of snow cover.

5. Discussion

  • The example applications presented in section 4 were ideally suited to the application of the SfM technique.
  • Minimal vegetation coverage and relatively complex, heterogeneous topography at both the meso-and micro-scales facilitate the extraction of suitable numbers of keypoint descriptors for consistent, dense point cloud coverage.
  • Similarly, the method is ideally suited for application in (semi)arid environments.
  • In contrast, the method's suitability for topographic reconstruction of, for example, riparian landscapes may be limited, given that, at present, only waterfree surfaces would be suitable for reconstruction, and point density is likely to be limited, and of questionable accuracy, in areas of dense vegetation.

6. Conclusions

  • This paper has outlined a novel low-cost, ground-based, close-range terrestrial photogrammetry and computer vision approach to obtaining high-resolution spatial data suitable for modelling meso-and micro-scale landforms.
  • The nature of the SfM method eliminates the requirement for manual identification of image control prior to processing, instead employing automatic camera pose estimation algorithms to simultaneously resolve 3-D camera location and scene geometry; this is an extremely significant advantage of the technique over traditional digital photogrammetric methods.
  • As the raw SfM output is fixed into a relative co-ordinate system, particular time and attention should be taken in the establishment of a GCP network to facilitate transformation to an absolute coordinate system and the extraction of metric data.
  • Taking the hypothesised effectiveness of an aerial approach into account, the terrestrial data collection method presented herein nevertheless represents an effective, financially viable alternative to traditional manual topographic surveying and photogrammetric techniques, particularly for practical application in remote or inaccessible regions.

Did you find this useful? Give us your feedback

Figures (13)
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, a Structure from Motion (SfM) workflow was applied to derive a 3D model of a landslide in southeast Tasmania from multi-view UAV photography and the geometric accuracy of the model and resulting DEMs and orthophoto mosaics was tested with ground control points coordinated with geodetic GPS receivers.
Abstract: In this study, we present a flexible, cost-effective, and accurate method to monitor landslides using a small unmanned aerial vehicle (UAV) to collect aerial photography. In the first part, we apply a Structure from Motion (SfM) workflow to derive a 3D model of a landslide in southeast Tasmania from multi-view UAV photography. The geometric accuracy of the 3D model and resulting DEMs and orthophoto mosaics was tested with ground control points coordinated with geodetic GPS receivers. A horizontal accuracy of 7 cm and vertical accuracy of 6 cm was achieved. In the second part, two DEMs and orthophoto mosaics acquired on 16 July 2011 and 10 November 2011 were compared to study landslide dynamics. The COSI-Corr image correlation technique was evaluated to quantify and map terrain displacements. The magnitude and direction of the displacement vectors derived from correlating two hillshaded DEM layers corresponded to a visual interpretation of landslide change. Results show that the algorithm can accurately map displacements of the toes, chunks of soil, and vegetation patches on top of the landslide, but is not capable of mapping the retreat of the main scarp. The conclusion is that UAV-based imagery in combination with 3D scene reconstruction and image correlation algorithms provide flexible and effective tools to map and monitor landslide dynamics.

606 citations


Cites methods from "‘Structure-from-Motion’ photogramme..."

  • ...Westoby et al. (2012) used SfM techniques to map the 3D structure of a steep alpine hill slope and demonstrated that the SfM-derived elevation measurements were within 0.1 m of a TLS scan....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a detailed error analysis of sub-meter resolution terrain models of two contiguous reaches (1.6 and 1.7 km long) of the braided Ahuriri River, New Zealand, generated using Structure-from-Motion (SfM) is presented.

573 citations


Cites background or methods from "‘Structure-from-Motion’ photogramme..."

  • ...…SfM–MVS has produced terrain models with centimeter precision and point cloud resolutions that fall between LiDAR and TLS (Doneus et al., 2011; Fonstad et al., 2013) andhas beenutilized to accuratelymodel objects on the centimeter to kilometer scale (James and Robson, 2012; Westoby et al., 2012)....

    [...]

  • ...While the study reach highlights the capabilities of the described workflow, the reasonable errors of the extended reach illustrate the potential for thisworkflow to produce qualitatively convincingDEMs from retrofitted data thus, significantly increasing the topographic detail and research opportunities of pre-existing or limited datasets....

    [...]

  • ...While it is recognized that PhotoSynth and SFMToolkit are both capable of producing quality DEMs (e.g. James and Robson, 2012; Westoby et al., 2012), this research utilized PhotoScan (version 0....

    [...]

  • ...Traditional photogrammetric DEMs were typically less accurate and precise than airborne LiDAR (Baltsavias, 1999); however, SfM–MVS has produced terrain models with centimeter precision and point cloud resolutions that fall between LiDAR and TLS (Doneus et al., 2011; Fonstad et al., 2013) andhas beenutilized to accuratelymodel objects on the centimeter to kilometer scale (James and Robson, 2012; Westoby et al., 2012)....

    [...]

  • ..., 2013) andhas beenutilized to accuratelymodel objects on the centimeter to kilometer scale (James and Robson, 2012; Westoby et al., 2012)....

    [...]

Journal ArticleDOI
TL;DR: The typical workflow applied by SfM-MVS software packages is detailed, practical details of implementing S fM- MVS are reviewed, existing validation studies to assess practically achievable data quality are combined, and the range of applications in physical geography are reviewed.
Abstract: Accurate, precise and rapid acquisition of topographic data is fundamental to many sub-disciplines of physical geography. Technological developments over the past few decades have made fully distributed data sets of centimetric resolution and accuracy commonplace, yet the emergence of Structure from Motion (SfM) with Multi-View Stereo (MVS) in recent years has revolutionised three-dimensional topographic surveys in physical geography by democratising data collection and processing. SfM-MVS originates from the fields of computer vision and photogrammetry, requires minimal expensive equipment or specialist expertise and, under certain conditions, can produce point clouds of comparable quality to existing survey methods (e.g. Terrestrial Laser Scanning). Consequently, applications of SfM-MVS in physical geography have multiplied rapidly. There are many practical options available to physical geographers when planning a SfM-MVS survey (e.g. platforms, cameras, software), yet, many SfM-MVS end-users are uncert...

565 citations


Cites background or methods from "‘Structure-from-Motion’ photogramme..."

  • ...Advice on SfM-MVS image acquisition for specific applications is given in several papers, including Favalli et al. (2012); James and Robson (2012); Westoby et al. (2012); Bemis et al. (2014); Micheletti et al. (2014); Smith et al., (2014) and Stumpf et al. (2015)....

    [...]

  • ...Full 360 coverage is ideal (Westoby et al., 2012), though not always necessary so long as all surfaces of interest are visible in multiple photographs....

    [...]

  • ...Scenes devoid of distinct features (e.g. smooth ice surfaces) will be challenging and often produce fewer keypoint correspondences and lower point densities (Westoby et al., 2012)....

    [...]

  • ..., 2014); however, large images may need to be re-sized to reduce processing times (Westoby et al., 2012)....

    [...]

  • ...smooth ice surfaces) will be challenging and often produce fewer keypoint correspondences and lower point densities (Westoby et al., 2012)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the use of modern 3D photo-based surface reconstruction techniques for high fidelity surveys of trenches, rock exposures and hand specimens is discussed, highlighting their potential for paleoseismology and structural geology.

548 citations

Journal ArticleDOI
TL;DR: The paper demonstrates the great potential of high-resolution UAV data and photogrammetric techniques applied in the agriculture framework to collect multispectral images and evaluate different VI, suggesting that these instruments represent a fast, reliable, and cost-effective resource in crop assessment for precision farming applications.
Abstract: Unmanned Aerial Vehicles (UAV)-based remote sensing offers great possibilities to acquire in a fast and easy way field data for precision agriculture applications. This field of study is rapidly increasing due to the benefits and advantages for farm resources management, particularly for studying crop health. This paper reports some experiences related to the analysis of cultivations (vineyards and tomatoes) with Tetracam multispectral data. The Tetracam camera was mounted on a multi-rotor hexacopter. The multispectral data were processed with a photogrammetric pipeline to create triband orthoimages of the surveyed sites. Those orthoimages were employed to extract some Vegetation Indices (VI) such as the Normalized Difference Vegetation Index (NDVI), the Green Normalized Difference Vegetation Index (GNDVI), and the Soil Adjusted Vegetation Index (SAVI), examining the vegetation vigor for each crop. The paper demonstrates the great potential of high-resolution UAV data and photogrammetric techniques applied in the agriculture framework to collect multispectral images and evaluate different VI, suggesting that these instruments represent a fast, reliable, and cost-effective resource in crop assessment for precision farming applications.

504 citations


Cites background from "‘Structure-from-Motion’ photogramme..."

  • ...), coupled with imaging, ranging, and positioning sensors, are able to collect multispectral imagery at cm-level resolution and offer great possibilities in the precision farming domain [7–10], agriculture and forestry management [11,12], and geosciences [13]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Journal ArticleDOI
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
Abstract: A new paradigm, Random Sample Consensus (RANSAC), for fitting a model to experimental data is introduced. RANSAC is capable of interpreting/smoothing data containing a significant percentage of gross errors, and is thus ideally suited for applications in automated image analysis where interpretation is based on the data provided by error-prone feature detectors. A major portion of this paper describes the application of RANSAC to the Location Determination Problem (LDP): Given an image depicting a set of landmarks with known locations, determine that point in space from which the image was obtained. In response to a RANSAC requirement, new results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form. These results provide the basis for an automatic system that can solve the LDP under difficult viewing

23,396 citations


"‘Structure-from-Motion’ photogramme..." refers methods in this paper

  • ...Keypoints in multiple images are matched using approximate nearest neighbour (Arya et al., 1998) and Random Sample Consensus (RANSAC; Fischler and Bolles, 1987) algorithms, and ‘tracks’ linking specific keypoints in a set of pictures are established....

    [...]

Book
01 Nov 2008
TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.
Abstract: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization. It responds to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems. For this new edition the book has been thoroughly updated throughout. There are new chapters on nonlinear interior methods and derivative-free methods for optimization, both of which are used widely in practice and the focus of much current research. Because of the emphasis on practical methods, as well as the extensive illustrations and exercises, the book is accessible to a wide audience. It can be used as a graduate text in engineering, operations research, mathematics, computer science, and business. It also serves as a handbook for researchers and practitioners in the field. The authors have strived to produce a text that is pleasant to read, informative, and rigorous - one that reveals both the beautiful nature of the discipline and its practical side.

17,420 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations


"‘Structure-from-Motion’ photogramme..." refers methods in this paper

  • ...This package contains a number of open-source applications including, in order of execution, SiftGPU (Lowe, 1999, 2004), Bundler (Snavely et al., 2008), CMVS and PMVS2 (Furukawa and Ponce, 2007; Furukawa et al., 2010), all of which may be run independently if desired....

    [...]

  • ...This is implemented in 197 SFMToolkit3, through the incorporation of the SiftGPU algorithm (Lowe, 1999; 2004)....

    [...]

Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,993 citations


"‘Structure-from-Motion’ photogramme..." refers methods in this paper

  • ...…in the 1990s, this technique has its origins in the computer vision community (e.g. Spetsakis and Aloimonos, 1991; Boufama et al., 1993; Szeliski and Kang, 1994) and the development of automatic feature-matching algorithms in the previous decade (e.g. Förstner, 1986; Harris and Stephens, 1988)....

    [...]

  • ..., 1993; Szeliski and Kang, 1994) and the development of automatic feature-matching algorithms in the previous decade (e.g. Förstner, 1986; Harris and Stephens, 1988)....

    [...]