Skip to content

Commit

Permalink
Updated docs for v0.4
Browse files Browse the repository at this point in the history
Change-Id: I70c9e73f3884db0aace52fc4804085c3b397532b
  • Loading branch information
sweeneychris committed Jul 22, 2015
1 parent cb0e76a commit d41a4c7
Show file tree
Hide file tree
Showing 8 changed files with 182 additions and 28 deletions.
4 changes: 2 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -91,11 +91,11 @@ SET(CMAKE_RELEASE_POSTFIX "")
SET(CMAKE_DEBUG_POSTFIX "-debug")

SET(THEIA_VERSION_MAJOR 0)
SET(THEIA_VERSION_MINOR 3)
SET(THEIA_VERSION_MINOR 4)
SET(THEIA_VERSION_PATCH 0)
SET(THEIA_VERSION
${THEIA_VERSION_MAJOR}.${THEIA_VERSION_MINOR}.${THEIA_VERSION_PATCH})
SET(THEIA_ABI_VERSION 0.3.0)
SET(THEIA_ABI_VERSION 0.4.0)

# THEIA data directory
ADD_DEFINITIONS(-DTHEIA_DATA_DIR="${CMAKE_SOURCE_DIR}/data")
Expand Down
10 changes: 10 additions & 0 deletions docs/source/bibliography.rst
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,17 @@ Bibliography
Estimation via Residual Consensus**. *International Conference on Computer
Vision (ICCV)*, 2011.
.. [PhotoTourism] N. Snavely, S. Seitz, and R. Szeliski. **Photo tourism:
exploring photo collections in 3D.** *ACM transactions on graphics (TOG)*, 2006.
.. [Stewenius5pt] H. Stewénius, C. Engels, D. Nistér. **Recent developments on
direct relative orientation**. *ISPRS Journal of Photogrammetry and Remote
Sensing*, 2006
.. [SweeneyCVPR2015] C. Sweeney, L. Kneip, T. Hollerer, M. Turk. **Computing
Similarity Transformations from Only Image Correspondences**. *Computer Vision
and Pattern Recognition (CVPR)*, 2015
.. [SweeneyGDLS] C. Sweeney, V. Fragoso, T. Hollerer, M. Turk. **gDLS: A
Scalable Solution to the Generalized Pose and Scale Problem**. *European
Conference on Computer Vision (ECCV)*, 2014
Expand All @@ -119,5 +126,8 @@ Bibliography
between two point patterns**. *IEEE Transactions on Pattern Analysis and Machine
Intelligence*, 1991.
.. [VisualSfM] Wu, Changchang. **Towards a Linear Time Incremental Structure From
Motion**. *International Conference on 3D Vision*, 2013.
.. [WilsonECCV2014] Wilson, K. and Snavely, N. **Robust Global Translation with 1DSfM**
*European Conference on Computer Vision*, 2014.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/global_sfm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/incremental_sfm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/pisa.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
26 changes: 24 additions & 2 deletions docs/source/releases.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,34 @@ HEAD

New Features
------------
* Better rendering for point clouds.

Bug Fixes
---------
* Some Visual Studio bugs and incompatabilities (thanks to Pierre Moulon and Brojeshwar Bhowmick)

`0.4.0 <https://github.com/sweeneychris/TheiaSfM/archive/v0.4.tar.gz>`_
=======================================================================

New Features
------------
* Incremental SfM pipeline.
* New website: `www.theia-sfm.org <http://www.theia-sfm.org>`_.
* Linear method for camera pose registration [JiangICCV]_.
* Better rendering for point clouds.
* Significantly better Cmake scripts for Windows (thanks to bvanevery for testing)
* Mutable priority queue class.
* Bundle adjustment method for cameras only (points held constant).
* Calibrated and Uncalibrated absolute pose estimators.
* Two-view bundle adjustment will now optimize camera intrinsics if they are not known.
* New small and large-scale benchmarking results on the Theia website.

Bug Fixes
---------
* Some Visual Studio bugs and incompatabilities (thanks to Pierre Moulon and Brojeshwar Bhowmick).
* Sample Consensus estimators were incorrectly counting the number of samples needed (found by inspirit).
* Proper normalization the 1dSfM axis of projection.
* OpenGL viewer properly sets zero-values of matrices upon initialization.
* Relative translation optimization (with known rotation) is dramatically improved (thanks to Onur Ozyesil)
* Translations solver uses SPARSE_NORMAL_CHOLESKY when no 3D points are used.

`0.3.0 <https://github.com/sweeneychris/TheiaSfM/archive/v0.3.tar.gz>`_
=======================================================================
Expand Down
170 changes: 146 additions & 24 deletions docs/source/sfm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,17 @@ Theia has a full Structure-from-Motion pipeline that is extremely efficient. Our
overall pipeline consists of several steps. First, we extract features (SIFT is
the default). Then, we perform two-view matching and geometric verification to
obtain relative poses between image pairs and create a :class:`ViewGraph`. Next,
we perform global pose estimation with global SfM. Global SfM is different from
incremental SfM in that it considers the entire view graph at the same time
instead of incrementally adding more and more images to the
:class:`Reconstruction`. Global SfM methods have been proven to be very fast
with comparable or better accuracy to incremental SfM approaches (See
[JiangICCV]_, [MoulonICCV]_, [WilsonECCV2014]_), and they are much more readily
parallelized. After we have obtained camera poses, we perform triangulation and
:class:`BundleAdjustment` to obtain a valid 3D reconstruction consisting of
cameras and 3D points.
we perform either incremental or global SfM. Incremental SfM is the standard
approach that adds on one image at a time to grow the reconstruction. While this
method is robust, it is not scalable because it requires repeated operations of
expensive bundle adjustment. Global SfM is different from incremental SfM in
that it considers the entire view graph at the same time instead of
incrementally adding more and more images to the :class:`Reconstruction`. Global
SfM methods have been proven to be very fast with comparable or better accuracy
to incremental SfM approaches (See [JiangICCV]_, [MoulonICCV]_,
[WilsonECCV2014]_), and they are much more readily parallelized. After we have
obtained camera poses, we perform triangulation and :class:`BundleAdjustment` to
obtain a valid 3D reconstruction consisting of cameras and 3D points.

The first step towards creating a reconstruction is to determine images which
view the same objects. To do this, we must create a :class:`ViewGraph`.
Expand All @@ -29,8 +31,6 @@ view the same objects. To do this, we must create a :class:`ViewGraph`.
#. Match features to obtain image correspondences.
#. Estimate camera poses from two-view matches and geometries.

.. TODO:: Insert figure.

#1. and #2. have been covered in other sections, so we will focus on creating a
reconstruction from two-view matches and geometry. First, we will describe the
fundamental elements of our reconstruction.
Expand All @@ -40,7 +40,7 @@ Reconstruction

.. class:: Reconstruction

.. TODO:: Insert figure.
.. image:: pisa.png

At the core of our SfM pipeline is an SfM :class:`Reconstruction`. A
:class:`Reconstruction` is the representation of a 3D reconstuction consisting
Expand Down Expand Up @@ -117,8 +117,6 @@ ViewGraph

.. class:: ViewGraph

.. TODO:: INSERT FIGURE HERE

A :class:`ViewGraph` is a basic SfM construct that is created from two-view
matching information. Any pair of views that have a view correlation form an
edge in the :class:`ViewGraph` such that the nodes in the graph are
Expand Down Expand Up @@ -296,9 +294,83 @@ In addition to typical getter/setter methods for the camera parameters, the
according to the camera orientation in 3D space. The returned vector is not
unit length.

Incremental SfM Pipeline
========================

.. image:: incremental_sfm.png

The incremental SfM pipeline follows very closely the pipelines of `Bundler
<http://www.cs.cornell.edu/~snavely/bundler/>`_ [PhotoTourism]_ and `VisualSfM
<http://ccwu.me/vsfm/>`_ [VisualSfM]_. The method begins by first estimating the
3D structure and camera poses of 2 cameras based on their relative pose. Then
additional cameras are added on sequentially and new 3D structure is estimated
as new parts of the scene are observed. Bundle adjustment is repeatedly
performed as more cameras are added to ensure high quality reconstructions and
to avoid drift.

The incremental SfM pipeline is as follows:
#. Choose an initial camera pair to reconstruct.
#. Estimate 3D structure of the scene.
#. Bundle adjustment on the 2-view reconstruction.
#. Localize a new camera to the current 3D points. Choose the camera that
observes the most 3D points currently in the scene.
#. Estimate new 3D structure.
#. Bundle adjustment if the model has grown by more than 5% since the last
bundle adjustment.
#. Repeat steps 4-6 until all cameras have been added.

Incremental SfM is generally considered to be more robust than global SfM
methods; hwoever, it requires many more instances of bundle adjustment (which
is very costly) and so incremental SfM is not as efficient or scalable.

.. member:: double ReconstructorEstimatorOptions::multiple_view_localization_ratio

DEFAULT: ``0.8``

If M is the maximum number of 3D points observed by any view, we want to
localize all views that observe > M * multiple_view_localization_ratio 3D
points. This allows for multiple well-conditioned views to be added to the
reconstruction before needing bundle adjustment.

.. member:: double ReconstructionEstimatorOptions::absolute_pose_reprojection_error_threshold

DEFAULT: ``8.0``

When adding a new view to the current reconstruction, this is the
reprojection error that determines whether a 2D-3D correspondence is an
inlier during localization.

.. member:: int ReconstructionEstimatorOptions::min_num_absolute_pose_inliers

DEFAULT: ``30``

Minimum number of inliers for absolute pose estimation to be considered
successful.

.. member:: double ReconstructionEstimatorOptions::full_bundle_adjustment_growth_percent

DEFAULT: ``5.0``

Bundle adjustment of the entire reconstruction is triggered when the
reconstruction has grown by more than this percent. That is, if we last ran
BA when there were K views in the reconstruction and there are now N views,
then G = (N - K) / K is the percent that the model has grown. We run bundle
adjustment only if G is greater than this variable. This variable is
indicated in percent so e.g., 5.0 = 5%.

.. member:: int ReconstructionEstimatorOptions::partial_bundle_adjustment_num_views

DEFAULT: ``20``

During incremental SfM we run "partial" bundle adjustment on the most
recent views that have been added to the 3D reconstruction. This parameter
controls how many views should be part of the partial BA.

Global SfM Pipeline
===================

.. image:: global_sfm.png

The global SfM pipelines in Theia follow a general procedure of filtering
outliers and estimating camera poses or structure. Removing outliers can help
increase performance dramatically for global SfM, though robust estimation
Expand All @@ -325,8 +397,8 @@ follows:
.. class:: ReconstructionEstimator

This is the base class for which all SfM reconstruction pipelines derive
from. The reconstruction estimation type can be specified at runtime, though
currently only ``NONLINEAR`` is implemented.
from. The reconstruction estimation type can be specified at runtime
(currently ``NONLINEAR`` and ``INCREMENTAL`` are implemented).

.. function:: ReconstructionEstimator::ReconstructionEstimator(const ReconstructorEstimatorOptions& options)

Expand Down Expand Up @@ -583,15 +655,17 @@ Estimating Global Positions
===========================

Positions of cameras may be estimated simultaneously after the rotations are
known. We use a nonlinear optimization to estimate camera positions based. Given
pairwise relative translations from :class:`TwoViewInfo` and the estimated
rotation, the constraint
known. We use either a linear or a nonlinear optimization to estimate camera
positions based.

Given pairwise relative translations from :class:`TwoViewInfo`
and the estimated rotation, the constraint

.. math:: R_i * (c_j - c_i) = \alpha_{i,j} * t_{i,j}

Where :math:`\alpha_{i,j} = ||c_j - c_i||^2`. This ensures that we optimize for
positions that agree with the relative positions computed in two-view
estimation.
is used to determine the global camera positions, where :math:`\alpha_{i,j} =
||c_j - c_i||^2`. This ensures that we optimize for positions that agree with
the relative positions computed in two-view estimation.

.. class:: NonlinearPositionEstimatorOptions

Expand Down Expand Up @@ -653,6 +727,40 @@ estimation.
using the nonlinear algorithm described above. Only positions that have an
orientation set are estimated. Returns true upons success and false on failure.


.. class:: LinearPositionEstimator

.. image:: global_linear_position_estimation.png
:width: 40%
:align: center

For the linear position estimator of [JiangICCV]_, we utilize an approximate geometric error to determine the position locations within a triplet as shown above. The cost function we minimize is:

.. math:: f(i, j, k) = c_k - \dfrac{1}{2} (c_i + ||c_k - c_i|| c_{ik}) + c_j + ||c_k - c_j|| c_{jk})

This can be formed as a linear constraint in the unknown camera positions :math:`c_i`. Tthe solution that minimizes this cost lies in the null-space of the resultant linear system. Instead of extracting the entire null-space as [JiangICCV]_ does, we instead hold one camera constant at the origin and use the Inverse-Iteration Power Method to efficiently determine the null vector that best solves our minimization. This results in a dramatic speedup without sacrificing efficiency.

.. NOTE:: Currently this position estimation method is not integrated into the Theia global SfM pipeline. More testing needs to be done with this method before it can be reliably integrated.

.. member:: int LinearPositionEstimator::Options::num_threads

DEFAULT: ``1``

The number of threads to use to solve for camera positions

.. member:: int LinearPositionEstimator::Options::max_power_iterations

DEFAULT: ``1000``

Maximum number of power iterations to perform while solving for camera positions.

.. member:: double LinearPositionEstimator::Options::eigensolver_threshold

DEFAULT: ``1e-8``

This number determines the convergence of the power iteration method. The
lower the threshold the longer it will take to converge.

Triangulation
=============

Expand Down Expand Up @@ -853,7 +961,7 @@ the reprojection error.
Similarity Transformation
=========================

.. function:: void AlignPointCloudsICP(const int num_points, const double left[], const double right[], double rotation[3 * 3], double translation[3])
.. function:: void AlignPointCloudsICP(const int num_points, const double left[], const double right[], double rotation[], double translation[])

We implement ICP for point clouds. We use Besl-McKay registration to align
point clouds. We use SVD decomposition to find the rotation, as this is much
Expand All @@ -863,7 +971,7 @@ Similarity Transformation
the left and right reconstructions have the same number of points, and that the
points are aligned by correspondence (i.e. left[i] corresponds to right[i]).

.. function:: void AlignPointCloudsUmeyama(const int num_points, const double left[], const double right[], double rotation[3 * 3], double translation[3], double* scale)
.. function:: void AlignPointCloudsUmeyama(const int num_points, const double left[], const double right[], double rotation[], double translation[], double* scale)

This function estimates the 3D similiarty transformation using the least
squares method of [Umeyama]_. The returned rotation, translation, and scale
Expand Down Expand Up @@ -903,3 +1011,17 @@ Similarity Transformation
``solution_translation``: the translation of the candidate solutions

``solution_scale``: the scale of the candidate solutions

.. function:: void SimTransformPartialRotation(const Eigen::Vector3d& rotation_axis, const Eigen::Vector3d image_one_ray_directions[5], const Eigen::Vector3d image_one_ray_origins[5], const Eigen::Vector3d image_two_ray_directions[5], const Eigen::Vector3d image_two_ray_origins[5], std::vector<Eigen::Quaterniond>* soln_rotations, std::vector<Eigen::Vector3d>* soln_translations, std::vector<double>* soln_scales)

Solves for the similarity transformation that will transform rays in image
two such that the intersect with rays in image one such that:

.. math:: s * R * X' + t = X

where s, R, t are the scale, rotation, and translation returned, X' is a
point in coordinate system 2 and X is the point transformed back to
coordinate system 1. Up to 8 solutions will be returned.

Please cite the paper "Computing Similarity Transformations from Only Image
Correspondences" by C. Sweeney et al (CVPR 2015) [SweeneyCVPR2015]_ when using this algorithm.

0 comments on commit d41a4c7

Please sign in to comment.